New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set memory constraints of docker containers to match deployment #1809

Merged
merged 1 commit into from Apr 19, 2018

Conversation

Projects
None yet
2 participants
@toolness
Contributor

toolness commented Apr 19, 2018

To improve dev/prod parity and better anticipate potential problems like #1601 during development, we can use Docker's user memory constraints to simulate the kind of resource limits that our production environment has.

I'm not actually sure if our production environment disables swap like this PR does, but Brendan Gregg's Linux Performance Tools talk claims that most production environments do. It also makes it a lot easier to tell when we're using too much memory, since our app just crashes, instead of slowly thrashing about as the kernel swaps memory to disk.

To test this out, you can create a new Python file with the following content and run it in a container:

from ctypes import c_int

Chunk = c_int * 1_000_000

chunks = []

while True:
    chunks.append(Chunk())
    print(f"Allocated {len(chunks)}M")

You should see it get killed by the kernel after it's allocated some amount of memory that's commensurate to its container's mem_limit.

I also tried uploading Kelly's latest R10 bulk data with these constraints in-place and it works, so that's a good sign.

In the future, we can also add a configuration test that ensures the container memory limits are identical to the ones specified in the manifests, but I'm going to leave that for a later PR for now.

@toolness toolness requested a review from jseppi Apr 19, 2018

@toolness

This comment has been minimized.

Contributor

toolness commented Apr 19, 2018

Uh... I am confused as to why tests are failing, since the changes in this PR only affect docker settings that are used during development, and not during CI. Weird.

@toolness

This comment has been minimized.

Contributor

toolness commented Apr 19, 2018

Well, that's weird, I told CircleCI to re-run the tests and they worked. Filed #1810 to address it later.

@jseppi

jseppi approved these changes Apr 19, 2018

@toolness toolness merged commit 909e5fe into develop Apr 19, 2018

3 checks passed

ci/circleci: build Your tests passed on CircleCI!
Details
codeclimate All good!
Details
codeclimate/total-coverage 92% (0.0% change)
Details

@toolness toolness deleted the docker-mem-limit branch Apr 19, 2018

@toolness

This comment has been minimized.

Contributor

toolness commented Apr 19, 2018

Update: I tried changing the batch size of bulk upload processing from 5,000 to 60,000, in an attempt to reproduce the crashes we were seeing on production, but it worked. Ah well--it's likely that there's other factors involved in the production deployment that might have caused the worker to have even less memory in production than it does in development... but regardless, I still think this PR has brought us closer to dev/prod parity, at least.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment