Create/implement task definition for 2018 Transportation-Systems-Service project #104

iant01 · 2018-05-17T00:37:22Z

Create the service sub directory and service.yaml file for use in getting the service task definition into ECS.

iant01 · 2018-05-18T23:03:43Z

created PR16
changes needed to master.yaml to add transportation-systems service and service.yaml file to define task definition and load balancer listener rule for service.

Right now, the following items have arbitrarily set values:
Host: staging-2018.civicpdx.org
Path: /transportation-systems
Port:3000
Priority: 40 (needs to be before the civic-2018 service and the civic-lab service)
Memory: 2048 2GB (last years service was a memory pig, hopefully this years will be less, setting high to start)

znmeb · 2018-05-19T02:13:21Z

@iant01 How much memory did we use last year? And how do you measure it? Is there some way we can test this locally before deploying?

iant01 · 2018-05-19T03:35:37Z

Can possible use docker stat on a running host to get the memory info on the running containers. Either a container developer would need to run the command on their local system or on another ECS instance. Since we can't ssh into the hacko's container instance to run the command we might be able to run the transportation-systems container on another ECS instance, but I have not had any success running the 2017 container in my AWS account, so may not have success with the 2018 container. I will give it a try.

There may be a docker API that might work to the hacko ECS instance, but again we might need an access key to get in.

znmeb · 2018-05-19T04:21:42Z

This is the API containers, right? If those look like this year's API images from the backend-examplar, either there's an AWS way to monitor their usage or we'd need console access to the Docker host. :-(

MikeTheCanuck · 2018-05-19T05:16:11Z

@znmeb , is there any chance of running the container locally, performing a few operations through the API (to load up some in-memory data) and running the docker stat command as Ian suggested above?

MikeTheCanuck · 2018-05-19T05:17:03Z

There is no way we're going to throw 1/4 of our available memory at a new container "just in case" - this was only done last year as a last-minute, last-resort fix, and no one's had time to go back and characterize that pig since then.

znmeb · 2018-05-19T06:00:06Z

Yeah, I can spin it up locally but this isn't the full API. Should I just use the Docker host default settings for container resource usage?

It would be really nice if we could build resource limiting into the images - interpreted languages like Python tend to take up all the RAM they can find even if they're sharing it with a dozen other containers / VMs they don't know about.

MikeTheCanuck · 2018-05-19T06:04:39Z

I'm confused - why isn't the Docker image you'd spin up locally not "the full API"? Isn't that one of the benefits of Docker, so that the app you run locally and the one you deploy into production are identical?

znmeb · 2018-05-19T06:06:29Z

It's the full API for the one database we had when we built the image. We have more data now, which will mean more models and more API endpoints and probably more RAM used.

BrianHGrant · 2018-05-19T06:27:35Z

So we have some options to profile python and django behavior, including running DEBUG true with the gunicorn server (-p) connecting with aws db, some usage of the django DEBUG toolbar (not currently installed) or maybe through new relic if we need some more advance info not provided by docker stat

That said there were some complexities to the transportaation project last year that didn't exist when I left a bit ago, will catch up this weekend but not sure if this will be an issue.

Good data on usage is awesome and great though.

MikeTheCanuck · 2018-05-19T06:59:35Z

Let's not go overboard here - the most significant information we'll need to know is roughly how much RAM the Django app(s) in the container will consume, so that we can allocate a sufficient amount of RAM in the AWS CloudFormation template for this container. We generally start out with 100MB and bump it by increments of 100 from there, and spent a lot of time last year debugging containers that wouldn't stay running because we had no idea what kind of memory load they would have.

However, we're not just going to throw RAM at these - this isn't an unlimited resource - so if there's some risk that they'll need more than 100MB, let's get a rough number based on some rough characterization. Thanks!

znmeb · 2018-05-19T07:06:43Z

By the way, wouldn't DEBUG=True use more RAM?

iant01 · 2018-05-19T21:15:08Z

Silly question... was the transportation container last year running a database rather than connecting to a database server or was it a hybrid of both (keeping large amounts of data local after grabbing it from a remote DB server?

bhgrant8 · 2018-05-19T21:19:58Z

Yes DEBUG would use more RAM.

But here is where I've got to:

docker stats gives a streaming output, so a PIT of memory usage and a few other stats:

CPU %               MEM USAGE / LIMIT     MEM %               NET I/O             BLOCK I/O           PIDS

I then ran this on the docker container transportation-system-backend_api_production_1 on my host machine. Using the current api

During startup of container using the prod flag ./bin/start.sh -p and connecting to the aws hosted db, we see CPU % maxing out around 85%, with the memory usage going to around 152 MiBs.

The thing I was seeing though is that the MEM USAGE did not seem to drop by more then a few MiBs, a few queries using filters on the crash data, I made it to a ~225MiBs. So started looking into what this figure actually included.

First, I found Google's cAdvisor (https://github.com/google/cadvisor). This provides a GUI and provides 60 seconds of historical data, so a bit more useful then docker stat.

Looking into the MEM usage came across this issue, which documents what the different types of memory are being recorded:

google/cadvisor#638

tldr is:

Hot is the working set - pages that has been recently touched as calculated
by the kernel.

Total includes hot + cold memory - where cold are the pages that have not
been touched in a while and can be reclaimed if there was global memory
pressure.

or another way:

Total (memory.usage_in_bytes) = rss + cache
Working set = Total - inactive (not recently accessed memory = inactive_anon + inactive_file)

So question becomes which is most important number?

bhgrant8 · 2018-05-19T21:30:00Z

@iant01 I feel like there was some type of hybrid data store going on, but was not directly on project last year and was not completely sure of the full magic that was happening.

MikeTheCanuck · 2018-05-19T21:30:09Z

Awesome data Brian, thank you.

When we allocate memory to each container, there’s no memory management to worry about - as in, the “cold” memory that could be reclaimed probably wouldn’t be, because there’s nothing else in the container that would appreciably request contended memory (it’d all be consumed by one process - gunicorn, Python, whatever the runtime host is).

So given we’re doing hard allocations per container, I’m going to conservatively assume that we should use the Total - and then round up to the nearest 100 (just to give us a little breathing room for edge cases and future API enhancements).

Based on this data, I’m inclined to allocate 300 MB to this transportation-systems container.

znmeb · 2018-05-19T21:45:16Z

I've got the merged database ready for testing - I'm planning to build a local development environment from it at the May 20 build session so we can see what we have.

iant01 · 2018-05-20T04:29:40Z

All of the discussion on memory use should be moved to its own new issue, this issue was intended for creation of the service task to get things going in ECS.

This issue can be closed once all the Memory discussion is in its own issue and PR 16 has been merged.

iant01 · 2018-05-20T04:31:06Z

On the issue of which memory size is relevant it would be the Total memory size.

MikeTheCanuck added this to To do in Get all 2018 API containers deploying to ECS May 18, 2018

iant01 self-assigned this May 19, 2018

bhgrant8 mentioned this issue May 19, 2018

Consider removing RAM constraints on API containers #115

Closed

bhgrant8 mentioned this issue May 22, 2018

Confirm new Transportation Systems Project Reqs/Structure #126

Closed

MikeTheCanuck closed this as completed May 27, 2018

Get all 2018 API containers deploying to ECS automation moved this from To do to Done May 27, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create/implement task definition for 2018 Transportation-Systems-Service project #104

Create/implement task definition for 2018 Transportation-Systems-Service project #104

iant01 commented May 17, 2018 •

edited

Loading

iant01 commented May 18, 2018

znmeb commented May 19, 2018

iant01 commented May 19, 2018

znmeb commented May 19, 2018

MikeTheCanuck commented May 19, 2018

MikeTheCanuck commented May 19, 2018

znmeb commented May 19, 2018 •

edited

Loading

MikeTheCanuck commented May 19, 2018

znmeb commented May 19, 2018

BrianHGrant commented May 19, 2018

MikeTheCanuck commented May 19, 2018

znmeb commented May 19, 2018

iant01 commented May 19, 2018

bhgrant8 commented May 19, 2018

bhgrant8 commented May 19, 2018

MikeTheCanuck commented May 19, 2018

znmeb commented May 19, 2018

iant01 commented May 20, 2018

iant01 commented May 20, 2018

Create/implement task definition for 2018 Transportation-Systems-Service project #104

Create/implement task definition for 2018 Transportation-Systems-Service project #104

Comments

iant01 commented May 17, 2018 • edited Loading

iant01 commented May 18, 2018

znmeb commented May 19, 2018

iant01 commented May 19, 2018

znmeb commented May 19, 2018

MikeTheCanuck commented May 19, 2018

MikeTheCanuck commented May 19, 2018

znmeb commented May 19, 2018 • edited Loading

MikeTheCanuck commented May 19, 2018

znmeb commented May 19, 2018

BrianHGrant commented May 19, 2018

MikeTheCanuck commented May 19, 2018

znmeb commented May 19, 2018

iant01 commented May 19, 2018

bhgrant8 commented May 19, 2018

bhgrant8 commented May 19, 2018

MikeTheCanuck commented May 19, 2018

znmeb commented May 19, 2018

iant01 commented May 20, 2018

iant01 commented May 20, 2018

iant01 commented May 17, 2018 •

edited

Loading

znmeb commented May 19, 2018 •

edited

Loading