New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support `--sysctl ` `docker run` flag #789

Closed
aaithal opened this Issue May 3, 2017 · 22 comments

Comments

Projects
None yet
@aaithal
Member

aaithal commented May 3, 2017

From 502 (Has 4 +1s):

Support for the sysctl option would be useful, especially to allow setting somaxconn.

@wezzynl

This comment has been minimized.

Show comment
Hide comment
@wezzynl

wezzynl May 22, 2017

Yeah, being able to set somaxconn would be very, very welcome.

wezzynl commented May 22, 2017

Yeah, being able to set somaxconn would be very, very welcome.

@mancej

This comment has been minimized.

Show comment
Hide comment
@mancej

mancej Jul 5, 2017

We definitely need this feature ASAP, the workaround is a security risk.

mancej commented Jul 5, 2017

We definitely need this feature ASAP, the workaround is a security risk.

@jdreaver

This comment has been minimized.

Show comment
Hide comment
@jdreaver

jdreaver Aug 15, 2017

Are there any workarounds to set sysctl parameters until this feature is implemented? In particular, I want to set net.ipv4.ip_local_port_range and net.core.somaxconn. There is a lot of advice on the internet about workarounds, but a lot of it is old and I want to follow best practices here without shooting myself in the foot.

jdreaver commented Aug 15, 2017

Are there any workarounds to set sysctl parameters until this feature is implemented? In particular, I want to set net.ipv4.ip_local_port_range and net.core.somaxconn. There is a lot of advice on the internet about workarounds, but a lot of it is old and I want to follow best practices here without shooting myself in the foot.

@ayozemr

This comment has been minimized.

Show comment
Hide comment
@ayozemr

ayozemr Sep 8, 2017

Suscribed. We may need to tune somaxconn too for UWSGI python server.

Thanks!

ayozemr commented Sep 8, 2017

Suscribed. We may need to tune somaxconn too for UWSGI python server.

Thanks!

@mancej

This comment has been minimized.

Show comment
Hide comment
@mancej

mancej Sep 8, 2017

Just a heads up, we instituted a workaround (for now, while we wait for this), but unfortunately it involves running the container as privileged. We also had to set the somaxconn on the host itself to get it working. That being said, it does work.

mancej commented Sep 8, 2017

Just a heads up, we instituted a workaround (for now, while we wait for this), but unfortunately it involves running the container as privileged. We also had to set the somaxconn on the host itself to get it working. That being said, it does work.

@shoumitragametime

This comment has been minimized.

Show comment
Hide comment
@shoumitragametime

shoumitragametime Sep 28, 2017

@mancej Can you give me a little insight on the workaround? I'm running into a similar issue where I am trying to set the tcp_keepalive_time for one of my Airflow operators. Though it would be great if Amazon could deliver on the feature soon.

shoumitragametime commented Sep 28, 2017

@mancej Can you give me a little insight on the workaround? I'm running into a similar issue where I am trying to set the tcp_keepalive_time for one of my Airflow operators. Though it would be great if Amazon could deliver on the feature soon.

@jdreaver

This comment has been minimized.

Show comment
Hide comment
@jdreaver

jdreaver Sep 28, 2017

@shoumitragametime if you set your container as privileged then you can just call sysctl yourself on startup.

AWS also recently implemented adding Linux capabilities to containers (https://aws.amazon.com/about-aws/whats-new/2017/09/amazon-ecs-adds-support-for-adding-or-dropping-linux-capabilities-to-containers/). I haven't tried this yet but maybe you don't have to use privileged and can instead just add a particular capability.

jdreaver commented Sep 28, 2017

@shoumitragametime if you set your container as privileged then you can just call sysctl yourself on startup.

AWS also recently implemented adding Linux capabilities to containers (https://aws.amazon.com/about-aws/whats-new/2017/09/amazon-ecs-adds-support-for-adding-or-dropping-linux-capabilities-to-containers/). I haven't tried this yet but maybe you don't have to use privileged and can instead just add a particular capability.

@wezzynl

This comment has been minimized.

Show comment
Hide comment
@wezzynl

wezzynl Nov 20, 2017

Any update on this? It's really gnarly to have to run a lot of containers eating up memory because we can't simply increase somaxconn.

wezzynl commented Nov 20, 2017

Any update on this? It's really gnarly to have to run a lot of containers eating up memory because we can't simply increase somaxconn.

@owengo

This comment has been minimized.

Show comment
Hide comment
@owengo

owengo Mar 20, 2018

Any update? being unable to configure properly the tcp_keepalive_xxx sysctl settings causes TCP_ELB_Reset_Count errors on NLB.

owengo commented Mar 20, 2018

Any update? being unable to configure properly the tcp_keepalive_xxx sysctl settings causes TCP_ELB_Reset_Count errors on NLB.

@johvet

This comment has been minimized.

Show comment
Hide comment
@johvet

johvet May 24, 2018

Any update on this one? Running into the same problem. need to adjust somaxconn

johvet commented May 24, 2018

Any update on this one? Running into the same problem. need to adjust somaxconn

@jnmcross

This comment has been minimized.

Show comment
Hide comment
@jnmcross

jnmcross May 31, 2018

We also need to set net.ipv4.tcp_keepalive_* properties.
If this doesn't get fixed we may move off of ECS

jnmcross commented May 31, 2018

We also need to set net.ipv4.tcp_keepalive_* properties.
If this doesn't get fixed we may move off of ECS

@shoumitragametime

This comment has been minimized.

Show comment
Hide comment
@shoumitragametime

shoumitragametime May 31, 2018

@jnmcross I was able to get around this by running my container in a privileged mode(if you are okay with that) and executing sysctl -w net.ipv4.tcp_keepalive_time=xxx net.ipv4.tcp_keepalive_intvl=xxx net.ipv4.tcp_keepalive_probes=x in my entrypoint.sh script. Just make sure your script is executable and that should work.

shoumitragametime commented May 31, 2018

@jnmcross I was able to get around this by running my container in a privileged mode(if you are okay with that) and executing sysctl -w net.ipv4.tcp_keepalive_time=xxx net.ipv4.tcp_keepalive_intvl=xxx net.ipv4.tcp_keepalive_probes=x in my entrypoint.sh script. Just make sure your script is executable and that should work.

@ziggythehamster

This comment has been minimized.

Show comment
Hide comment
@ziggythehamster

ziggythehamster Jun 1, 2018

If you're not OK with running a privileged container and exposing read-write procfs and sysfs to your container for its lifetime, this trick works for us on latest Docker/ECS:

nsenter --target $FIND_YOUR_CONTAINER_PID_SOMEHOW --mount --uts --ipc --net --pid \
   /bin/sh -c '/usr/bin/mount /proc/sys -o remount,rw;
               /usr/sbin/sysctl -q net.ipv6.conf.all.forwarding=1;
               /usr/bin/mount /proc/sys -o remount,ro;
               /usr/bin/mount /proc -o remount,rw # restore rw on /proc'

We built this into an internal tool which polls Docker for new containers having a sysctls label. Then we validate the (pipe-delimited) sysctls against a whitelist (net.core.* basically) and plug them into this command and have our tool run it.

Kind of a janky workaround for something ECS should support out of the box, but it works.

Also, I'd add that this has to run, as root, on the host machine since the namespaces of other containers would be inaccessible if you ran this in a container. You also need procps or equivalent installed in the container.

ziggythehamster commented Jun 1, 2018

If you're not OK with running a privileged container and exposing read-write procfs and sysfs to your container for its lifetime, this trick works for us on latest Docker/ECS:

nsenter --target $FIND_YOUR_CONTAINER_PID_SOMEHOW --mount --uts --ipc --net --pid \
   /bin/sh -c '/usr/bin/mount /proc/sys -o remount,rw;
               /usr/sbin/sysctl -q net.ipv6.conf.all.forwarding=1;
               /usr/bin/mount /proc/sys -o remount,ro;
               /usr/bin/mount /proc -o remount,rw # restore rw on /proc'

We built this into an internal tool which polls Docker for new containers having a sysctls label. Then we validate the (pipe-delimited) sysctls against a whitelist (net.core.* basically) and plug them into this command and have our tool run it.

Kind of a janky workaround for something ECS should support out of the box, but it works.

Also, I'd add that this has to run, as root, on the host machine since the namespaces of other containers would be inaccessible if you ran this in a container. You also need procps or equivalent installed in the container.

@nehalrp

This comment has been minimized.

Show comment
Hide comment
@nehalrp

nehalrp Jun 2, 2018

+1, any update on this? We need to set keepalives to work with NLBs.

nehalrp commented Jun 2, 2018

+1, any update on this? We need to set keepalives to work with NLBs.

@rifelpet

This comment has been minimized.

Show comment
Hide comment
@rifelpet

rifelpet Jun 14, 2018

We were relying on the <4.13 kernel behavior of containers inheriting sysctl changes from the host in order for our containers to use the following overrides:

net_ipv4_tcp_tw_reuse = 1
net_core_somaxconn = 10000
net_ipv4_tcp_max_tw_buckets = 250000
net_ipv4_tcp_fin_timeout = 10
net_ipv4_ip_local_port_range = 25000 61000

This behavior changed with 4.13 as mentioned on the bottom of this page.

When we upgraded to the latest Amazon Linux AMI we lost those sysctl overrides. While the workarounds mentioned in this issue do work, it'd be great to avoid those altogether and instead specify them in the task definition.

rifelpet commented Jun 14, 2018

We were relying on the <4.13 kernel behavior of containers inheriting sysctl changes from the host in order for our containers to use the following overrides:

net_ipv4_tcp_tw_reuse = 1
net_core_somaxconn = 10000
net_ipv4_tcp_max_tw_buckets = 250000
net_ipv4_tcp_fin_timeout = 10
net_ipv4_ip_local_port_range = 25000 61000

This behavior changed with 4.13 as mentioned on the bottom of this page.

When we upgraded to the latest Amazon Linux AMI we lost those sysctl overrides. While the workarounds mentioned in this issue do work, it'd be great to avoid those altogether and instead specify them in the task definition.

@nehalrp

This comment has been minimized.

Show comment
Hide comment
@nehalrp

nehalrp Jun 14, 2018

@rifelpet Agreed. Especially since Amazon Linux AMI 2018.03 uses a 4.14+ kernel...

https://aws.amazon.com/amazon-linux-ami/2018.03-release-notes/

nehalrp commented Jun 14, 2018

@rifelpet Agreed. Especially since Amazon Linux AMI 2018.03 uses a 4.14+ kernel...

https://aws.amazon.com/amazon-linux-ami/2018.03-release-notes/

@jishi

This comment has been minimized.

Show comment
Hide comment
@jishi

jishi Jul 11, 2018

I want to point out that allowing setting the net.ipv4.tcp_keepalive_time is a very useful feature to mitigate the silently dropped connections in the AWS NAT gateways (which happens after 5-6 minutes for each connection), since not all applications/driver allows you do configure the desired keepalive delay.

We always have issues with this on any externally hosted service that uses sockets, where Compose being one of the biggest culprits. Adding sysctl options to the taskDefinition seems to be by far the most bang for the buck IMO :)

jishi commented Jul 11, 2018

I want to point out that allowing setting the net.ipv4.tcp_keepalive_time is a very useful feature to mitigate the silently dropped connections in the AWS NAT gateways (which happens after 5-6 minutes for each connection), since not all applications/driver allows you do configure the desired keepalive delay.

We always have issues with this on any externally hosted service that uses sockets, where Compose being one of the biggest culprits. Adding sysctl options to the taskDefinition seems to be by far the most bang for the buck IMO :)

@shahidash

This comment has been minimized.

Show comment
Hide comment
@shahidash

shahidash Jul 18, 2018

Hi
net.ipv4.tcp_keepalive_time is an important variable to set, when using the aws infrastructure. As we are using NLB to connect the multiple docker services. and NLB just has the idle timeout of 350 seconds, in this case we need the net.ipv4.tcp_keepalive_time to increase the NLB idle timeout.

shahidash commented Jul 18, 2018

Hi
net.ipv4.tcp_keepalive_time is an important variable to set, when using the aws infrastructure. As we are using NLB to connect the multiple docker services. and NLB just has the idle timeout of 350 seconds, in this case we need the net.ipv4.tcp_keepalive_time to increase the NLB idle timeout.

@devops-eatigo

This comment has been minimized.

Show comment
Hide comment
@devops-eatigo

devops-eatigo Jul 20, 2018

@shoumitragametime can you share some insights of your entrypoint.sh or how you run the sysctl command? i tried to add the command in dockerfile RUN and command section under ContainerDefinition, but both failed with some no path/no such file or permission denied errors. I did set privileged to true.

devops-eatigo commented Jul 20, 2018

@shoumitragametime can you share some insights of your entrypoint.sh or how you run the sysctl command? i tried to add the command in dockerfile RUN and command section under ContainerDefinition, but both failed with some no path/no such file or permission denied errors. I did set privileged to true.

@wezzynl

This comment has been minimized.

Show comment
Hide comment
@wezzynl

wezzynl Sep 12, 2018

Not having control over various Docker run options has been reported as early as 20 August 2016... Now more than 2 years later, we're still not able to set kernel parameters with Docker's sysctl flag. This is ridiculous.

wezzynl commented Sep 12, 2018

Not having control over various Docker run options has been reported as early as 20 August 2016... Now more than 2 years later, we're still not able to set kernel parameters with Docker's sysctl flag. This is ridiculous.

@wezzynl

This comment has been minimized.

Show comment
Hide comment
@wezzynl

wezzynl Sep 12, 2018

@aaithal Can you provide us with an update?

wezzynl commented Sep 12, 2018

@aaithal Can you provide us with an update?

@sharanyad

This comment has been minimized.

Show comment
Hide comment
@sharanyad

sharanyad Sep 17, 2018

Contributor

You can now add sysctl options to your containers in your task definition. You can add these new options in the AWS Console now, and they'll be available in the AWS CLI and SDKs soon.

Contributor

sharanyad commented Sep 17, 2018

You can now add sysctl options to your containers in your task definition. You can add these new options in the AWS Console now, and they'll be available in the AWS CLI and SDKs soon.

@sharanyad sharanyad closed this Sep 17, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment