This repository has been archived by the owner. It is now read-only.

[#120368533] Add set_mtu job #2

Merged
merged 1 commit into from May 31, 2016

Conversation

Projects
None yet
3 participants
@saliceti

saliceti commented May 27, 2016

What

Story: Reduce MTU of VMs to 1500

We have discovered an MTU related problem with AWS's nat-gateway service which is causing us all sorts of problems at the moment, including:

  • 'Error tailing logs: Unexpected EOF'. Seems to be that doppler cannot properly validate the auth token with the UAA (presumably because it's using the UAA endpoint URL and that's being routed out via the aws-nat-gateway)
  • 'Stats unavailable: Stats server temporarily unavailable.' Seems to be a problem with tps-listener failing to connect to the doppler endpoint

The MTU is different for different machine sizes on AWS, but a lot of the larger ones (which we're using) now default to using Jumbo frames, with MTU 9001. AWS doesn't give us enough control over the dhcp-option-sets to set the MTU for ourselves.

The problem doesn't happen if you spin up your own nat-masquerading linux box. AWS has confirmed they see the problem when we sent them some code to try out MTU sizes from 1513 to 517 bytes. They say the've worked out a fix and will deploy it by 30th June

The bosh-agent seems to overwrite the dhclient.conf when it starts up. But we can use a bosh addon to write a DHCP hook to set the MTU whenever the lease is renewed.

How to review

Review with MTU PR on paas-cf

Note

Once this is merged, update MTU on paas-cf to point to merge commit id

Who can review

Anyone but @jimconner or me

set_mtu job
Allow setting the MTU via bosh addon. The interface name and the MTU can
be changed via properties.
An example manifest is provided.
This was tested on Ubuntu 14.04.
@dcarley

This comment has been minimized.

dcarley commented May 27, 2016

Could you add a brief description to the PR about the previous SPIKE:

  • why we want to do this; some VMs in AWS get 9001 by default but we've encountered a bug in another AWS product
  • why we don't change dhclient.conf; it's written by and hardcoded into bosh-agent
  • how this works; it gets run after every lease renewal to switch the value back again

@combor combor self-assigned this May 31, 2016

@combor

This comment has been minimized.

combor commented May 31, 2016

I tested this together with alphagov/paas-cf#301 and it works properly. Merging...

@combor combor merged commit a2bc2ab into gds_master May 31, 2016

@combor combor deleted the 120368533_set_mtu branch May 31, 2016

@combor combor restored the 120368533_set_mtu branch May 31, 2016

@rjw1 rjw1 unassigned combor Dec 30, 2016

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.