Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
MM Traffic Test: 15%+ Portland K8s #518
Using the Maintenance Mode plan (issue #409), put SCL3 and k8s production deployments into Maintenance mode, and send 15% or more of MDN traffic to the Portland k8s servers (#487 (comment)) for up to an hour. Monitor with New Relic, Sentry, Papertrail, and AWS. Record the results and open new issues as needed.
The test went well. We spent 5 minutes at 5%, then 10 minutes each at 15%, 50%, and 100% traffic to AWS. There were about 2,600 users on the site during the test.
The CPU burden seemed to top out at 10 - 15%, barely climbing between 50% and 100% traffic. Other AWS metrics looked good as well. All indications are that Portland K8s will have plenty of spare capacity.
In New Relic, there was the same 30% reduction in traffic and throughput from switching to maintenance mode, similar to previous tests. Throughput continued to decline during the test. It's unclear if this is due to maintenance mode or will be expected in K8s. Response time was flat or 10% better with traffic going to k8s. @jgmize suggested using a different New Relic APM application, so that Portland K8S requests could be compared to SCL3 directly.
Papertrail was able to handle the logging requests. The top 404 was to /en-US/favicon.ico. This is also a 404 in SCL3, but is handled by Apache and doesn't do a locale-based redirect. We should probably redirect these requests our generic favicon, at https://developer.cdn.mozilla.net/static/img/favicon32.e1ca6d9bb933.png, or serve the same image at this location as well.
I think we've proven Portland k8s can handle MDN traffic. Now we need to check that write requests work correctly in staging.