-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The Lua rollout plan #294
Comments
I've completed the initial rollout for testing against the NREL APIs. So far, things have looked pretty good and no big issues (and lots of nice benefits like lower memory use, CPU use, etc). There have been a few small things crop up, mostly around edge cases that the live traffic has helped pinpoint (eg, things like geocoding results for analytics containing a city, but no state/region). So I've been fixing those issues and keeping an eye on traffic, but otherwise no big issues really impacting functionality. We'll continue to monitor things, but I think we can reach out to agencies soon about planning the wider rollout. |
As a quick update, I've been seeing some unexpected memory growth on the production system running the new stack. It's not happening super quickly, but it's something I've been looking into, so I wanted to make note of it for reference. This memory growth didn't show up in the multi-day stress tests I ran, but I have a couple of theories as to what's going on with production now:
I'm hoping it's the first option, since that means we don't really have a memory leak, just a slowly-filling cache that does have an eventual cap. And there are some recent signs that maybe point towards that: I reloaded things at around 7AM, so that's the big drop-off, but the rate of increase does appear to be leveling off now. It's still increasing some, which I had sort of expected to stop by now based on some calculations, but we'll see how this looks tomorrow given another good chunk of hours. Prior to the 7AM reload, I also made some tweaks to better tune the default sizes of our shared memory dicts inside nginx, so that might also be helping. |
Status update on the memory growth: Despite things appearing like they were leveling, off, the memory usage continued to grow. I think I've tracked it down to the geoip2 module. The memory growth was easily reproducible when making requests from many different IP addresses. I think this should be resolved by a switch to the geoip module that's builtin to nginx and uses the legacy dataset. Overall memory usage should also be improved by this switch after some deeper digging and testing. More details in this commit message: NREL/api-umbrella@19f2283 So we'll continue to keep our eyes peeled on that, but otherwise I think the plan is to announce the wider rollout for the week of November 16 and do a slow rollout that week to each agency domain. |
A couple of status updates on the technical stuff:
And in terms of the general rollout, we announced our plans to rollout the changes to agencies this week. We're rolling things out to agencies one at a time on the following schedule:
|
Quick update for today: Things seem to be progressing well (knock on wood). The only notable issue discovered during the rollout this week was a pretty minor one. There was a bug that caused requests not to be logged in the analytics database if the request came from an IP address that geocoded to a city name that contained an accent or special character (Tórshavn, Faroe Islands is an example). This didn't affect a huge number of requests, but it has now been fixed. |
The transition is fully complete! 🌟 🌟 |
We have a significant update to the API Umbrella platform we're going to be releasing: NREL/api-umbrella#183. This issue is to coordinate how we're going to update the api.data.gov stack with this update.
The text was updated successfully, but these errors were encountered: