Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Respecting SemVer and Changelog #260

Closed
houmark opened this issue Nov 5, 2021 · 12 comments
Closed

Respecting SemVer and Changelog #260

houmark opened this issue Nov 5, 2021 · 12 comments
Labels
enhancement Making the library better meta Related to build process/workflows/issue templates/refactors

Comments

@houmark
Copy link

houmark commented Nov 5, 2021

Hey there,

First of all, thanks for this package. Don't want to be Mr. Negative here, so please take this as constructive feedback.

Backstory: We've been running express-rate-limit on various Lambda functions for nearly a year and it's been working great.

Yesterday we upgraded the express-rate-limit package (minor version update). No changelog or release notes, but considering the package has 300,000+ weekly downloads we wrongfully assumed that if any breaking change would be in a minor or patch version, it would be written somewhere.

The package broke on AWS Lambda (with API Gateway in front) due to not having app.set('trust proxy', true); set. In the README it was not clear that this was needed for Lambda and we never had any issues running it without, but a throw was added in commit 17135ea.

Please consider adding some changelog, many projects are relying on a package like this and this package literally breaks in a big way when these types of breaking changes are introduces in a patch release.

@nfriedly
Copy link
Member

nfriedly commented Nov 6, 2021

Oof, I'm sorry to hear that happened to you.

For starters, that's a fair point about the changelog.

API Gateway in front

Yea, I'll add that one to the readme along with the others.

The package broke ... due to not having app.set('trust proxy', true); ... a throw was added in commit 17135ea

That seems like two unrelated issues. Actually, I don't even understand how they could happen at the same time.

The one issue, not having trust proxy set, that one I understand, it's about how to choose the right one when you have multiple IP addresses.

As a side note using true leaves you open to abuse, because causes it to trust an unlimited number of proxies. You really want to set it to the lowest number that works in normal usage. I should add something about that to the readme also.

The other issue, req.ip not being set is the wierd one. That means there is no IP address. As far as I know, Express always sets that, so I don't know how it could be triggered in an express app. Are you using express? Is all of your traffic coming over the network?

I'd like to understand a little bit more about your situation to know if this is something I need to roll back or not.

My expectation was that it would only help diagnose situations where things were already broken, not cause any new breakages. That's why I considered it a semver minor change.

nfriedly added a commit that referenced this issue Nov 6, 2021
@houmark
Copy link
Author

houmark commented Nov 6, 2021

Thank you @nfriedly — I really appreciate your quick valuable feedback here.

Yes, it is in fact very weird that req.ip was undefined. Throwing was just crashing the entire process, breaking payment in our case (pretty bad spot), but why it became undefined is weird in the first place. As this slipped through to our production, we rushed the best fix possible next to disabling the rate limiter altogether. I would like to circle back and test a better configuration that will in fact parse the IP without trusting any proxy. Can you explain what are the possible options for that trust configuration so I can test those settings in our development environment where it also happens.

This is Express, runs on Node.js 14.x in AWS Lambda with API Gateway in front (our platform is based on AWS Amplify, which sets all this up automatically). I am not sure how I can provide more information to you, besides giving your our setup, so maybe it can be replicated in your end. After adding the trust configuration, we did add use the keyGenerator function just to debug out req.ip to confirm it was now getting it. I can use that same function to debug more out and reply back here with different configurations, so we can maybe get closer to the original issue.

Let me know what you prefer.

@houmark
Copy link
Author

houmark commented Nov 6, 2021

I was just reading the Express behind proxies documentation and setting true assumes that the left most value in X-Forwarded-For is expected to be the client IP address, which in our testing seems to be the case.

We get two IP addresses always inside the node process. The left most one is the client, the second value is an AWS IP. The really weird thing here is this worked up until at least v5.3.0 (we upgraded from 5.3.0 to 5.5.0 which is where it broke) and a quick review of commits since that version showed the Throw when req.ip is undefined. It may have been another commit that broke it, or maybe it was always undefined and this package allowed that and the rate limiting became global for the process in memory without using the IP address? That was my quick theory but due to having to get this fixed and live, I did not have time to roll back the package version to test the IP it was using.

nfriedly added a commit that referenced this issue Nov 6, 2021
@nfriedly
Copy link
Member

nfriedly commented Nov 6, 2021

Ok, I just changed it from throwing an error to logging an error, that should get published to npm as v5.5.1 once the CI loop finishes. That's not exactly a fix, but it should reduce the impact of whatever is going on.

I was just reading the Express behind proxies documentation and setting true assumes that the left most value in X-Forwarded-For is expected to be the client IP address, which in our testing seems to be the case.

A malicious end-user could set their own X-Forwarded-For and, by default, most proxies will preserve that as the leftmost IP address, which is why I don't recommend setting true.

I'm about spent for tonight, but I would like to get to the bottom of the undefined req.ip at some point.

@houmark
Copy link
Author

houmark commented Nov 6, 2021

Right, reading more on that configuration, it seems 1 would be the best value for us, but I would have to test more. I'd also like to understand why this happened as it did, so I'll be doing so more testing over the weekend I hope, and give you feedback here as I dig more into it.

@houmark
Copy link
Author

houmark commented Nov 6, 2021

Having said that, not having to use this configuration would obviously be better, which is how we had it and somehow Express or this package figured out the IP address automatically (or we ran on undefined IP all the time). I shall attempt to get to the bottom of it.

@houmark
Copy link
Author

houmark commented Nov 9, 2021

Hey @nfriedly!

I was able to do some more testing on our Lambda set with API Gateway in front.

In our case trust proxy should be 2, not the most right IP (spot 1 I guess) but the second one from the right (which is also the first IP address).

In addition to that, I did some other tests. If trust proxy is not set, req.ip will be undefined both in version 5.5.x and back to version 5.3.0 and possibly all versions before that. What changed and you reverted in 5.5.1 is that it would throw if the IP is undefined. In our case though, rate-limiting still worked, maybe in a global fashion where it will rate limit based in the undefined key in memory, effectively rate-limiting globally on the platform, I am not sure, but we never had any reports of rate-limiting issues in the past.

I think your current fix in 5.5.1 is good, but I could definitely see people missing the console.error and that way potentially have a semi-broken rate-limiting, whereas throwing would be blocking and require people to fix it (by reading improved docs on that subject). Maybe adding more docs related to this, doing a major version update with a changelog that has a LARGE breaking note, and re-introducing the throw is the best long-term approach here?

In any case, we have cleared the "mystery" on our end and have a stable configuration now, so feel free to close this one and use any of my feedback however you think is best.

Thanks for responding and fixing fast!

@nfriedly
Copy link
Member

I think you're right, throwing in a major release version would be a better idea. And, yeah, a change log.

I feel like there's still something about your setup that I'm not understanding, but if it works for you then I guess that's the important thing :-)

@nfriedly
Copy link
Member

Oh, one more question, which data store are you using? The default memory store, or one of the external ones (Redis, MongoDB, memcache, etc.)?

If it's the memory store, then the way lambda runs things might explain why your users aren't getting rate-limited.

@houmark
Copy link
Author

houmark commented Nov 20, 2021

Sorry for the late reply here. We are using the standard memory store on Lamda.

@nfriedly
Copy link
Member

Ok, yea, I think what was happening is that you were really using a global rate-limiter as you suggested... but the default memory store doesn't share state between processes, and AWS Lambda was creating enough processes that none of them ever got up the the limit on their own.

It sounds like you fixed the IP address reading part, but you're probably still not getting as much protection as you'd like out of express-rate-limit due to the combination of the memory store and Lambda's behavior.

You could solve this with an external store (memcached, redis, etc) to let all of the lambda processes share their state, but a better answer might be to drop express-rate-limit and instead have something else provide the rate-limiting - the API Gateway itself, or maybe something that sits between it and the internet.

It's something to consider, anyways.

Sorry about the initial trouble, and good luck with everything!

@houmark
Copy link
Author

houmark commented Nov 20, 2021

Ok, yea, I think what was happening is that you were really using a global rate-limiter as you suggested... but the default memory store doesn't share state between processes, and AWS Lambda was creating enough processes that none of them ever got up to the limit on their own.

Yeah, that was also our theory, and even if it did not create multiple processes, then we had very low traffic on that endpoint so maybe only one user at a time, effectively making the global limit the user limit.

I will consider your suggestions on alternatives, thanks for that, but this is really not super critical. Just protecting a few endpoints where we don't want traffic to be hammered (payment attempt) etc. so I think it works well for now. If we want more control we will either add redis or look at rate limiting on the API gateway itself.

No worries, thanks for your quick replies and fixes, and I hope you are able to maybe add an automatically created changelog based on commits at least.

@gamemaker1 gamemaker1 added enhancement Making the library better meta Related to build process/workflows/issue templates/refactors labels Dec 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Making the library better meta Related to build process/workflows/issue templates/refactors
Projects
None yet
Development

No branches or pull requests

3 participants