Skip to content
This repository was archived by the owner on Jan 29, 2026. It is now read-only.

feat(setup-waf): [INFRA-1167] create WAF for firefox-api-proxy#42

Merged
hyperparabolic merged 3 commits intomainfrom
setup-waf
Apr 27, 2023
Merged

feat(setup-waf): [INFRA-1167] create WAF for firefox-api-proxy#42
hyperparabolic merged 3 commits intomainfrom
setup-waf

Conversation

@hyperparabolic
Copy link
Copy Markdown
Contributor

@hyperparabolic hyperparabolic commented Apr 24, 2023

Goal

Set up a preliminary WAF implementation for firefox-api-proxy.

Going with a relatively permissive initial setup here. Traffic originating from the CDN is all permitted, and a fairly loose rate limit is in place for direct requests.

This change requires a newer version of @pocket-tools/terraform-modules, so updated that to the latest version. The loosely coupled version of cdktf dependencies are also updated with that change. This includes breaking changes to this package around imports. Changing all imports to use deep imports as the index files have been deprecated.

I think we should also consider replicating some of the dotcom gateway rules that are provided by AWS here as well (these rules include blocks for well known abuse), but I'd like to export those from terraform-modules rather than copy pasting them all over the place.

I'd love feedback/perspectives on:

  • Are there any other rules we want in this first pass?
  • Are we happy with resource names?
  • I tried to document some sensible next steps for mitigating abuse. Any other thoughts on these?
  • Some other rules we might want are defined in dotcom-gateway (see rules with vendorName: 'AWS'). I think these should probably be defined in terraform-modules if we want to reuse them in multiple places, but I have some open questions around that that require some digging:
    • How would this behave if priority for default rules started at an out of sequence number? Say 100 (arbitrary, could be higher), to leave room for application specific rules?
    • I think the other option would be making a function that returns default rules, starting at a provided priority, and basing any following rules off of the total rule count afterwards.
    • Any other thoughts on how to handle this? Leaving it out for a second to try to get feedback, or I'll dig for the next iteration.

Implementation Decisions

Deployment steps

  • Check to ensure this deployment doesn't block all traffic.
  • Update service page once this is live and we have links to metrics.

References

JIRA ticket:
https://getpocket.atlassian.net/browse/INFRA-1167

Going with a relatively permissive initial setup here. Traffic originating
from the CDN is all permitted, and a fairly loose rate limit is in place
for direct requests.

This change requires a newer version of @pocket-tools/terraform-modules, so updated
that to the latest version. The loosely coupled version of cdktf dependencies are
also updated with that change. This includes breaking changes to this package
around imports. Changing all imports to use deep imports as the index files
have been deprecated.

I think we should also consider replicating some of the dotcom gateway rules
that are provided by AWS here as well (these rules include blocks for well known
abuse), but I'd like to export those from terraform-modules rather than copy
pasting them all over the place.
@hyperparabolic hyperparabolic requested review from a team as code owners April 24, 2023 21:56
@hyperparabolic hyperparabolic requested review from Herraj and jpetto April 24, 2023 21:56
Comment thread .aws/src/main.ts
- `x-source`: `mozilla`
- `x-forwarded-for`: <end client IP address>

For now, allow all traffic with `x-source`: `mozilla` to bypass the WAF.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hi, making sure i understand the intent right

we want only mozilla client to use this firefox proxy and any unknown client would be rejected and not served

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is really just getting the WAF in place so we can start iterating with smaller changes if and when they are needed.

As of right now, this first rule allows any traffic coming from the CDN (identified by the x-source header) to bypass rate limiting entirely. We could add another condition to this rule to apply rate limiting to this rather than allowing bypass. We could also change this value to be a shared secret instead if we find that sources other than the VPN start setting this header. Plenty of options to move forward if we need it.

Comment thread .aws/src/main.ts
action: { block: {} },
statement: {
rateBasedStatement: {
limit: 1000,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this 1000 req per hour from the same IP?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://docs.aws.amazon.com/waf/latest/developerguide/waf-rule-statement-type-rate-based.html

Per 5 minutes it seems actually. This could definitely be pretty aggressively throttled down.

Copy link
Copy Markdown

@sri-kirsh sri-kirsh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how do we usually test/QA waf rules?

@hyperparabolic
Copy link
Copy Markdown
Contributor Author

how do we usually test/QA waf rules?

This is entirely unclear to me, and this is also the primary reason I'm going with relatively loose limits for this first pass. I think we could test in the dev environment, but I don't think we have a corresponding dev CDN in GCP.

Copy link
Copy Markdown
Contributor

@efixler efixler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm.

It would be good to add some comments in the file, or in the README noting the thinking around the WAF settings; you mention that these are starting points and we'll look to home them (which is totally reasonable) and it'd be good to put a little bit of that along with the how-to-check info in case the person who is checking is not you :)

Also I'm sure that this was discussed with SRE, but it'd be good to have SRE 👀 specifically on the PR too.

@efixler
Copy link
Copy Markdown
Contributor

efixler commented Apr 27, 2023

Also, just want to mention, newtab is how the largest number of people interact with Pocket and now we make most of our money! Thanks for taking good care with this!

@hyperparabolic
Copy link
Copy Markdown
Contributor Author

It would be good to add some comments in the file, or in the README noting the thinking around the WAF settings; you mention that these are starting points and we'll look to home them (which is totally reasonable) and it'd be good to put a little bit of that along with the how-to-check info in case the person who is checking is not you :)

I tried to already leave some next steps hints. I should probably just also say that I expect this to need some iteration as we move into stable and get data on reasonable limits.

I'm going to follow up with some info on the WAF on the service page once this is deployed and I know where the metrics live. I've got a few TODOs waiting for me there.

Also I'm sure that this was discussed with SRE, but it'd be good to have SRE 👀 specifically on the PR too.

Yeah, I've been talking with Brett through all of this, but good call. I'll give them an explicit heads up after updating the top level comment.

@bkochendorfer
Copy link
Copy Markdown
Member

Couple notes here:

  1. I created the ability to pass this in to the module, is there a reason you aren't using that feature? Pocket/terraform-modules@4923c29
  2. If you are worried about the default block action and want to see what it looks like first you can set that to count which will count things it would have blocked.

@hyperparabolic
Copy link
Copy Markdown
Contributor Author

Thanks for the feedback @bkochendorfer

1. I created the ability to pass this in to the module, is there a reason you aren't using that feature? [Pocket/terraform-modules@4923c29](https://github.com/Pocket/terraform-modules/commit/4923c29a870dfd291ae23da98c3a1cbab2a231bc)

I kind of got lost in the weeds here and forgot this change was made 😅. Done.

2. If you are worried about the default `block` action and want to see what it looks like first you can set that to `count` which will count things it would have blocked.

Good call.

@hyperparabolic hyperparabolic merged commit 96ae082 into main Apr 27, 2023
@hyperparabolic hyperparabolic deleted the setup-waf branch April 27, 2023 21:03
Comment thread .aws/src/main.ts
const globalRateLimitRule = <Wafv2WebAclRule>{
name: 'GlobalRateLimit',
priority: 1,
action: { block: {} },
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you probably meant to change this one to count and the other one to allow?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might have misunderstood your intent, but this is what I intended.

Nobody should be using this service directly other than for cache debugging imo. This limit is sky high for that requirement.

I turned on a count for the other rule so we could get visibility into the request volume from the CDN, and potentially enable rate limiting based on x-forwarded-for.

hyperparabolic pushed a commit that referenced this pull request Apr 27, 2023
See #42 for more context.
#42

This included some rules that were originally defined in dotcom-gateway,
and it seems the action syntax used there has been deprecated.

I believe this should sort things out.
hyperparabolic added a commit that referenced this pull request Apr 27, 2023
See #42 for more context.
#42

This included some rules that were originally defined in dotcom-gateway,
and it seems the action syntax used there has been deprecated.

I believe this should sort things out.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants