Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Geolocation mocking #165

Merged
merged 33 commits into from Nov 9, 2022
Merged

Conversation

bbutkovic
Copy link
Contributor

@bbutkovic bbutkovic commented Aug 9, 2022

This PR resolves #10 by adding support for mocked Geolocation responses. By default it will return the address of Fastly's HQ in San Francisco and users can override the response based on the requested IP address by modifying the fastly.toml file in the following manner:

[local_server]
	[local_server.geolocation]
                format = "inline-toml"
                [local_server.geolocation.addresses]
		[local_server.geolocation.addresses."127.0.0.1"]
			as_name = "Dummy, Inc."
			city = "New York"
			latitude = 1.23
			longitude = 1.23

or by providing it as an external resource:

[local_server]
	[local_server.geolocation]
                format = "json"
		file = "./test-geolocation-data.json"

test-geolocation-data.json:

{
    "127.0.0.1": {
        "as_name": "Fastly Test",
        "as_number": 12345,
        "area_code": 123,
        "city": "Test City",
        "conn_speed": "broadband",
        "conn_type": "wired",
        "continent": "NA",
        "country_code": "CA",
        "country_code3": "CAN",
        "country_name": "Canada",
        "latitude": 12.345,
        "longitude": 54.321,
        "metro_code": 0,
        "postal_code": "12345",
        "proxy_description": "?",
        "proxy_type": "?",
        "region": "CA-BC",
        "utc_offset": -700
    }
}

Current progress

  • support for mocked GeoIP responses
  • support for configuration through fastly.toml
  • support for IPv6 lookups
  • tests

@aturon
Copy link
Contributor

aturon commented Aug 10, 2022

👋 hi @bbutkovic, thanks very much for taking up this work! I've been working to get in touch with stakeholders on this topic to make sure we know the design constraints. I'll be happy to review the details of the code, and @JakeChampion is going to help shepherd the overall feature/design.

In general, the folks I spoke to internally at Fastly are in support of adding mocking along the lines you are taking here! 👍

A couple bits of context/feedback on the details:

  • The precise geoip data does not have strong guarantees in terms of which fields will be available in the future. While the Rust SDK, for example, nails down a specific expectation, that could potentially change later on (and is something we'd like to refine at the SDK level). I wonder if we can take a strategy that is less rigid about precisely which fields are included, since in the end we're just providing a json blob anyway.
  • In the future, we may want to support other means of mocking, e.g. by including an external, freely available geoip database. I'd suggest we follow the approach we take for dictionaries, where there's an explicit format option, and initially we only support inline-toml. Then the actual contents would show up in a TOML table underneath.

Those two points actually work nicely together: when using inline-toml, we could just take the contents of the inner TOML table, and translate them directly to json when servicing geoip requests.

How does all that sound?

Copy link
Contributor

@JakeChampion JakeChampion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this, having geoip data for local development will be really nice to have ☺️

I understand that the code will likely change a bit based on @aturon's suggestions such as adding an explicit format option but I thought I would have an initial first-pass review of the code as it currently is.

In case it is useful, this is the test plan I would use when adding similar features:

  1. Add a test-fixture/app in test-fixtures/src/bin which can be used to shows that the feature is working, this is similar to an integration test - this test would only work for valid fastly.toml configurations, the tests for invalid fastly.toml configurations are handled in the next step
  2. Add unit-tests for parsing the new configuration section of fastly.toml in lib/src/config/unit_tests.rs - The test scenarios to cover would be similar to the already existing inline_toml_dictionary_config_tests

Let me know if you want any help or to talk through any parts of the tests or implementation, I'm very happy to help ☺️

lib/src/wiggle_abi/geo_impl.rs Outdated Show resolved Hide resolved
lib/src/session.rs Outdated Show resolved Hide resolved
@bbutkovic
Copy link
Contributor Author

Thank you both for the suggestions!

@aturon:

Those two points actually work nicely together: when using inline-toml, we could just take the contents of the inner TOML table, and translate them directly to json when servicing geoip requests.

I think going with something similar to how dictionaries work is the way to go ultimately.

Perhaps we should leave a default blob of JSON with the current up-to-date specification and periodically update it just to make local development easier, but allow developers to define their own object to be returned entirely as you described, how does that sound?


@JakeChampion:

  1. Add a test-fixture/app in test-fixtures/src/bin which can be used to shows that the feature is working, this is similar to an integration test - this test would only work for valid fastly.toml configurations, the tests for invalid fastly.toml configurations are handled in the next step

  2. Add unit-tests for parsing the new configuration section of fastly.toml in lib/src/config/unit_tests.rs - The test scenarios to cover would be similar to the already existing inline_toml_dictionary_config_tests

This will be definitely helpful when I get there. I'll update the code to include tests and reach out if I hit a roadblock.


TLDR; My plan now is to rework how the configuration data is passed in the TOML file to be loose and not follow a structure, basically to work similarly to how dictionaries work at the moment. Additionally I'll include a default blob of JSON that returns a currently-valid response just so that developers have an easier time spinning up the code locally.

I'll also address the suggestions by Jake and update the code to include tests.

@aturon
Copy link
Contributor

aturon commented Aug 11, 2022

@bbutkovic That all sounds great to me!

assert_eq!(geo_v4.proxy_description(), ProxyDescription::Unknown);
assert_eq!(geo_v4.proxy_type(), ProxyType::Unknown);
assert_eq!(geo_v4.region(), Some("CA-BC"));
// assert_eq!(geo_v4.utc_offset(), Some(UtcOffset::from_hms(-7, 0, 0).unwrap()));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fails because for negative UTC offset (-700 in the case above) Geo::utc_offset() returns None. I also tested this on C@E by using an IP address that belongs to North America region and found that it yields a None as well.

@bbutkovic bbutkovic marked this pull request as ready for review September 13, 2022 08:21
@bbutkovic
Copy link
Contributor Author

I have a working version right now with changes as requested above and tests added. A few things to note, however:

  • Currently inline-toml entries are added as [local_server.geoip_mapping.contents."127.0.0.1"]. I think we should probably rename the contents bit to addresses because I think it makes more sense in the case above. Thoughts?
  • We can probably do with more tests. I can add more tests when I get a chance, any ideas for additional test cases?
  • At the moment developers will need to define all of the fields they would like to provide for a certain IP address, there is no 'merging' with some sane default values. Can this stay like this?

@aturon
Copy link
Contributor

aturon commented Sep 16, 2022

Hi @bbutkovic, just wanted to acknowledge your updates -- I or somebody else will work to review ASAP!

Copy link
Contributor

@mgattozzi mgattozzi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @bbutkovic I'll be taking over the PR review for Aaron! 😄 Excited to help you get this over the finish line.

Currently inline-toml entries are added as [local_server.geoip_mapping.contents."127.0.0.1"]. I think we should probably rename the contents bit to addresses because I think it makes more sense in the case above. Thoughts?

I think addresses makes more sense here rather than contents. I say make the change!

We can probably do with more tests. I can add more tests when I get a chance, any ideas for additional test cases?

What did you have in mind? I think making sure to test failure modes for the config are good. Maybe a test case to trip the BufferSizeError for a lookup, though the Rust SDK handles this case already so this might not be needed here or might be too finicky to implement without much gain. Overall though I feel you've handled most of the needed testing here imo.

At the moment developers will need to define all of the fields they would like to provide for a certain IP address, there is no 'merging' with some sane default values. Can this stay like this?

My gut instinct is this is fine since the customer will likely want to make distinct values to use and you can't really use default values, but I'll ask to see what others on the team think.

A not so fun legal note about GeoIP, but unfortunately it is trademarked. Therefore we need to replace any reference to that with Geolocation to avoid any legal issues before we can merge this.

Overall this is fantastic work and a very comprehensive PR that I enjoyed reading and reviewing. I've noted a small idiom nit and one ask around defaulting to fastly for all lookups. I'll follow up when I get other's opinions around merging sane default values!

}

pub fn read_json_contents(
file: &PathBuf,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should switch this to be &Path since PathBuf will deref to Path.

pub fn geoip_lookup(&self, addr: &IpAddr) -> String {
match self.geoip_mapping.get(addr) {
Some(geoip) => geoip.to_string(),
None => String::from(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have this instead be a default value in the mapping for 127.0.0.1 and the ipv6 equivalent that can be overwritten in the config. By that I mean when the mapping is first initialized in GeoIPMapping::new() this entry is included for localhost and can be overwritten after when the fastly.toml is read.

The reasoning being that if a customer is trying to test doing an IP lookup for a value that isn't available they will always get a value now which can be confusing since there are cases where on C@E it will return a None value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm just thinking if we could somehow even disable the placeholder for loopback addresses as well? Perhaps we can make it so that it returns the placeholder address only in the case when you aren't reading addresses from JSON or reading them from TOML, effectively making it work only when you use the absolute default geolocation settings. Or perhaps we can have this behind a settings flag in the TOML?

For example, sometimes you may want to test what happens to your code when the IP address of the incoming request yields no result.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bbutkovic am I understanding correctly that you'd want to add an option to remove the default loopback value?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, precisely, if you choose to provide your own resolving through a JSON or TOML you must specify the result for loopback as well, otherwise it will be a "failed" lookup.

Essentially, the default would only work if you don't use TOML or JSON.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah okay! I think what we want is sort of a hybrid approach

  • By default we provide a loopback address default
  • If it's overwritten by a value in the toml/json use that
  • You can disable the default by setting a boolean field in the config under local_server.geolocation_mapping.use_default_loopback to false and this defaults to true

I think this will let us document the behavior better and allow people to opt into turning it off or overriding it! Does that sound good to you?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds good!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! I think this is probably the last thing we'll need. I'll review it whenever you get it in (just give me an @ ping)

@bbutkovic
Copy link
Contributor Author

Hey @mgattozzi thank you for your input! I took care of changing everything to Geolocation, the idiom change you suggested as well as changed contents to addresses.

@bbutkovic bbutkovic changed the title Implement GeoIP mocking Implement Geolocation mocking Sep 22, 2022
assert_eq!(geo_v4.proxy_description(), ProxyDescription::Unknown);
assert_eq!(geo_v4.proxy_type(), ProxyType::Unknown);
assert_eq!(geo_v4.region(), Some("CA-BC"));
// assert_eq!(geo_v4.utc_offset(), Some(UtcOffset::from_hms(-7, 0, 0).unwrap()));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mgattozzi For some reason this line fails. I even tried an IP address in California with Compute@Edge and the result was wrong. Perhaps an issue with Rust SDK?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw this before and I tried digging into it a little bit (not too much mind you), but I'm not sure whether it's a bug in the SDK (I haven't tried our JS one to check) or if it is totally correct. If you're seeing this on C@E then I think we can just leave this as is for now and I can open up an issue on our internal tracker to investigate it more. You can put a small note saying why it's commented out.

@mgattozzi
Copy link
Contributor

@bbutkovic just so you know I'll be on vacation next week so if you do make any updates I'll make sure someone on the team can review it, unless you get the default changes in this week!

@bbutkovic
Copy link
Contributor Author

Hey, @mgattozzi enjoy your vacation!

I managed to wrap this one up, let me know if any more modifications will be needed.

assert_eq!(geo_v4.postal_code(), "12345");
assert_eq!(geo_v4.proxy_description(), ProxyDescription::Unknown);
assert_eq!(geo_v4.proxy_type(), ProxyType::Unknown);
assert_eq!(geo_v4.region(), Some("CA-BC"));
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These region values are the subdivision part only

I was a bit confused seeing geo_v4.region() == "CA-BC" based on the description here https://developer.fastly.com/reference/vcl/variables/geolocation/client-geo-region-ascii. Should the test fixture use "BC" without the "CA-" to better represent the format returned in C@E?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, that's a mistake on my end. I corrected it now.

Thanks for the catch!

Copy link
Contributor

@mgattozzi mgattozzi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @bbutkovic I'm back from vacation and finally have had some cycles to review this. Again thanks for your patience in getting this all working!

I have one last test to ask for here. This PR is stellar phenomenal work on your part. We just need one more test in cli/tests/integration/geolocation_lookup.rs for when use_default_loopback is set to true or not present in the manifest since it defaults to true and test that we provide a value for 127.0.0.1 for people to use. Just to cover all of our bases. Once that's in I'm more than happy to approve and merge these changes!

@bbutkovic
Copy link
Contributor Author

Hey, @mgattozzi, I updated the branch with requested changes to test default config.

@bbutkovic bbutkovic requested review from mgattozzi and removed request for JakeChampion November 9, 2022 12:06
Copy link
Contributor

@mgattozzi mgattozzi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bbutkovic thanks for all the hard work and patience with getting this in. It's an absolutely phenomenal piece of work and one that we greatly appreciate. With the final test in I'm happy to approve this!

@mgattozzi
Copy link
Contributor

@bbutkovic I'm gonna run the test suite now before I merge this and if anything happens I'll let you know 😄

@mgattozzi mgattozzi merged commit 1905d9a into fastly:main Nov 9, 2022
@mgattozzi mgattozzi mentioned this pull request Nov 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement geolocation support
5 participants