Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTTPS by default #34

Closed
konklone opened this issue Mar 17, 2014 · 14 comments
Closed

HTTPS by default #34

konklone opened this issue Mar 17, 2014 · 14 comments

Comments

@konklone
Copy link

HTTPS is already configured at https://api.data.gov. So, api.data.gov should make it the recommended and default protocol.

I'm sure the team here already understands the various benefits of HTTPS, and I don't mean to preach to the choir, but here are a few:

  • Broader client-side compatibility. Any APIs under api.data.gov wishing to support CORS or JSONP for in-browser operation will be completely blocked on HTTPS websites, as active mixed content.
  • Enhanced privacy for apps and users using APIs at api.data.gov -- HTTP headers and query string parameters (among other things) will be encrypted. For APIs accepting particularly sensitive data (like lat/long or zip), this is critical.
  • The primary benefit of HTTPS: a stronger guarantee that you're speaking with the real api.data.gov.

Much of the performance hit involved in SSL can be mitigated - @igrigorik's excellent istlsfastyet.com contains many references for doing this.

There are a few ways this issue could be tackled:

  • Update the documentation to always use https:// in the examples.
  • Update all links to api.data.gov to use https://.
  • Begin redirecting all http:// requests to https:// for the main api.data.gov website.
  • Begin redirecting all http:// requests to https:// for any contained API.
  • Set a HSTS header that tells clients to always use https:// going forward.
@GUI
Copy link
Member

GUI commented Mar 18, 2014

Great suggestion. At launch we didn't have HTTPS available on this api.data.gov subdomain, so a lot of this stems from that, but I agree HTTPS should be the default. I'll get the documentation updated soon.

However, I think HTTPS redirection might be slightly more problematic to introduce for existing APIs. While technically HTTP redirects should be followed by clients, I've experienced my share of problems with "dumb" clients not following redirects, so introducing them to an API can cause things to break. This is never really a problem in web browsers that properly follow the location headers, but if you're fetching an API using a lower-level HTTP library (as is relatively common for APIs), you often have to explicitly account for redirections in your code. You could definitely argue that redirects should be handled by a proper client, but in my experience that often doesn't happen, and by starting to return redirects I'm just afraid we'll break existing API usage.

If you have any experience introducing HTTP redirects for APIs, though, I'd love to hear about it (maybe it's not as big of a deal as I think it might be). Or if you, or anyone else, has more general thoughts this potential redirection issue, those would be great to here. If everyone thinks tackling forced HTTPS is important enough, one options is to try and tackle this migration with all our current users (notify them of the upcoming change, give them ample time to change their code, give them more warnings, and then finally roll out the change at a later date).

Thanks for bringing this up!

@konklone
Copy link
Author

Actually, I had the exact same concern about redirection that you've expressed here - when I announced HTTPS by default in January for an API I manage in my work, I didn't include HSTS or forced redirection, and we continue to support HTTP. This is primarily because I wanted to make that announcement right away, and so I put off tackling the harder aspects.

It's hard for me to know how difficult it is for api.data.gov to migrate its users, but maybe we can trade experiences together here. I'm recording the user agent of all requests, and just in the last few minutes started capturing the protocol of all requests in propublica/sunlight-congress@bb86221. That'll help me get a sense of how much http: traffic there is, and what API keys among those are using user agents other than web browsers.

Over the last couple months, I've tried to bend the curve of http: traffic downwards by converting our own applications that dogfood the API to exclusively make HTTPS queries. I also went and filed pull requests on the most popular Python, Ruby, and Node libraries that people use to integrate with our API, all of which were merged and released. (I actually just went and filed a PR with a PHP library a few minutes ago.)

I've also just trolled around Github's code search looking for domain references, and filed random pull requests updating the protocol wherever it looked like the code was in use. There aren't too many for api.data.gov, though of course they could be referenced in private repos and indirectly. On that note, I could probably go further by searching for people who are using outdated versions of the the main client libraries I pushed to use https:.

Finally, I could also announce a migration window before we disable HTTP support, and send an email out a warning to all API keys that have hit our API with an http: protocol and a non-browser user-agent over some span of time.

Hopefully me vocalizing my inner monologue provides some ideas for what to do next -- I'm game for pushing a migration window if you are. 😸

@konklone
Copy link
Author

Yesterday, I sent out the below email to ~170 people who had made unencrypted requests in a 3-day span to our Congress API.

Hi,

This is the Sunlight Foundation!

We've noticed that in the last few days, your Sunlight Foundation API key has been making unencrypted requests to our Congress API.

We have an encrypted (https://) endpoint that we would like you to use instead. We are considering disabling or redirecting unencrypted use of the API in the future.

We feel strongly that usage of our API should be encrypted, so as to protect the confidentiality and privacy of you and your users. We recently explained why we feel strongly about this on our blog.

Fixing this is likely to be very simple:

  • If you use a helper library to use our API, check to see if they've shipped an updated version that uses our HTTPS endpoint. If so, update your version of that library. If not, request that they do so, or make the change yourself in your copy.
  • If you hit our API directly, without a helper library, toggle http:// to https://. In most contexts, this is all you need to do.

If you have any issues addressing this, you can reply to me directly, or write to our API discussion list. I'm happy to answer any questions that may come up.

Thanks,
Eric Mill

(Actually, about half of them got a more strongly worded variant, if their requests were sending user lat/longs or zip codes.) And this was just from usage in the last 3 days, as that's the size of our rolling analytics window at the moment -- I'm working on expanding it. There are probably more than 170 keys making unencrypted requests.

I've fielded ~25 responses from users since then, most of which were "Oh! Thanks, I fixed it" or "Oh! Thanks, I'll fix it". At least one of them was using the PHP client that I successfully nudged the week before to ship an update using HTTPS. For many others, it was just a one-character change. A few people had no idea they were even using our API, but either changed it or didn't care if their use was cut off.

No one complained! The quick, positive response makes me feel good about:

  1. Sending another blast from a longer analytics window, to people not yet emailed
  2. Announcing a "we're going to force HTTPS" window for upgrading.

No. 2 is something to talk over more here before actually doing it. But I feel good about the whole thing so far.

@konklone
Copy link
Author

This thread was the inspiration for some of the content that @nacin wrote up for this project: https://https.cio.gov/apis/

As api.data.gov migrates, feeding some of the experience back into that page, managed for now at whitehouse/https, would be awesome for other agencies looking to do something similar.

@GUI
Copy link
Member

GUI commented Mar 23, 2015

I saw that API page when the HTTPS site got released and thought it sounded vaguely familiar to our conversation here. :) But awesome work to everyone involved in the HTTPS standard--it looks really great.

But as an extremely belated followup, we are finally planning to at least tackle part of this for api.data.gov. In the next couple weeks we're planning to rollout functionality that will force all new APIs we host to HTTPS (and you can thank the HTTPS standard site going live for jogging my memory). Even though I've been trying to informally encourage all new users to use HTTPS, it easily gets missed as soon as people start exchanging links with HTTP in them. This should at least help us stem the tide of non-HTTPS APIs and ensure all new agencies moving forward on api.data.gov will be HTTPS-only (I just wish we had tackled this earlier). I may also try to implement something to try to ensure that new API keys that are signed up for are only used via HTTPS, even for existing APIs (but I'm less sure about how feasible that will be to implement).

Then the second part of this task becomes migrating existing APIs to HTTPS-only. As we've discussed, that's a little more complicated and will require a little more coordination/communication, but it's certainly doable. But we at least wanted to start with the simpler task of ensuring new clients are HTTPS-only, and then we can begin to work with existing agencies on potential migration.

And yeah, I'd definitely be happy to share our experiences and things we learn.

Thanks again for everything you're doing to push HTTPS forward, and sorry for the super-long delay in following up!

@konklone
Copy link
Author

In the next couple weeks we're planning to rollout functionality that will force all new APIs we host to HTTPS (and you can thank the HTTPS standard site going live for jogging my memory).

Awesome.

I may also try to implement something to try to ensure that new API keys that are signed up for are only used via HTTPS, even for existing APIs (but I'm less sure about how feasible that will be to implement).

That's fascinating. If that is feasible, that's exactly the kind of case study I'd like to document.

Then the second part of this task becomes migrating existing APIs to HTTPS-only. As we've discussed, that's a little more complicated and will require a little more coordination/communication, but it's certainly doable. But we at least wanted to start with the simpler task of ensuring new clients are HTTPS-only, and then we can begin to work with existing agencies on potential migration.

Absolutely. I don't underestimate the challenge. The kinds of steps you think of will also be useful.

Thanks again for everything you're doing to push HTTPS forward, and sorry for the super-long delay in following up!

Thank you for being so supportive of the transition!

@GUI
Copy link
Member

GUI commented Mar 30, 2015

This has been rolled out to production. Any new API backends we setup will default to HTTPS-only access. All existing APIs continue to be accessible over HTTP or HTTPS, but I've introduced a couple new "transitionary" options that can be applied to individual API backends to hopefully aid in eventually switching these over to HTTPS.

The transition options allow for any existing API keys to continue accessing APIs over HTTP (to prevent breaking usage for current users). However, any new API keys created after the transition mode is put into place on an API will be forced to use HTTPS. As a separate task, I think we can talk with existing agencies about at least putting their APIs into this transition mode. It should be a very safe option, since it doesn't affect existing users, the only reason I didn't default everyone to this already is that I think we should talk to agencies about this first and help ensure that their docs are using HTTPS links. This will just ensure new users have a smooth experience and don't encounter docs referencing insecure HTTP links, which would results in possible errors for the new users.

When picking your API's HTTPS requirements, you can also pick whether to redirect http users to https, or whether to return a straight-up error to http users. While redirects would technically allow for a smoother transition in some cases, there's a few reasons I think redirects aren't ideal for API usage. So for new APIs, I'm defaulting to an error message for HTTP access that instructs the user to use an HTTPS link instead.

Here's the help we provide in the admin that explains the options and some of the reasoning in a bit more detail:

Choose whether HTTPS is required to access this API. HTTPS is encouraged to protect the API keys.

  • Required & return message: HTTPS is required to access the API. HTTP requests will return an error message instructing the user to use an HTTPS URL instead. This is the recommended and default strategy.
  • Required & return redirect: HTTPS is required to access the API. HTTP requests will return a redirect to the HTTPS URL.
  • Transitionary & return message: New API keys that signup after choosing this setting will be forced to use HTTPS. Existing API keys may continue to use either HTTP or HTTPS. New API keys using HTTP will return an error message instructing the user to use an HTTPS URL instead.
  • Transitionary & return redirect: New API keys that signup after choosing this setting will be forced to use HTTPS. Existing API keys may continue to use either HTTP or HTTPS. New API keys using HTTP will return a redirect to the HTTPS URL.
  • Optional: HTTPS is optional and either HTTP or HTTPS may be used.

Notes on redirects:

  • Not all API clients will automatically follow redirects, so be careful if using a redirect strategy for existing APIs (since existing calls may break).
  • If API clients rely on the redirect for HTTPS access, this strategy does not secure the API keys, since the client may still be making an insecure initial HTTP request with their API key.
  • For GET requests a 301 Moved Permanently redirect will be returned. For all other HTTP methods a 307 Temporary Redirect redirect will be returned (to instruct the client to retry using the same HTTP method).

As I mentioned, I think we still have some work to do to transition existing agencies to HTTPS, but I think we can tackle that as a separate issue. Let me know if anyone has any other thoughts on any of this or on some of the implementation details.

Implemented in NREL/api-umbrella-gatekeeper#10, NREL/api-umbrella-web#9, and NREL/api-umbrella-router#5

@konklone
Copy link
Author

This has been rolled out to production. Any new API backends we setup will default to HTTPS-only access.

👏 👏 👏 👏 👏 👏 🔒 🔒 🔒 🔒 🔒 🔒 🎊 🎊 🎊 🎊 🎊 🎊

That's so great, @GUI! And this is thorough work.

All existing APIs continue to be accessible over HTTP or HTTPS, but I've introduced a couple new "transitionary" options that can be applied to individual API backends to hopefully aid in eventually switching these over to HTTPS. The transition options allow for any existing API keys to continue accessing APIs over HTTP (to prevent breaking usage for current users). However, any new API keys created after the transition mode is put into place on an API will be forced to use HTTPS.

That's so smart. It effectively means that the "surface area" of clients that APIs need to push to migrate can be immediately capped, even as the actual transition process might drag out over the long term.

As a separate task, I think we can talk with existing agencies about at least putting their APIs into this transition mode. It should be a very safe option, since it doesn't affect existing users, the only reason I didn't default everyone to this already is that I think we should talk to agencies about this first and help ensure that their docs are using HTTPS links. This will just ensure new users have a smooth experience and don't encounter docs referencing insecure HTTP links, which would results in possible errors for the new users.

Makes total sense.

When picking your API's HTTPS requirements, you can also pick whether to redirect http users to https, or whether to return a straight-up error to http users. While redirects would technically allow for a smoother transition in some cases, there's a few reasons I think redirects aren't ideal for API usage. So for new APIs, I'm defaulting to an error message for HTTP access that instructs the user to use an HTTPS link instead.

Also makes total sense.

As I mentioned, I think we still have some work to do to transition existing agencies to HTTPS, but I think we can tackle that as a separate issue. Let me know if anyone has any other thoughts on any of this or on some of the implementation details.

👍 Feel free to tag me on any other issues you open around this. I'll be trying to translate api.data.gov's transition experience into recommendations for other agencies.

@nacin
Copy link

nacin commented Mar 30, 2015

@GUI, this is fantastic. I love the idea about using API keys to control HTTPS-only access, this is so well done.

And now for a rant about redirects. :-)

I think redirects should be avoided if at all possible. https://https.cio.gov/apis/ goes into very specific detail as to why it's best to simply not use them, so I won't rehash what's already there. I would very much hesitate making it an option — I don't think it's a decision that can easily be made by choosing from a menu of options.

I do think there are limited situations where a redirect can be OK. Broadly, that'd be CORS, JSONP, etc. Narrowly, I do agree you can redirect read-only APIs if you're willing to accept the risk, as you basically lay out here. But I wouldn't want to emphasize that — it's more likely than not too risky. Any 30x is going to break some clients that aren't following redirects; any HTTPS URL might break some clients that aren't configured properly for HTTPS; a 307 is going to break clients that don't understand it; etc. I know this is stated in the notes (Is it worth linking these notes to https://https.cio.gov/apis/?), but I'd suggest even greater de-emphasis on redirects as a sub-par solution compared to the other options.

The nuance here is why https://https.cio.gov/apis/ simply says "don't use them" – it's easier to have blanket guidance, the same way the HTTPS-only proposal does not make exceptions for so-called non-sensitive content.

I would also strike the "Transitionary & return redirect" option as, as you noted, it's inherently insecure. If their HTTPS-only API key never works over HTTP (as in, returns an error instead of a redirect), then they won't accidentally do something that leaks the API key over HTTP (perhaps only development when trying it out).

@nacin
Copy link

nacin commented Mar 30, 2015

Also — I would love to expand https://https.cio.gov/apis/ to talk more about API keys and specifically the tactics outlined here.

@konklone
Copy link
Author

Also — I would love to expand https://https.cio.gov/apis/ to talk more about API keys and specifically the tactics outlined here.

Me too! While we should probably avoid changing https.cio.gov much during the comment period, I know I'm making a private branch or two with some other work I think should be added to the site after the comment period closes. We can share with each other over ngrok or whatever in the meantime, too.

@GUI
Copy link
Member

GUI commented Mar 30, 2015

@nacin: Thanks for the feedback! I agree with everything you've said about redirects. I added the redirect support with the general thought that it might theoretically help as a transition path towards HTTPS for specific APIs, but the more I think about it, and after reading your comments, I think you're right--redirects don't really help matters much, and probably just confuse things.

We could certainly remove the redirect options altogether at this point. Can anyone think of any reasons to keep either redirect option (required or transitionary) at all?

If we do want to remove those options, it would also probably simplify the options for admins and prevent them from taking a path that we don't really want to encourage. If we remove things, the new simplified options would be:

  • Required: HTTPS is required to access the API. HTTP requests will return an error message instructing the user to use an HTTPS URL instead. This is the recommended and default strategy.
  • Transitionary: New API keys that signup after choosing this setting will be forced to use HTTPS. Existing API keys may continue to use either HTTP or HTTPS. New API keys using HTTP will return an error message instructing the user to use an HTTPS URL instead.
  • Optional: HTTPS is optional and either HTTP or HTTPS may be used.

@konklone
Copy link
Author

Can anyone think of any reasons to keep either redirect option (required or transitionary) at all?

The only reason I can think of would be an API with heavy existing CORS/JSONP usage. However, you can address that more straightforwardly by adding an HSTS header to api.data.gov.

And I think you can add that HSTS header pretty much any time, as browsers are the only API clients that will obey HSTS headers, and by obeying them they should Just Work and only request HTTPS.

@konklone
Copy link
Author

However, you can address that more straightforwardly by adding an HSTS header to api.data.gov.

...and to APIs using custom domain names, as long as those custom domain names have HTTPS enabled.

GUI added a commit to NREL/api-umbrella that referenced this issue Feb 20, 2017
The redirect options were never actually implemented when we
transitioned to the Lua-based internal in v0.9.0. However, they were
left in the interface, but were non-functional.

Since we had been discussing removing the redirect options since as soon
as we added them
(18F/api.data.gov#34 (comment)),
this goes ahead and removes these broken options, since they haven't
been functional since v0.9.0.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants