Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

encode user_id when using it in urls #296

Closed
luisrudge opened this issue Aug 29, 2018 · 19 comments · Fixed by #374
Closed

encode user_id when using it in urls #296

luisrudge opened this issue Aug 29, 2018 · 19 comments · Fixed by #374
Labels
bug This points to a verified bug in the code

Comments

@luisrudge
Copy link
Contributor

No description provided.

@mmaddex
Copy link

mmaddex commented Aug 29, 2018

Given that this would be a breaking change would we ever consider something like adding an uriEncodeIDs or uriEncodeUserIDs (I'm not sure if any other IDs ever require url escaping) option the the ManagementClients constructor that could default to false?

@luisrudge
Copy link
Contributor Author

I don't think that makes sense. If you're adding a parameter, just encode the value yourself. We'll fix this in the next major.

@davidstrauss
Copy link

I don't think that makes sense.

Are you referring to the suggestion from @mmaddex or the need to encode values prior to using them in a URL? The fact that the API client library doesn't properly escape values prior to using them in a URL is both a bug and, possibly, a security issue.

@luisrudge
Copy link
Contributor Author

luisrudge commented Sep 2, 2018 via email

@davidstrauss
Copy link

davidstrauss commented Sep 3, 2018

Why do you think it’s a bug

You're breaking abstraction.

  • A consumer of the library shouldn't need to know or care that the library is using a parameter (without proper escaping) in a URL.
  • In a ticket, we reported that we had to add escaping to use a user_id that your own SAML gateway generated. It appears that you're using base64 encoding when your SAML gateway encounters a persistent ID that contains a structured value (rather than a single string). So, certain values Auth0 is producing for the user_id are unusable with your API library without a workaround. This caused weeks of downtime for us for a customer's SAML gateway until Auth0 support suggested that we implement the bizarre workaround of URL-encoding the value before sending it to the API library.
  • If the API changes the backend transportation (e.g. sending the user_id as a form value or via something like Protocol Buffers), then a library user escaping the data expecting use in a URL will suddenly find their implementation broken.
  • Accordingly, by not properly escaping the data, Auth0 can never change how the user_id is used in other APIs by this library.

or a security issue?

The user_id value is often derived from external sources like SAML gateways. It's often used directly by Auth0 if already a string. So, it's not a value that can be trusted to not be malicious. Accordingly, it must be properly escaped before use in an API. Your current implementation apparently allows the value to go into API requests without escaping delimiters like slashes. It's the REST equivalent of an SQL injection.

I suspect that a user of the library that doesn't URL-encode user_id values before using them (with this library) may allow a malicious user_id collision. In any case, Auth0 is a security product; I shouldn't need to prove a vulnerability to get unescaped data in API requests fixed.

How do you propose that we change this without breaking every one that is already successfully using this library?

What you're doing isn't made right by any quantity of existing usage. A broken implementation isn't justified by being operational 99% of the time, especially when it's a potential security issue like this. It's Auth0's job, as a vendor, to fix this, even if it's hard. I'm not fond of how you're suggesting it's my burden -- as a customer, FOSS developer, or security engineer -- to find a solution to a problem that Auth0 created.

If adding proper URL escaping breaks existing implementations, it's a sign that the implementations and library were already mishandling data.

@luisrudge
Copy link
Contributor Author

A consumer of the library shouldn't need to know or care that the library is using a parameter (without proper escaping) in a URL.

I fully agree with you.

Your current implementation apparently allows the value to go into API requests without escaping delimiters like slashes. It's the REST equivalent of an SQL injection.

I'm asking our security team to go over this so we can assess the security implications of this bug.

I'm not fond of how you're suggesting it's my burden -- as a customer, FOSS developer, or security engineer -- to find a solution to a problem that Auth0 created.

I'm sorry you feel this way. That wasn't my intention at all. I was merely asking for any ideas you might have on how to fix this without breaking everyone else that is already encoding the parameter.

If adding proper URL escaping breaks existing implementations, it's a sign that the implementations and library were already mishandling data.

Again, I agree with you. That's why this issue is still open. Rest assured this will get fixed. While I don't have any estimates on this, I'm sure we'll get this sorted out in the near future.

@rikaardhosein
Copy link

rikaardhosein commented Sep 3, 2018

@davidstrauss Your other points aside, if I'm understanding you correctly, this is not a security issue. From a security perspective, it is never the responsibility of an untrusted client to send properly encoded, escaped data to the server. The burden is on the server to properly sanitize incoming data before using it.

If a vulnerability exists here, then there's nothing stopping someone from just not using the client library and exploiting it anyway.

Edit: After thinking about it some more, I might have misunderstood you. I can see how this can lead to some weird cases on the client side.
Are you talking about a scenario like:

Completely hypothetical -: endpoint does not exist
/api/users/rikaard/sensitive_information

where rikaard/sensitive_information was the user_id returned

@davidstrauss
Copy link

From a security perspective, it is never the responsibility of an untrusted client to send properly encoded, escaped data to the server.

I agree, but I'm not sure how this applies to the situation. A "rule" in Auth0 ought to be able to trust this library to properly escape data before it reaches an HTTP request to the API. Rules may handle user ID values derived from something like a SAML assertion. A rule may then call the API (via this library) to read or update data for the user in the process of logging in. Because this library doesn't properly escape the value, it may be possible to trick the rule into updating a different user by sending a user ID with a URL delimiter or escaping that will decode to a different user ID.

Here's a case where it would definitely be a security issue. Let's say we've annotated the user profile with fields listing additional constraints to that user's authentication. Perhaps it's a 2FA or region requirement. If the user profile isn't found because of lack of escaping, we may not apply those constraints. We also couldn't just refuse login on lack of a profile, as first-time logins may not have one yet.

@rikaardhosein
Copy link

Yeah, I get you now. I misunderstood where you were coming from at first. I can definitely understand how this can be a security risk for people using the node-auth0 library since it goes against expected behaviour. We'll work on getting this fixed. Thank you!

@davidstrauss
Copy link

davidstrauss commented Sep 3, 2018

I'm sorry you feel this way. That wasn't my intention at all. I was merely asking for any ideas you might have on how to fix this without breaking everyone else that is already encoding the parameter.

Thank you for clearing up what you're requesting, but please consider how you present such requests in the future. Combined with asking if it was even a bug, it was hard to read it as anything other than resistance to addressing the issue.

Here is what I would suggest:

  1. Introduce a new, optional parameter to library functions that accept userId values. I'd suggest a Boolean that's something like escapeUserId and defaults to false (for the transition). When the value is true, the library should escape the data before using it in API requests. This would allow us to transition our code with the current workaround to something supportable for longer.
  2. For calls with escapeUserId == false, log warnings when the library receives unsafe characters in userId arguments. Perform outreach (like with the Lock widget issue) to customers that get these warnings. Two types of customers will get the warnings. The first type are ones sending in sensitive characters without realizing it. The second type are ones like us who are already escaping. Naturally, escaping also creates characters that set off the alarm. The customers who are already escaping would update their code to not escape but set escapeUserId == true. Customers sending in sensitive characters without realizing it may need to see if they have other issues to solve involving corrupted userId values, duplicate profiles, or information disclosure.
  3. Switch to a default of escapeUserId == true. Assuming outreach was successful, this shouldn't break customer code. I would have the library log warnings for any invocations with escapeUserId == false, as users really shouldn't be doing that, and it would only occur for code invoking the library intentionally that way.
  4. Retire the escapeUserId parameter on the next major library release. Most users probably won't even notice unless they were involved in the outreach from part two.

@luisrudge
Copy link
Contributor Author

@davidstrauss Thanks for the thoughtful suggestion. I'll keep it in mind when fixing this.

I also talked with a couple of coworkers and we had another idea as well. I'll work with the data team tomorrow to find if we have any users with a % in the user_id property. I don't think that is the case. If we don't have any users with that character, I think we can just add a check like this:

if (!user_id.includes('%')) {
  user_id = encodeURIComponent(user_id);
}

If this works, there's no extra overhead for anyone already encoding the user_id. In the next major, we can just document that we'll always encode it by default in the changelog.

If it doesn't work, I'll use something similar to your suggestion so we move forward with the fix.

Thanks for the patience!

@davidstrauss
Copy link

davidstrauss commented Sep 4, 2018

@luisrudge Activating encoding only when there's no percentage sign would cause slashes and other special characters to remain (potentially) exploitable. A malicious userId would simply need to contain a % to avoid getting further escaping. I could send in a userId like anotherUserId/donotescapeme%23, and the library would cause this library + API to process it as anotherUserId.

@davidstrauss
Copy link

To provide a more actionable alternative, I'd encourage escaping when the provided userId is found to contain a sensitive character like /. It's the slashes, specifically, that allow a lot of control to create malicious (or even accidental) collisions.

if (user_id.includes('/')) {
  user_id = encodeURIComponent(user_id);
}

Slashes may not be the only character that allows tricking rules into accessing the API in a way that reads or manipulates a different user than the active one, though.

@luisrudge
Copy link
Contributor Author

@davidstrauss sorry, I forgot to update you on this.

A malicious userId would simply need to contain a % to avoid getting further escaping

I've been talking with the security team and co workers and we got to the same conclusion.

Slashes may not be the only character that allows tricking rules into accessing the API in a way that reads or manipulates a different user than the active one, though.

I'll evaluate this with the security team as well.

My idea for a minor version was to decode the user_id recursively until it's a "raw" string again (to prevent an attacker for encoding a string twice, for example) and compare the end result with the provided user_id.

  • If they match, then I can encode it safely
  • If they don't match, then it's probably already encoded, so we leave it as is

Then, in the next major, I remove that fix, always encode the param and document the change.

I think that's going to work. I'll write some test cases tomorrow to make sure I'm not missing anything

@davidstrauss
Copy link

davidstrauss commented Sep 4, 2018

If they don't match, then it's probably already encoded, so we leave it as is

A malicious userId of anotherUserId/donotescapeme%23 would pass this test and be considered "already encoded."

If you're looking for a way to test for the string being already encoded, I would count the percent signs in the provided userId, run a test URL encode on it, count the percent signs in the encoded result, and compare the counts. If the test encoding has a greater number of percent signs, it means that there are still characters that ought to be escaped. If the string is already URL-encoded or otherwise safe, then the count will not go up, as a second round of encoding won't increase the number of percent signs.

@luisrudge
Copy link
Contributor Author

@davidstrauss Thanks again for all your patience on this.

After pairing with @rikaardhosein, we came up with this solution:

function containsUnsafeChars(s) {
  const safeChars =
    "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ$-_.+!*'(),%";
  return !!s.split("").find(c => !safeChars.includes(c));
}

const maybeDecode = url => {
  if (containsUnsafeChars(url)) {
    return encodeURIComponent(url);
  }
  return url;
};

test("encodes regular user_id", () => {
  const normalId = "auth0|1234";
  expect(maybeDecode(normalId)).toBe("auth0%7C1234");
});
test("encodes a malicious user_id", () => {
  const maliciousId = "anotherUserId/secret";
  expect(maybeDecode(maliciousId)).toBe("anotherUserId%2Fsecret");
});
test("encodes a malicious user_id with a percentage", () => {
  const maliciousId = "anotherUserId/secret%23";
  expect(maybeDecode(maliciousId)).toBe("anotherUserId%2Fsecret%2523");
});
test("does not encode an already encoded user_id", () => {
  const maliciousId = "auth0%7C1234";
  expect(maybeDecode(maliciousId)).toBe("auth0%7C1234");
});

This mitigates the risk of allowing a malicious id to hit a different endpoint. This is our proposed short-time solution for this issue (until a new major version is released).

We still need to do some internal review of this, but I wanted to be transparent on what are our plans for this and also gather any feedback you might have on this solution. Thanks again!

@davidstrauss
Copy link

davidstrauss commented Sep 5, 2018

@luisrudge Thanks for continuing to follow up. I believe the proposed solution fixes the primary issues I raised.

There remain ways to manipulate the encoding behavior, but I think this gets things to the the best possible place without altering the library's API. Most importantly, it prevents an attacker from supplying a user_id that will load or change data for an arbitrary different one (for common usage patterns of this library in rules and elsewhere). Creating a malicious collision would likely require controlling both user_id values now.

@luisrudge luisrudge added the bug This points to a verified bug in the code label May 23, 2019
@luisrudge luisrudge mentioned this issue Jul 8, 2019
1 task
@luisrudge
Copy link
Contributor Author

Released in 2.18.1

@davidstrauss
Copy link

Released in 2.18.1

Thank you for your work here!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This points to a verified bug in the code
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants