Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is it safe to change the alphabet used? #15

Closed
jazbek opened this issue Jan 25, 2017 · 3 comments
Closed

Is it safe to change the alphabet used? #15

jazbek opened this issue Jan 25, 2017 · 3 comments
Labels

Comments

@jazbek
Copy link

jazbek commented Jan 25, 2017

We are hoping to use this for creating order IDs that can't be guessed, and are finding that using mixed case is not ideal for a couple reasons.

If we take out all the uppercase letters, would there be any issues that arise with regard to uniqueness?

@ocram ocram added the question label Jan 25, 2017
@ocram
Copy link
Contributor

ocram commented Jan 25, 2017

Thanks, very good question!

Changing the alphabet is fine. But you have to remember four things:

  • Change it once and stay with that modified alphabet forever. Obviously, if you change the alphabet again later, or if you somehow "lose" your alphabet, all IDs transformed so far will become invalid, i.e. pointing to the wrong resources.
  • Shortening the alphabet makes your transformed IDs longer.
  • Extending the alphabet makes your transformed IDs shorter. You don't have to worry about this if obfuscation is all you need, of course.
  • Your alphabet must never contain duplicate characters.

And when using this library in general, you have to keep in mind that the ID transformation is "secure" only in the sense that it's "security through obscurity". If there is a "determined guesser", they will probably find a way to reverse the transformation. Against a layperson, however, that transformation will easily be enough as a defense.

In addition to that, an attacker will still be able to observe whether an ID is short (e.g. ms7) or long (e.g. m29slqmdf8), where the first one is a smaller number and the second one is a larger number, no matter what alphabet you're using. If you want to get rid of that property as well, and if you happen to work in PHP, you can use PHP-IDs, which is an extension of this library here.

After all, if you need to make guessing even harder, try non-sequential IDs such as UUIDs.

Does this help?

@jazbek
Copy link
Author

jazbek commented Jan 25, 2017

This definitely helps, thanks for the detailed response! FWIW we are using mysql short UUIDs and then encoding them with this library.

We need the IDs to be easy(ish) to type using a touchscreen keyboard, so that's why we're not using just regular UUIDs. Unfortunately we've found that both the short UUID and the resulting encoded string is sequential, but I think now that we're throwing letters into the mix, the average person won't be able to guess, which is all we need.

Thanks again!

@ocram
Copy link
Contributor

ocram commented Jan 25, 2017

the average person won't be able to guess, which is all we need

Definitely not! If that's really all you need, you should be fine.

All in all, that seems like a pretty reasonable solution for your use case. Especially in combination with MySQL's UUID_SHORT(), this will work well.

Just make sure that the implementation that you're using supports 64-bit (unsigned) integers. I don't know which (language) version you're using from this repository, but not all do support that out of the box. Some have 32-bit support only. If you're unsure, try a single large number (outside the 32-bit space), or open an issue here.

@ocram ocram closed this as completed Jan 25, 2017
@ocram ocram mentioned this issue Jan 30, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants