Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shorter urls #42

Merged
merged 2 commits into from
May 4, 2013
Merged

Shorter urls #42

merged 2 commits into from
May 4, 2013

Conversation

mk-fg
Copy link
Contributor

@mk-fg mk-fg commented Apr 29, 2013

Instead of this:

http://some.0bin.site/paste/9611846d1c86e549fabe205837d40b780f5efce3#bLilFEQok1qTOzifX2gdvZO1ZoHki1getSt/YnAdAFQ=

Produce this:

http://some.0bin.site/paste/FxsUsOGB#aW8DhwMy

6eec295 changes how paste-ids are generated - instead of base16 hexdigest, base64-like (with "/" replaced with "-" to produce less confusing urls and require no special fs-path conversion) ids of configurable length are used.

Adds new PASTE_ID_LENGTH option:

# length of base64-like paste-id string in the url, max: 27 (length of sha1 digest)
# total number of unique pastes can be calculated as 2^(6*PASTE_ID_LENGTH)
# for PASTE_ID_LENGTH=8, for example, it's 2^(6*8) = 281 474 976 710 656
PASTE_ID_LENGTH = 8

It's currently not mentioned in the doc, but if you see no problem with these commits, I can easily fix this (or you can, if you intend to update fr doc alongside anyway - just one option).

As mentioned in the default_settings.py, with 8 base64 (6-bit) chars, number of possible pastes should be 2^(6*8) = 281 474 976 710 656, which I think should suffice to keep ids unique.

All old pastes should still be accessible under long old-style ids, while new ones will be shorter.

@sametmax
Copy link
Contributor

Hello man, thanks again for this PL. It deserves I spend some time on it. I can do that today as I'm starting a new mission, but I'll have a look this WE.

@sametmax
Copy link
Contributor

Just a quick though: now that key length is, we probably want it to be checked so it's not under a certain size (at least 4 characters since we us it to create a path) and zerobin.py should return a non zero error code with an error message if it detects a too short key length.

@mk-fg
Copy link
Contributor Author

mk-fg commented Apr 30, 2013

Good point indeed, squashed that into 82e2dad.

@sametmax
Copy link
Contributor

sametmax commented May 1, 2013

We must somewhere else to put the check, the runserver command is only called if you don't use a personable WSGI wrapper. get_app() feels more appropriate, with should raise an excecption, then catched by runserver which display the massage and exit. This way we separate programmatic running (and allow a custom WSGI setup) and automaticly setup scripted running.

@mk-fg
Copy link
Contributor Author

mk-fg commented May 1, 2013

Ah, true, moved the thing to get_app(), also adding "settings" keyword there to be able to pass custom settings object, while leaving existing settings_file interface intact as well. I actually run 0bin from uwsgi as well.

Currently it doesn't mean much, as both runserver() and get_app() use module-global "settings" var, but if they'll ever get separated, passing pre-populated (using cli) settings like that might be useful, and I thought it might also be useful for a general wsgi apps.
Not sure if having second keyword there for a somewhat similar and maybe too liberal interface there might be too much though, but two functions communicating via global var just doesn't look right ;)

@xyb
Copy link

xyb commented May 2, 2013

This is a great idea, but I perform Crockford's Base32 encoding which is more human readable in case of you write it down.

@mk-fg
Copy link
Contributor Author

mk-fg commented May 2, 2013

I didn't really had the "write down on paper" use-case in mind, picking base64 because it includes most of the url-allowed ascii chars and is readily available (e.g. via str.encode('base64')).

For 8 chars, using 5-bit base32 still shouldn't make a difference, I'd question whether it's a relevant use-case to optimize for though - imho not, especially since you can still write down base64 with just a bit more effort to make upper/lower-case distinguishable.
Otherwise, I'd at least tweak base32 to be all-lowercase, which is more readable and less annoying than all-caps ;)

Actually, there's an interesting side-effect to both non-base16 encodings - there's a one-in-a-zillion chance your paste url will be "F-CKY0U#M0RON" ;)

@sametmax
Copy link
Contributor

sametmax commented May 4, 2013

A friend of mine pointed out that we should not use the hash to generate the id, but rather a uuid to avoid attacks base on hash prediction.

@mk-fg
Copy link
Contributor Author

mk-fg commented May 4, 2013

More details on what do you mean by "hash prediction attacks" would be nice, I assume attack here is to guess paste-id without having the URL?

0bin doesn't just use "hash of the plaintext" - it generates (a) random key which is added to (b) random nonces (pbkdf2, ccm), (c) hashed, (d) passed through aes along with the (e) contents and then all that is hashed to produce paste-id.

So:

  • Even having the key and the plaintext attacker still can't get the paste-id.
  • Having paste-id, to match some plaintext against it, attacker would still have to bruteforce the key and all the nonces.
  • With the complete ciphertext, attacker can get the paste-id, but then it'd be also enough to look at the browser URL, headers of the request or html, not sure why that would be bad.

Using e.g. os.urandom() to add server's randomness to that won't hurt, I guess, but find it hard to see why it'd be necessary.

@sametmax
Copy link
Contributor

sametmax commented May 4, 2013

Currently we don't check if a paste exist, and somebody could override a paste with that. So first we should check for collision (which I didn't do because I though about randomness not security). Secondly, having a non predictable name is always nice.

Anyway, it's not the purpose of this PL so let's not worry about it.

I may open a ticket for that or just fix it by myself.

@xyb : can you open a ticket for this as well, so we can debate about base32 ?

In the meantime, will accept the PL.

sametmax added a commit that referenced this pull request May 4, 2013
Shorter urls. Thanks again for this piece of work to the author.
@sametmax sametmax merged commit 759bb82 into Tygs:master May 4, 2013
@mk-fg mk-fg deleted the shorter_urls branch May 4, 2013 14:19
@mk-fg
Copy link
Contributor Author

mk-fg commented May 4, 2013

My point is that "somebody could override a paste with that" after several trillon paste submissions - sure, it's bad that "somebody could", but given the odds, I'd rather bet on lighting striking "somebody" before that happens ;)

@xyb xyb mentioned this pull request May 5, 2013
@xyb
Copy link

xyb commented May 5, 2013

@sametmax done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants