Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't try and hash if the data is empty #17

Merged
merged 4 commits into from
May 4, 2016

Conversation

palfrey
Copy link
Contributor

@palfrey palfrey commented May 4, 2016

I ran into a situation where I tried to push in a hash of a zero-length value, and instead of erroring, it paniced! This PR fixes the problem, but my Rust is still not very good so I haven't figured out how to write the test correctly. If anyone has any thoughts about how to fix the commented out code, that'd be much appreciated.

@abonander
Copy link
Owner

abonander commented May 4, 2016

Using FromBase64Error::InvalidBase64Length in this case is a bit misleading because it doesn't mean what you want it to mean: base-64 encoded strings are required to have a length which is a multiple of 4, or be padded with = to the next multiple of 4. rustc_serialize considers both "" and "===" (any number of ='s, actually) to be valid base-64 strings for 0 bytes.

Making this clearer while not panicking would really necessitate creating a new error enum wrapping FromBase64Error with a variant describing an empty string error. Unfortunately, changing the return type of a method is a breaking change and I stupidly released this as 1.0.0 before it was really battle-tested and before any of its dependencies reached 1.0.0 (which is an important milestone for creating a release version, I've come to realize).

So the two options (three in actuality but I'm very hesitant for the last one) would be:

  1. Document the panic case for the method and do it early on an assert so we can provide a more informative error message. This still panicks but doesn't change the behavior in a breaking way.
  2. Return an ImageHash instance with 0 bits and an arbitrary hash type, which is misleading, useless, and hides the case where an empty string is being passed by mistake.
  3. Yank 1.0.0, create the error type I described above, and publish as 0.6.0, which would throw off existing users who (somewhat incorrectly but not unreasonably) expect semantic versions to always increase.

As for your test, it looks all right but you don't need to use assert! with constant conditions, just use panic!() and drop the first boolean term. For assert!(true) in that case, just replace it with ().

@palfrey
Copy link
Contributor Author

palfrey commented May 4, 2016

Or 4) Like 3), but without the yanking, and releasing as 2.0.0. It's unpleasant, but the best option given semantic versioning.

@abonander
Copy link
Owner

Another option I was editing in before you replied would be to do what you're doing anyways, but just clearly document it. It's still misleading for people who don't read the documentation but it's not breaking and it eliminates the panic.

assert!(decoded_result.is_err());
/* FIXME: Figure out how to make the below run
match decoded_result {
Err(FromBase64Error::InvalidBase64Length) => assert!(true),
Copy link
Owner

@abonander abonander May 4, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assert!() is unnecessary for constant conditions. Just use () for assert!(true) and panic!(...) for assert!(false, ...).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The core problem I've got with the commented out lines is that they don't even compile. Any thoughts on how to fix that? I get

src/lib.rs:493:17: 493:53 error: unresolved enum variant, struct or const `InvalidBase64Length` [E0419]
src/lib.rs:493             Err(FromBase64Error::InvalidBase64Length) => (),
                               ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Copy link
Owner

@abonander abonander May 4, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah, you need to duplicate the import of FromBase64Error in the sub-module.

In fact, you can just do use serialize::base64::*; for both the test submodule and the top level, since we're using a bunch of the definitions anyway.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What might also be good would be to add pub use FromBase64Error; to the top level so users who want to match on the error don't have to import from the rustc_serialize crate.

Copy link
Owner

@abonander abonander May 4, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually if you were to do that, you'd need to have this at the top level:

use serialize::base64::{ToBase64, FromBase64, Config, STANDARD};
// Needs to be fully qualified
pub use serialize::base64::FromBase64Error;

And then the test submodule can just have:

use serialize::base64::*;

Still a good idea though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! That was confusing the heck out of me, but now fixed

@abonander
Copy link
Owner

Publishing 2.0.0 feels yuckier than yanking 1.0.0 even though it follows the specification.

@abonander
Copy link
Owner

Ultimately I think using FromBase64Error::InvalidBase64Length is all right as long as the error case is clearly documented, if you want to add that on.

@palfrey palfrey changed the title [WIP] Don't try and hash if the data is empty Don't try and hash if the data is empty May 4, 2016
@palfrey
Copy link
Contributor Author

palfrey commented May 4, 2016

I've fixed the test case and added a comment about the new error

@@ -131,7 +132,7 @@ impl ImageHash {
/// Create an `ImageHash` instance from the given Base64-encoded string.
/// ## Note:
/// **Not** compatible with Base64-encoded strings created before `HashType` was added.
///
/// Returns a FromBase64Error::InvalidBase64Length when trying to hash a zero-length string
Copy link
Owner

@abonander abonander May 4, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you format these docs like this?

     /// Create an `ImageHash` instance from the given Base64-encoded string.
     /// ## Note:
     /// **Not** compatible with Base64-encoded strings created before `HashType` was added.
     ///
     /// Does **not** preserve the internal value of `HashType::UserDCT`.
     /// ## Errors:
     /// Returns `FromBase64Error::InvalidBase64Length` when passed an empty string (`""`).

Thanks. (The empty line is intentional.)

@abonander
Copy link
Owner

Added a comment on the docs formatting, easy tweak.

@palfrey
Copy link
Contributor Author

palfrey commented May 4, 2016

Done

@abonander
Copy link
Owner

That'll work! I'll merge as soon as tests pass.

I also need to fix the docs uploading but I probably won't do that tonight as it's really late here.

@abonander abonander merged commit 43553eb into abonander:master May 4, 2016
@abonander
Copy link
Owner

Merged, thanks!

@palfrey palfrey deleted the zero-length-hash branch May 4, 2016 22:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants