Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve base58 crate #2481

Merged
merged 5 commits into from Mar 24, 2024
Merged

Conversation

tcharding
Copy link
Member

@tcharding tcharding commented Feb 15, 2024

Improve the error code in the new base58 crate.

@github-actions github-actions bot added C-bitcoin PRs modifying the bitcoin crate doc C-base58 labels Feb 15, 2024
@coveralls
Copy link

coveralls commented Feb 15, 2024

Pull Request Test Coverage Report for Build 8334091085

Details

  • 14 of 113 (12.39%) changed or added relevant lines in 7 files are covered.
  • 1 unchanged line in 1 file lost coverage.
  • Overall coverage decreased (-0.2%) to 83.599%

Changes Missing Coverage Covered Lines Changed/Added Lines %
base58/src/lib.rs 9 11 81.82%
bitcoin/src/address/mod.rs 0 3 0.0%
bitcoin/src/bip32.rs 0 10 0.0%
bitcoin/src/address/error.rs 0 24 0.0%
bitcoin/src/crypto/key.rs 0 25 0.0%
base58/src/error.rs 4 39 10.26%
Files with Coverage Reduction New Missed Lines %
bitcoin/src/crypto/key.rs 1 68.7%
Totals Coverage Status
Change from base Build 8318043190: -0.2%
Covered Lines: 19757
Relevant Lines: 23633

💛 - Coveralls

Copy link
Collaborator

@Kixunil Kixunil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK, light review.

bitcoin/src/bip32.rs Outdated Show resolved Hide resolved
/// Legacy address is too long.
LegacyAddressTooLong(LegacyAddressTooLongError),
/// Invalid base58 payload data length for legacy address.
InvalidBase58PayloadLength(InvalidBase58PayloadLengthError),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not treat too long and InvalidBase58PayloadLenght the same? They are both invalid lengths just in different layers. It'd be nicer to merge them into a simple InvalidLength if we can compute one from the other (which I believe we can) and only present the outer one to the user. (The user is only concerned with number of chars, not decoded bytes.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Warning, hand wavey response.

Without commenting too much on this specific case, I find complex errors annoying to work with. Error handling is proving to be hard enough as it is using simple types, every time we have more complex errors I end up having to spend ages working out if they conform to all the million nuanced rules we have, this is compounded when errors get used in more places - then when they break the rules it takes much longer to detangle them than if there are just many types. One of the reasons this PR adds multiple errors that are the same "shape" (same fields). I have the gut feeling that forward compatibility will be harder the more we reuse errors in different places also. I think the error code should all be brain dead simple and ridiculously uniform if we are to have any hope of limiting mistakes. Its a moving target still, unfortunately, I can do error stuff mechanically in the afternoon if its brain dead easy - if its complex it means I have to do it earlier in the day when I have more brain power and it also needs more thought when reviewing. Long wall of text, sorry, I've spent a lot of time writing error stuff lately :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tend to agree -- and to the extent possible, as a user I like when every error is only constructible from one place. Then I can trace backward from the rust-bitcoin source to figure out where the error actually came from (after grepping to reverse the error message into an error variant).

While conceptually "invalid length" always means that the string was somehow an invalid length, it's useful to know which layer of the decoding complained.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tcharding is it possible that you're confusing this with an objection against same-shaped errors? Because I'm not objecting to that. Same-shaped errors with different semantics are fine. Differently-shaped errors with the same semantics are not really that great.

I know it can be annoying but really, programming is hard anyway. If we merge the errors in thee library we can save downstream users from verbose matching. I tend to prefer making libraries internally complex if it can make downstream crates less complex because there are multiple downstream projects.

@apoelstra by "users" here I mean the people who are not programmers. I see no reason why they should concern themselves with layers. If you need to debug things we could make the debug representation print out the source or we could even add optional backtrace.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@apoelstra by "users" here I mean the people who are not programmers

You've defined "user" in a way that can't possibly include me, which isn't fair. I also use software. In basically every software I use when I encounter an error of some sort I dig into the source code.

If there is no source code (e.g. the opaque "Contact Key Verification" error that you get when you try to enable the new iMessage fingerprint verification tool) then generally I behave like a non-technical person by giving up and doing something else that doesn't involve computers.

If you need to debug things we could make the debug representation print out the source or we could even add optional backtrace.

I really like this idea. We could compile out that stuff when debug_assertions is off.

The backtrace API isn't stable until 1.65 so something to revisit, not related to this PR or anything soon.

BTW this is an example of something we could add where @tcharding's constructors would be unaffected :). They would just need to have #[track_caller] on and they could conditionally internally generate backtraces.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You've defined "user" in a way that can't possibly include me, which isn't fair. I also use software. In basically every software I use when I encounter an error of some sort I dig into the source code.

Fair, I do this too. :D

Maybe the error messages can be slightly different such that it doesn't bother users but allows debugging.

We could compile out that stuff when debug_assertions is off.

I'd rather use a different feature flag if we want to provide a flag at all. The Backtrace type already handles not capturing when RUST_BACKTRACE is not set

The backtrace API isn't stable until 1.65 so something to revisit

We can use cfg and/or replace it with core::panic::Location in the meantime (and maybe we should do it anyway because panic location is no_std while backtrace isn't).

There's also external library for this but we can't make it conditional on Rust version so I'd rather not use it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this thread resolved @Kixunil or do you still want to see changes to the introduced error types?

Base58(base58::Error),
/// Base58 decoded data was an invalid length.
InvalidBase58PayloadLength(InvalidBase58PayloadLengthError),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd be nicer to just call this InvalidLength and present the number of chars to the user.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can get behind that change though!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a bit of a play with doing this but because one error uses expected length and one uses expected length I believe its more clear if left as is.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand, can you explain more?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, that was bad English I wrote. Should have been - one uses expected length and one uses "should have been 33 or 34". WIP accepts data length of 33 or 34

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could distinguish those internally with an enum but maybe it's too much. I suggest we at least make the error messages more user-friendly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked at the error messages and did not see exactly where you wanted them improved. FTR I'm not confident that all our nested errors stack error messages in a meaningful, non-duplicate way, I've kind of given up for now and am just putting something grep'able. I think we will need to come back pre-1.0 and do some deep investigation like I tried to do in rust-bitcoin/rust-bech32#151

base58/src/error.rs Outdated Show resolved Hide resolved
/// Checksum was not correct.
BadChecksum(BadChecksumError),
/// Checked data was too short.
TooShort(TooShortError),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is also invalid length so we should then just convert this one into InvalidLength downstream.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Conceptually you are correct but since I've argued above that we should not have the InvalidLength then I believe this comment is mute.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not convinced. The parsed type has a set of valid lengths. Either the length is in the set or isn't. The only issue is how we represent the sets (although they are statically known for their types).

base58/src/error.rs Outdated Show resolved Hide resolved
base58/src/error.rs Outdated Show resolved Hide resolved
base58/src/error.rs Outdated Show resolved Hide resolved
@tcharding
Copy link
Member Author

tcharding commented Feb 15, 2024

Thanks for the review @Kixunil. I'll leave the 'pub' constructors in there but note that the merit of constructors/getters is still under debate.

I've now used pub fields and non_exhaustive along with private constructors.

@github-actions github-actions bot added C-hashes PRs modifying the hashes crate C-units PRs modifying the units crate C-io PRs modifying the io crate test labels Feb 21, 2024
@tcharding tcharding force-pushed the 02-15-improve-base58 branch 2 times, most recently from 4463ffc to f6b86b3 Compare February 21, 2024 06:17
@tcharding
Copy link
Member Author

Note, I just saw "incorrect" vs "invalid" comment, will change tomorrow.

@tcharding tcharding force-pushed the 02-15-improve-base58 branch 2 times, most recently from f4135a1 to 6221c76 Compare February 26, 2024 01:57
@tcharding tcharding marked this pull request as ready for review February 26, 2024 02:46
@tcharding tcharding force-pushed the 02-15-improve-base58 branch 4 times, most recently from dce2aab to d7ceb25 Compare February 29, 2024 00:03
@tcharding
Copy link
Member Author

tcharding commented Feb 29, 2024

I do not claim that this PR makes the errors perfect but IMO it does get the crate into a state that it can be released as v0.1.0 - can we merge as is and release? We now need v.0.1.0 to be out before the next rust-bitcoin release.

Copy link
Collaborator

@Kixunil Kixunil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding invalid length, I'm fine with not doing it in this PR.

| TooShort(_) => None,
}
}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the point of this change if the error is reexported anyway? I find this annoying to review.

Copy link
Member Author

@tcharding tcharding Mar 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure exactly what is annoying you, did you mean commit: 4f6f0bf6 base58: Add error module?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Kixunil I guess the point is just code organization. I do a very lax review of these things.

  • Check that number of variants in one file is the same of one in another file.
  • Check some random method if it is correctly copied or not.
  • Check that total lines are the same.
  • Check the total diff is small.

While reviewing move-only/formatting commits from contributors whose PRs I have been reviewing for years, I am looking for accidental mistakes not for deliberate/malicious errors. Sure, it is possible that @tcharding's or @apoelstra's account is hacked or they deliberately sneak in a change that causes issues. If the commit from a regular contributor cargo fmt something or move something, I tend to a spot sanity check and not use my brain trying to carefully review it.

If these guys wanted to break something, they just push a malicious crate to crates io that is unrelated to github code :) . I do rely on strongly on what commit messages and certainly it is possible that authors sneak in a code logic change along under a move/cargo fmt label under a PR that I ACKed.

base58/src/error.rs Show resolved Hide resolved
base58/src/error.rs Outdated Show resolved Hide resolved
bitcoin/src/address/error.rs Outdated Show resolved Hide resolved
@tcharding
Copy link
Member Author

tcharding commented Mar 13, 2024

Rebased and changed InvalidCharError to InvalidCharacterError as discussed previously with @Kixunil somewhere else. I also created rust-bitcoin/hex-conservative#85

@apoelstra
Copy link
Member

I have a mild preference for using pub(super) rather than pub(crate) for error internals. This signals "this error can be created by the module that owns this errors.rs but not by random other parts of the codebase".

But this is just a nit.

apoelstra
apoelstra previously approved these changes Mar 15, 2024
Copy link
Member

@apoelstra apoelstra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 1784b0d

@tcharding
Copy link
Member Author

I have a mild preference for using pub(super) rather than pub(crate) for error internals.

Yeah, I like that. I'll fix and re-spin tomorrow.

@tcharding
Copy link
Member Author

Just the pub(super) thing.

@tcharding tcharding added the P-high High priority label Mar 18, 2024
apoelstra
apoelstra previously approved these changes Mar 18, 2024
Copy link
Member

@apoelstra apoelstra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 33347ac

@tcharding
Copy link
Member Author

tcharding commented Mar 20, 2024

This whole PR is error handling, totally boring, and can be iterated upon later. However, we need to release this crate already so we can get to the RC release of rust-bitcoin. Can I get an ACK please @Kixunil or @sanket1729.

sanket1729
sanket1729 previously approved these changes Mar 20, 2024
Copy link
Member

@sanket1729 sanket1729 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 33347ac

#[derive(Debug, Clone, PartialEq, Eq)]
pub struct IncorrectChecksumError {
/// The incorrect checksum.
pub(super) incorrect: u32,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Most of errors I have seen got and expected format.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to change all the errors across the board and just use got/expected, will leave it for another day (or month). Thanks man.

@sanket1729
Copy link
Member

I only reviewed this for correctness and I think the moves the code in forward direction. There are still some unresolved issues about which error variants to emit or whether to combine them or not.

I don't have a strong opinion on them and all of those decisions fall well within the acceptable good API boundary and I did not think about what would be the "perfect" solution.

@tcharding
Copy link
Member Author

Awesome, thanks for the review man.

@apoelstra
Copy link
Member

Frustratingly this needs a rebase now because of #2518, which was basically trivial.

In preparation for improving the `base58` error types crate an `error`
module and move the single current error type there. Make the module
public and reexport the type.
The `base58::decode` function can only return a single error type, add a
`InvalidCharacterError` struct (leaf error) to use as the return type.
We are currently using the `base58::Error` type to create errors in
`bitcoin`, these are bitcoin errors not `base58` errors.

Note that we add what looks like duplicate
`InvalidBase58PayloadLengthError` types but they are different because
of the expected length. This could have been a field but I elected not
to do so for two reasons:

1. We will need to do so anyways if we crate smash more
2. The `crypto::key` one can have one of two values 33 or 34.

With this applied we can remove the now unused error variants from
`base58::Error`.
As is convention here in `rust-bitcoin`, hide the `base58::Error`
internals by adding struct error types.
@tcharding
Copy link
Member Author

Done! No other changes.

@tcharding
Copy link
Member Author

tcharding commented Mar 20, 2024

This is a good example of how something in our set up requires manual intervention when it shouldn't really, there were no conflicts, I just ran cargo rebase master and force pushed - the machines should be able to do that.

(Lazy comment, I did not spend time thinking why.)

Copy link
Member

@apoelstra apoelstra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK af49841

@tcharding
Copy link
Member Author

Process idea: when there is an ack and then a force push with minor changes could a single ack from you @apoelstra mean "I ack the changes and I checked the diff with git range-diff since Dev A acked and its still reasonable to think that ack stands"? Especially when you previously acked the same hash so you are running git range-diff anyway. Just an idea.

@apoelstra
Copy link
Member

Yeah, agreed. Though we should PR such a change separately to CONTRIBUTING.md as a new one-ack carveout.

Copy link
Member

@sanket1729 sanket1729 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK af49841

| TooShort(_) => None,
}
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Kixunil I guess the point is just code organization. I do a very lax review of these things.

  • Check that number of variants in one file is the same of one in another file.
  • Check some random method if it is correctly copied or not.
  • Check that total lines are the same.
  • Check the total diff is small.

While reviewing move-only/formatting commits from contributors whose PRs I have been reviewing for years, I am looking for accidental mistakes not for deliberate/malicious errors. Sure, it is possible that @tcharding's or @apoelstra's account is hacked or they deliberately sneak in a change that causes issues. If the commit from a regular contributor cargo fmt something or move something, I tend to a spot sanity check and not use my brain trying to carefully review it.

If these guys wanted to break something, they just push a malicious crate to crates io that is unrelated to github code :) . I do rely on strongly on what commit messages and certainly it is possible that authors sneak in a code logic change along under a move/cargo fmt label under a PR that I ACKed.

@@ -732,11 +735,11 @@ impl FromStr for Address<NetworkUnchecked> {
// If segwit decoding fails, assume its a legacy address.

if s.len() > 50 {
return Err(ParseError::Base58(base58::Error::InvalidLength(s.len() * 11 / 15)));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice catch!

@apoelstra apoelstra merged commit bfd5255 into rust-bitcoin:master Mar 24, 2024
169 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-base58 C-bitcoin PRs modifying the bitcoin crate C-units PRs modifying the units crate doc P-high High priority test
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants