Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor address byte swapping #968

Merged
merged 1 commit into from May 24, 2022
Merged

Conversation

tcharding
Copy link
Member

@tcharding tcharding commented Apr 26, 2022

Refactor address byte swapping

When encoding a network::Address two of the fields are encoded
big-endian instead of little-endian as is done by consensus_encode. In
order to achieve this we have a helper function addr_to_be that swaps
the bytes. This function is miss-named because it is not converting to a
specific endian-ness (which implies different behaviour on machines with
different endian-ness) but is reversing the byte order irrespective of
the underlying architecture.

  • Remove function addr_to_be
  • Inline the endian-ness code when encoding an address
  • Remove TODO and use to_be_bytes when encoding port
  • Add a function for reading big-endian bytes read_be_address
  • Use read_be_address when decoding Address and Addrv2

Refactor only, no logic changes. Code path is already covered by
unit tests.

@Kixunil
Copy link
Collaborator

Kixunil commented Apr 26, 2022

Your observation about naming looks correct, the comment was talking about port though. Also I'd call the function swap_addr_bytes or something like that to distinguish it from u16::swap_bytes.

Anyway, I think the whole code looks over-complicated and confusing - why not just write to the writer in a for loop, converting individual u16s on the fly? It should be more obvious what it does.

@tcharding
Copy link
Member Author

Your observation about naming looks correct, the comment was talking about port though. Also I'd call the function swap_addr_bytes or something like that to distinguish it from u16::swap_bytes.

I used the same name on purpose, it's the same method implemented on a different type.

Anyway, I think the whole code looks over-complicated and confusing - why not just write to the writer in a for loop, converting individual u16s on the fly? It should be more obvious what it does.

The method is used when decoding as well, in multiple places.

@Kixunil
Copy link
Collaborator

Kixunil commented Apr 27, 2022

The reason I don't like swap_bytes being intentionally similar is it doesn't reorder the words, only swaps endianess of each individual word.

I think in a tricky code like this it may be worth a bit of code repetition (decoding should still be a single function) if it increases clarity. But maybe it's just me who has harder time tracking endianess. 🤷‍♂️

I wouldn't block the PR for any of my objections, it's just for consideration.

@tcharding
Copy link
Member Author

Force push is total re-write, based on review by @Kixunil. PR description has also been re-written.

Copy link
Collaborator

@Kixunil Kixunil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

read(...) != 2 is definitely a bug, the rest looks OK.

let mut buf = [0u8; 2];

for i in 0..8 {
if io::Read::read(&mut r, &mut buf)? != 2 {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be read_exact. read returning less bytes is not an error.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks man, I learned a bit more about readers.

if io::Read::read(&mut r, &mut buf)? != 2 {
return Err(encode::Error::ParseFailed("missing address bytes"));
}
address[i] = u16::from_be_bytes(buf);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not for word in &mut address and then *word = u16::from_be_bytes(buf);? (feel free to come up with better var name)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah nice, thanks. Implemented as suggested.

@tcharding
Copy link
Member Author

Changes in force push:

  • Rewrite the read_be_address function with suggestions from review above.
  • Rebase on master.

@tcharding
Copy link
Member Author

I have no idea why I did this PR other at this moment in time, converting to draft to let the edition bump stuff go in.

@tcharding tcharding marked this pull request as draft April 29, 2022 00:18
Copy link
Collaborator

@Kixunil Kixunil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for not finding the other one sooner.

let mut len = self.services.consensus_encode(&mut s)?;

for word in &self.address {
len += io::Write::write(&mut s, &word.to_be_bytes())?;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just realized we have the same issue as with read_exact, we need write_all() here. Also while changing you could use s.write_all() instead, no need to repeat the whole trait since it can't get confused anyway thanks to generics.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Legend man, late is better than never :) Will fix as suggested.

@tcharding
Copy link
Member Author

Changes in force push: Use write_all as suggested.

@Kixunil
Copy link
Collaborator

Kixunil commented May 2, 2022

Looks good now, will ACK when it's un-drafted.

When encoding a `network::Address` two of the fields are encoded
big-endian instead of little-endian as is done by `consensus_encode`. In
order to achieve this we have a helper function `addr_to_be` that swaps
the bytes. This function is miss-named because it is not converting to a
specific endian-ness (which implies different behaviour on machines with
different endian-ness) but is reversing the byte order irrespective of
the underlying architecture.

- Remove function `addr_to_be`
- Inline the endian-ness code when encoding an address
- Remove TODO and use `to_be_bytes` when encoding port
- Add a function for reading big-endian bytes `read_be_address`
- Use `read_be_address` when decoding `Address` and `Addrv2`

Refactor only, no logic changes. Code path is already covered by
unit tests.
@tcharding tcharding marked this pull request as ready for review May 19, 2022 06:19
Copy link
Member

@apoelstra apoelstra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 07c7530

Copy link
Collaborator

@Kixunil Kixunil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 07c7530

@tcharding
Copy link
Member Author

Good to have you back @Kixunil!

@apoelstra apoelstra merged commit 324fa0f into rust-bitcoin:master May 24, 2022
@tcharding tcharding deleted the to-be-bytes branch May 24, 2022 21:54
ChallengeDev210 pushed a commit to ChallengeDev210/rust-bitcoin that referenced this pull request Aug 1, 2022
07c7530 Refactor address byte swapping (Tobin C. Harding)

Pull request description:

  Refactor address byte swapping

  When encoding a `network::Address` two of the fields are encoded
  big-endian instead of little-endian as is done by `consensus_encode`. In
  order to achieve this we have a helper function `addr_to_be` that swaps
  the bytes. This function is miss-named because it is not converting to a
  specific endian-ness (which implies different behaviour on machines with
  different endian-ness) but is reversing the byte order irrespective of
  the underlying architecture.

  - Remove function `addr_to_be`
  - Inline the endian-ness code when encoding an address
  - Remove TODO and use `to_be_bytes` when encoding port
  - Add a function for reading big-endian bytes `read_be_address`
  - Use `read_be_address` when decoding `Address` and `Addrv2`

  Refactor only, no logic changes. Code path is already covered by
  unit tests.

ACKs for top commit:
  apoelstra:
    ACK 07c7530
  Kixunil:
    ACK 07c7530

Tree-SHA512: 186bc86512e264a7b306f3bc2e18d1619f3cd84fc54412148cfc2663e8d6e9616ea9e2fe19eafec72d76cc11367a9b39cac2b73210d9e43eb8f453bd253b33de
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants