Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#24031: Protover.rs could use a better algorithm #33

Closed
wants to merge 36 commits into from

Conversation

Labels
None yet
Projects
None yet
3 participants
@isislovecruft
Copy link
Contributor

@isislovecruft isislovecruft commented Mar 28, 2018

PR for code review purposes only. Ticket is https://bugs.torproject.org/24031.

There's one fixup! commit, and it's only one line, but I wanted to keep the commits that @chelseakomlo reviewed (77e73e6..5eac7c1) separate to the ones addressing her review and my testing/bugfinding.

isislovecruft added 30 commits Mar 21, 2018
This will allow us to do actual error handling intra-crate in a more
rusty manner, e.g. propogating errors in match statements, conversion
between error types, logging messages, etc.

 * FIXES part of #24031: https://bugs.torproject.org/24031.
 * ADD new protover::protoset module.
 * ADD new protover::protoset::ProtoSet class for holding protover versions.
 * REMOVE protover::Versions type implementation and its method
   `from_version_string()`, and instead implement this behaviour in a more
   rust-like manner as `impl FromStr for ProtoSet`.
 * MOVE the `find_range()` utility function from protover::protover to
   protover::protoset since it's only used internally in the
   implementation of ProtoSet.
 * REMOVE the `contract_protocol_list()` function from protover::protover and
   instead refactor it (reusing nearly the entire thing, with minor superficial,
   i.e. non-behavioural, changes) into a more rusty
   `impl ToString for ProtoSet`.
 * REMOVE the `expand_version_range()` function from protover::protover and
   instead refactor it into a more rusty implementation of
   `impl Into<Vec<Version>> for ProtoSet` using the new error types in
   protover::errors.
 * FIXES part of #24031: https://bugs.torproject.org/24031.
 * ADD new type, protover::UnknownProtocol, so that we have greater type safety
   and our protover functionality which works with unsanitised protocol names is
   more clearly demarcated.
 * REFACTOR protover::Proto, renaming it protover::Protocol to mirror the new
   protover::UnknownProtocol type name.
 * ADD a utility conversion of `impl From<Protocol> for UnknownProtocol` so that
   we can easily with known protocols and unknown protocols simultaneously
   (e.g. doing comparisons, checking their version numbers), while not allowing
   UnknownProtocols to be accidentally used in functions which should only take
   Protocols.
 * FIXES part of #24031: https://bugs.torproject.org/24031
This replaces the `protover::SupportedProtocols` (why would you have a type just
for things which are supported?) with a new, more generic type,
`protover::ProtoEntry`, which utilises the new, more memory-efficient datatype
in protover::protoset.

 * REMOVE `get_supported_protocols()` and `SupportedProtocols::tor_supported()`
   (since they were never used separately) and collapse their functionality into
   a single `ProtoEntry::supported()` method.
 * REMOVE `SupportedProtocols::from_proto_entries()` and reimplement its
   functionality as the more rusty `impl FromStr for ProtoEntry`.
 * REMOVE `get_proto_and_vers()` function and refactor it into the more rusty
   `impl FromStr for ProtoEntry`.
 * FIXES part of #24031: https://bugs.torproject.org/24031
This adds a new protover::UnvalidatedProtoEntry type, which is the
UnknownProtocol variant of a ProtoEntry, and refactors several functions which
should operate on this type into methods.

 * REMOVE internal `contains_only_supported_protocols()` function.
 * REMOVE `all_supported()` function and refactor it into
   `UnvalidatedProtoEntry::all_supported()`.
 * REMOVE `parse_protocols_from_string_with_no_validation()` and
   refactor it into the more rusty implementation of
   `impl FromStr for UnvalidatedProtoEntry`.
 * REMOVE `protover_string_supports_protocol()` and refactor it into
   `UnvalidatedProtoEntry::supports_protocol()`.
 * REMOVE `protover_string_supports_protocol_or_later()` and refactor
   it into `UnvalidatedProtoEntry::supports_protocol_or_later()`.
 * FIXES part of #24031: https://bugs.torproject.org/24031
This implements conversions from either a ProtoEntry or an UnvalidatedProtoEntry
into a String, for use in replacing such functions as
`protover::write_vote_to_string()`.

 * ADD macro for implementing ToString trait for ProtoEntry and
   UnvalidatedProtoEntry.
 * FIXES part of #24031: https://bugs.torproject.org/24031
This adds a new type for votes upon `protover::ProtoEntry`s (technically, on
`protover::UnvalidatedProtoEntry`s, because the C code does not validate based
upon currently known protocols when voting, in order to maintain
future-compatibility), and converts several functions which would have operated
on this datatype into methods for ease-of-use and readability.

 * ADD new `protover::ProtoverVote` datatype.
 * REMOVE the `protover::compute_vote()` function and refactor it into an
   equivalent-in-behaviour albeit more memory-efficient voting algorithm based
   on the new underlying `protover::protoset::ProtoSet` datatype, as
   `ProtoverVote::compute()`.
 * REMOVE the `protover::write_vote_to_string()` function, since this
   functionality is now generated by the impl_to_string_for_proto_entry!() macro
   for both `ProtoEntry` and `UnvalidatedProtoEntry` (the latter of which is the
   correct type to return from a voting protocol instance, since the entity
   voting may not know of all protocols being voted upon or known about by other
   voting parties).
 * FIXES part of #24031: https://bugs.torproject.org/24031
This changes `protover::is_supported_here()` to be aware of new datatypes
(e.g. don't call `.0` on things which are no longer tuple structs) and also
changes the method signature to take borrows, making it faster, threadable, and
easier to read (i.e. the caller can know from reading the function signature
that the function won't mutate values passed into it).

 * CHANGE the `protover::is_supported_here()` function to take borrows.
 * REFACTOR the `protover::is_supported_here()` function to be aware of new
   datatypes.
 * FIXES part of #24031: https://bugs.torproject.org/24031
Previously, the rust implementation of protover considered an empty string to be
a valid ProtoEntry, while the C version did not (it must have a "=" character).
Other differences include that unknown protocols must now be parsed as
`protover::UnknownProtocol`s, and hence their entries as
`protover::UnvalidatedProtoEntry`s, whereas before (nearly) all protoentries
could be parsed regardless of how erroneous they might be considered by the C
version.

My apologies for this somewhat messy and difficult to read commit, if any part
is frustrating to the reviewer, please feel free to ask me to split this into
smaller changes (possibly hard to do, since so much changed), or ask me to
comment on a specific line/change and clarify how/when the behaviours differ.

The tests here should more closely match the behaviours exhibited by the C
implementation, but I do not yet personally guarantee they match precisely.

 * REFACTOR unittests in protover::protover.
 * ADD new integration tests for previously untested behaviour.
 * FIXES part of #24031: https://bugs.torproject.org/24031.
This includes differences in behaviour to before, which should now more closely
match the C version:

 - If parsing a protover `char*` from C, and the string is not parseable, this
   function will return 1 early, which matches the C behaviour when protocols
   are unparseable.  Previously, we would parse it and its version numbers
   simultaneously, i.e. there was no fail early option, causing us to spend more
   time unnecessarily parsing versions.

 * REFACTOR `protover::ffi::protover_all_supported()` to use new types and
   methods.
This includes a subtle difference in behaviour, as in 4258f1e, where we return
(matching the C impl's return behaviour) earlier than before if parsing failed,
saving us computation in parsing the versions into a
protover::protoset::ProtoSet.

 * REFACTOR `protover::ffi::protover_list_supports_protocol()` to use new types
   and methods.
This includes a subtle difference in behaviour, as in 4258f1e, where we return
(matching the C impl's return behaviour) earlier than before if parsing failed,
saving us computation in parsing the versions into a
protover::protoset::ProtoSet.

 * REFACTOR `protover::ffi::protover_list_supports_protocol_or_later()` to use
   new types and methods.
This includes a subtle difference in behaviour to the previous Rust
implementation, where, for each vote that we're computing over, if a single one
fails to parse, we skip it.  This now matches the current behaviour in the C
implementation.

 * REFACTOR `protover::ffi::protover_compute_vote()` to use
   new types and methods.
It was changed to take borrows instead of taking ownership.

 * REFACTOR `protover::ffi::protover_is_supported_here()` to use changed method
   signature on `protover::is_supported_here()`.
During code review and discussion with Chelsea Komlo, she pointed out
that protover::compute_for_old_tor() was a public function whose
return type was `&'static CStr`.  We both agree that C-like parts of
APIs should:

1. not be exposed publicly (to other Rust crates),
2. only be called in the appropriate FFI code,
3. not expose types which are meant for FFI code (e.g. `*mut char`,
   `CString`, `*const c_int`, etc.) to the pure-Rust code of other
   crates.
4. FFI code (e.g. things in `ffi.rs` modules) should _never_ be called
   from pure-Rust, not even from other modules in its own crate
   (i.e. do not call `protover::ffi::*` from anywhere in
   `protover::protoset::*`, etc).

With that in mind, this commit makes the following changes:

 * CHANGE `protover::compute_for_old_tor()` to be
   visible only at the `pub(crate)` level.
 * RENAME `protover::compute_for_old_tor()` to
   `protover::compute_for_old_tor_cstr()` to reflect the last change.
 * ADD a new `protover::compute_for_old_tor()` function wrapper which
   is public and intended for other Rust code to use, which returns a
   `&str`.
Previously, if "Link=1-5" was supported, and you asked protover_all_supported()
(or protover::all_supported() in Rust) if it supported "Link=3-999", the C
version would return "Link=3-999" and the Rust would return "Link=6-999".  These
both behave the same now, i.e. both return "Link=6-999".
The DoS potential is slightly higher in C now due to some differences
to the Rust code, see the C_RUST_DIFFERS tags in
src/rust/protover/tests/protover.rs.
There's now no difference in these tests w.r.t. the C or Rust: both
fail miserably (well, Rust fails with nice descriptive errors, and C
gives you a traceback, because, well, C).
The behaviours still do not match, unsurprisingly, but now we know where a
primary difference is: the Rust is validating version ranges more than the C,
so in the C it's possible to call protover_all_supported on a ridiculous
version range like "Sleen=0-4294967294" because the C uses
MAX_PROTOCOLS_TO_EXPAND to count the number of *subprotocols* whereas the Rust
uses it to count the total number of *versions* of all subprotocols.
Previously, the limit for MAX_PROTOCOLS_TO_EXPAND was actually being applied
in Rust to the maximum number of version (total, for all subprotocols).
Whereas in C, it was being applied to the number of subprotocols that were
allowed.  This changes the Rust to match C's behaviour.
The C version of compute_vote() calls expand_protocol_list() which checks if
there would be too many subprotocols *or* expanded individual version numbers,
i.e. more than MAX_PROTOCOLS_TO_EXPAND, and does this *per vote* (but only in
compute_vote(), everywhere else in the C seems to only care about the number of
subprotocols, not the number of individual versions).  We need to match its
behaviour in Rust and ensure we're not allowing more than it would to get the
votes to match.
Currently, if you ask the C version of compute_vote() to compute a single vote
containing "Fribble=", it will return a NULL.  However, the Rust version
currently returns "Fribble=" since it doesn't check if the versions were
empty, so this new test currently fails.
This fixes the unittest from the prior commit by checking if the versions are
empty before adding a protocol to a vote.
@isislovecruft isislovecruft self-assigned this Mar 28, 2018
@isislovecruft isislovecruft changed the title Bug24031 r4 #24031: Protover.rs could use a better algorithm Mar 28, 2018
@isislovecruft isislovecruft requested a review from nmathewson Mar 28, 2018
@isislovecruft isislovecruft removed their assignment Mar 28, 2018
@teor2345
Copy link
Contributor

@teor2345 teor2345 commented Apr 20, 2018

This was merged to 0.3.3 and later.

@teor2345 teor2345 closed this Apr 20, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment