Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modernizes the ElementWriter trait #487

Merged
merged 23 commits into from
Mar 17, 2023
Merged

Modernizes the ElementWriter trait #487

merged 23 commits into from
Mar 17, 2023

Conversation

zslayton
Copy link
Contributor

@zslayton zslayton commented Mar 17, 2023

This PR has a weird commit history that was caused by needing to be rebased a few times in the wake of other (sweeping) changes getting merged. The diff is correct, however, so it can be safely ignored for the purposes of review. It'll get squashed at merge time anyway.


Changes:

  • ElementWriter is now automatically implemented by all IonWriter implementations, allowing IonWriters to write an Element to output without creating an instance of an intermediate type (e.g. NativeElementWriter).
  • Removed NativeElementWriter.
  • Adds the Element::value() method to get a reference to its Value. This allows matching on the Value directly instead of matching on its IonType and then calling the appropriate as_TYPE().unwrap() method.
  • Added output() and output_mut() to the IonWriter trait, giving users a way to see/manipulate their (eg) output buffer while the writer still exists. Prior to this, calling drop(writer) to unfreeze the output buffer was common.
  • Fixes a bug in RawBinaryWriter::write_f32 that caused negative zero to be serialized as positive zero.
  • Renames IonSequence::iter() to IonSequence::elements().
  • Several doc comments included code fences (triple backticks) containing example output (ie not Rust). Previously, those code fences were marked with ignore to keep them from being treated as doc tests; this caused them to be listed in the cargo test output as IGNORED. Now they are marked text, indicating that they are not test code at all.

Fixes #202.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@codecov
Copy link

codecov bot commented Mar 17, 2023

Codecov Report

Patch coverage: 100.00% and no project coverage change.

Comparison is base (dba18aa) 89.47% compared to head (84fd7fb) 89.47%.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #487   +/-   ##
=======================================
  Coverage   89.47%   89.47%           
=======================================
  Files          77       76    -1     
  Lines       13711    13704    -7     
=======================================
- Hits        12268    12262    -6     
+ Misses       1443     1442    -1     
Impacted Files Coverage Δ
src/binary/binary_writer.rs 81.95% <ø> (ø)
src/element/element_stream_reader.rs 86.50% <ø> (ø)
src/text/parse_result.rs 84.90% <ø> (ø)
src/binary/raw_binary_writer.rs 94.77% <100.00%> (ø)
src/element/mod.rs 90.76% <100.00%> (ø)
src/element/owned.rs 90.70% <100.00%> (+0.03%) ⬆️
src/element/writer.rs 92.45% <100.00%> (+29.95%) ⬆️
src/ion_hash/representation.rs 99.06% <100.00%> (ø)
src/text/raw_text_writer.rs 96.17% <100.00%> (+0.01%) ⬆️
src/text/text_formatter.rs 58.72% <100.00%> (ø)
... and 2 more

... and 1 file with indirect coverage changes

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

Copy link
Contributor Author

@zslayton zslayton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ PR tour

Comment on lines +197 to +198
fn output(&self) -> &Self::Output;
fn output_mut(&mut self) -> &mut Self::Output;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ All IonWriters can now provide a handle to their underlying data sink.

For comparison, this is done in a variety of stdlib types:

While get_ref() and get_mut() are common names, I found output() and output_mut() to be more descriptive. As you'll see in later examples, they make for more obvious code.

@@ -638,7 +640,7 @@ impl<W: Write> IonWriter for RawBinaryWriter<W> {
/// Writes an Ion float with the specified value.
fn write_f32(&mut self, value: f32) -> IonResult<()> {
self.write_scalar(|enc_buffer| {
if value == 0f32 {
if value == 0f32 && !value.is_sign_negative() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ Fixes a hitherto undiscovered bug: because zero and negative zero are mathematically equal (that is: 0.0f32 == -0.0f32), this branch would cause negative zero to be serialized as positive zero (0x40). write_f64 (just below) did not have this problem.

@@ -110,7 +110,7 @@ impl ElementStreamReader {
value
.as_sequence()
.unwrap()
.iter()
.elements()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ IonSequence::iter() has been renamed IonSequence::elements() to be more self-describing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the loss of iter() in IonSequence make any difference w.r.t. being able to use it in a for loop?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. The short answer is no, what was previously:

for element in sequence.iter() {...}

is now

for element in sequence.elements() {...}

I still intend to impl IntoIterator for &List and for &SExp, which will enable the more desirable:

for element in &sequence {...}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ The old ElementWriter trait had two implementations: IonCElementWriter and NativeElementWriter. IonCElementWriter was removed around v0.14.0, leaving only NativeElementWriter. It is now superfluous and has been removed.

/// ```
pub fn value(&self) -> &Value {
&self.value
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ Exposing the Element's underlying Value enum transforms code like this:

let element: Element = ...;

if element.is_null() { // `is_null` matches on inner `value`
    return handle_null(element.ion_type());
}

use IonType::*;
match element.ion_type() { // `ion_type()` matches on inner `value`
  Null => unreachable!("we already handled null above :c"),
  Bool => handle_bool(element.as_bool().unwrap()),   // Matches on inner `value`
  Int => handle_int(element.as_int().unwrap()),      // Matches on inner `value`
  Float => handle_int(element.as_float()).unwrap()), // Matches on inner `value`
   // ...etc
};

into:

let element: Element = ...;

use Value::*;
match element.value() {
   Null(ion_type) => handle_null(ion_type), // Consolidated null handling
   Bool(b) => handle_bool(b), // No need to call as_bool(), etc
   Int(i) => handle_int(i),
   Float(f) => handle_int(f),
   // ...etc
};

which is much less noisy and eliminates a number of redundant match operations.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉 🎉 🎉

@@ -51,7 +51,7 @@ pub(crate) type IonParseResult<'a, O> = IResult<&'a str, O, IonParseError<'a>>;
/// with other parsers that return `IonParseResult`, you can call `upgrade()` on your parser's
/// output before returning it.
///
/// ```ignore
/// ```ignore // Doc tests cannot use internal APIs
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ This doc comment's code fence does contain Rust code, so ignore is appropriate, but it also uses crate-visible methods and so will not compile.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like a bad idea to have doc comment code that uses private/pub(crate) functions. Aren't you basically publishing an example that the reader can never use?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a private module, so it isn't exposed in the user-facing documentation.

Any doc comment that includes a code fence is treated as code that should compile using the public API; I'm unsure how to convince cargo to compile it with pub(crate) APIs. If you know of a way, let me know!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a private module, so it isn't exposed in the user-facing documentation.

Okay.

Any doc comment that includes a code fence is treated as code that should compile using the public API; I'm unsure how to convince cargo to compile it with pub(crate) APIs. If you know of a way, let me know!

I spent way too long trying to figure out even a hacky workaround (I thought I might be able to do something with #[cfg(doctest)]). Alas, it seems to be impossible—and I must apologize for ever doubting you. 😉

/// tag(" ") <-- Parser 2 matches a single space
/// let parser = terminated( // <-- Takes two parsers; if both succeed, returns the output of the first
/// tag("true").or(tag("false")), // <-- Parser 1 matches the text `true` or the text `false`
/// tag(" ") // <-- Parser 2 matches a single space
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ Hazard of using ignore: the Rust code was not being compiled, so no one noticed that these lines should've been comments.

@@ -20,7 +20,7 @@ impl RawTextWriterBuilder {
/// Constructs a text Ion writer with modest (but not strictly minimal) spacing.
///
/// For example:
/// ```ignore
/// ```text
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ All of these code fences contain example text Ion output, not Rust code. Tagging them with ignore caused them to be listed as IGNORED in the rust test --doc output, which was misleading. Tagging them with text causes them to not be treated as example code, which is more appropriate.

Comment on lines -766 to +767
drop(writer);
assert_eq!(str::from_utf8(&output).unwrap(), expected);
assert_eq!(
str::from_utf8(writer.output().as_slice()).unwrap(),
expected
);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ Note that we no longer need to drop(writer) in order to be able to access its output.

@@ -104,7 +101,7 @@ trait ElementApi {
for (index, (element1, element2)) in e1.iter().zip(e2.iter()).enumerate() {
if !element1.ion_eq(element2) {
return format!(
"The values at position #{index} were not IonEq.\ne1: {element1:?}\ne2: {element2:?}"
"The values at position #{index} were not IonEq.\ne1: {element1}\ne2: {element2}"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ The Debug impl for Element is really verbose, which makes it unhelpful for comparing actual/expected. Element has implemented Display for a while now (which just shows the text Ion serialization of the Element) and is more useful.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

/// ```
pub fn value(&self) -> &Value {
&self.value
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉 🎉 🎉

where
W: IonWriter,
{
fn write_element(&mut self, element: &Element) -> IonResult<()> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So clean!

Comment on lines +58 to +62
Value::SExp(s) => {
self.step_in(IonType::SExp)?;
self.write_elements(s.elements())?;
self.step_out()
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This and the other sequence cases make me wish for a more closed form, but I don't know what that would look like yet.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you elaborate on what you mean by "closed form?"

I considered extending ElementWriter to also have write_struct(&Struct), write_list(&List), and write_sext(&SExp). Would that address it?

Comment on lines +109 to +114
let mut writer = TextWriterBuilder::new().build(&mut buffer)?;

let ion = r#"
null true 0 1e0 2.0 2022T foo "bar" (foo bar baz) [foo, bar, baz] {foo: true, bar: false}
"#;
let expected_elements = Element::read_all(ion.as_bytes())?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would the semantics of the test be changed by using a builder macro for expected_elements? I feel like that would be an improvement but perhaps I'm missing something.

Separately, this is read text, write text, read text. Is there a version of this that round trips binary or round trips across formats, e.g. read text, write binary, then read binary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could switch it to a builder; this test was written before that was an option.

Separately, this is read text, write text, read text. Is there a version of this that round trips binary or round trips across formats, e.g. read text, write binary, then read binary?

Yes, this code is exercised really thoroughly (cross product of all binary/text reader/writer pairs) in element_test_vectors.rs.

Comment on lines +731 to +737
fn output(&self) -> &W {
self.output.get_ref()
}

fn output_mut(&mut self) -> &mut W {
self.output.get_mut()
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a move- I see the doc comments in the deleted code, are they preserved elsewhere?

EDIT: Nevermind I see much more detailed commentary added in writer.rs.

@@ -104,7 +101,7 @@ trait ElementApi {
for (index, (element1, element2)) in e1.iter().zip(e2.iter()).enumerate() {
if !element1.ion_eq(element2) {
return format!(
"The values at position #{index} were not IonEq.\ne1: {element1:?}\ne2: {element2:?}"
"The values at position #{index} were not IonEq.\ne1: {element1}\ne2: {element2}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

let elems = Element::read_all(&data)?;
let elems = Element::read_all(data)?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's going on here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clippy noticed that data (which was already a reference) was being turned into a doubly-indirect reference unnecessarily. This change was unrelated to the core purpose of the PR.

@zslayton zslayton merged commit cffa745 into main Mar 17, 2023
@zslayton zslayton deleted the element-writer branch March 17, 2023 18:10
@zslayton zslayton mentioned this pull request Mar 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Element should be writeable to a writer
3 participants