Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds builder API and macros for Element #479

Merged
merged 13 commits into from
Mar 14, 2023
Merged

Adds builder API and macros for Element #479

merged 13 commits into from
Mar 14, 2023

Conversation

zslayton
Copy link
Contributor

@zslayton zslayton commented Mar 13, 2023

This PR aims to make the Element API more ergonomic.

Changes:

  • Splits Sequence into List and Struct types with their own methods (including builder())
  • Adds ListBuilder, SExpBuilder, and StructBuilder types.
  • Adds ion_list![...], ion_sexp!(...) and ion_struct!{...} macros for concise element construction.
  • Adds an ion!(...) macro for parsing data into a single Element.
  • Removes Rc from all Element types except Symbol in anticipation of a larger change to make Element Sync/Send.
  • Adds clone_builder() to the container types, allowing duplicates of a container to be made with elements added/removed.

Examples

Some of the 'before' cases shown here were more verbose than strictly necessary, but as they were pulled
directly from the ion-rs source I think they're good illustrations of the hoops some developers had to
jump through.

list

Before

let list = Element::new_list(
    vec![
        Element::from("greetings".to_owned()),
        Element::from(5),
        Element::from(true)
    ].into_iter()
)

With builder

let list = List::builder()
    .push("greetings")
    .push(5)
    .push(true)
    .build();

With ion_list! macro

let list = ion_list!["greetings", 5, true]; // Expands to the builder code in the previous example

sexp

Before

let list = Element::new_sexp(
    vec![
        Element::from("greetings".to_owned()),
        Element::from(5),
        Element::from(true)
    ].into_iter()
)

With builder

let sexp = SExp::builder()
    .push("greetings")
    .push(5)
    .push(true)
    .build();

With ion_sexp! macro

let sexp = ion_sexp!("greetings" 5 true); // Expands to the builder code in the previous example

struct

Before

let struct_ = Element::new_struct(
    vec![
        ("foo", Element::from(Value::String("hello".into()))),
        ("bar", Element::from(Value::Integer(Integer::U64(5u64)))),
        ("baz", Element::from(Value::Boolean(true)),
    ].into_iter()
);

With builder

let struct_ = Struct::builder()
    .with_field("foo", "hello")
    .with_field("bar", 5)
    .with_field("baz", true)
    .build();

With ion_struct! macro

let sexp = ion_struct! { // Expands to the builder code in the previous example
    "foo": "hello",
    "bar": 5,
    "baz": true
};

ion!

Especially in the context of unit tests, it's nice to be able to turn Ion text into an Element for the sake of comparison.

Previously, this required users to create an ElementReader, pass it the data, and ask it for
an Option<Result<Element>>.

let element = NativeElementReader
    .iterate_over("(foo bar baz)")
    .unwrap() // this never fails; removing it is a TODO
    .next()
    .expect("Data passed to the ion! macro did not contain any values.")
    .expect("Invalid Ion data passed to the `ion!` macro.")

This PR adds the ion! macro, which achieves the same thing more concisely:

let element = ion!("(foo bar baz)");

The macro will panic if the Ion data is invalid. It returns the first value from the stream. If we think it's useful,
in the future we could also provide an ion_stream!(...) macro that returns a Vec<Element> with all of the
values instead of just the first.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

This PR aims to make the `Element` API more ergonomic.

Changes:
* Splits `Sequence` into `List` and `Struct` types with their own
  methods (including `builder()`)
* Adds `ListBuilder`, `SExpBuilder`, and `StructBuilder` types.
* Adds `ion_list![...]``, `ion_sexp!(...)` and `ion_struct!{...}`
  macros for concise element construction.
* Adds an `ion!(...)` macro for parsing data into an Element.
* Removes `Rc` from all `Element` types except `Symbol` in anticipation
  of a larger change to make `Element` `Sync`/`Send`.
* Adds `clone_builder()` to the container types, allowing duplicates
  of a container to be made with elements added/removed.
@codecov
Copy link

codecov bot commented Mar 13, 2023

Codecov Report

Patch coverage: 91.28% and project coverage change: +0.23 🎉

Comparison is base (9019a2e) 90.14% compared to head (09bbc67) 90.37%.

❗ Current head 09bbc67 differs from pull request most recent head 1ad6237. Consider uploading reports for the commit 1ad6237 to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #479      +/-   ##
==========================================
+ Coverage   90.14%   90.37%   +0.23%     
==========================================
  Files          77       78       +1     
  Lines       13568    13740     +172     
==========================================
+ Hits        12231    12418     +187     
+ Misses       1337     1322      -15     
Impacted Files Coverage Δ
src/text/text_value.rs 93.15% <ø> (ø)
src/value/native_writer.rs 94.54% <0.00%> (+3.31%) ⬆️
src/value/writer.rs 62.50% <ø> (ø)
src/value/mod.rs 90.76% <87.75%> (-1.50%) ⬇️
src/value/owned.rs 89.70% <88.14%> (+6.67%) ⬆️
src/value/builders.rs 93.70% <93.70%> (ø)
src/ion_hash/representation.rs 99.06% <100.00%> (+0.06%) ⬆️
src/ion_hash/type_qualifier.rs 100.00% <100.00%> (ø)
src/symbol.rs 84.69% <100.00%> (+0.99%) ⬆️
src/text/text_formatter.rs 58.82% <100.00%> (-0.89%) ⬇️
... and 3 more

... and 2 files with indirect coverage changes

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

Copy link
Contributor Author

@zslayton zslayton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ PR tour

ion-hash/src/representation.rs Outdated Show resolved Hide resolved
fn from(text: String) -> Self {
Symbol::owned(text)
}
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ While implementing StructBuilder I realized you couldn't provide an owned String as a field name because String didn't implement Into<Symbol>.

ivf.format_struct(
&Struct::builder()
.with_field("greetings", "hello")
.build_struct(),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ The new ion_list!, ion_sexp!, and ion_struct! macros produce an Element. The methods in this file operate on the more specific List, SExp, and Struct types that an Element can contain. As a result, we're using the builder APIs rather than the new macros.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this because the macros are not (or should not) be generic enough to be used recursively to return additional nested lists, structs, or sexp? Should we consider the macros returning an Option of a struct or element?

Copy link
Contributor

@desaikd desaikd Mar 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar thought should we have ion_struct!, ion_list! return Struct, List instead of Element and use ion! for returning an Element? This seems more intuitive to me and may also align with how builder uses list_builder(), sexp_builder() for building List or SExp?

Copy link
Contributor Author

@zslayton zslayton Mar 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this because the macros are not (or should not) be generic enough to be used recursively to return additional nested lists, structs, or sexp? Should we consider the macros returning an Option of a struct or element?

No, build() will always successfully return an Element. Take a look at the Value enum and Element struct:

ion-rust/src/value/owned.rs

Lines 401 to 424 in 5c88f8e

/// Variants for all _values_ within an [`Element`].
#[derive(Debug, Clone, PartialEq)]
pub enum Value {
Null(IonType),
Integer(Integer),
Float(f64),
Decimal(Decimal),
Timestamp(Timestamp),
String(String),
Symbol(Symbol),
Boolean(bool),
Blob(Vec<u8>),
Clob(Vec<u8>),
SExpression(SExp),
List(List),
Struct(Struct),
}
/// An `(annotations, value)` pair representing an Ion value.
#[derive(Debug, Clone)]
pub struct Element {
annotations: Vec<Symbol>,
value: Value,
}

List, SExp, and Struct are each variants of the Value enum, and an Element is a (Value, annotations) pair.

In most cases, users will want an Element since that's what most APIs expect/return. That's why the macros and the build() methods both return an Element. However, in some cases (like the code this comment chain is on), you actually want the more specific List, SExp, or Struct types, not the more general Element type. In those cases, you can configure your (e.g.) ListBuilder and then call fn build_list(self) -> List instead of fn build(self) -> Element. Does that make sense?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar thought should we have ion_struct!, ion_list! return Struct, List instead of Element and use ion! for returning an Element? This seems more intuitive to me and may also align with how builder uses list_builder(), sexp_builder() for building List or SExp?

My aim is to make the most common use cases the most streamlined. Users almost never need/want a List/SExp/Struct after constructing one, and having those types be the default return values for the macros would mean you'd need to do something like:

let e = Element::from(ion_list![1, 2, 3]);

or

let e: Element = ion_list![1, 2, 3].into();

instead of just

let e = ion_list![1, 2, 3];

The macros are named ion_list!, ion_sexp! and ion_struct! because each offers syntax that mirrors the Ion text equivalent.

Ion Rust
[1, 2, 3] ion_list! [1, 2, 3]
(1 2 3) ion_sexp! (1 2 3)
{foo: "bar"} ion_struct! {"foo": "bar"}

Notice that the container macros are intended to make it easy to turn Rust variables into Ion data:

let first_name = "Lisa";
let last_name = "Jones";
let e = ion_struct! {"first": first_name, "last": last_name};

while the ion! macro expects an already-valid Ion data stream. Inserting existing values requires injecting them into Ion data like this:

let first_name = "Lisa";
let last_name = "Jones";
let ion_data = format!(r#"{{"first": "{}", "last": "{}"}}"#, first_name, last_name);
let e = ion!(ion_data);

which is error prone and expensive.

Copy link
Contributor Author

@zslayton zslayton Mar 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My aim is to make the most common use cases the most streamlined. Users almost never need/want a List/SExp/Struct after constructing one, and having those types be the default return values for the macros would mean you'd need to do something like:

After some more consideration, I ended up swinging the other way on this.

Instead of returning a fully constructed Element, the macros now return their corresponding bare value type:

  • ion_list! returns List
  • ion_sexp! returns SExp
  • ion_struct! returns Struct

I did this for a few reasons:

  • When working with any other value type (Decimal, Integer, bool, f64, etc) it's understood that there's a required conversion step that needs to happen before you can use it as an Element. This change makes the bare collection value types consistent with the scalar types; when you build a List, SExp, or Struct, there's now a required conversion step that needs to happen before you can use it as an Element.
  • Because the bare value types implement Into<Element>, you only end up paying the syntactic cost of .into() at the top level. For example:
let list: Element = ion_list! [
  1,
  true,
  ion_struct! {"foo": "bar"}, // nested; like all other bare values in this list, no .into() required
  7.5f64
].into(); // Top level, .into() required
  • Functions that take an Into<Element> instead of an Element can accept the bare value types without explicit conversion.
fn do_something<E: Into<Element>>(value: E) {
  // ...something...
}

do_something(ion_list! [1, 2, 3]); // No explicit conversion required
  • Thanks to Matt's idea to offer with_annotations(...) on values that haven't become Elements yet, we can also construct a full Element that way:
let list_element = ion_list! [1, 2, 3].with_annotations(["foo", "bar"]);
  • Going this route eliminates some minor API stutter that would happen if you used the builder APIs to construct the bare value types instead of Element:
// Before
// `ListBuilder::build()` makes `Element`, `build_list()` makes `List`
let list: List = List::builder()
  .push(1)
  .push(2)
  .push(3)
  .build_list(); // List::builder...build_list() ☹️ 

// After
// There is only `ListBuilder::build()`, which returns `List`
let list: List = List::builder()
  .push(1)
  .push(2)
  .push(3)
  .build();
// or even better:
let list = ion_list! [1, 2, 3];
  • There are some methods in the builder APIs where it's nice to be able to easily construct and pass in the bare value types List, SExp, and Struct instead of doing ion_list![1, 2, 3].as_list().unwrap(). While the builder APIs are simple/limited at the moment, I have a sense that we may use that more in the future.

@@ -94,7 +94,7 @@ impl TextValue {
}
}

/// Converts a given type into a `Vec<OwnedSymbolToken>`.
/// Converts a given type into a `Vec<RawSymbolToken>`.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ This was a dangling documentation reference to the long ago removed OwnedSymbolToken type.

assert_eq!(actual, expected);
assert_eq!(actual, expected);
}
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ I can't comment directly on element_stream_reader.rs directly because it wasn't changed in this PR, but the "redundant closure" warning from clippy is caused by #472.

@@ -12,6 +12,7 @@
//! [simd-json-value]: https://docs.rs/simd-json/latest/simd_json/value/index.html
//! [serde-json-value]: https://docs.serde.rs/serde_json/value/enum.Value.html

pub mod builders;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ Changes in this file are superficial; I just updated the unit tests to use the new macros where possible.

}
impl Element {
pub fn null(null_type: IonType) -> Element {
null_type.into()
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ I renamed all of the new_TYPE methods to simply TYPE (e.g. Element::new_blob(...) to Element::blob(...) with the exception of the container types. struct is a Rust keyword, so the container types' methods' names now indicate that they return the appropriate builder. (e.g. new_struct() -> struct_builder())

}

fn new_bool(bool: bool) -> Element {
Value::Boolean(bool).into()
pub fn float(float: f64) -> Element {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ I renamed the new_f64 method to new_float for consistent use of Ion type names.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can a float be case into a specific float type, e.g. Element.as_f64?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the Ion data model, a float is always 64 bits. Once it's read from the stream into memory, it's basically always an f64. However, on the writing side we make an effort to encode it as an f32 if it can be done losslessly.

@@ -467,111 +466,111 @@ impl PartialEq for Element {

impl Eq for Element {}

impl From<Value> for Element {
fn from(val: Value) -> Self {
Self::new(vec![], val)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ Changes in this file were made to guarantee that Rust values with an obvious mapping to Ion Elements could be passed into APIs without needing explicit boilerplate conversion code.

src/value/reader.rs Outdated Show resolved Hide resolved
@zslayton zslayton marked this pull request as ready for review March 13, 2023 16:34
src/value/reader.rs Outdated Show resolved Hide resolved
src/value/builders.rs Outdated Show resolved Hide resolved
src/value/builders.rs Outdated Show resolved Hide resolved
src/value/builders.rs Show resolved Hide resolved
src/value/builders.rs Show resolved Hide resolved
src/value/builders.rs Show resolved Hide resolved
src/value/builders.rs Outdated Show resolved Hide resolved
src/value/owned.rs Show resolved Hide resolved
src/value/reader.rs Outdated Show resolved Hide resolved
@zslayton zslayton requested a review from popematt March 14, 2023 17:11
@jobarr-amzn jobarr-amzn mentioned this pull request Mar 14, 2023
@zslayton zslayton deleted the element-macros branch April 10, 2023 16:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants