Skip to content

AVRO-3248 Rust: Support named types in UnionSchema#1393

Closed
martin-g wants to merge 2 commits intoapache:masterfrom
martin-g:avro-3248-better-union-schema-in-rust
Closed

AVRO-3248 Rust: Support named types in UnionSchema#1393
martin-g wants to merge 2 commits intoapache:masterfrom
martin-g:avro-3248-better-union-schema-in-rust

Conversation

@martin-g
Copy link
Member

@martin-g martin-g commented Nov 7, 2021

Drop the index map in UnionSchema. It is not common to have many schemata in the union. Usually it is either "null"+otherType or just a few of otherType. Iterating over a short list is not that slow

Jira

  • My PR addresses the following AVRO-3248 issue

Tests

  • My PR adds new unit tests

Commits

  • My commits all reference Jira issues in their subject lines.

Drop the index map in UnionSchema. It is not common to have many schemata in the union. Usually it is either "null"+otherType or just a few of otherType. Iterating over a short list is not that slow
@github-actions github-actions bot added the Rust label Nov 7, 2021
@lulitao1997
Copy link
Contributor

lulitao1997 commented Nov 12, 2021

Sorry, i missied your message, somehow it did't send the notification by email :P

I checked and ur solution seems to be wrong if we have a union like this

["A", "B"]

where A and B are like this --

        // A and B are the same except the name.
        let schema_str_1 = r#"{
            "name": "A",
            "type": "record",
            "fields": [
                {"name": "field_one", "type": "float"}
            ]
        }"#;

        let schema_str_2 = r#"{
            "name": "B",
            "type": "record",
            "fields": [
                {"name": "field_one", "type": "float"}
            ]
        }"#;

in this case, the schema found by a value will always be "A"'s schema.

this also applies to A, B being "fixed" type, which seems plausible in reallife (for example A = {"name": "fieldA", "type": "fixed", "size": 10}, B = {"name": "fieldB", "type": "fixed", "size": 10})

here is my PR for these scenario . #1396

to solve that, we can change Value::Union to (index in the type list, value it holds), Similiar to Value::Enum.
this allows us to get Union's inner_schema for named types
directly, without validating the schema.

@martin-g
Copy link
Member Author

Closed in favour of #1396

@martin-g martin-g closed this Jan 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants