Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parsing (registry) multiple schemas #32

Closed
x10ba opened this issue Feb 12, 2016 · 11 comments
Closed

Parsing (registry) multiple schemas #32

x10ba opened this issue Feb 12, 2016 · 11 comments

Comments

@x10ba
Copy link

x10ba commented Feb 12, 2016

Hi,
I am trying to parse schemas that are split in multiple files (my understanding is I need to register these during the parse). However, the documentation is a bit sparse and I am having difficulty using the registry, can you give an example on how to use?
Thanks,
x10ba
pseudo code nodejs

var first_type = avsc.parse(frist.avsc); // how do I use registry?
var second_type = avsc.parse(second.avsc);

// parse avro encoded message fromBuffer
console.log(type.fromBuffer());

two schema files
// first.avsc
{
"namespace": "x10ba",
"type": "record",
"name": "first",
"fields": [
{ "name": "stuff", "type": [ "long", "null" ] },
{ "name": "moreStuff", "type": [
"x10ba.second", "null" ] }
]
}

// second.avsc
{
"namespace": "x10ba",
"type": "record",
"name": "second",
"fields": [
{ "name": "someStuff", "type": [ "string", "null" ] },
]
}

@mtth
Copy link
Owner

mtth commented Feb 12, 2016

Sure! The two things to know are:

  • When parse creates a new named type (for example a record), it will add it to its registry.
  • When parse encounters a type reference, it will look it up inside its registry (and throw an error if it isn't found).

By default the registry is just an empty object, but by sharing it between parse calls we can allow a schema to reference types defined in another schema. Using your example, that would look like:

var registry = {};
var secondType = avro.parse('./second.avsc', {registry: registry});
// At this point `registry` contains the definition of `x10ba.second`.
var firstType = avro.parse('./first.avsc', {registry: registry});

Note that the order in which we parse the schemas is important (it should be chosen such that references can be resolved).

@x10ba
Copy link
Author

x10ba commented Feb 12, 2016

BTW. I love your module, and you are very responsive (excellent Open Source Committer)

So, I'd call:
obj = avroEncodedMessage (buffer)
firstType.fromBuffer(obj)
//secondType reference held in the registry, so would not error?

@mtth
Copy link
Owner

mtth commented Feb 12, 2016

Thanks :).

So, I'd call:

obj = avroEncodedMessage; // (buffer)
firstType.fromBuffer(obj)
//secondType reference held in the registry, so would not error?

Right. If parse successfully returned, then you're all set.

@x10ba
Copy link
Author

x10ba commented Feb 12, 2016

So simple.
Thanks for the help. Closing issue. Have a great weekend.

@x10ba x10ba closed this as completed Feb 12, 2016
@x10ba
Copy link
Author

x10ba commented Feb 12, 2016

Hi,
Though parser/ registry works, adding this comment which I am trying to resolve:

//in the above examples, when parsed, I get this result for one of the k:v.

// registry works on parsing, but returns [Object].
{...,
"moreStuff":{"x10ba.second": [Object]}
}

@mtth
Copy link
Owner

mtth commented Feb 12, 2016

That's probably because nested objects get truncated when you print them (see util.inspect).

There are multiple ways around it, for example:

  • You can increase the depth: console.log(util.inspect(obj, {depth: null}))
  • You can stringify the object first: console.log(JSON.stringify(obj, null, 2))
  • You can use Avro's JSON encoding: console.log(type.toString(obj))

@x10ba
Copy link
Author

x10ba commented Feb 12, 2016

yep. worked.

@mtth mtth mentioned this issue Feb 15, 2016
@krukru
Copy link

krukru commented Mar 14, 2018

Is it possible to have circular dependencies with this pattern?
For example, record A has a field of type B, and record B has a field of type A.

Would it help if all the schemas were jumbled in one json file?

@mtth
Copy link
Owner

mtth commented Mar 15, 2018

@krukru - yes, you should put both schemas in the same file if you have a circular dependency.

@krukru
Copy link

krukru commented Mar 15, 2018

Hey @mtth, could you then help me out, I tried the following but did not work as expected.

    it("Should parse circular dependencies", function() {
        const type = avro.Type.forSchema([
            {
                "type": "record",
                "name": "A",
                "fields": [
                    {
                        "name": "fieldB",
                        "type": "B"
                    },
                ]
            },
            {
                "type": "record",
                "name": "B",
                "fields": [
                    {
                        "name": "fieldA",
                        "type": "A"
                    },
                ]
            }
        ]);

        const objA: A = new A();
        const buffer = type.toBuffer(objA);
    });

This fails with exception
Error: undefined type name: B at Function.Type.forSchema (nodejs/node_modules/avsc/lib/types.js:167:11) at new Field (nodejs/node_modules/avsc/lib/types.js:2722:20) at RecordType.<anonymous> (nodejs/node_modules/avsc/lib/types.js:2086:17) at Array.map (<anonymous>) at new RecordType (nodejs/node_modules/avsc/lib/types.js:2085:45) at nodejs/node_modules/avsc/lib/types.js:210:14 at Function.Type.forSchema (nodejs/node_modules/avsc/lib/types.js:211:7)

@mtth
Copy link
Owner

mtth commented Mar 15, 2018

Sure, you'll need to expand B's schema the first time it's encountered. I think this'll do the trick:

const type = avro.Type.forSchema([
            {
                "type": "record",
                "name": "A",
                "fields": [
                    {
                        "name": "fieldB",
                        "type": {
                          // Put B's schema here since it's the first time we encounter it.
                          "type": "record",
                          "name": "B",
                          "fields": [{"name": "fieldA", "type": "A"}]
                        }
                    }
                ]
            },
            "B" // Now we can just reference it by name.
        ]);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants