Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add simpleion support for structs with multiple values for the same key #36

Closed
tgregg opened this issue Jan 20, 2017 · 4 comments · Fixed by #82
Closed

Add simpleion support for structs with multiple values for the same key #36

tgregg opened this issue Jan 20, 2017 · 4 comments · Fixed by #82

Comments

@tgregg
Copy link
Contributor

tgregg commented Jan 20, 2017

Simpleion maps Ion structs to python dicts, whose keys are de-duped. IonPyDict needs to support duplicate mappings when iterated and compared.

Successful resolution of this issue will involve removal of the following files from the test_vectors.py skip list:
good/equivs/structsFieldsRepeatedNames.ion
good/nonequivs/structs.ion

@rubcuevas
Copy link
Contributor

rubcuevas commented Apr 16, 2019

Any plans to tackle this? I'm using this library for an internal project but will need to drop it unless this is fixed.

@tgregg
Copy link
Contributor Author

tgregg commented Apr 16, 2019

Yes, however there are several other things that we would probably prioritize over this in the short term, especially performance.

That said, we would be happy to entertain a pull request if you need this functionality immediately.

@rubcuevas
Copy link
Contributor

I'll be happy to help if I find some time.

Do you have an estimation on how complex this change is? Can I get a rough description about the changes I'd need to do / files to look at / small design - provided that you might have thought about it already?

I took a quick look at the code. Let me know if I got this right: in simple_types.py I can see this line:

IonPyDict = _ion_type_for('IonPyDict', dict)  # TODO support multiple mappings for same field name (iteration only).

One of the changes needed would be to remove that line and customize the IonPyDict class:

class IonPyDict(_IonNature):
# Custom methods here
...

In order to support multiple mappings for the same field, IonPyDict could keep its state in a defaultdict(list). However, if I do this, I suspect that I wouldn't be able to tell the difference between a key that maps to an array or a key with multiple mappings. Am I right?

Another change I could identify is in the _load method from simpleion.py:

def _load(out, reader, end_type=IonEventType.STREAM_END, in_struct=False):

    def add(obj):
        if in_struct:
            # TODO what about duplicate field names?
            out[event.field_name.text] = obj
        else:
            out.append(obj)

I understand that I should stop using the outdict and use a defaultdict(list) or the custom IonPyDict.

And there might also be some changes to do in the _dump method.

Does this sound right to you? Am I missing something?

@tgregg
Copy link
Contributor Author

tgregg commented Apr 17, 2019

I think you're on the right track here. When updating IonPyDict, it will be important to retain the existing dict behavior, namely the ability to use [] to access its values. Behind the scenes, you can keep track of multiple values for the same key (such as by using defaultdict(list)), but [] would return just one. That way, existing users won't break and the recommended usage pattern (namely, one value per key) won't be complicated.

You can add the ability to receive all values for a given key through a separate API. I'd also expect iterating over the IonPyDict to return all values for each key. If you're able to do this, you might even avoid having to change dump.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants