Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New features: faster xdr reader, ascii file reader, basic writers #31

Open
trossi opened this issue Dec 4, 2023 · 2 comments · May be fixed by #40
Open

New features: faster xdr reader, ascii file reader, basic writers #31

trossi opened this issue Dec 4, 2023 · 2 comments · May be fixed by #40

Comments

@trossi
Copy link
Contributor

trossi commented Dec 4, 2023

Hi @vnmabus! This package has been greatly useful and I have implemented some new features (faster reading of arrays, a reader for ascii files, and xdr and ascii writers for basic types). Would such features fit to your plans for the rdata package? I could submit them in a few separate PRs to make the review smoother.

@vnmabus
Copy link
Owner

vnmabus commented Dec 5, 2023

They definitely fit. In fact, I mentioned in #13 the need for faster reading of arrays.

Furthermore, we need to move away from xdrlib, as it will be removed in Python 3.13. I want to discuss this topic with the reviewers of PyOpenSci (pyOpenSci/software-submission#144), @rich-iannone, @has2k1 and @isabelizimm because I do not know if there is a semi-official successor package, or if it is better to vendor it. As I understand, your proposal is to add the XDR reader/writer to this project, instead of depending on an external parser, am I right?

As for the ASCII format, I guess that the implementation is to just add a parser subclass and overwrite the reading methods therein. At least that is what I thought that could be done (but nobody asked for that feature so I did not bother implementing it). I will definitively add it if you create a PR.

I have more doubts about the writing methods, as currently this package only reads and does not write XDR (but again, I am not opposed to it, it is just that I did not need it personally and it is a ton of work). I guess that you just added the methods for writing XDR basic types and arrays, which is fine.

@trossi
Copy link
Contributor Author

trossi commented Dec 5, 2023

They definitely fit. In fact, I mentioned in #13 the need for faster reading of arrays.

Great!

Furthermore, we need to move away from xdrlib, as it will be removed in Python 3.13. I want to discuss this topic with the reviewers of PyOpenSci (pyOpenSci/software-submission#144), @rich-iannone, @has2k1 and @isabelizimm because I do not know if there is a semi-official successor package, or if it is better to vendor it. As I understand, your proposal is to add the XDR reader/writer to this project, instead of depending on an external parser, am I right?

Yes, I did reading directly with numpy and a positive side effect is indeed that deprecated xdrlib isn't needed anymore.

As for the ASCII format, I guess that the implementation is to just add a parser subclass and overwrite the reading methods therein. At least that is what I thought that could be done (but nobody asked for that feature so I did not bother implementing it). I will definitively add it if you create a PR.

That's right, exactly so. ASCII format is quite handy for debugging and testing, but not really that useful for production use.

I have more doubts about the writing methods, as currently this package only reads and does not write XDR (but again, I am not opposed to it, it is just that I did not need it personally and it is a ton of work). I guess that you just added the methods for writing XDR basic types and arrays, which is fine.

Yes, I implemented xdr/ascii writer classes analogous to parsers, and conversion functions for lists, dicts, and numpy arrays to R-like format. I agree that covering all types would be a lot of work and the probably there isn't unique mapping between Python and R types either. I basically focused only to types that I needed for a project.

@trossi trossi mentioned this issue Dec 5, 2023
@trossi trossi linked a pull request Mar 4, 2024 that will close this issue
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants