Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature: add support for io operations from filenames #52

Merged

Conversation

neutrinoceros
Copy link
Contributor

@neutrinoceros neutrinoceros commented May 22, 2021

fix #51
opening as a draft for now, I'll need to add tests

There's room for discussion over the exact implementation for this since no matter how it is handled, it will always add a slight overhead to the existing behaviour, either as an additional conditional check or a try/except.
The latter solution is particularly expensive (performance wise) if an exception is typically raised more than 1% of the time, so I went with the former, which should make both load and dump perform decently in any case, to no significant cost.

A side effect is that in order to achieve this properly, I had to add support to load/dump to and from binary files.

@neutrinoceros neutrinoceros force-pushed the io_operations_from_filenames branch 2 times, most recently from 6f4deaa to d79da4a Compare May 22, 2021 10:56
@neutrinoceros neutrinoceros marked this pull request as ready for review May 22, 2021 10:59
@marzer
Copy link
Collaborator

marzer commented May 22, 2021

RE performance concerns: given my findings in #42 I'd be surprised it came up on a profiler regardless of which route you go, since the vast majority of CPU time is spent in C++ (toml::parse)

@neutrinoceros
Copy link
Contributor Author

Nice, this is a relief ! However the patch ended up increasing the complexity of your io functions much more than I anticipated, no hard feelings if you don't want it for this reason :)

@marzer
Copy link
Collaborator

marzer commented May 22, 2021

Had a quick look at your changes and they look fine to me, but ultimately it's up to @bobfang1992 since the python lib is his project, I'm just the shadowy C++ spectre looming in the shadows :)

One concern I have is the addition of the encoding parameter; note that the TOML spec explicitly states that all valid TOML is UTF-8, so if any non-UTF-8 string is passed through to toml++ it will likely fail with a 'malformed UTF-8' error. Shouldn't be a problem if you can ensure that python handles all the encoding changes internally and the C++ lib never sees anything other than UTF-8.

(worth noting that I don't know how pybind11 handles strings; it's possible that it already ensures they're UTF-8?)

@neutrinoceros
Copy link
Contributor Author

One concern I have is the addition of the encoding parameter; note that the TOML spec explicitly states that all valid TOML is UTF-8,

Thank you, this is an excellent point that I failed to consider.

Furthermore, quoting the spec (re: binary)

for binary data, it is recommended that you use Base64 or another suitable ASCII or UTF-8 encoding

so It seems pretty clear to me that the encoding param should not be exposed in pytomlpp's api. I'll simplify my patch accordingly.

@neutrinoceros
Copy link
Contributor Author

I see that the wheels are failing to build because some of the tests that I added use pathlib.Path objects as inputs for load and dump, but the builtin open function only gained support for such object in Python 3.6
It should suffice to convert those paths to strings in the tests, let's see.

@bobfang1992 bobfang1992 self-requested a review May 22, 2021 19:43
@bobfang1992
Copy link
Owner

Hi I fixed this in #53 can you rebase?

@neutrinoceros
Copy link
Contributor Author

Done !
To be clear, should I remove the third commit, or do you need to keep tests compatible with Python 3.5 ?

@bobfang1992
Copy link
Owner

bobfang1992 commented May 23, 2021

I see no harm keeping them the way they are. I think we can merge this if the CI passes.

@bobfang1992 bobfang1992 merged commit f9d7c41 into bobfang1992:master May 23, 2021
@neutrinoceros
Copy link
Contributor Author

Thank you !

@neutrinoceros neutrinoceros deleted the io_operations_from_filenames branch May 24, 2021 13:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Feature request: overloading load/dump to accept filename/paths
3 participants