Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

export to file-like objects #61

Closed
umlaeute opened this issue Sep 15, 2016 · 5 comments
Closed

export to file-like objects #61

umlaeute opened this issue Sep 15, 2016 · 5 comments
Assignees

Comments

@umlaeute
Copy link

would it be possible to have the exporters/importers take filelike objects (aka streams) instead of filenames?

e.g. I would like to get a JSON representation of the database, without having to create a temporary file and then read that file back into my application.
instead i'd like to do something like:

import canmatrix.exportjson
import io
f = io.StringIO()
canmatrix.exportjson.exportJson(db, f)
j = f.getvalue()

(though of course in reality i would like to really be able to get a meaningful representation of the database in Python; this is related to #57 ; but i still think that it is a good idea to not have external files (references via filenames) as the sole interface)

@ebroecker ebroecker self-assigned this Sep 16, 2016
@ebroecker
Copy link
Owner

What do you think about this kind of solution?

283e904#diff-ca99f39597194379924d2c654a772772

(in branch stableApi)

@altendky
Copy link
Collaborator

If detecting type you should use isinstance() one way or the other, but let's look a little deeper. Note that I already hacked around this on the input side myself.

I see four basic possibilities: file path (string-like-object), file contents (string-like-object), file (file-like-object), and file descriptor (integer? I forget). I'll suggest that we at least consider separate parameters or separate function calls to handle each. The core function would either take the file contents or a file like object. The file contents is reasonable since these files tend not to be too terribly large. If we want to be concerned about large files then a file like object would be better since it doesn't require that the entirety of the file be in memory at once. Of course, since that is behind the scenes it could change at any point without affecting the interface.

When I asked on #python I was given the standard JSON library as an example. It offers load() for file-like objects and loads() for string containing the contents. I can reason to not adding the interface for the string containing the file path by considering the differences with and without. With such an interface, the application programmer must figure out what exceptions are thrown by the the library when an issue with open() occurs. Without, they only have to write the standard open() call themselves. Plus, leaving coding of open() up to the application developer allows them to control other options directly. In short, I see little useful abstraction provided by the string-like filepath option but there is obfuscation. I would be fine with dropping this interface.

Of the remaining three, the file descriptor seems the least likely to be used.

This leaves the primary two interfaces as file-like-objects and contents-containing-strings.

@ebroecker
Copy link
Owner

ebroecker commented Sep 19, 2016 via email

@altendky
Copy link
Collaborator

That's an interesting point that the 'any' functions are based on the filename so they kind of need that information. It seems that all the 'any' functions provide is a mapping from a string (the file extension) to the corresponding import and export functions. Couldn't they provide the same pair of functions (file-like-object, and contents-containing-string) as the individual formats? Then additionally take the file extension. This would replace the existing force-output option.

On to the second point. Which exporters require the binary format? Hmm, I can probably answer that question myself. :] Looks like:

https://github.com/ebroecker/canmatrix/blob/383702bf4b73c31688214347018371b41a0fbe24/canmatrix/exportarxml.py
https://github.com/ebroecker/canmatrix/blob/383702bf4b73c31688214347018371b41a0fbe24/canmatrix/exportdbc.py
https://github.com/ebroecker/canmatrix/blob/383702bf4b73c31688214347018371b41a0fbe24/canmatrix/exportdbf.py
https://github.com/ebroecker/canmatrix/blob/383702bf4b73c31688214347018371b41a0fbe24/canmatrix/exportfibex.py
https://github.com/ebroecker/canmatrix/blob/383702bf4b73c31688214347018371b41a0fbe24/canmatrix/exportjson.py
https://github.com/ebroecker/canmatrix/blob/383702bf4b73c31688214347018371b41a0fbe24/canmatrix/exportkcd.py
https://github.com/ebroecker/canmatrix/blob/383702bf4b73c31688214347018371b41a0fbe24/canmatrix/exportsym.py

I presume there are various reasons these require binary, and in the case of JSON it depends on the python version. :[ In some of the cases I believe that having the file be opened non-binary would consistently result in an exception. To be helpful we could catch, chain, and reraise the exception and add a comment about the likely reason (just read about exception chaining but haven't actually used it yet), or just leave it as is. In the exception-free cases (do you know off-hand which ones do this?), perhaps we could help out by checking the file's .mode to make sure it contains b? I'm not sure the best way to handle this. I'll let you know if I find something but asking on #python didn't yield anything nifty.

I understand that this is not a solution and I am willing to put in some time to actually code but wanted to discuss a bit first.

@ebroecker
Copy link
Owner

fixed in "stableApi"

the interface is changed completely.

the most easy way to use is now:

import canmatrix.formats
f = open("someDbc.dbc","rb")
# use file object interface
canmatrix.formats.load(f, "dbc")

##### 
# or use path interface
canmatrix.formats.loadp("someDbc.dbc")  # this is like importany.importany("someDbc.dbc")

##### 
# or use string interface
canmatrix.formats.loads(buffer, "dbc")

exporting now does "dump"

import canmatrix.formats
f = open("someDbc.dbc","rb")
# use file object interface
canmatrix.formats.dump(db, f, "dbc")

##### 
# or use path interface
canmatrix.formats.dumpp(db, "someDbc.dbc")  # this is like exportany.importany(db, "someDbc.dbc")

there is no documentation jet - sorry.
Maybe best entrypoint is formats.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants