Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add comment writing capabilities #12

Closed
Anaphory opened this issue May 28, 2018 · 7 comments
Closed

Add comment writing capabilities #12

Anaphory opened this issue May 28, 2018 · 7 comments

Comments

@Anaphory
Copy link

Anaphory commented May 28, 2018

I just had a use case for writing comments to a CSV file, in order to ensure that light metadata stays with the data.
Given that the readers in this module can apparently read CSV with comments, it seems appropriate for the writer to have a method like

def writecomment(self, comment):
    if self.comment_prefix is None:
        raise ValueError(
            'Cannot write comments in this csv dialect')
    for row in comment.split("\n"):
        self.f.write(self.comment_prefix)
        self.f.write(row)
        self.f.write('\n')

which probably still needs consideration of encoding and escaping. (And self.comment_prefix would need to be set in the constructor.)

@LinguList
Copy link

woudl be useful for handling like edictor setting comments, but also other cases, of course. I remember we put this at the side, but I do not remember, in which issue we discussed it (maybe somewhere in pycldf).

@Anaphory
Copy link
Author

Anaphory commented Jun 5, 2018

I have started a rudimentary implementation:
Anaphory@2c65974

@xrotwang
Copy link
Contributor

xrotwang commented Jun 5, 2018

Hm. I almost feel bad about having implemented more support for comments than just skipping them on read :)

Why should the metadata stay with the data - but in an ideosyncratic/under-specified way? Why not add a column? Or put the metadata into the JSON file?

I sort of see the edictor use case - but then, for global (i.e. not per-row) metadata stored as comments in the csv, it would be easy enough to read the comments in separate code, and have csvw simply strip the comments.

@LinguList
Copy link

Yes, that's what edictor is doing now, and in fact, when sharing data after having comments inside it, it is probably anyway better to strip those off, at least for publication...

@Anaphory
Copy link
Author

Anaphory commented Jun 5, 2018

In this case, I have a lot of simulation results in different files and I have once to many accidentally moved data files without their metadata, losing the metadata. That's why I wanted a quick-and-dirty (admittedly) way to keep metadata in the data file.

Honestly, I would probably even prefer to use dedicated code to write to the data file because in general I don't like comments in CSVs, so I should not really promote their use. However, I thought that code should ideally be aware of the UnicodeDictWriter's dialect (which my commit is barely), and the logical consequence of that was to suggest it for inclusion here.

@xrotwang
Copy link
Contributor

xrotwang commented Jun 5, 2018

@Anaphory I agree that the amount of support for reading comments would motivate exactly the kind of support for writing you have in mind. So I'm somewhat sympathetic. It's just that I think it was a mistake to begin with :)

@Anaphory
Copy link
Author

Good! Let's drop this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants