-
Notifications
You must be signed in to change notification settings - Fork 211
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The trait serde::ser::Serialize
is not implemented for csv::ByteRecord
#76
Comments
We also upgraded to the latest beta release of `csv` to get support for `position`, which we may want to use in error messages in the future. But that's blocked on BurntSushi/rust-csv#76
@emk Could you maybe say why you are using Also note that I can't test this with the compiler, but I'd think this would be the best way to write your loop: let mut row = csv::ByteRecord::new();
while rdr.read_byte_record(&mut row)? {
let zip = from_utf8(&row[zip_col_idx])
.chain_err(|| -> Error { "Could not parse zip code as UTF-8".into() })?;
row.push_field(self.chunk_for(zip)?.as_bytes());
wtr.write_byte_record(&row)?;
} Note that I also tried to get rid of the
I did indeed consider that. I decided to not do it for now because (as I think is the case here) it results in folks using sub-optimal APIs. e.g., You already have a |
I think it's me who missed the relevant sections of the new documentation while porting code forward. :-/ Sorry about that. I'll go re-read the documentation more carefully this time. Thank you for the pointer in the right direction! |
@emk No worries! Note that |
I really learned a lot from the old tutorial section that showed the absolute fastest ways to do various operations—that was a super-helpful bit of documentation. Basically, if we touch CSV data, there's automatically going to be at least 150 GB of it. So I try to know all the tricks for these inner loops. Thank you once again for (Now I need to go rebuild my Pachyderm containers and get ready for another couple thousand CPU hours worth of data processing next week. ;-) And no, |
@emk Thanks for your kind words. :-) Note that |
@emk Coming back around to this, I wanted to say a little bit more: I would very much appreciate any feedback you have on csv 1.0. I released a beta first specifically so I could still make breaking changes if I have to. :-) |
I really ought to update We like CSV for "extract transform load" (ETL) tasks, not because it's the most performant data format, but because it allows us to build pipelines between heterogeneous tools and still get mostly acceptable performance. |
Oh, also, I personally released cli_test_dir, which is a standalone version of your I noticed this pattern while working on |
We also upgraded to the latest beta release of `csv` to get support for `position`, which we may want to use in error messages in the future. But that's blocked on BurntSushi/rust-csv#76
Hello! I'm trying out the new
csv
beta release on some of in-house data-munging tools, and it's great.But I did discover one tiny way to improve the ergonomics:
The above code has no idea what the columns are in any given record, except for
zip_col_idx
, which it uses to compute a new column and add it to the output row. But this code fails with:This turns up in a lot of places in the API when working with generic CSV tools. I suspect one way to improve this might be to implement
Serialize
andDeserialize
for bothByteRecord
andRecord
.I might be able to find time sometime soon to put together a PR. But is there a better way to fix this?
The text was updated successfully, but these errors were encountered: