Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upHandle invalid CSV data with Serde #237
Conversation
alisha17
force-pushed the
alisha17:invalid_data
branch
from
8af57ab
to
ca6f3bd
Jul 11, 2017
budziq
requested changes
Jul 11, 2017
| id: u64, | ||
| } | ||
| fn run() -> Result<(), Box<Error>> { |
This comment has been minimized.
This comment has been minimized.
| @@ -8,6 +8,7 @@ | |||
| | [Encode a string as application/x-www-form-urlencoded][ex-urlencoded] | [![url-badge]][url] | [![cat-encoding-badge]][cat-encoding] | | |||
| | [Encode and decode hex][ex-hex-encode-decode] | [![data-encoding-badge]][data-encoding] | [![cat-encoding-badge]][cat-encoding] | | |||
| | [Encode and decode base64][ex-base64] | [![base64-badge]][base64] | [![cat-encoding-badge]][cat-encoding] | | |||
| | [Handle invalid CSV data with Serde][ex-invalid-csv] | [![csv-badge]][csv] | [![cat-encoding-badge]][cat-encoding] | | |||
This comment has been minimized.
This comment has been minimized.
| <a name="ex-invalid-csv"></a> | ||
| ## Handle invalid CSV data with Serde | ||
|
|
||
| [![csv-badge]][csv] [![cat-encoding-badge]][cat-encoding] |
This comment has been minimized.
This comment has been minimized.
| #[serde(rename_all = "PascalCase")] | ||
| struct Record { | ||
| Name: String, | ||
| #[serde(deserialize_with = "csv::invalid_option")] |
This comment has been minimized.
This comment has been minimized.
budziq
Jul 11, 2017
Collaborator
this serde macro is important in avoiding deserialization errors. pleas mention it in description and add links to docs.
| use std::process; | ||
| #[derive(Debug, Deserialize)] | ||
| #[serde(rename_all = "PascalCase")] |
This comment has been minimized.
This comment has been minimized.
budziq
Jul 11, 2017
Collaborator
If this serde macro is important then please mention it in description and add link to docs
| [![csv-badge]][csv] [![cat-encoding-badge]][cat-encoding] | ||
|
|
||
| We get deserialization errors if invalid CSV data like empty field | ||
| is encountered. Serde converts any such errors to a None value, when applied |
This comment has been minimized.
This comment has been minimized.
| let data = "\ | ||
| Name,Place,id | ||
| Mark,,46 | ||
| Ashley,Zurich,92"; |
This comment has been minimized.
This comment has been minimized.
| } | ||
| fn run() -> Result<(), Box<Error>> { | ||
| let data = "\ |
This comment has been minimized.
This comment has been minimized.
| fn run() -> Result<(), Box<Error>> { | ||
| let data = "\ | ||
| Name,Place,id | ||
| Mark,,46 |
This comment has been minimized.
This comment has been minimized.
budziq
Jul 11, 2017
Collaborator
this field is actually valid csv the place will be an empty string
please see the https://docs.rs/csv/1.0.0-beta.3/csv/fn.invalid_option.html for suggestion on invalid data from serde point of view.
alisha17
force-pushed the
alisha17:invalid_data
branch
from
24136f3
to
c1f5086
Jul 15, 2017
This comment has been minimized.
This comment has been minimized.
|
Hi @alisha17 thanks for all the submissions! Unfortunately I'll not be able to follow up with anything resembling review for the next two weeks due to holidays |
brson
reviewed
Jul 19, 2017
|
|
||
| [![csv-badge]][csv] [![serde-badge]][serde] [![cat-encoding-badge]][cat-encoding] | ||
|
|
||
| We get deserialization errors if [`invalid CSV data`] is encountered. [`Serde`] is |
This comment has been minimized.
This comment has been minimized.
brson
Jul 19, 2017
Contributor
Please remove the backticks from "invalid CSV data" - backticks are for formatting text as code and this is just regular text. Same with "Serde".
brson
reviewed
Jul 19, 2017
|
|
||
| We get deserialization errors if [`invalid CSV data`] is encountered. [`Serde`] is | ||
| essential to convert any such errors to a [`None`] value, when applied to | ||
| the [`Option`] field. |
This comment has been minimized.
This comment has been minimized.
brson
Jul 19, 2017
Contributor
The second sentence here, "Serde is essential to convert any errors ..." seems off to me. This examples looks to be about using csv::invalid_option to deal with bogus CSV data.
I might instead word this as something like
CSV files often contain invalid data. For these cases the
csvcrate provides a custom deserializer,csv::invalid_option, which automatically converts invalid data toNonevalues.
Place a link on csv::invalid_option, but remove the links from Some and None - those are very fundamental types that don't need links here IMO.
This comment has been minimized.
This comment has been minimized.
|
I added some further feedback about the description of this test. This needs a rebase. |
alisha17
force-pushed the
alisha17:invalid_data
branch
2 times, most recently
from
56bf92d
to
bffa320
Jul 20, 2017
budziq
requested changes
Jul 20, 2017
|
Nice! just few more details and we are golden. |
| provides a custom deserializer, [`csv::invalid_option`], which automatically | ||
| converts invalid data to None values. | ||
|
|
||
| [`csv::invalid_option`]: https://docs.rs/csv/*/csv/fn.invalid_option.html |
This comment has been minimized.
This comment has been minimized.
budziq
Jul 20, 2017
Collaborator
please move this link to the bottom of the file under referencje section
|
|
||
| ```rust | ||
| # #[macro_use] | ||
| #extern crate error_chain; |
This comment has been minimized.
This comment has been minimized.
| fn run() -> Result<()> { | ||
| let data = "name,place,id | ||
| mark,\"\",46 |
This comment has been minimized.
This comment has been minimized.
budziq
Jul 20, 2017
Collaborator
the value here is actually valid ( string containing """").
please note that in the example from docs has a i32 type trying to be deserialized both from such string and xyz which are both invalid i32.
another approach would be to actually leave out one of the fields (remove the , too).
This comment has been minimized.
This comment has been minimized.
alisha17
Jul 23, 2017
Author
Contributor
If we leave out one field, it will actually give error of fields: "found record with 2 fields, but the previous record has 3 fields".
This comment has been minimized.
This comment has been minimized.
budziq
Jul 23, 2017
Collaborator
Hmm the docs seem of then. I'll lok into it after holidays.
ok then exchange the Optional field to non string one as almost anything you throw at it will produce a string.
| fn run() -> Result<()> { | ||
| let data = "name,place,id | ||
| mark,\"\",46 | ||
| ashley,zurich,92"; |
This comment has been minimized.
This comment has been minimized.
budziq
Jul 20, 2017
Collaborator
I would add two more records. one valid and another invalid in another way.
| name: String, | ||
| #[serde(deserialize_with = "csv::invalid_option")] | ||
| place: Option<String>, | ||
| id: u64, |
This comment has been minimized.
This comment has been minimized.
budziq
Jul 20, 2017
Collaborator
how about making all of the fields optional as in the docs for csv::invalid_option?
alisha17
force-pushed the
alisha17:invalid_data
branch
from
cd0db61
to
6d81608
Jul 23, 2017
budziq
requested changes
Jul 23, 2017
|
one final thing and we are good to go. |
| #[derive(Debug, Deserialize)] | ||
| struct Record { | ||
| name: Option<String>, |
This comment has been minimized.
This comment has been minimized.
budziq
Jul 23, 2017
Collaborator
do the Option's here make sense without csv::invalid_option? these might obfuscate the example.
This comment has been minimized.
This comment has been minimized.
alisha17
Jul 23, 2017
Author
Contributor
I think it actually doesn't really matter. But yes, in the example it's better to keep Option with invalid field only. I will fix it.
This comment has been minimized.
This comment has been minimized.
budziq
Jul 23, 2017
Collaborator
yep the code is still valid but newb reader might get confused when seeing more Options than the relevant one.
alisha17
force-pushed the
alisha17:invalid_data
branch
from
e43448d
to
4fa3a38
Jul 24, 2017
budziq
approved these changes
Jul 24, 2017
This comment has been minimized.
This comment has been minimized.
|
I rebased and merged manually. Thanks @alisha17 |
alisha17 commentedJul 11, 2017
•
edited
fixes #231