New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Decodable sequence #15
Comments
Hi @josh, Yeah, last week was pretty productive 😄 I didn't manage to do everything I wanted (the Referring to your idea:
|
I haven't gotten a chance to look at any of this decoder work at all, so take my comments with a grain of salt, but two small notes:
In general, I agree with @dehesa that when an object is decoded, it is expected to be fully initialized. If lazily decoding is entirely transparent (e.g. a |
Hi @itaiferber, thank you for pitching in here. I completely agree with all the points you've raised. My doubt hangs more over my point 4. Let me rephrase it: For a lazily sequence decoder, as the one you have described: let sequence = CSVDecoder().decodeSequence(of: CustomRowType.self, from: url)
for row in sequence { ... } How could we iterate through its elements when the operation may throw at any point? We would like to keep the
Do you see any other option for lazy sequence decoding? |
Exposing something like |
@dehesa Good point, I wasn't thinking about that. I was going to write the following:
But the more I think about it, the more for row in sequence {
guard case let .success(row) = row else {
// Skip invalid rows.
// <oops, this doesn't actually advance the sequence>
continue
}
// Do something with `row`.
} Alternatively, if you do want to So, it sounds like your original suggestion of let decoder = CSVDecoder()
let sequence = decoder.__nameForOnDemandRowDecoding__(CustomRowType.self, from: url)
while let row = try sequence.next() { /* Do what you want with it */ } is more apt. Getting rid of Two alternative ideas that might be interesting:
This last idea, to me, seems the more interesting one to investigate. |
Ohh, I like the last idea you mentioned. Thank you @itaiferber (and congratulations for coming up with the amazing Codable interface) 😉 What do you say, @josh? Are you still up for giving it a go with all the information gathered? By the way @itaiferber, if you find yourself with a bit of time and you end up checking the library. I will be very happy to receive any comment/criticism that you may have 😄 |
@dehesa yeah, I'd like to give it a try. How does this interface look? extension CSVDecoder {
/// Decode CSV data row by row.
///
/// Does not throw since parsing does not start until requested by iterator.
///
/// - Returns: Iterator for decoding each row.
func decodeSequence<T: Decodable>(of type: T.Type, from url: URL) -> CSVDecodableIterator<T>
}
struct CSVDecodableIterator<Element: Decodable> {
/// Decode next row.
/// - Returns: Decoded row as `Element` or `nil` if no next row exists.
/// - Throws: `DecodingError` if row can not be decoded.
func next() throws -> Element?
} |
If I understand correctly, @josh's proposed API will look as follows on regular usage: let decoder = CSVDecoder(configuration: ...)
for row in decoder.decodeSequence(of: Student.self, from: url) {
let student = try row.decode()
// Do something
} I would suggest the following changes:
With those changes in mind. What do you think about this? let decoder = CSVDecoder(configuration: ...)
for row in decoder.lazy(from: url) {
let student = try row.decode(Student.self)
// Do something
} The benefit of point 2 to is that the user can choose different row representation as she wants. For example, you may have a students CSV where the students from 0 to 99 are represented as a different type than the students from 100 to 199. Although the underlying CSV representation is the same. |
Now, on an implementation perspective. There are some challenges that you will face:
I am happy for you to give it a go, but if it becomes too much, we can always fall back in the easier-to-implement API, which doesn't conform to the let decoder = CSVDecoder()
let sequence = decoder.lazy(from: url)
while let row = try sequence.next() {
// Do what you want with it
} |
Hey @dehesa!
I think I was too slow, it looks like you already implemented the
sequential
buffering strategy for 0.5.2. I was taking some time to learn about the Decoder protocol internals.What I learned is that it's possible to decode an
UnkeyedDecodingContainer
into any sequence without buffering.ShadowDecoder.UnkeyedContainer
seems to do a good job of iteratively decoding each item.The README demos decoding into a preallocated array.
Instead of an Array, I created a custom sequence wrapper. With the added benefit of customizing how the result is wrapped. I had my 🤞 that
AnySequence
wasDecodable
, but it's not.Then:
Any thoughts on this technique or Alternatives? Would a sequence wrapper like this be useful to include as part of the library?
Thanks!
@josh
The text was updated successfully, but these errors were encountered: