Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Combining multiple columns into one property - Missing .ConvertUsing option like in CsvHelper #54

Closed
ThaDaVos opened this issue Sep 16, 2019 · 14 comments

Comments

@ThaDaVos
Copy link

ThaDaVos commented Sep 16, 2019

Hello,

I have to deal with really big CSV files (100m+) and being able to use this library (because of the Parallel processing) feels really like a blessing, I was using the CsvHelper library but it doesn't support using Parallel processing now I've got a problem, I'm not in control of the CSV files and one of the CSV files contains 19 columns which function as flags (0 or 1) in my Class Object I'm storing this as an Enum with the Flag attribute -> Now with CsvHelper library I'm able to use a function called .ConvertUsing() to convert from the Csv file multiple columns into one property but I'm missing something like this in the TinyCsvParser -> is there any way to do this? Or do I have to implement a custom parser which uses CsvHelper's mapper?

PS: Maybe you two can join forces to create the BEST CsvHelper/Parser :P

@bytefish
Copy link
Collaborator

I will take a look at it, but I cannot promise when I will have the time. It's currently busy at work, so there is not much time left.

From what I see: The problem is, that the TypeConverter I designed only takes a string value as input. To implement the ConvertUsing functionality you would need access to the current tokenized row.

This isn't a big deal: I already extended the CsvMapping to allow Range Mappings (for mapping Arrays) some time ago. So there is already an implementation that takes a string Array.

Maybe this can serve as a starting point if you want to try implementing it yourself.

I am grateful for Pull Requests.

I will look at it, but I cannot promise when.

@bytefish
Copy link
Collaborator

You say you convert multiple rows into one? Did you mean columns or is it really rows?

@ThaDaVos
Copy link
Author

My mistake, I meant Columns
So like you said, access the tokenized row

@bytefish
Copy link
Collaborator

Let me write a small Prototype this evening. 🤜🤛

@ThaDaVos
Copy link
Author

Thanks!
I'm currently using both Libraries and have specialized extension methods per library on the DbContext so I can switch between them when needed... sadly different Generic constraints don't count as different overloads

@bytefish
Copy link
Collaborator

bytefish commented Sep 16, 2019

Well this is a quick prototype of what an API could look like:

One thing: It currently only supports reading the TokenizedRow values in the MapUsing function you pass into. What I don't like is, that this probably makes it neccessary to instantiate all TypeConverters by yourself. What I could think of instead: Extension Methods for the TokinzedRow like: row.Get<int>(column: 1)

It's a first try. I am open for feedback. 👍

@ThaDaVos
Copy link
Author

It looks really good - when I've got time, how will I be able to test it? Also, why not use Fluent design pattern? So

MapProperty(x => x.Id).Index(0).Using((values) => values.Tokens[0]); // return the value instead of assigning it for example

@bytefish
Copy link
Collaborator

bytefish commented Sep 20, 2019

Sorry for my late reply, these are busy times. 😑

I could release a Preview Version to NuGet, which you could include.

As for the Fluent API: Yes it looks better for single properties. You are totally right. 😎 But maybe people want to assign multiple properties of the entity at the same time and then a single property gets in the way. This generic approach doesn't go for high readability, but it's rather flexible.

(And it was easier to implement. 🤫)

I will try to release a NuGet Preview this evening.

@stevencovhlth
Copy link

Hello @bytefish I see that the code for this is now in the master branch, but it does not appear to be available in the nuget package. Do you still have plans to include? or maybe I'm looking in the wrong place?

I have kind of the opposite problem here, I'm populating multiple properties on my class with 1 column from the file (need to do some splitting). The MapUsing seems perfect, but open to alternatives.

@bytefish
Copy link
Collaborator

bytefish commented May 6, 2020

@stevencovhlth Sorry all, I must have forgotten about this ticket. Next time please bump it a little. There was something here I wasn't happy with back then and I wanted to improve... I think I wasn't happy with using string values in the MapUsing function. This somehow defeats the use of the TypeConverters for parsing Column Values. A user should at least have a chance to say: This column was an Integer, instead of operating on string values and leaving all Type conversion problems on the user. This shouldn't be, I want at least to give the User the opportunity to decide either using Raw Values or Converted Values. Let me improve the API here a little and then I'll do a Release to NuGet.

@stevencovhlth
Copy link

@bytefish thank you. I'll bump you again a week or so if that's ok. Thanks for your efforts on this great project.

@stevencovhlth
Copy link

@bytefish oops time got away from me :) I'm bumping as promised.

dustykline added a commit to dustykline/TinyCsvParser that referenced this issue Oct 30, 2020
…xceptions as flow control), add examples.
dustykline added a commit to dustykline/TinyCsvParser that referenced this issue Oct 30, 2020
bytefish added a commit to dustykline/TinyCsvParser that referenced this issue Oct 31, 2020
bytefish added a commit to dustykline/TinyCsvParser that referenced this issue Oct 31, 2020
bytefish added a commit to dustykline/TinyCsvParser that referenced this issue Oct 31, 2020
bytefish added a commit to dustykline/TinyCsvParser that referenced this issue Oct 31, 2020
bytefish added a commit that referenced this issue Oct 31, 2020
Issue #54 Makes pass/fail of MapUsing explicit
bytefish added a commit that referenced this issue Oct 31, 2020
@laicasaane
Copy link

laicasaane commented Dec 13, 2020

Recently I've stumbled on this issue too. I've been prototyping something called "UserTypeConverter" which implements IArrayTypeConverter, but the currently form is quite inconvienent. I use readonly struct a lot, but you can only construct them by calling constructors, so the user's converter must inherit any of the predefined UserTypeConverter<T1, ...T10, TTargetType> classes, then implement the abstract method TTargetType New(T1 p1, ...T10 p10) to create an instance of that user type.

Currently I have only the codebase, no test case yet, but I envisioned the actual usage to be cumbersome, because the user might want to do user type conversions inside user type conversions, (i.e recursively).

@bytefish
Copy link
Collaborator

A MapUsing has been added. If there are additional ideas, please reopen and bump the ticket.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants