Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support multi-record fixed width file engine #10

Closed
petersondrew opened this issue Mar 20, 2015 · 3 comments
Closed

Support multi-record fixed width file engine #10

petersondrew opened this issue Mar 20, 2015 · 3 comments

Comments

@petersondrew
Copy link
Contributor

I've been playing around with some ideas for a multi-record file engine for fixed width files, something similar to the MultiRecordEngine in FileHelpers.

There are a couple of things to consider (mostly related to reading, haven't thought much about writing yet)

  1. The nice separation between fixed and delimited makes it hard to mix them both in a multi-engine as FileHelpers does. My thought is to initially only support fixed width files as they more commonly require multiple record types.
  2. FileHelpers returns an ArrayList with all the parsed records. putting the burden on the user to loop through the list and determine what to do with each record one by one. That makes it easy on us, but harder on the user. I think we can do better with something like either:
    1. Require the user to still supply a single T where T : interface that serves as a simple marker interface. That allows us to return List<T> rather than ArrayList. The user still needs to go through the list item by item (or use a Linq query with Oftype<>), but with the added benefit of more Linq functionality available on the returned list.
    2. Return List<object>. Less runtime type safety, but doesn't require the user to create a marker interface.
    3. Return a dictionary<Type, List<T>> that contains a List<T> for each record type passed to the multi-record engine. The user can then access results[typeof(SomeRecord)] to get the results for that type.
    4. Similar to iii, but abstract the dictionary away and require the user to call GetResults<T>() on the engine after reading is complete. This then returns the appropriate List to the user and feels more polished from their perspective, I feel.
    5. Do some dynamic magic and return an expando object containing a public List<T> for each record type, automatically named something like MyRecordList. This has a certain "cool factor" to it, but ultimately I think I favor option iv as it requires less guessing on the user's part.

I think I'm going to take a crack at implementing a fixed-width file engine (and factory) that take a param array of types to parse and implement the GetResults<T> concept. However before I went too far down the rabbit hole I wanted to see what thoughts you had.

Thanks.

@petersondrew
Copy link
Contributor Author

You can track my progress at dev...RoadRanger:add-multi-engine

I've reduced some of the complexity in the interface/class hierarchy by pushing the generic type parameters down to the method level which allows the line builder and parser objects to be re-used, rather than requiring an instance per record type. All the tests pass, but it is a minor breaking change. After getting this far, I may be able to add multiple record support to the file engines as they are without creating a new file engine type, but it's going to take some playing around to see what feels right.

@forcewake
Copy link
Owner

Hello Drew [@petersondrew],

I've reduced some of the complexity in the interface/class hierarchy

Great job! Some time ago when I started implementing this library the curiously recurring template pattern looked as good idea. But right now.. Just great job, thank you.

All the tests pass, but it is a minor breaking change

It's OK as right now we have something about 0.1.. ver, so after your changes we can push first stable version (this library was created as full replacement for FileHelpers, so multi-engine support is very useful feature)

implement the GetResults concept.

I will spend some time on this but right now it looks like the good idea.

Thank you for help.

Best Regards,
Pavel Nasovich

@petersondrew
Copy link
Contributor Author

I've added some tests that demonstrate the functionality of Read() and GetRecords<T>().

I've not implemented a multi-engine version of Write() because it's quite easy to call the base Write<T>(IEnumerable<T> records) version multiple times for any number of different record types.

I'm going to go ahead and submit this as a PR and we can discuss it further there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants