-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Overload column #34
Overload column #34
Conversation
The `copy_template` logic was used for both standard and overloaded mapping fields. Refactor the code into it's own method.
Thanks for this creative patch. I don't have a strong opinion either way, but I wonder what are the merits of a new kwarg over simply allowing duplication in the standard mapping? What made you want to break this out? |
Expediency mostly. I needed the functionality for another project, and On Thu, Oct 27, 2016 at 12:25 AM, Ben Welsh notifications@github.com
|
I wonder if this could be handled in a similar "it just works" way as the skip functionality from your other pull request. I am going to borrow your tests here and see if I can get that to go. |
…nd overloaded fields 'just work' and everything is more legible and straight forward. For #34.
@ckirby, moments ago I pushed a sweeping refactor of the library that I believe will have excluded headers and overloaded fields "just work" without any additional configuration. I've also rearranged a lot of the mechanics of the system to make it more legible and straightforward. Your tests are passing and we're at 100% coverage but but before I ship the code I'd love to get your feedback on the master branch. I'm also aiming to draft a blog post promoting this new release when it's ready. As part of that I'd like to share your contributions. If you have a second, let me know a little more about your work and why this library is useful to achieve your goals. |
@palewire - The refactor looks nice, really cleaned up the code. I took it for a test drive and it works a charm also. My usage - my company works in the healthcare space. One of our products does automated disease and condition detection based on patient data - lots of it. Due to the sensitive nature of the data, it also is run exclusively at client sites. We use django-postgres-copy on an data aggregating/reporting/analysis tool. The client sites export line-level anonymized data to this tool -20 to 50 million rows in a shot, per month. dpc give us a nice clean and quick way to load the data into our data store while respecting the django-ness of everything. dpc also sped up the processing from a naive csvreader implementation that took hours to running in about 10 minutes. |
Great. Glad to hear it's working. I'll close this ticket and you can expect version 0.1.0 to carry the new features. I'll have a post up on www.californiacivicdata.org before too long and will share it back here with you. |
Awesome - thanks for the kind words! |
Cleaned up pull.
I wanted the ability to have two fields in my model to map to the same incoming data column. The second of the two fields is a transformation of the data. COPY with copy_templates runs quickly as compared to updating all the rows after a copy, and denormalizing is preferred for my use case I added this feature.
I thought it was better to be explicit with a new kwarg for overloaded_mapping as opposed to allowing multiple instances of a column in the mapping parameter.