You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'd really like to see closer connections between these projects, but this could take a number of forms.
Complete Integration (csvdedupe and csvlink would be subsumed into csvkit)
Pros
Seamless experience for users
Pooling of developer time
Cons
Current core devs of csvkit would need to become somewhat familiar with csvdedupe
The complicated stuff that csvdedupe is doing may not fit within the csvkit philosophy
Neutral
A few years ago, it was pretty hard to install dedupe, but python packaging has gotten a lot better. I think this is not a serious disadvantage at present.
Interface compatibility and publicizing each other's projects on these independent projects. csvdedupe and csvlink would need to provide csvkit's common arguments.
Pros
Better discoverability for users (more benefit for csvdedupe than csvkit obviously)
No need for csvkit core devs to know anything about csvdedupe
Users need to learn less to use csvdedupe
Cons
Harder for users
Only publicizing each others's projects
Pros
Better discoverability for users (more benefit for csvdedupe than csvkit obviously)
2. No need for csvkit core devs to know anything about csvdedupe
Cons
Harder for users
Do Nothing (status quo)
Pros
Easiest for core devs
Cons
2. No advantages of 1,2, or 3
We, the core devs of csvdedupe, would be interested in options 1, 2, and 3.
I think we can pursue a version of 2, but that doesn't require as many changes to csvdedupe in terms of its support for common arguments. Users can just pipe their CSV through in2csv or csvformat (and use whatever csvkit options they want) and then pipe the output to csvdedupe. We can perhaps add a page to the tutorial specifically for using csvdedupe with csvkit.
On twitter, @jpmckinney, @hunterowens, and I discussed integrating csvdedupe and csvlink with csvkit.
I'd really like to see closer connections between these projects, but this could take a number of forms.
2. No need for csvkit core devs to know anything about csvdedupe
2. No advantages of 1,2, or 3
We, the core devs of csvdedupe, would be interested in options 1, 2, and 3.
Beyond, @jpmckinney and @onyxfish, @mbauman and @hunterowens might also be interested in this conversation.
The text was updated successfully, but these errors were encountered: