Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace DataPusher #35

Open
BWibo opened this issue Mar 12, 2024 · 0 comments · May be fixed by #9
Open

Replace DataPusher #35

BWibo opened this issue Mar 12, 2024 · 0 comments · May be fixed by #9
Labels
effort: 5 priority: soon type: feature Brand new functionality, features, pages, workflows, endpoints, etc.
Milestone

Comments

@BWibo
Copy link
Member

BWibo commented Mar 12, 2024

DataPusher should be replaced

CKAN DataPusher is not a good choice for pushing data into CKAN datastore.
One core reason to replace DataPusher is that it is complicated to setup and extremly slow. Some more arguments are listed here.
I identified two candidates to replace DataPusher.

ckanext-xloader

Pros

  • Comes as a CKAN extension and is easy to setup
  • Up to 10x faster than Datapusher

Cons

  • Needs to be included in the CKAN-SDDI image
  • Can only be autoscaled by scaling CKAN instances
  • All columns defined as text, and the Data Publisher will need to manually change the data types in the Data Dictionary and reload the data again.

DataPusher+

Pros

  • Built on qsv, an ultra fast processing tool wirtten in Rust.
  • Lives in a separate container and can be scaled individually

Cons

  • Complicated setup
@BWibo BWibo added effort: 5 priority: soon type: feature Brand new functionality, features, pages, workflows, endpoints, etc. labels Mar 12, 2024
@BWibo BWibo added this to the v4.0.0 milestone Mar 22, 2024
@BWibo BWibo linked a pull request Mar 22, 2024 that will close this issue
@BWibo BWibo modified the milestones: v4.0.0, v5.0.0 Apr 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
effort: 5 priority: soon type: feature Brand new functionality, features, pages, workflows, endpoints, etc.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant