Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DSIP-][Task] Add Datavines task to better support data quality #16113

Open
2 tasks done
xxzuo opened this issue Jun 3, 2024 · 5 comments
Open
2 tasks done

[DSIP-][Task] Add Datavines task to better support data quality #16113

xxzuo opened this issue Jun 3, 2024 · 5 comments
Labels
DSIP help wanted Extra attention is needed

Comments

@xxzuo
Copy link
Contributor

xxzuo commented Jun 3, 2024

Search before asking

  • I had searched in the DSIP and found no similar DSIP.

Motivation

DataVines is an easy-to-use data quality service platform that supports multiple metric.
https://github.com/datavane/datavines

  • Datavines supports executing multiple metrics in one job.
  • Datavines supports execution status dashboard and data quality report.
  • Datavines supports plug-in extensions for components such as metric, data sources, error data storage, and execution engines.
  • Jdbc engines can be used to execute data quality tasks instead of solely relying on Spark engines.

Design Detail

Sript mode

  1. config data quality job in datavines
    image

  2. get the job config scipt file

  3. Add datavines job node in workflow, and configure the script
    image

API Mode

  1. config data quality job in datavines
    image

  2. get the jobId

  3. Add datavines job node in workflow, and configure the datavines api address and jobId

Compatibility, Deprecation, and Migration Plan

No response

Test Plan

No response

Code of Conduct

@xxzuo xxzuo added DSIP Waiting for reply Waiting for reply labels Jun 3, 2024
@MYiYang
Copy link

MYiYang commented Jun 4, 2024

It would be nice if you could submit a task here and see the status of the task in ds and stop it via datavines

@zhangp8721
Copy link

very useful for data pipeLine

@xiaoshiqiai
Copy link

If the datavines are incorporated into the ds, it will be easier to integrate project management and data inspection

@zixi0825
Copy link
Member

zixi0825 commented Jun 7, 2024

+1

@SbloodyS SbloodyS added help wanted Extra attention is needed and removed Waiting for reply Waiting for reply labels Jun 7, 2024
@ruanwenjun
Copy link
Member

You should provide a detail design related of the how to use the new task and how does the task work in ds, rather than some pictures of ui.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
DSIP help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

7 participants