DeFlow

A Lightweight Declarative Data Framework that allow you to run data pipelines by YAML config template.

Note

I want to use this project is the real-world use-case for my Workflow package that able to handle production data pipeline with the DataOps strategy.

Warning

This framework does not allow you to custom your pipeline yet. If you want to create your workflow, you can implement it by your custom template reference this package.

In my opinion, I think it should not create duplicate workflow codes if I can write with dynamic input parameters on the one template workflow that just change the input parameters per use-case instead. This way I can handle a lot of logical workflows in our orgs with only metadata configuration. It called Metadata Driven Data Workflow.

📦 Installation

pip install -U deflow

🎯 Usage

Version 1

Note

This project will create the data framework Version 1 first.

After initialize your data framework project with Version 1, your data pipeline config files will store with this file structure:

conf/
 ├─ conn/
 │   ├─ c_conn_01.yml
 │   ╰─ c_conn_02.yml
 ├─ routes/
 │   ╰─ routing.yml
 ├─ stream/
 │   ╰─ s_stream_01/
 │       ├─ g_group_01.tier.priority/
 │       │   ├─ p_proces_01.yml
 │       │   ╰─ p_proces_02.yml
 │       ├─ g_group_02.tier.priority/
 │       │   ├─ p_proces_01.yml
 │       │   ╰─ p_proces_02.yml
 │       ╰─ config.yml
 ╰─ .configore

You can run the data flow by:

from deflow.flow import Flow
from ddeutil.workflow import Result

flow: Result = (
    Flow(name="s_stream_01")
    .option("conf_paths", ["./data/conf"])
    .run(mode="N")
)

Version 2

Note

This version is the same DAG and Task strategy like Airflow.

After initialize your data framework project with Version 2, your data pipeline config files will store with this file structure:

conf/
 ├─ pipeline/
 │   ╰─ p_pipe_01/
 │       ├─ config.yml
 │       ├─ n_node_01.yml
 │       ╰─ n_node_02.yml
 ╰─ .configore

🍪 Configuration

Name	Component	Default	Description
DEFLOW_CORE_CONF_PATH	CORE	`./conf`	A config path to get data framework configuration.
DEFLOW_CORE_VERSION	CORE	`v1`	A specific data framework version.
DEFLOW_CORE_REGISTRY_CALLER	CORE	`.`	A registry of caller function.

Support data framework version:

Version	Supported	Description
1	Progress	A data framework that base on `stream`, `group`, and `process`.
2	Progress	A data framework that base on `pipeline`, and `node`.

💬 Contribute

I do not think this project will go around the world because it has specific propose, and you can create by your coding without this project dependency for long term solution. So, on this time, you can open the GitHub issue on this project 🙌 for fix bug or request new feature if you want it.

Name		Name	Last commit message	Last commit date
Latest commit History 87 Commits
.github		.github
deflow		deflow
docs		docs
tests		tests
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
README.md		README.md
json-schema.json		json-schema.json
mkdocs.yml		mkdocs.yml
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

DeFlow

📦 Installation

🎯 Usage

Version 1

Version 2

🍪 Configuration

💬 Contribute

About

Uh oh!

Releases 4

Uh oh!

Contributors 2

Uh oh!

Languages

License

ddeutils/deflow

Folders and files

Latest commit

History

Repository files navigation

DeFlow

📦 Installation

🎯 Usage

Version 1

Version 2

🍪 Configuration

💬 Contribute

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 4

Uh oh!

Contributors 2

Uh oh!

Languages