Skip to content

mediacloud/sous-chef

Repository files navigation

Sous-Chef

UNDER CONSTRUCTION!

A package which wraps prefect up in a little easily configurable bow, for self-validating and freely configurable data pipelines.

We call a pipeline configuration a "recipe". This is a YAML file which specifies a set of atoms and connections between them.

Two Entrypoints:

  1. By Directory- point to a directory with a recipe.yaml and optionally a mixins.yaml file
  • python run_recipe.py -d ../path/to/recipe/directory/
  1. By Directory with Date Iteration- modify start_date and end_date to
  • python run_recipe.py -d ../path/to/recipe/directory/ -s start_date(%Y-%m-%d)

Recipes

All of the recipes I've been writing for this tool live at a different, private repository

the 'tests' folder there contains recipes which demonstrate the basic shape and functionality of the tool

The Atom Wishlist is where I am storing the list of new components I'll be adding as time moves on.

Version History

v0.1 - First beta tag for versioning

About

Configurable Data Analytics Pipeline

Topics

Resources

Code of conduct

Stars

Watchers

Forks