-
Notifications
You must be signed in to change notification settings - Fork 0
Documentation
Rob edited this page Jan 7, 2021
·
2 revisions
-
Pipeline(self, source, extract, transformations, load)
- A data pipeline. Comprised of a data object (a DataFrame) and a set of Steps.
-
source
: The data source for the pipeline. Either a DataFrame object or fpath of CSV file to read. -
extract
: (Optional) The Step to run for extraction. -
transformations
: List of Steps and Transforms to run. -
load
: (Optional) The final Step in a pipeline. Should save or passPipeline.data
somewhere.
-
- A data pipeline. Comprised of a data object (a DataFrame) and a set of Steps.
-
Step(self, func, *args, **kwargs)
- A function and a set of arguments that are called during
Pipeline.run()
.
- A function and a set of arguments that are called during
-
Transform(self, func, *args, **kwargs)
- A subclass of
Step
. When run, its function is passedPipeline.data
as the first positional argument.
- A subclass of
-
Load(self, func, *args, **kwargs)
- A subclass of
Transform
. It requires a destination keyword argument (indicates where the data will be saved or passed to).
- A subclass of