# Exploring Metaflow
> Want to use this to take a look at how metaflow works, specifically their decorators

- toc: true 
- badges: true
- comments: true
- categories: [jupyter]
- image: images/chart-preview.png

## About 
[metaflow](https://github.com/Netflix/metaflow) is a python package open sourced by netflix to help data scientists easily scale their project workflows. Metaflow is mainly interacted with through decorators. In this post, we will get behind the scenes of how these decorators actually work. 

## The code
To start, let's take a look at the [first example](https://github.com/Netflix/metaflow/blob/master/metaflow/tutorials/00-helloworld/helloworld.py) in the documentation. This is a simple flow. 

In [None]:
from metaflow import FlowSpec, step
class LinearFlow(FlowSpec):
    @step
    def start(self):
        self.my_var = 'hello world'
        self.next(self.a)

    @step
    def a(self):
        print('the data artifact is: %s' % self.my_var)
        self.next(self.end)

    @step
    def end(self):
        print('the data artifact is still: %s' % self.my_var)

LinearFlow()

We see that the LinearFlow python class inherits from metaflow's FlowSpec class, and each of the functions are decorated with ```@step```. As seen (here)[https://docs.metaflow.org/metaflow/basics], this basic flow follows metaflow's guidelines. However, what is actually happening? How does it turn our functions into pipeline steps? Let's start by taking a look at the Flowspec class.

(Flowspec)[https://github.com/Netflix/metaflow/blob/master/metaflow/flowspec.py] definition and constructor. Full code can be found at the link. 
```python
class FlowSpec(object):
    """
    Main class from which all Flows should inherit.
    Attributes
    ----------
    script_name
    index
    input
    """

    # Attributes that are not saved in the datastore when checkpointing.
    # Name starting with '__', methods, functions and Parameters do not need
    # to be listed.
    _EPHEMERAL = {'_EPHEMERAL',
                  '_datastore',
                  '_cached_input',
                  '_graph',
                  '_flow_decorators',
                  '_steps',
                  'index',
                  'input'}

    _flow_decorators = {}

    def __init__(self, use_cli=True):
        """
        Construct a FlowSpec
        Parameters
        ----------
        use_cli : bool, optional, default: True
            Set to True if the flow is invoked from __main__ or the command line
        """

        self.name = self.__class__.__name__

        self._datastore = None
        self._transition = None
        self._cached_input = {}

        self._graph = FlowGraph(self.__class__)
        self._steps = [getattr(self, node.name) for node in self._graph]

        if use_cli:
            # we import cli here to make sure custom parameters in
            # args.py get fully evaluated before cli.py is imported.
            from . import cli
            cli.main(self)
```