# Basic Usage of Flux



## Introduction

In [1]:
from flux import Flux
from flux import MemoryDataset

In [2]:
flux = Flux()

### Adding Datasets

The most fundamental dataset is the MemoryDataset. This object is just a
wrapper of python objects. You can save any object inside and it will be 
save in memory.

To view all datasets in current the current Flux, you can access the catalog
property.

In [3]:
from flux import MemoryDataset

input_ds = MemoryDataset(data=1, description='input data')

flux.add_dataset(name='input', dataset=input_ds)

In [4]:
flux.catalog

{'input': MemoryDataset(description='input data', _data='1')}

### Adding Nodes

Adding nodes is as simple as passing a python function, its inputs and its expected outputs.

The inputs are required to be available in the flux catalog in order for the node to excecute succesfully.

On the other hand, outputs are created dinamically as MemoryDatasets if they are not in the catalog.


To view all nodes inside the curren flux object you can access the pipeline property.

In [5]:
def add_number(input, number=0):
    return input+number

flux.add_node(
    func=add_number,
    inputs='input',
    outputs='sum_output',
    func_kwargs={"number":10}
)

def multiply_number(input, number=1):
    return input*number

flux.add_node(
    func=multiply_number,
    inputs='sum_output',
    outputs='output',
    func_kwargs={"number":2}
)

In [6]:
flux.pipeline

Pipeline 
 - Node: add_number([input]) -> [sum_output]
- Node: multiply_number([sum_output]) -> [output]

### Running Flux

In [7]:
flux.run()

In [8]:
flux.catalog

{'input': MemoryDataset(description='input data', _data='1'),
 'output': MemoryDataset(_data='22')}

### Accessing Results

In [9]:
flux.load_dataset(name='output')

22

### Saving Flux

In [10]:
flux.save('../data/basic_flux')

### Loading Flux

In [11]:
new_flux = Flux()
new_flux.load('../data/basic_flux.pkl')

In [12]:
new_flux.catalog

{}

In [13]:
new_flux.pipeline

Pipeline 
 - Node: add_number([input]) -> [sum_output]
- Node: multiply_number([sum_output]) -> [output]

In [14]:
# Adding new dataset in order to run flux
new_flux.add_dataset(
    name='input',
    dataset=MemoryDataset(data=2)
)

In [15]:
new_flux.run()

### Overriding Datasets

You dont necessary need to create new datasets for each node if your data
is temporary and will be change y further steps.

In [16]:
flux = Flux()

input_ds = MemoryDataset(data=1, description='input data')

flux.add_dataset(name='input',dataset=input_ds)

def add_number(input, number=0):
    return input+number

flux.add_node(
    name='add_number',
    func=add_number,
    inputs='input',
    outputs='interim',
    func_kwargs={"number":10}
)

def multiply_number(input, number=1):
    return input*number

flux.add_node(
    name='multiply_number',
    func=multiply_number,
    inputs='interim', 
    outputs='interim', # this overrides the output of add_number node.
    func_kwargs={"number":2}
)

flux.add_node(
    name='add_number_2',
    func=add_number,
    inputs='interim',
    outputs='output',
    func_kwargs={"number":2}
)

In [17]:
flux()