# SOAM FLOW RUN QUICKSTART

In the following notebook we present a simple quickstart to expose how to make a connection with a database, extract the data, transform it, generate a forecast, plot it and send a mail report using soam modules and methods in simple steps by using our soam flow.

In [2]:
from soam.workflow.time_series_extractor import TimeSeriesExtractor
from muttlib.dbconn import get_client
import pandas as pd
from soam.workflow import Transformer
from sklearn.preprocessing import MinMaxScaler
import numpy as np
from soam.workflow.forecaster import Forecaster
from soam.models import SkProphet
from soam.utilities.utils import add_future_dates
import matplotlib.pyplot as plt
from soam.reporting import mail_report
import datetime
from soam.core import SoamFlow
from prefect import task

ERROR:fbprophet.plot:Importing plotly failed. Interactive plots will not work.


## Extraction
DB Connection using `muttlib`. <br>
`SQL Query` constructed <br>
`SOAM Extractor` object initialization.

In [3]:
pg_cfg = {
    "host": "localhost",
    "port": 5432,
    "db_type": "postgres",
    "username": "mutt",
    "password": "mutt",
    "database": "sqlalchemy"
}
pg_client = get_client(pg_cfg)[1]

In [4]:
build_query_kwargs={
    'columns': '*',
    'timestamp_col': 'date',
    'start_date': "2021-03-01",
    'end_date': "2021-03-20",
    'extra_where_conditions': ["symbol = 'AAPL'"],
    'order_by': ["date ASC"]
}

In [5]:
extractor = TimeSeriesExtractor(db=pg_client, table_name='stocks_valuation')

## Preprocessing
`SOAM Transformer` object initialization. <br>
Tasks created upon functions based on custom transformations.

In [6]:
scaler = MinMaxScaler()
ts = Transformer(transformer = scaler)

@task()
def transform_df_for_scaler(df: pd.DataFrame):
    data = np.array([df.avg_price])
    data = np.swapaxes(data, 0, 1)
    return data

@task()
def transform_df_format(df: pd.DataFrame):    
    df = df[['date', 'avg_price']]
    df.rename(columns = {
        'date': 'ds',
        'avg_price': 'y'}, inplace = True)
    df.ds =  pd.to_datetime(df.ds, infer_datetime_format=True)
    df = add_future_dates(df, periods=7, frequency="d")
    return df

## Forecasting
Forecasting model selected: `FBProphet`. <br>
`SOAM Forecaster` object initialization.

In [7]:
my_model = SkProphet(weekly_seasonality=False, daily_seasonality=False)
forecaster = Forecaster(my_model, output_length=7)

## Postprocessing
Postprocessing tasks based on functions for custom transformations.

In [8]:
@task(nout=2)
def post_processing(df: pd.DataFrame, predictions: pd.DataFrame):
    dfp = df
    dfp["ds"] = dfp["ds"].dt.strftime("%d")
    dfp = dfp.set_index("ds")

    predp = predictions
    predp["ds"] = predp["ds"].dt.strftime("%d")
    predp =  predp.set_index("ds")
    return dfp, predp

## Plotting and Reporting
Plotting task based on a function to generate a custom `time-series` plot.

In [9]:
@task()
def plot_results(dfp: pd.DataFrame, predp: pd.DataFrame):
    # construct the plot
    fig = plt.figure(figsize=(18,7))
    plt.plot(dfp.index, dfp.y, marker='o', color='black', label="History")
    plt.plot(predp.index, predp.yhat, marker='o', color='purple', linestyle='dashed', label="Forecast")
    
    # legend title and labels
    plt.legend(loc='best')
    plt.title(
        "Normalized Average Price of Apple's Stock Value Time Series for March of 2021", 
        fontdict = {
            'fontsize': '20',
            'fontweight' : '300',
            'verticalalignment': 'baseline'
            })
    plt.ylabel('Normalized average price', fontdict = {'fontsize': '15'})
    plt.xlabel('Day of month', fontdict = {'fontsize': '15'})

    # save and show fig
    plt.savefig('img/applestockprice.png')
    plt.show()

`SOAM Mail Report` object initialization.

In [14]:
mr = mail_report.MailReportTask(
    # recipients mails separated by commas
    mail_recipients_list = ["scafatieugenio@gmail.com"],
    # the metric name will be in the title
    metric_name = "Stocks Forecast" 
)

# SoaMFlow

Putting all together using `SoaMFlow`.

In [15]:
with SoamFlow(name = "t") as t:
    # EXTRACTION
    df = extractor(build_query_kwargs)
    # PRE PROCESSING
    data = transform_df_for_scaler(df = df)
    df.avg_price = ts(data)[0]
    df = transform_df_format(df = df)
    # FORECASTING
    predictions, time_series, model = forecaster(time_series=df)
    # POST PROCESSING
    dfp, predp = post_processing(df = df, predictions = predictions)
    # PLOTTING
    plot_results(dfp = dfp, predp = predp)
    # REPORTING
    mr(current_date = "2021-05-06", plot_filename = "img/applestockprice.png")

In [16]:
t.run()

[2021-04-09 15:22:51-0300] INFO - prefect.FlowRunner | Beginning Flow run for 't'
INFO:prefect.FlowRunner:Beginning Flow run for 't'
[2021-04-09 15:22:51-0300] INFO - prefect.TaskRunner | Task 'MailReportTask': Starting task run...
INFO:prefect.TaskRunner:Task 'MailReportTask': Starting task run...
INFO:soam.reporting.mail_report:Sending email report to: ['scafatieugenio@gmail.com']
INFO:soam.reporting.mail_report:About to send the following email:
                    'From: ' SoaM Reporter
                    'To: ' ['scafatieugenio@gmail.com']
                    'Subject: ' [2021-05-06]Forecast report for Stocks Forecast
                    'Using host': smtp.gmail.com and port: 587
ERROR:soam.reporting.mail_report:With the following body: 
 <!DOCTYPE html>

<head>
  <meta charset='UTF-8'>
  <style type='text/css'>
    .maindiv {
      height: 100%;
      width: 100% !important;
      background-color: #f7f7f7;
      margin: 0;
      padding: 0;
      overflow: auto;
    }

    .con

<Failed: "Some reference tasks failed.">