# Example Notebook

In this notebook we will be showcasing the structure of a generic Python notebook project

In [None]:
import pandas as pd
from matplotlib.pyplot import xkcd

%matplotlib inline


With some common libraries imported above, let's generate some charts as if this is a real project;

For this, we will need to load the data from a file. It is always better if all outside sources are defined in one place in our notebook. This way if we need to run the code against an entirely different set of files we can update the file locations from the same place.

In [None]:
# Location of the CSV for sales data
sales_csv = 'data/sales_data.csv'


Now we can start loading our data in a separate stage.

In [None]:
sales = pd.read_csv(sales_csv, parse_dates=['Date'])
sales.head()

We are keeping the next cell for some code to modify the loaded data if we need to or just play and learn about the dataset ourselves.

In [None]:
# Calculate Revenue_per_Age and total sales cost
sales['Revenue_per_Age'] = sales['Revenue'] / sales['Customer_Age']
sales['Calculated_Cost'] = sales['Order_Quantity'] * sales['Unit_Cost']

# Adjust Unit_Price
sales['Unit_Price'] *= 1.03


sales.corr()
sales['Age_Group'].value_counts()

Finally, let's generate a chart

In [None]:
with xkcd(randomness=2):
    cost_plot = sales['Unit_Cost'].plot(kind='density', figsize=(14, 6))
    cost_plot.axvline(sales['Unit_Cost'].mean(), color='red')
    cost_plot.axvline(sales['Unit_Cost'].median(), color='green')


## Improvements

We can go one step further by creating pure-python methods for functionalities we often use.

Below is an example of ingestion and chart generation being done in a separate Python file. Please note that we copied the logic for reading the file from an earlier cell to a method in `libraries/ingestions.py` file and removed any references to a hardcoded filename to make the code agnostic to specific file locations and names.

`correlation_chart()` method is another example of code reuse. 

In [None]:
from libraries.ingestions import sales_df
from libraries.charts import plotfigure as correlation_chart

# Since the sales_df method returns a DataFrame object,
# the output already comes with  Pandas statistical methods
# like .corr() without importing Pandas separately.
sales_corr = sales_df(sales_csv).corr()

corr_chart = correlation_chart(sales_corr)

## Done!