## Context
You work in the data analysis team of a very important company. On Monday, the company shares some good news with you: you just got hired by a major retail company! So, let's get prepared for a huge amount of work!

Then you get to work with your team and define the following tasks to perform:   
1. You need to start your analysis using data from the past.  
2. You need to define a process that takes your daily data as an input and integrates it.  

You are in charge of the second part, so you are provided with a sample file that you will have to read daily. To complete you task, you need the following aggregates:
* One aggregate per store that adds up the rest of the values.
* One aggregate per item that adds up the rest of the values.

You can import the `raw_sales` table from the database `retail_sales` fon of Ironhack's databases. 

## Your task
Therefore, your process will consist of the following steps:
1. Read the sample file that a daily process will save in your folder. 
2. Clean up the data.
3. Create the aggregates.
4. Write three tables in your local database: 
    - A table for the cleaned data.
    - A table for the aggregate per store.
    - A table for the aggregate per item.

## Instructions
* Clean the data and create the aggregates as you consider.
* Create the tables in your local database.
* Populate them with your process.

In [21]:
path = '../Datasets_as_CSV/retail_sales-raw_sales.csv'

In [23]:
# Import libraries
import numpy as np
import pandas as pd

# Inputs user
path = input('Enter the path of the file: ',)

# Data aquisition
def acquire(path):
    data = pd.read_csv(path, sep=";",index_col = 'date',parse_dates=True)
    return data

# Data Wrangle
def wrangle(df):
    df['total_amount'] = df['item_price']*df['item_cnt_day']
    return df
    
# Data Analysis
def analyze_by_items(df):
    item = pd.DataFrame(df.groupby(['item_id','item_price'])[['item_cnt_day','total_amount']].sum())
    return item

def analyze_by_shop(df):
    shop = df.groupby('shop_id').agg({'total_amount':'sum'})
    return shop

# Export data
def export_tables(item,shop):
    with pd.ExcelWriter('sales.xlsx') as writer:
        item.to_excel(writer, sheet_name='items')
        shop.to_excel(writer, sheet_name='shop')
    
# Execution
if __name__ == '__main__':
    data = acquire(path)
    df = wrangle(data)
    item = analyze_by_items(df)
    shop = analyze_by_shop(df)
    export_tables(item,shop)

Enter the path of the file: ../Datasets_as_CSV/retail_sales-raw_sales.csv
