## Context
You work in the data analysis team of a very important company. On Monday, the company shares some good news with you: you just got hired by a major retail company! So, let's get prepared for a huge amount of work!

Then you get to work with your team and define the following tasks to perform:   
1. You need to start your analysis using data from the past.  
2. You need to define a process that takes your daily data as an input and integrates it.  

You are in charge of the second part, so you are provided with a sample file that you will have to read daily. To complete you task, you need the following aggregates:
* One aggregate per store that adds up the rest of the values.
* One aggregate per item that adds up the rest of the values.

You can import the `raw_sales` table from the database `retail_sales` fon of Ironhack's databases. 

## Your task
Therefore, your process will consist of the following steps:
1. Read the sample file that a daily process will save in your folder. 
2. Clean up the data.
3. Create the aggregates.
4. Write three tables in your local database: 
    - A table for the cleaned data.
    - A table for the aggregate per store.
    - A table for the aggregate per item.

## Instructions
* Clean the data and create the aggregates as you consider.
* Create the tables in your local database.
* Populate them with your process.

In [1]:
# We first import the necessary libraries:
import pandas as pd
import numpy as np

In [9]:
#We then import the dataset and clean it
raw_sales = pd.read_csv('../Data/retail_sales-raw_sales.csv',sep=';')
raw_sales.drop(['date'],axis=1,inplace=True)
raw_sales.head()

Unnamed: 0,shop_id,item_id,item_price,item_cnt_day
0,29,1469,1199.0,1.0
1,28,21364,479.0,1.0
2,28,21365,999.0,2.0
3,28,22104,249.0,2.0
4,28,22091,179.0,1.0


In [17]:
#I compute the aggregate per item
sales_items = raw_sales.drop(['shop_id'],axis=1)
sales_items.groupby('item_id',as_index=False).sum().sort_values('item_cnt_day',ascending=False)

Unnamed: 0,item_id,item_price,item_cnt_day
932,20949,219.0,93.0
71,1969,190152.0,66.0
939,21364,22848.0,66.0
807,17717,59034.0,60.0
526,11927,20754.0,51.0
...,...,...,...
94,2575,6297.0,-3.0
97,2690,4794.0,-3.0
43,1523,2397.0,-3.0
122,2946,1347.0,-3.0


In [18]:
#I finally compute the aggregate per store
sales_stores = raw_sales.drop(['item_id'],axis=1)
sales_stores.groupby('shop_id',as_index=False).sum().sort_values('item_cnt_day',ascending=False)

Unnamed: 0,shop_id,item_price,item_cnt_day
21,31,268098.0,402.0
42,57,219234.0,324.0
16,25,281796.0,312.0
28,42,327864.0,249.0
19,28,175614.0,246.0
7,12,212196.4,216.0
39,54,119352.0,195.0
40,55,72612.6,180.0
18,27,156324.0,180.0
13,21,224818.5,180.0
