## **Introduction** ##

Main Aim : An analysis of Irelands Agricultural data and comparing the Irish Agri Sector with other countries worldwide. 

Main Aim of Data Preparation & visualisation Tasks : 

- Using EDA ( Exploratory Data Analysis ) to help identify patterns , inconsistencies, anomalies , missing data and other attributes and issues in the choosen datasets. To address these as well
- With machine learning in mind , use the appropriate data cleaning ,engineering and extraction and other techniques to structure and enrich the data
- Develop an interactive dashboard ( visualisation to communicate information) tailored to modern farmers using tufts principles to shocaste the information gathered from the machine learning 

## **What data are we exploring today?**

Reference : https://agridata.ec.europa.eu/extensions/DashboardDairy/DairyPrices.html#
https://agridata.ec.europa.eu/extensions/DashboardRawMilk/RawMilkPrices.html#


Is there a correlation between the price of butter and its stock  ? 

typically - if the price is high , the stocks are low and most people sell , and vice versa 

Objective is to find the right predictors - investigate seasonal effects  and the price of milk on the butter prices


#### **1. Importing or required Libraries for EDA**

In [103]:
# Importing required libraries.
import pandas as pd
import numpy as np
import seaborn as sns #visualisation
import matplotlib.pyplot as plt #visualisation
%matplotlib inline
sns.set(color_codes=True) 
from datetime import datetime
import warnings
warnings.filterwarnings('ignore')


#### **2. Loading the data into the data frame.**

In [104]:
DairyButterStock = pd.read_csv('European Butter Stock.csv') 

In [105]:
DairyMilkStock = pd.read_csv('European Milk Stock.csv') 

https://stackoverflow.com/questions/22216076/unicodedecodeerror-utf8-codec-cant-decode-byte-0xa5-in-position-0-invalid-s
#The error is because there is some non-ascii character in the dictionary and it can't be encoded/decoded. One simple way to avoid this error is to encode such strings with encode() function as follows (if a is the string with non-ascii character):

In [106]:
ButterPrices = pd.read_csv('European union Butter Prices.csv',encoding='unicode_escape') 

In [107]:
MilkPrices = pd.read_csv('European milk prices.csv',encoding='unicode_escape') 

## **3. Data Preparation .**

### 1. ButterPrices Dataset

Organise data in terms of months and remove the week column 



In [108]:
ButterPrices.head(5)

Unnamed: 0,Year,Week,Member State,Product,Begin Date,End Date,Price (/100kg)
0,2000,52,Belgium,BUTTER,25/12/2000,31/12/2000,326.1
1,2000,52,Denmark,BUTTER,25/12/2000,31/12/2000,376.7
2,2000,52,Germany,BUTTER,25/12/2000,31/12/2000,352.79
3,2000,52,Ireland,BUTTER,25/12/2000,31/12/2000,292.04
4,2000,52,Greece,BUTTER,25/12/2000,31/12/2000,460.16


In [109]:
ButterPrices["Begin Date"] = pd.to_datetime(ButterPrices["Begin Date"])

In [110]:
ButterPrices.head(5)

Unnamed: 0,Year,Week,Member State,Product,Begin Date,End Date,Price (/100kg)
0,2000,52,Belgium,BUTTER,2000-12-25,31/12/2000,326.1
1,2000,52,Denmark,BUTTER,2000-12-25,31/12/2000,376.7
2,2000,52,Germany,BUTTER,2000-12-25,31/12/2000,352.79
3,2000,52,Ireland,BUTTER,2000-12-25,31/12/2000,292.04
4,2000,52,Greece,BUTTER,2000-12-25,31/12/2000,460.16


In [111]:
ButterPrices['Month'] = pd.DatetimeIndex(ButterPrices["Begin Date"]).month 

In [112]:
ButterPrices.groupby(ButterPrices['Begin Date'].dt.month)['Price (/100kg)'].mean()

Begin Date
1     352.518843
2     352.876086
3     355.813380
4     356.434747
5     357.043786
6     358.089287
7     363.152140
8     363.698226
9     363.378740
10    372.647288
11    372.050699
12    356.798542
Name: Price (/100kg), dtype: float64

In [113]:
ButterPrices.rename(columns={"Price (/100kg)": "Butter Price (/100kg)"},inplace=True)

In [114]:
ButterPrices.drop(['Begin Date','End Date','Week','Product'],axis=1)

Unnamed: 0,Year,Member State,Butter Price (/100kg),Month
0,2000,Belgium,326.10,12
1,2000,Denmark,376.70,12
2,2000,Germany,352.79,12
3,2000,Ireland,292.04,12
4,2000,Greece,460.16,12
...,...,...,...,...
14219,2022,Italy,514.00,3
14220,2022,Netherlands,608.00,3
14221,2022,Poland,574.43,3
14222,2022,Portugal,558.87,3


In [115]:
ButterPrices.rename(columns={"Price (/100kg)": "Butter Price (/100kg)"},inplace=True)

In [116]:
# rearranging the location of the columns
ButterPrices1=ButterPrices.iloc[:, [0,7,2,6]]
ButterPrices1.head()

Unnamed: 0,Year,Month,Member State,Butter Price (/100kg)
0,2000,12,Belgium,326.1
1,2000,12,Denmark,376.7
2,2000,12,Germany,352.79
3,2000,12,Ireland,292.04
4,2000,12,Greece,460.16


### 2. MilkPrices Dataset

In [117]:
MilkPrices.head(5)

Unnamed: 0,Year,Month,Member State,Product,Price(/100kg)
0,2022,Jan,Belgium,Raw milk,45.13
1,2022,Jan,Bulgaria,Raw milk,37.61
2,2022,Jan,Czechia,Raw milk,39.82
3,2022,Jan,Denmark,Raw milk,43.68
4,2022,Jan,Germany,Raw milk,43.03


In [118]:
MilkPrices.rename(columns={"Price(/100kg)": "Raw Milk Price(/100kg)"},inplace=True)

In [119]:
MilkPrices.head(5)

Unnamed: 0,Year,Month,Member State,Product,Raw Milk Price(/100kg)
0,2022,Jan,Belgium,Raw milk,45.13
1,2022,Jan,Bulgaria,Raw milk,37.61
2,2022,Jan,Czechia,Raw milk,39.82
3,2022,Jan,Denmark,Raw milk,43.68
4,2022,Jan,Germany,Raw milk,43.03


In [120]:
MilkPrices.drop(['Product'],axis=1)

Unnamed: 0,Year,Month,Member State,Raw Milk Price(/100kg)
0,2022,Jan,Belgium,45.13
1,2022,Jan,Bulgaria,37.61
2,2022,Jan,Czechia,39.82
3,2022,Jan,Denmark,43.68
4,2022,Jan,Germany,43.03
...,...,...,...,...
452,2005,Jan,Slovakia,25.07
453,2005,Jan,Finland,34.42
454,2005,Jan,Sweden,31.63
455,2004,Jan,Slovenia,27.95


### 3. DairyStock Dataset

In [121]:
DairyButterStock.head(5)

Unnamed: 0,Member State,Member State Code,Category,Year,Month,Thousand tonnes
0,Austria,AT,"Butter, incl. dehydrated butter and ghee, and ...",2022,January,3.32
1,Austria,AT,"Butter, incl. dehydrated butter and ghee, and ...",2022,February,2.85
2,Austria,AT,"Butter, incl. dehydrated butter and ghee, and ...",2022,March,3.36
3,Austria,AT,"Butter, incl. dehydrated butter and ghee, and ...",2022,April,3.13
4,Austria,AT,"Butter, incl. dehydrated butter and ghee, and ...",2022,May,3.09


In [125]:
DairyButterStock.rename(columns={'Thousand tonnes':"Butter Thousand tonnes"},inplace=True)

In [126]:
DairyButterStock.head(5)

Unnamed: 0,Member State,Member State Code,Category,Year,Month,Butter Thousand tonnes
0,Austria,AT,"Butter, incl. dehydrated butter and ghee, and ...",2022,January,3.32
1,Austria,AT,"Butter, incl. dehydrated butter and ghee, and ...",2022,February,2.85
2,Austria,AT,"Butter, incl. dehydrated butter and ghee, and ...",2022,March,3.36
3,Austria,AT,"Butter, incl. dehydrated butter and ghee, and ...",2022,April,3.13
4,Austria,AT,"Butter, incl. dehydrated butter and ghee, and ...",2022,May,3.09


In [127]:
DairyButterStock.drop(['Member State Code','Category'],axis=1)

Unnamed: 0,Member State,Year,Month,Butter Thousand tonnes
0,Austria,2022,January,3.32
1,Austria,2022,February,2.85
2,Austria,2022,March,3.36
3,Austria,2022,April,3.13
4,Austria,2022,May,3.09
...,...,...,...,...
6129,Slovakia,2004,August,0.89
6130,Slovakia,2004,September,0.74
6131,Slovakia,2004,October,0.64
6132,Slovakia,2004,November,0.62


In [122]:
DairyMilkStock.head(5)

Unnamed: 0,Member State,Member State Code,Category,Year,Month,Thousand tonnes
0,Austria,AT,Total raw cow's milk delivered to dairies,2022,January,281.97
1,Austria,AT,Total raw cow's milk delivered to dairies,2022,February,264.78
2,Austria,AT,Total raw cow's milk delivered to dairies,2022,March,298.06
3,Austria,AT,Total raw cow's milk delivered to dairies,2022,April,290.51
4,Austria,AT,Total raw cow's milk delivered to dairies,2022,May,299.46


In [128]:
DairyMilkStock.rename(columns={'Thousand tonnes':"Milk Thousand tonnes"},inplace=True)

In [129]:
DairyMilkStock.drop(['Member State Code','Category'],axis=1)

Unnamed: 0,Member State,Year,Month,Milk Thousand tonnes
0,Austria,2022,January,281.97
1,Austria,2022,February,264.78
2,Austria,2022,March,298.06
3,Austria,2022,April,290.51
4,Austria,2022,May,299.46
...,...,...,...,...
6403,Slovakia,2004,August,81.20
6404,Slovakia,2004,September,76.50
6405,Slovakia,2004,October,74.45
6406,Slovakia,2004,November,68.20
