# PROBLEM STATEMENT

- Data represents weekly 2018 retail scan data for National retail volume (units) and price. 
- Retail scan data comes directly from retailers’ cash registers based on actual retail sales of Hass avocados. 
- Starting in 2013, the table below reflects an expanded, multi-outlet retail data set. Multi-outlet reporting includes an aggregation of the following channels: grocery, mass, club, drug, dollar and military. 
- The Average Price (of avocados) in the table reflects a per unit (per avocado) cost, even when multiple units (avocados) are sold in bags. 
- The Product Lookup codes (PLU’s) in the table are only for Hass avocados. Other varieties of avocados (e.g. greenskins) are not included in this table.

Some relevant columns in the dataset:

- Date - The date of the observation
- AveragePrice - the average price of a single avocado
- type - conventional or organic
- year - the year
- Region - the city or region of the observation
- Total Volume - Total number of avocados sold
- 4046 - Total number of avocados with PLU 4046 sold
- 4225 - Total number of avocados with PLU 4225 sold
- 4770 - Total number of avocados with PLU 4770 sold

Dataset: https://bigml.com/user/vidal/gallery/dataset/5c5c760cde2d4d330b009ff2


# IMPORTING DATA

In [None]:
# import libraries 
import pandas as pd # Import Pandas for data manipulation using dataframes
import numpy as np # Import Numpy for data statistical analysis 
import matplotlib.pyplot as plt # Import matplotlib for data visualisation
import random
import seaborn as sns
from fbprophet import Prophet

In [None]:
# dataframes creation for both training and testing datasets 
avocado_df = pd.read_csv('avocado.csv')


# EXPLORING THE DATASET  

In [None]:
# Let's view the head of the training dataset
avocado_df.head()

In [None]:
# Let's view the last elements in the training dataset
avocado_df.tail(20)

In [None]:
avocado_df = avocado_df.sort_values("Date")

In [None]:
plt.figure(figsize=(10,10))
plt.plot(avocado_df['Date'], avocado_df['AveragePrice'])


In [None]:
avocado_df

In [None]:
# Bar Chart to indicate the number of regions 
plt.figure(figsize=[25,12])
sns.countplot(x = 'region', data = avocado_df)
plt.xticks(rotation = 45)


In [None]:
# Bar Chart to indicate the year
plt.figure(figsize=[25,12])
sns.countplot(x = 'year', data = avocado_df)
plt.xticks(rotation = 45)


In [None]:
avocado_prophet_df = avocado_df[['Date', 'AveragePrice']] 


In [None]:
avocado_prophet_df

# PREDICTIONS

In [None]:
avocado_prophet_df = avocado_prophet_df.rename(columns={'Date':'ds', 'AveragePrice':'y'})


In [None]:
avocado_prophet_df

In [None]:
m = Prophet()
m.fit(avocado_prophet_df)


In [None]:
# Forcasting into the future
future = m.make_future_dataframe(periods=365)
forecast = m.predict(future)

In [None]:
forecast

In [None]:
figure = m.plot(forecast, xlabel='Date', ylabel='Price')

In [None]:
figure3 = m.plot_components(forecast)

In [None]:
# dataframes creation for both training and testing datasets 
avocado_df = pd.read_csv('avocado.csv')


In [None]:
avocado_df

In [None]:
avocado_df_sample = avocado_df[avocado_df['region']=='West']

In [None]:
avocado_df_sample

In [None]:
avocado_df_sample

In [None]:
avocado_df_sample = avocado_df_sample.sort_values("Date")

In [None]:
avocado_df_sample

In [None]:
plt.figure(figsize=(10,10))
plt.plot(avocado_df_sample['Date'], avocado_df_sample['AveragePrice'])

In [None]:
avocado_df_sample = avocado_df_sample.rename(columns={'Date':'ds', 'AveragePrice':'y'})


In [None]:
m = Prophet()
m.fit(avocado_df_sample)
# Forcasting into the future
future = m.make_future_dataframe(periods=365)
forecast = m.predict(future)

In [None]:
figure = m.plot(forecast, xlabel='Date', ylabel='Price')

In [None]:
figure3 = m.plot_components(forecast)