- In this notebook I explore the global prices of peanuts using Klib, Forecast using facebook prophet and train model using H2O.ai AutoML 

![](https://pangeabrokers.com/wp-content/uploads/2019/08/peanuts-global-market-research-pangea-new-report-2019.jpg)

# Global Peanut Market
- Peanuts are oval-shaped nuts commercially distributed as pulse and oilseed. Peanuts are widely used largely in food and beverage industry in the form of oil, flour, snacks, and peanut butter.
- China is the world's leading producer of peanuts, accounting for nearly 41% of the total output. In the year 2019, China was the biggest peanut producer with a production of 17.5 million metric tons. India, Nigeria, and the United States followed with about 6.8, 3, and 2.5 million metric tons each. The production of peanuts decreased in the year 2018 in India, United States, and Senegal, due to adverse weather conditions, especially delayed and irregular rainfall. Major importers of peanuts are Netherlands, Indonesia, Russian Federation, Germany, and China. China exported peanuts to countries such as Vietnam, Thailand, and Japan. These countries imported 43,791 metric tons, 19,833 metric tons, and 15,117 metric tons in 2018 from China. The increased demand from other countries has led China to increase its production which grew from 17.1 million metric tons in 2017 to 17.5 million metric tons in 2019. The market growth will bolster over the forecast period, globally owing to the demand from the processed industries, such as, peanut snacks which are widely consumed snacks in the Asia-Pacific region.
![](https://s3.mordorintelligence.com/peanuts-market/1608725851974_peanuts-market_Global_Peanuts_Market%3A_Market_Size_by_Region%2C_2019.png)

# Install Klib.
[From here](https://www.kaggle.com/sripaadsrinivasan/klib-library-python)

In [None]:
!pip install -q '../input/klib-library-python/klib-0.1.3-py3-none-any.whl'

# Imports

In [None]:
import klib 
import pandas as pd
from statsmodels.tsa.seasonal import seasonal_decompose
import matplotlib.pyplot as plt
import warnings
warnings.filterwarnings("ignore")

In [None]:
df = pd.read_csv('../input/global-peanut-prices/Global_Peanut_Price.csv')
df.head(10)

# Missing Values

In [None]:
klib.missingval_plot(df)

In [None]:
df_cleaned = klib.data_cleaning(df)

# Correlation Matrix

In [None]:
klib.corr_mat(df_cleaned)

# Distribution Plots

In [None]:
klib.dist_plot(df_cleaned['pgnutsusdm'])

In [None]:
df_cleaned.head(10)

In [None]:
df_cleaned.plot(x='date', y='pgnutsusdm', title='prices over time', figsize=(15,6))

# Seasonal Decomposition

In [None]:
decompose = seasonal_decompose(df_cleaned.pgnutsusdm.values, period = 25)
decompose.plot()

# Smoothing techniques
**(helps reduce the effect of random variation and shows the seasonality, trend, residual components of the series)**


In [None]:
#Moving average
df_cleaned['moving_average'] = df_cleaned['pgnutsusdm'].rolling(window=3,center=False).mean()
df_cleaned.fillna(0).head(10)

In [None]:
plt.plot(df_cleaned.pgnutsusdm,'-',color='black',alpha=0.3)
plt.plot(df_cleaned.moving_average,color='b')
plt.title('Peanut Prices and Moving Average Smoothing')
plt.legend()
plt.show()

In [None]:
#Exponetial smoothing(exponentially weighted moving average)(EWMA)
df_cleaned['ewma']=df_cleaned['pgnutsusdm'].ewm(halflife=3,ignore_na=False,min_periods=0,adjust=True).mean()
df_cleaned.fillna(0).head(10)

In [None]:
plt.plot(df_cleaned.pgnutsusdm,'-',color='b',alpha=0.3)
plt.plot(df_cleaned.ewma,color='g')
plt.title('Peanut Price and Exponential Smoothing')
plt.legend()
plt.show()

# Facebook Prophet
- Prophet is open source software released by Facebook’s Core Data Science team.
- Prophet is a procedure for forecasting time series data based on an additive model where non-linear trends are fit with yearly, weekly, and daily seasonality, plus holiday effects.
- Prophet works best with time series that have strong seasonal effects and several seasons of historical data.

In [None]:
from fbprophet import Prophet

In [None]:
prophet_df = klib.data_cleaning(df_cleaned)

In [None]:
prophet_df = prophet_df.rename(columns={'date':'ds', 'pgnutsusdm':'y'})

In [None]:
prophet_df.head()

In [None]:
m = Prophet()
m.fit(prophet_df)

In [None]:
# Forcasting into the future
future = m.make_future_dataframe(periods=365)
future.tail()

In [None]:
forecast = m.predict(future)
forecast[['ds', 'yhat', 'yhat_lower', 'yhat_upper']].tail()

In [None]:
#forecast plotting
forecast_plot = m.plot(forecast, xlabel='Date', ylabel='Price', figsize=(15,6))

In [None]:
# forecast components
forecast_components = m.plot_components(forecast)

# H2O.ai AutoML

In [None]:
# !pip install h2o
# or 
# !pip install http://h2o-release.s3.amazonaws.com/h2o/rel-weierstrass/2/Python/h2o-3.14.0.2-py2.py3-none-any.whl
import h2o
from h2o.automl import H2OAutoML
h2o.init()

In [None]:
data = prophet_df.copy()
data.head()

In [None]:
# Load data into H2O
df = h2o.import_file('../input/global-peanut-prices/Global_Peanut_Price.csv')

In [None]:
df.describe()

In [None]:
y = 'PGNUTSUSDM'

- 60% for training
- 20% for validation (hyper parameter tuning)
- 20% for final testing, will net be used until the end.

In [None]:
# Parse Df
splits = df.split_frame([0.6, 0.2], seed = 1)

In [None]:
splits

In [None]:
# Parse Df
train = splits[0]
valid = splits[1]
test  = splits[2]

In [None]:
# Run AutoML
aml = H2OAutoML(max_runtime_secs = 300, seed = 1, project_name = "peanut_price")
aml.train(y = y, training_frame = train, leaderboard_frame = valid)

In [None]:
aml.leaderboard.head()

# Predict Using Leader Model

In [None]:
prediction = aml.predict(test)
prediction

In [None]:
performance = aml.leader.model_performance(test)
performance