
<h3 id="Business-Case-of-Deere-&amp;-Co.">Business Case of Deere &amp; Co.<a class="anchor-link" href="#Business-Case-of-Deere-&amp;-Co.">¶</a></h3><p>Deere and copmany forecast higher sales of machinery in the next financial year as the worldâ€™s largest tractor manufacturer downplayed the impact of the U.S.-China trade war on soybean prices.</p>
<p>Deere also forecast its equipment sales will rise by about 30 percent in the current fiscal year. The company expects farmersâ€™ net returns per acre in 2019 will rise as much as 20 percent to the highest level in about five years, Chief Finance Officer Rajesh Kalathur said on the call.</p>
<p>Now with this challenging demand, we need data science team to help them</p>
<p>Deere is a tractor and farm equipment manufacturing company, was established in 1838.</p>
<p>The company has shown a consistent growth in its revenue from tractor sales since its inception.</p>
<p>However, over the years the company has struggled to keep itâ€™s inventory and production cost down because of variability in sales and tractor demand.</p>
<p>The management at PowerHorse is under enormous pressure from the shareholders and board to reduce the production cost.</p>
<p>Additionally, they are also interested in understanding the impact of their marketing and farmer connect efforts towards overall sales.</p>
<p>In the same effort, they have hired you as a data science and predictive analytics consultant.</p>
<p>Can you help them in optimizing and solving their business Problem</p>


In [None]:

import warnings
import itertools
import pandas as pd
import numpy as np
import statsmodels.api as sm
import statsmodels.tsa.api as smt
import statsmodels.formula.api as smf
from io import StringIO
import requests
import matplotlib.pyplot as plt
%matplotlib inline

plt.style.use('bmh')



In [None]:

# Read the data

sales_data = pd.read_csv("/Users/gobo/Desktop/IMS Pro School Data/Untitled Folder/Tractor-sales.csv")
sales_data.head(5)



In [None]:

# since the complete date was not mentioned, we assume that it was the first of every month
dates = pd.date_range(start='2003-01-01', freq='MS', periods=len(sales_data))

import calendar
sales_data['Month'] = dates.month
sales_data['Month'] = sales_data['Month'].apply(lambda x: calendar.month_abbr[x])
sales_data['Year'] = dates.year



In [None]:

sales_data.drop(['Month-Year'], axis=1, inplace=True)
sales_data.rename(columns={'Number of Tractor Sold':'Tractor-Sales'}, inplace=True)
sales_data = sales_data[['Month', 'Year', 'Tractor-Sales']]



In [None]:

# set the dates as the index of the dataframe, so that it can be treated as a time-series dataframe
sales_data.set_index(dates, inplace=True)
# check out first 5 samples of the data
sales_data.head(5)



In [None]:

# extract out the time-series
sales_ts = sales_data['Tractor-Sales']



In [None]:

# Plot the time series
plt.figure(figsize=(10, 5))
plt.plot(sales_ts)
plt.xlabel('Years')
plt.ylabel('Tractor Sales')



In [None]:

#Determing rolling statistics
rolmean = sales_ts.rolling(window=12).mean()
rolstd = sales_ts.rolling(window=12).std()

#Plot rolling statistics:
orig = plt.plot(sales_ts, label='Original')
mean = plt.plot(rolmean, label='Rolling Mean')
std = plt.plot(rolstd, label = 'Rolling Std')
plt.legend(loc='best')
plt.title('Rolling Mean & Standard Deviation')
plt.show(block=False)



In [None]:

monthly_sales_data = pd.pivot_table(sales_data, values = "Tractor-Sales", columns = "Year", index = "Month")
monthly_sales_data = monthly_sales_data.reindex(index = ['Jan','Feb','Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec'])
monthly_sales_data



In [None]:

# This is to see month on month plot by year. This will help us to understand if we have
# similar patterns in the time series
monthly_sales_data.plot()



In [None]:

# Making yearly data and ploting it to check year patterns
yearly_sales_data = pd.pivot_table(sales_data, values = "Tractor-Sales", columns = "Month", index = "Year")
yearly_sales_data = yearly_sales_data[['Jan','Feb','Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec']]
yearly_sales_data
yearly_sales_data.boxplot()




<p>Important Inferences</p>
<p>The tractor sales have been increasing without fail every year.</p>
<p>July and August are the peak months for tractor sales and the variance and the mean value in July and August are also much higher than any of the other months.</p>
<p>We can see a seasonal cycle of 12 months where the mean value of each month starts with a increasing trend in the beginning of the year and drops down towards the end of the year.</p>
<p>We can see a seasonal effect with a cycle of 12 months.</p>


In [None]:

decomposition = sm.tsa.seasonal_decompose(sales_ts, model='multiplicative')



In [None]:

fig = decomposition.plot()
fig.set_figwidth(12)
fig.set_figheight(8)
fig.suptitle('Decomposition of multiplicative time series')
plt.show()



In [None]:

# Define the d and q parameters to take any value between 0 and 1
q = d = range(0, 2)
# Define the p parameters to take any value between 0 and 3
p = range(0, 4)

# Generate all different combinations of p, q and q triplets
pdq = list(itertools.product(p, d, q))

# Generate all different combinations of seasonal p, q and q triplets
seasonal_pdq = [(x[0], x[1], x[2], 12) for x in list(itertools.product(p, d, q))]

print('Examples of parameter combinations for Seasonal ARIMA...')
print('SARIMAX: {} x {}'.format(pdq[1], seasonal_pdq[1]))
print('SARIMAX: {} x {}'.format(pdq[1], seasonal_pdq[2]))
print('SARIMAX: {} x {}'.format(pdq[2], seasonal_pdq[3]))
print('SARIMAX: {} x {}'.format(pdq[2], seasonal_pdq[4]))



In [None]:

warnings.filterwarnings("ignore") # specify to ignore warning messages

AIC = []
SARIMAX_model = []
for param in pdq:
    for param_seasonal in seasonal_pdq:
        try:
            mod = sm.tsa.statespace.SARIMAX(sales_ts,
                                            order=param,
                                            seasonal_order=param_seasonal,
                                            enforce_stationarity=False,
                                            enforce_invertibility=False)

            results = mod.fit()

            print('SARIMAX{}x{} - AIC:{}'.format(param, param_seasonal, results.aic), end='\r')
            AIC.append(results.aic)
            SARIMAX_model.append([param, param_seasonal])
        except:
            continue



In [None]:

print('The smallest AIC is {} for model SARIMAX{}x{}'.format(min(AIC), SARIMAX_model[AIC.index(min(AIC))][0],SARIMAX_model[AIC.index(min(AIC))][1]))



In [None]:

# Let's fit this model
mod = sm.tsa.statespace.SARIMAX(sales_ts,
                                order=SARIMAX_model[AIC.index(min(AIC))][0],
                                seasonal_order=SARIMAX_model[AIC.index(min(AIC))][1],
                                enforce_stationarity=False,
                                enforce_invertibility=False)

results = mod.fit()



In [None]:

results.plot_diagnostics(figsize=(20, 14))
plt.show()



In [None]:

pred1 = results.get_prediction(start='2003-01-01', dynamic=True)
pred1_ci = pred1.conf_int()



In [None]:

pred2 = results.get_forecast('2015-01-01')
pred2_ci = pred2.conf_int()



In [None]:

#In this case the model is used to predict data that the model was built on. 
#1-step ahead forecasting implies that each forecasted point is used to predict the 
#following one.
pred0 = results.get_prediction(start='2003-01-01', dynamic=False)
pred0_ci = pred0.conf_int()

#In sample prediction with dynamic forecasting of the last year  
#Again, the model is used to predict data that the model was built on.
pred1 = results.get_prediction(start='2003-01-01', dynamic=True)
pred1_ci = pred1.conf_int()

#"True" forecasting of out of sample data. 
#In this case the model is asked to predict data it has not seen before.
pred2 = results.get_forecast('2016-01-01')
# Give the end year till you want forecast
pred2_ci = pred2.conf_int()



In [None]:

#Plot the predicted values
ax = sales_ts.plot(figsize=(20, 16))
pred0.predicted_mean.plot(ax=ax, label='1-step-ahead Forecast (get_predictions, dynamic=False)')
pred1.predicted_mean.plot(ax=ax, label='Dynamic Forecast (get_predictions, dynamic=True)')
pred2.predicted_mean.plot(ax=ax, label='Dynamic Forecast (get_forecast)')
ax.fill_between(pred2_ci.index, pred2_ci.iloc[:, 0], pred2_ci.iloc[:, 1], color='k', alpha=.1)
plt.ylabel('Monthly Tractor Sales')
plt.xlabel('Date')
plt.legend()
plt.show()

