# Hog Price Regression

This notebook outlines the process of running a multiple regression on a dataset including weighted average hog prices and utilization rates from 2010-2021. It is for a research paper as a part of my AURA fellowship.

First, we import our needed libraries.

In [19]:
import pandas as pd
from sklearn.linear_model import LinearRegression
import statsmodels.api as sm
import numpy as np

Next, let's read the .csv file using Pandas' read_csv() function and create our dataframe using pd.DataFrame().

In [20]:
sheet = pd.read_csv(r'C:\Users\csjes\Desktop\all hog prices.csv')
df = pd.DataFrame(sheet, columns = ['yr10', 'yr11', 'yr12', 'yr13', 'yr14', 'yr15', 'yr16', 'yr17', 'yr18', 'yr19', 'yr20', 'jan', 'feb', 'mar', 'apr', 'may', 'jun', 'jul', 'aug', 'sep', 'oct', 'nov', 'ln(utilization)', 'ln(wtd ave net price)'])

We have a lot of columns in this dataset due to our use of "dummy" variables for year and month.

Next, we need to define our variables that will be used in the regression.

In [21]:
X = df[['ln(utilization)', 'yr10', 'yr11', 'yr12', 'yr13', 'yr14', 'yr15', 'yr16', 'yr17', 'yr18', 'yr19', 'yr20', 'jan', 'feb', 'mar', 'apr', 'may', 'jun', 'jul', 'aug', 'sep', 'oct', 'nov']]
y = df['ln(wtd ave net price)']

The variable 'X' contains our independent variables. These are the variables that we hope to see have an impact on our dependent variable.
The variable 'y' contains our dependent variable. We want to see whether the independent variables impact the weighted average net price of hogs.

Finally we need to fit the model and print the results. We don't need to add a constant since our data is coded in dummy variables.

In [22]:
result = sm.OLS(y, X).fit()
print(result.summary())

                                  OLS Regression Results                                  
Dep. Variable:     ln(wtd ave net price)   R-squared (uncentered):                   0.959
Model:                               OLS   Adj. R-squared (uncentered):              0.958
Method:                    Least Squares   F-statistic:                              614.7
Date:                   Wed, 22 Jun 2022   Prob (F-statistic):                        0.00
Time:                           12:50:08   Log-Likelihood:                         -802.82
No. Observations:                    623   AIC:                                      1652.
Df Residuals:                        600   BIC:                                      1754.
Df Model:                             23                                                  
Covariance Type:               nonrobust                                                  
                      coef    std err          t      P>|t|      [0.025      0.975]
------

# Results

Let's interpret what we see.

The first note of importance is our Prob (F-statistic). This value describes how statistically significant our model is as a whole. This value is small, which would suggest that our model is significant at most confidence levels.

We can also see that a majority of our independent variables have a P>|t| (also known as a P-value) of less than alpha = 0.05, which suggests that our individual variables that meet that condition are significant at the 95% confidence level. The variables that don't meet this threshold are 'yr11', 'yr13', 'jan', and 'nov'.

The next area of interest is our coefficient values. We can see that, generally speaking, prices for producers have increased over time. The variable 'yr19' has the largest average increase in prices, with a coefficient of 1.867.

We can also suggest that an increase in utilization ratio leads to an average decrease in price.

We can also see a variation in price based on seasonal information. Coefficient values for certain months are higher than others. If we have some knowledge of demand for hogs, we know that prices tend to be lower for producers during summer months, demand increases during this time. Our model supports this hypothesis, as our coefficient values for the months June, July, and August are the lowest for our month variables.