# Analysis of Intel Stock Prices

### Alan Elbert

## 1. Introduction 

Predicting a stocks price is a multibillion dollar industry, with numerous hedgefunds investing massive sums of money in order to determine what will happen to a stock's price. Thousands, if not millions of factors affect the price of a stock, ranging from the overall performance of the economy, public sentiment, commodity prices, geopolitical events, and much more.

The overall objective of this project will be to analyze the prices of Intel stock, specifically in comparison to commodity price indices and the performance of other stocks related to Intel, mainly its competitors and companies that Intel does business with (Suppliers and Customers of Intel). Although this will come no where near the scale of analysis done at larger firms with more resources, I hope that this will reveal at least a few interesting trends, that a budding trader might consider when deciding to buy, or sell, Intel stock.



In [28]:

import pandas as pd
import numpy as np
import urllib


import requests

import matplotlib.pyplot as plt

from sklearn import datasets, linear_model
from sklearn.metrics import mean_squared_error, r2_score
import statsmodels.api as sma
from scipy import stats
import statsmodels.api as sm
import math
from datetime import date

# https://finance.yahoo.com/quote/intc/history/


n_data = pd.read_csv("INTC.csv")
n_data




Unnamed: 0,Date,Open,High,Low,Close,Adj Close,Volume
0,2002-05-01,28.570000,31.360001,25.559999,27.620001,16.863897,976795600
1,2002-06-01,27.430000,28.200001,17.450001,18.270000,11.163094,1334010600
2,2002-07-01,18.350000,19.879999,16.260000,18.790001,11.480818,1507029200
3,2002-08-01,18.719999,19.670000,15.820000,16.670000,10.185485,1182004600
4,2002-09-01,16.469999,17.040001,13.670000,13.890000,8.497054,1253168200
...,...,...,...,...,...,...,...
236,2022-01-01,51.650002,56.279999,46.299999,48.820000,48.450916,890542800
237,2022-02-01,48.779999,49.970001,43.630001,47.700001,47.339386,756814900
238,2022-03-01,47.540001,52.509998,44.070000,49.560001,49.560001,849288700
239,2022-04-01,49.830002,49.900002,43.500000,43.590000,43.590000,639357900


Based on the above data set, note how 

In [29]:


n_data["percent_change"] = ((n_data["Close"] - n_data["Open"]) / n_data["Open"]) * 100.0


for i, row in n_data.iterrows():
    
    dt = date.fromisoformat(str(row["Date"]))
    n_data.at[i, "Date"] = dt


n_data = n_data.drop([239, 240])
n_data



Unnamed: 0,Date,Open,High,Low,Close,Adj Close,Volume,percent_change
0,2002-05-01,28.570000,31.360001,25.559999,27.620001,16.863897,976795600,-3.325163
1,2002-06-01,27.430000,28.200001,17.450001,18.270000,11.163094,1334010600,-33.394094
2,2002-07-01,18.350000,19.879999,16.260000,18.790001,11.480818,1507029200,2.397826
3,2002-08-01,18.719999,19.670000,15.820000,16.670000,10.185485,1182004600,-10.950850
4,2002-09-01,16.469999,17.040001,13.670000,13.890000,8.497054,1253168200,-15.664840
...,...,...,...,...,...,...,...,...
234,2021-11-01,49.400002,51.990002,48.119999,49.200001,48.490829,617076700,-0.404860
235,2021-12-01,49.840000,55.000000,48.330002,51.500000,51.110657,714651800,3.330658
236,2022-01-01,51.650002,56.279999,46.299999,48.820000,48.450916,890542800,-5.479190
237,2022-02-01,48.779999,49.970001,43.630001,47.700001,47.339386,756814900,-2.214018


In [30]:

t_data = pd.read_csv("TSM.csv")
t_data["percent_change"] = ((t_data["Close"] - t_data["Open"]) / t_data["Open"]) * 100.0

for i, row in t_data.iterrows():
    
    dt = date.fromisoformat(str(row["Date"]))
    t_data.at[i, "Date"] = dt



a_data = pd.read_csv("AMD.csv")
a_data["percent_change"] = ((a_data["Close"] - a_data["Open"]) / a_data["Open"]) * 100.0

for i, row in a_data.iterrows():
    
    dt = date.fromisoformat(str(row["Date"]))
    a_data.at[i, "Date"] = dt

t_data = t_data.drop([0])
a_data = a_data.drop([0])

a_data

Unnamed: 0,Date,Open,High,Low,Close,Adj Close,Volume,percent_change
1,2002-05-01,11.350000,12.950000,10.400000,11.430000,11.430000,100840400,0.704846
2,2002-06-01,11.400000,11.410000,7.950000,9.720000,9.720000,137478900,-14.736842
3,2002-07-01,9.600000,10.300000,7.460000,8.030000,8.030000,111618400,-16.354167
4,2002-08-01,8.030000,10.880000,7.010000,8.850000,8.850000,110296000,10.211706
5,2002-09-01,8.600000,8.690000,5.200000,5.340000,5.340000,118764300,-37.906977
...,...,...,...,...,...,...,...,...
235,2021-11-01,119.449997,164.460007,118.129997,158.369995,158.369995,1373609400,32.582670
236,2021-12-01,160.369995,160.880005,130.600006,143.899994,143.899994,1175493900,-10.270002
237,2022-01-01,145.139999,152.419998,99.349998,114.250000,114.250000,1638612600,-21.282899
238,2022-02-01,116.750000,132.960007,104.260002,123.339996,123.339996,2293656000,5.644536


## 2  Commodity Price data 

### Acquisision

In comparison to acquiring the stock data from Yahoo finance, acquiring commodity price data for several different commodities is as simple as downloading a file.


In [31]:
tcomm = pd.read_excel('https://www.imf.org/-/media/Files/Research/CommodityPrices/Monthly/external-data-indices-onlyapril.ashx')
tcomm.head()

Unnamed: 0,Commodity,PALLFNF,PEXGALL,PNFUEL,PFANDB,PFOOD,PBEVE,PINDU,PAGRI,PRAWM,PALLMETA,PMETA,PPMETA,PEXGMETA,PFERT,PNRG,POILAPSP,PNGAS,PCOAL
0,Commodity.Description,"All Commodity Price Index, 2016 = 100, include...","Commodities for Index: All, excluding Gold, 20...","Non-Fuel Price Index, 2016 = 100, includes Pre...","Food and Beverage Price Index, 2016 = 100, inc...","Food Price Index, 2016 = 100, includes Cereal,...","Beverage Price Index, 2016 = 100, includes Cof...","Industrial Inputs Price Index, 2016 = 100, inc...","Agriculture Price Index, 2016 = 100, includes ...","Agricultural Raw Materials Index, 2016 = 100, ...","All Metals Index, 2016 = 100: includes Metal P...","Base Metals Price Index, 2016 = 100, includes ...","Precious Metals Price Index, 2016 = 100, inclu...","All Metals EX GOLD Index, 2016 = 100: includes...","Fertilizer Index, 2016 = 100, includes DAP, Po...","Fuel (Energy) Index, 2016 = 100, includes Crud...","Crude Oil (petroleum), Price index, 2016 = 100...","Natural Gas Price Index, 2016 = 100, includes ...","Coal Price Index, 2016 = 100, includes Austral..."
1,Data Type,Index,Index,Index,Index,Index,Index,Index,Index,Index,Index,Index,Index,Index,Index,Index,Index,Index,Index
2,Frequency,Monthly,Monthly,Monthly,Monthly,Monthly,Monthly,Monthly,Monthly,Monthly,Monthly,Monthly,Monthly,Monthly,Monthly,Monthly,Monthly,Monthly,Monthly
3,1990M1,,,,,,49.20388,,,72.855085,,,33.097764,,,,,,51.172528
4,1990M2,,,,,,49.947777,,,73.449658,,,33.701146,,,,,,51.172528


### Data Processing


Keeping in line with the same rationale for computing percent change across the years, we want to do the same for commodity prices, as well, in addition to cleaning up the data so it is in line with the stock data too. However, in order to know what must be cleaned up in the data, we must first observe the data.

Looking at the above data, you will notice several glaring issues. Firstly, three rows do not carry any useful data at all. They must be dropped from the dataframe. Secondly, notice the Commodity column, specifically the format the data is in. In order to match up, and do analysis on the commodity data in relation to the stock data, this column must be replaced with datetime objects instead of simple string data. Furthermore, notice the NaN columns. There are several ways of dealing with them, such as keeping columns 

In [32]:
from datetime import date



tcomm = tcomm.drop([0,1,2])

tpchange = ['PALLFNF',	'PEXGALL',	'PNFUEL',	'PFANDB',	'PFOOD', 'PBEVE',	
    'PINDU', 'PAGRI', 'PRAWM',	'PALLMETA',	'PMETA', 'PPMETA', 'PEXGMETA',	'PFERT', 'PNRG', 'POILAPSP', 'PNGAS', 'PCOAL']

tcomm.drop(tcomm.columns.difference(['Commodity'] + tpchange), 1, inplace=True)


for i, row in tcomm.iterrows():
    datestr = (row['Commodity'])

    yearstr = datestr[0:4]
    mstr = datestr[-2:]
    mstr = mstr.replace("M", "0")

    isostr = yearstr + "-" + mstr + "-01"
    
    tcomm.at[i, 'Commodity'] = date.fromisoformat(isostr)


tcomm = tcomm[tcomm['Commodity'] >= date.fromisoformat('2002-04-01')]

for i, row in tcomm.iterrows():
    for n in tpchange:
        tcomm.at[i, n] = float(row[n])


commdata = tcomm.copy(deep=True)

commdata = commdata.drop([150])

tcomm = tcomm.reset_index()
tcomm.drop(['index'], 1, inplace=True)

commdata = commdata.reset_index()
commdata.drop(['index'], 1, inplace=True)


for i, row in commdata.iterrows():
    for n in tpchange:
        commdata.at[i, n + "_PCHANGE"] = 100 * ( (commdata.at[i,n] - tcomm.at[i,n]) / tcomm.at[i,n])


commdata.drop(tpchange, 1, inplace = True)

commdata

  tcomm.drop(tcomm.columns.difference(['Commodity'] + tpchange), 1, inplace=True)
  tcomm.drop(['index'], 1, inplace=True)
  commdata.drop(['index'], 1, inplace=True)
  commdata.drop(tpchange, 1, inplace = True)


Unnamed: 0,Commodity,PALLFNF_PCHANGE,PEXGALL_PCHANGE,PNFUEL_PCHANGE,PFANDB_PCHANGE,PFOOD_PCHANGE,PBEVE_PCHANGE,PINDU_PCHANGE,PAGRI_PCHANGE,PRAWM_PCHANGE,PALLMETA_PCHANGE,PMETA_PCHANGE,PPMETA_PCHANGE,PEXGMETA_PCHANGE,PFERT_PCHANGE,PNRG_PCHANGE,POILAPSP_PCHANGE,PNGAS_PCHANGE,PCOAL_PCHANGE
0,2002-05-01,,,,-1.365817,-1.164610,-4.766974,-0.545688,-1.104417,0.609014,0.446983,-1.233882,3.025502,-1.163527,,1.244313,-0.611939,3.208914,-2.386656
1,2002-06-01,,,,-0.923893,-0.968423,-0.142708,3.302399,-0.108400,5.132082,2.088413,2.191574,1.936703,2.070231,,-4.642933,-6.353787,-3.861909,-6.126784
2,2002-07-01,,,,3.340158,3.273896,4.492992,0.970699,3.489521,4.394058,-1.731508,-1.167473,-2.563065,-1.376520,,3.633266,2.768587,2.483602,-6.666984
3,2002-08-01,,,,-0.182643,-0.278322,1.462563,-1.692926,0.143315,2.097384,-2.898913,-4.193506,-0.962961,-3.838795,,1.629957,4.681271,-5.000659,-5.057825
4,2002-09-01,,,,1.502661,0.927666,11.220133,1.775284,1.920696,4.370782,1.014690,-0.049478,2.554153,0.096096,,6.664290,5.577429,8.432123,5.180339
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
234,2021-11-01,-4.442645,-5.002359,-0.418341,1.073580,0.929958,2.882470,-6.174645,1.300011,3.071831,-3.669579,-8.293148,2.354686,-7.323527,11.135439,-7.777791,-2.035268,-8.665011,-30.316569
235,2021-12-01,1.688678,1.998493,1.699513,2.135417,2.133783,2.155608,4.356264,1.674886,-1.858888,1.932890,5.956717,-2.764536,4.429774,0.112229,1.678983,-8.248421,21.006837,0.256197
236,2022-01-01,2.230122,2.276082,3.991393,3.528239,3.848250,-0.424908,8.441083,3.466084,2.969737,6.506196,9.746074,2.384716,9.500993,-10.956217,0.653667,14.839650,-22.608796,27.374485
237,2022-02-01,6.423653,6.779986,4.113165,4.723593,4.989165,1.302151,3.423100,4.438266,2.147407,3.409336,3.708583,3.001292,4.120920,7.195349,8.560272,11.110886,0.199381,18.989586
