Predicting the stock market is one of the most important applications of Machine Learning in finance. In this project, we will go through a simple Data Science project on Stock Price Prediction using Machine Learning Python.

At the end of this project, we will learn how to predict stock prices by using the Linear Regression model by implementing the Python programming language.

### Stock Price Prediction

Predicting the stock market has been the bane and goal of investors since its inception. Every day billions of dollars are traded on the stock exchange, and behind every dollar is an investor hoping to make a profit in one way or another.

Entire companies rise and fall daily depending on market behaviour. If an investor is able to accurately predict market movements, he offers a tantalizing promise of wealth and influence. 

Today, so many people are making money staying at home trading in the stock market. It is a plus point for us if we use our experience in the stock market and our machine learning skills for the task of stock price prediction.

In [1]:
import numpy as np
import pandas as pd
from sklearn import preprocessing
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression

We can downlaod a dataset from [yahoo finance](https://query1.finance.yahoo.com/v7/finance/download/INR=X?period1=1580035828&period2=1611658228&interval=1d&events=history&includeAdjustedClose=true)

In [2]:
stock_prices = pd.read_csv("prices.csv")

In [3]:
stock_prices.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 262 entries, 0 to 261
Data columns (total 7 columns):
 #   Column     Non-Null Count  Dtype  
---  ------     --------------  -----  
 0   Date       262 non-null    object 
 1   Open       262 non-null    float64
 2   High       262 non-null    float64
 3   Low        262 non-null    float64
 4   Close      262 non-null    float64
 5   Adj Close  262 non-null    float64
 6   Volume     262 non-null    int64  
dtypes: float64(5), int64(1), object(1)
memory usage: 14.5+ KB


### Data Preparation

Now we will write a function that will prepare the dataset so that we can fit it easily in the Linear Regression model:

In [4]:
def prepare_data(df,forecast_col,forecast_out,test_size):
    
    label = df[forecast_col].shift(-forecast_out) # label with the last 5 rows are nan
    label.dropna(inplace=True) # dropping na values
    y = np.array(label)  # assigning Y. Converting Series into array
    
    X = np.array(df[[forecast_col]]) # creating the feature array
    X = preprocessing.scale(X) # Normalizing the feature array
    
    X_lately = X[-forecast_out:] # we want to use this later in the predicting method
    X = X[:-forecast_out] # X that will contain the training and testing
    
    X_train, X_test, Y_train, Y_test = train_test_split(X, y, test_size=test_size, random_state=0) #cross validation

    response = [X_train,X_test , Y_train, Y_test , X_lately]
    return response

Now we need to prepare three input variables as already prepared in the function created above. 

* We need to declare an input variable mentioning about which column we want to predict. 
* The next variable we need to declare is how much far we want to predict.
* And the last variable that we need to declare is how much should be the size of the test set. 

Now let’s declare all the variables:

In [5]:
forecast_col = 'Close'
forecast_out = 5
test_size = 0.2

### Applying Machine Learning for Stock Price Prediction

Now we will split the data and fit into the linear regression model:

In [6]:
X_train, X_test, Y_train, Y_test , X_lately =\
prepare_data(stock_prices,forecast_col,forecast_out,test_size); #calling the method were the cross validation and data preperation is in

model = LinearRegression() #initializing linear regression model

model.fit(X_train,Y_train) #training the linear regression model

LinearRegression(copy_X=True, fit_intercept=True, n_jobs=None, normalize=False)

Now let’s predict the output and have a look at the prices of the stock prices:

In [7]:
score=model.score(X_test,Y_test) # testing the linear regression model
prediction = model.predict(X_lately) #set that will contain the forecasted data

response={} # creting json object
response['test_score']=score
response['forecast_set']=prediction

print(response)

{'test_score': 0.6393027829544529, 'forecast_set': array([73.37040254, 73.12634778, 73.16456803, 73.20017668, 73.10125498])}


This is how we can predict the stock prices with Machine Learning by implementing the Linear Regression Model.