# Predicting Stock Change With Python

- toc: true
- branch: master
- badges: true
- comments: true
- categories: [Fastpages, Jupyter, Python, Selenium, Stoc]
- annotations: true
- hide: false
- search_exclude: true

Stock is a highly sensitive and turbulent market. Because of the recent crisis between Russia and Ukraine, for example, a few comments from Putin or another powerful figure might lead millions of people to lose or make profit in a matter of minutes. A essential skill that modern people should have is the ability to foresee trends in order to preserve their investments and maximize their profits. In this blog, we are going to introduce three basic functions to support you to achieve the following goals.


First, [selenium](https://pypi.org/project/selenium/) is the first option to support us to do web scraping from [Yahoo finance](https://finance.yahoo.com/screener/predefined/aggressive_small_caps?offset=0&count=202) based on the filter we set up, such as Aggressive Small Caps. Here is a [Youtube selenium tutoria](https://www.youtube.com/watch?v=Xjv1sY630Uc) that I recommend you to  set up selenium. And use this selenium python tutorial as more detailed reference. Then, we will get the historical data of that most active stock. Second, we will predict the stock trends. Eventually, we send out the predictions and the lastest change on the stock to our receiver by email.

![Stock Image](https://miro.medium.com/max/1400/0*oWZiknkZ2GzGBkIJ)

## Set up the envirnment

Bleow are some packages that are necessary to run the code.

In [None]:
import pandas as pd
import numpy as np
import re
from getpass import getpass
from datetime import datetime, date, time, timezone
import smtplib
from selenium import webdriver
import os

#For Prediction
from sklearn.linear_model import LinearRegression
from sklearn import preprocessing
from sklearn.model_selection import cross_validate
from sklearn.model_selection import train_test_split
from selenium.webdriver.common.by import By
from selenium.webdriver.chrome.service import Service

#For Stock Data
from iexfinance.stocks import get_historical_data
from iexfinance.stocks import Stock

## Do not have enough stock data?

### Method one: scrapying the stock data from Yahoo for free!

After following the tutorial at the very beginning of this blog, we can create our chrome drive and use driver.get(url) navigate to our desired webpage: [Yahoo finance](https://finance.yahoo.com/screener/predefined/aggressive_small_caps?offset=0&count=202) which displays the top 25 most active stocks by default. You can also change the filter based on what you are looking for. Inside webdriver.Chrome() you will need to type your chromedriver path.

After connecting to Yahoo by using webdriver, we could use a double for loops to crawl the whole database on the website. In detail, "j" represents how many rows we need, and "i" represents which column we need. 

In [None]:
def getStocks(n):
    #Navigating to the Yahoo stock screener
    driver = webdriver.Chrome(service= Service(
        '/Users/zeyu/Desktop/chromedriver'))
    url = "https://finance.yahoo.com/screener/predefined/aggressive_small_caps?offset=0&count=202"
    driver.get(url)

    #Creating a stock list and iterating through the ticker names on the stock screener list
    data = [[] for i in range(9)]
    for i in range(1,len(data)+1):
        for j in range(1,n+1):
            ticker = driver.find_element(By.XPATH,
                '//*[@id = "scr-res-table"]/div[1]/table/tbody/tr[' + str(j) + ']/td[' + str(i) + ']')
            data[i-1].append(ticker.text)  
    driver.quit()

    #Using the stock list to predict the future price of the stock a specificed amount of days
    number = 0
    for i in data:
        print("Number: " + str(number))
        try:
            predictData(i, 5)
        except:
            print("Stock: " + i + " was not predicted")
        number += 1

if __name__ == '__main__':
    getStocks(20)

### Method two: applying historical stock data from IEX Cloud with cost.


Before everything, you need to visit [iexcloud](https://iexcloud.io/) to create an account and get a exclusive API. The free version only offers a very limited access.

Now, we can using the get_historical_data() package to get cleaned up dataset you want. Then there are a few parameters you need to enter. First, you need to enter the stock symblo like (AAPL). Then, set the start and end date, the output format (we will use pandas in this project), and eventually, the token which is API you acquired from the iexcloud website.

In [None]:
#Outputting the Historical data into a .csv for later use
start = datetime(2021, 2, 17)
end = datetime(2022, 2, 16)
API = getpass("Please enter your API")

df = get_historical_data(stock_symblo, start=start, end=end, output_format='pandas', token = API)

In [None]:
#hide
df = pd.read_csv('/Users/zeyu/Desktop/DS/Stock/Selenium/SPOT_df.csv')
df = df.drop(['subkey'], axis = 1)

pd.set_option('display.max_columns', None)
df = df.rename(columns = {"Unnamed: 0" : "date"})
df.head()

In [None]:
#hide
container = []
for i in range(len(df.date)):
    x = re.sub(("-"), "", df.date[i])
    container.append(x)

df.date = container
df = df.drop(['symbol', "id", "key", "label"], axis = 1)

## Predict the future stock!

In the following lines, we first create a prediction column called "Prediction", each day's close price is the prediction of the previous day. Second, two datasets are created. X is the predictor variable, Y is the target variable. Preprocessing package is to nomalize our predictor variables. Then we split our data into train and test datasets for both X and Y. Then we use the Regression on the training data, then predict the closing price of X_prediction.

In [None]:
def predictData(stock, days):
    
    if os.path.exists('./Exports'):
        csv_name = ('Exports/' + stock + '_Export.csv')
    else:
        os.mkdir("Exports")
        csv_name = ('Exports/' + stock + '_Export.csv')

    df.to_csv(csv_name)
    
    df['prediction'] = df['close'].shift(-1)
    df.dropna(inplace = True)
    forecast_time = int(days)

    #Predicting the Stock price in the future
    X = np.array(df.drop(['prediction'], axis=1))
    Y = np.array(df['prediction'])

    #Nomalize our predictor variables
    X = preprocessing.scale(X)
    print("\nX aftre preprocessing:", X)
    X_prediction = X[-forecast_time:]
    
    # Split our data into train and test data
    X_train, X_test, Y_train, Y_Test = train_test_split(X, Y, test_size=0.5)
    print("\n\nX train:", X_train, "\nX_test:",X_test, "\nY_train:", Y_train, "\nY_Test", Y_Test)

    #Performing the Regression on the training data
    clf = LinearRegression()
    clf.fit(X_train, Y_train)

    #Predict the closing price of X_prediction.
    prediction = (clf.predict(X_prediction))

    last_row = df.tail(1)
    print("Last row:", last_row)
    print("Last row 1:", last_row['close'])

    #Sending the SMS if the predicted price of the stock is at least 1 greater than the previous closing price
    if (float(prediction[4]) > (float(last_row['close'])) + 1):
        output = ("\n\nStock:" + str(stock) + "\nPrior Close:\n" + str(last_row['close']) + "\n\nPrediction in 1 Day: " + str(
            prediction[0]) + "\nPrediction in 5 Days: " + str(prediction[4]))
        #sendMessage(output)
        print("This is the output:", output)

predictData(df, 5)

## Magic that send message through gmail!

Eventually, the last function sendMessage(text) is for us to send messages through gamil when some conditions are triggered by using [smtplib](https://docs.python.org/3/library/smtplib.html) module. getpass() is another package use here for user enter the password to improve the security.

In [None]:
def sendMessage(text):
    # If you're using Gmail to send the message, you might need to 
    # go into the security settings of your email account and 
    # enable the "Allow less secure apps" option 
    username = getpass("Please enter your gmail address like chrisguan912@gmail.com")
    password = getpass("Please enter your password")

    vtext = getpass("Please enter the receiver's phone number and company's domain like 9292649156@txt.att.net")
   
    message = getpass("Please enter your message")

    MSG = """From: %s  \n 
    To:      %s\n
    %s """ %(username, vtext, message)

    server = smtplib.SMTP('smtp.gmail.com', 587)
    server.starttls()
    server.login(username, password)
    server.sendmail(username, vtext, MSG)
    server.quit()

    print('Sent')

## Conclusion

The blog introduced four functions in total. The first two can support us to apply valuable data by using selenium or download from iexcloud. The third funciton is to analyze the predict the stock trend. The last few lines support us to send a message through gmail.