# Project proposal: Stock-price prediction

Stock market prediction has attracted much attention from academia as well as business. Due
to the non-linear, volatile and complex nature of the market, it is quite difficult to predict. As
the stock markets grow bigger, more investors pay attention to develop a systematic approach
to predict the stock market. Since the stock market is very sensitive to the external information,
the performance of previous prediction systems is limited by merely considering the traditional
stock data. New forms of collective intelligence have emerged with the rise of the Internet (e.g.
Google Trends, Wikipedia, etc.). The changes on these platforms will significantly affect the
stock market. In addition, both the financial news sentiment and volumes are believed to have
impact on the stock price.

In this project, you should develop and evaluate a prediction model that could be used to predict
the stocks’ short-term movement, and price. Besides historical data directly from the stock
market, some external data sources should also be considered as inputs to the model.

In summary, the contributions to this work are listed below:

1. Detect the potential factors(features) that could impact the stock market and acquire the
data from disparate data sources: 
    
    (a) publicly available market information on stocks such as Yahoo Finance, including opening/closing prices, trade volume, etc.; 

    (b) commonly used technical indicatorsfrom Yahoo finance that reflect price variation over time such as Stochastic Oscillator (%K), the Larry William (LW) % R Indicator and the Relative Strength Index (RSI) ; 

    (c) daily counts of Google Trends on the stocks of interest; 
     
    (d) the number of unique visitors for pertinent Wikipedia pages per day.

2. Use a variable selection method such as PCA, Correlation Coefficient or any other suitable methods to filter the most important features.

3. Make use of Artificial Neural Networks and most important featuresfrom disparate data sources to build the forecasting model.

4. The proposed model should allow investors to predict the next-day Closing, Opening or both prices for a particular stock or index.

5. Evaluate the prediction model based on different metrics(ex. MSE, MAPE) and provide decision making suggestions for the investors.

Note 1: You can use any stock such as AAPL for Apple, GOOG for Google, and etc.

Note 2. Use the best practices you learned in the course to solve the problem.
To obtain more information about this project, you can refer to the Chapter 2 of the
following paper: Application of machine learning techniques for stock market
prediction, Bin Weng, 2017.


# Factors 
Hvilke variabler må være med i modellen?

1. price on market open and close, trade volume, etc
2. technical indicators: Stochastic, Willams %R, Relative Strength Index ...
simple moving average?
3. daily count of Google Trends on the stock of interest
4. the number of unique visitors for pertinent Wikipedia pages per day 




In [113]:
# import guide https://colab.research.google.com/notebooks/snippets/importing_libraries.ipynb

!pip install yfinance
!pip install pytrends --upgrade
!pip install --upgrade --user git+https://github.com/GeneralMills/pytrends

Requirement already up-to-date: pytrends in c:\users\marti\appdata\local\packages\pythonsoftwarefoundation.python.3.8_qbz5n2kfra8p0\localcache\local-packages\python38\site-packages (4.7.4)
Collecting git+https://github.com/GeneralMills/pytrends
  Cloning https://github.com/GeneralMills/pytrends to c:\users\marti\appdata\local\temp\pip-req-build-0sobavq2
Building wheels for collected packages: pytrends
  Building wheel for pytrends (setup.py): started
  Building wheel for pytrends (setup.py): finished with status 'done'
  Created wheel for pytrends: filename=pytrends-4.7.4-py3-none-any.whl size=15479 sha256=f58fea858a9adc2f0341e7456d04ccf2f4a075cb278ca0015ff063b536eccf13
  Stored in directory: C:\Users\marti\AppData\Local\Temp\pip-ephem-wheel-cache-ekzzppqr\wheels\16\ff\41\df76076b615c3bc8c806e9f375bbccba902b4c66f0472d99b1
Successfully built pytrends
Installing collected packages: pytrends
  Attempting uninstall: pytrends
    Found existing installation: pytrends 4.7.4
    Uninstalling 

  Running command git clone -q https://github.com/GeneralMills/pytrends 'C:\Users\marti\AppData\Local\Temp\pip-req-build-0sobavq2'


# Libraries
Yahoo finance - yfinance: get stock data..


In [114]:
import yfinance as yf # eller bare pandas datareader?
import pandas as pd
import matplotlib.pyplot as plt
import scipy
import datetime
from datetime import date
import sklearn
from pytrends.request import TrendReq
pytrends = TrendReq(timeout=(10,25), proxies=['https://34.203.233.13:80',], retries=2, backoff_factor=0.1, requests_args={'verify':False})

ConnectTimeout: HTTPSConnectionPool(host='trends.google.com', port=443): Max retries exceeded with url: /?geo=US (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x00000202D023D5B0>, 'Connection to 34.203.233.13 timed out. (connect timeout=10)'))

Enter desired ticker

In [None]:
# Yahoo finance - get data
TICKER = 'AAPL'

stock = yf.Ticker(TICKER)
stock.info

In [None]:
stock.history(interval='1d', start='2000-1-1', end = '2002-1-1')
stock_open_price = stock.history(interval='1d', start='2000-1-1', end = '2002-1-1')['Open']

# setup figure
plt.ylim = stock.history(interval='1d', start='2000-1-1', end = '2002-1-1')[['Open','Close']].values.max() * 1.1
plt.figure(figsize=(50,10))

plt.plot(stock_open_price.index, stock_open_price.values)

In [None]:
help(sklearn)

In [None]:
def get_company_name(comp_name):
    if len(comp_name) > 2:
        return comp_name[0:1]
    else:
        return comp_name[0]


# Google trends data
keyword_list = [stock.info['shortName'], stock.info['symbol'], get_company_name(stock.info['shortName'].split(' '))]

#pytrend.build_payload(kw_list=[keyword_list])
from pytrends import dailydata
from datetime import datetime as dt

ipo_date = stock.history(period='max').index.min()

trend = pytrends.get_historical_interest(keyword_list, year_start=ipo_date.year, month_start=ipo_date.month, day_start=ipo_date.day, hour_start=0, year_end=dt.today().year, month_end=dt.today().month, day_end=dt.today().day, hour_end=0, cat=0, geo='', gprop='', sleep=10)

#trend = pytrends.build_payload(keyword_list, cat=0, timeframe='today 5-y', geo='', gprop='')

# trend = dailydata.get_daily_data(keyword_list[0], ipo_date.year, ipo_date.month, date.today().year, date.today().month, wait_time=1000)
# plt.figure(figsize=(50,10))
# plt.plot(trend)
# trend.resample('D').mean()
trend