# STOCKER - ANALYSIS

#### Importing stocker from PyPI:

In [1]:
import sys
!{sys.executable} -m pip install --upgrade stocker

Requirement already up-to-date: stocker in c:\users\juang\appdata\local\programs\python\python37\lib\site-packages (0.1.2)


You should consider upgrading via the 'python -m pip install --upgrade pip' command.


#### Importing the function tomorrow:

In [2]:
from stocker.predict import tomorrow

Using TensorFlow backend.


#### Defining the stock for the analysis:

In [3]:
stock = 'GOOGL'

#### Let's check the difference by using different time periods of data:

In [4]:
error1 = tomorrow(stock, years=1)[1]
error2 = tomorrow(stock, years=2)[1]
error3 = tomorrow(stock, years=3)[1]
print('Error by using 1 year of data:',error1,'%')
print('Error by using 2 years of data:',error2,'%')
print('Error by using 3 years of data:',error3,'%')

Error by using 1 year of data: 0.746 %
Error by using 2 years of data: 0.875 %
Error by using 3 years of data: 1.011 %


#### Apparently it is better to use only the data from the last year. This could be because of the understanding of the model, if only the most recent behavior is involved into the model thus it can make a better prediction.

#### Now let's check the difference by using different amount of days for the input steps:

In [5]:
error1 = tomorrow(stock, steps=1)[1]
error2 = tomorrow(stock, steps=10)[1]
error3 = tomorrow(stock, steps=20)[1]
print('Error by using 1 previous day of data:',error1,'%')
print('Error by using 10 previous days of data:',error2,'%')
print('Error by using 20 previous days of data:',error3,'%')

Error by using 1 previous day of data: 0.757 %
Error by using 10 previous days of data: 4.727 %
Error by using 20 previous days of data: 7.777 %


#### Again, checking the most recent data is the best option. This doesn't mean that the model is not taking into account the whole behavior during the entire period of time.

Now let's check the difference by using different features:

In [6]:
error1 = tomorrow(stock, features=['Open'])[1]
error2 = tomorrow(stock, features=['Low'])[1]
error3 = tomorrow(stock, features=['High'])[1]
error4 = tomorrow(stock, features=['Volume'])[1]
error5 = tomorrow(stock, features=['Adj Close'])[1]
error6 = tomorrow(stock, features=['Interest'])[1]
error7 = tomorrow(stock, features=['Wiki_views'])[1]
error8 = tomorrow(stock, features=['RSI','%K','%R'])[1]
error9 = tomorrow(stock)[1]
error10 = tomorrow(stock, features=['Open','Low','High','Volume','Adj Close','Interest','Wiki_views','RSI','%K','%R'])[1]
error11 = tomorrow(stock, features=['Open','Low','High','Volume','Adj Close'])[1]
print('Error by including Open prices:',error1,'%')
print('Error by including Low prices:',error2,'%')
print('Error by including High prices:',error3,'%')
print('Error by including Volume:',error4,'%')
print('Error by including Adj Close prices:',error5,'%')
print('Error by including Interest:',error6,'%')
print('Error by including Wiki_views:',error7,'%')
print('Error by including indicators:',error8,'%')
print('Error by including only Close prices:',error9,'%')
print('Error by including all the features:',error10,'%')
print('Error by including the features from Yahoo Finance:',error11,'%')

Error by including Open prices: 0.838 %
Error by including Low prices: 0.763 %
Error by including High prices: 0.814 %
Error by including Volume: 0.682 %
Error by including Adj Close prices: 0.757 %
Error by including Interest: 1.081 %
Error by including Wiki_views: 0.758 %
Error by including indicators: 0.74 %
Error by including only Close prices: 0.76 %
Error by including all the features: 0.723 %
Error by including the features from Yahoo Finance: 0.811 %


**Using only the previous *Close prices* shows a similar error than using more features, so it is a good option to save time while running the code. However, it is recommended to implement several cases with different features and choose the case with the lowest error.**

**Finally let's check the Pearson correlation coefficient for each feature against the *Close prices*:**

In [7]:
from stocker.get_data import correlation

In [8]:
correlation(stock, interest=True, wiki_views=True, indicators=True)

High          0.991673
Low           0.992298
Open          0.980375
Close         1.000000
Volume       -0.193335
Adj Close     1.000000
Interest     -0.478602
Wiki_views    0.133284
%K            0.402858
%R            0.402858
RSI           0.511400
Name: Close, dtype: float64