# How to work with stock market data on Kaggle

You can find a large number of financial datasets at kaggle.com/datasets.  As of January 2021, [this dataset](https://www.kaggle.com/paultimothymooney/stock-market-data) seems to be the most complete and the most up-to-date.  It contains daily prices and volumes for all stocks listed on NYSE, S&P500, or NASDAQ and it gets updated on a weekly basis.  The data is organized by ticker symbol and it joins nicely with this [supplemental dataset](https://www.kaggle.com/paultimothymooney/stock-market-supplementary-data) that pairs individual ticker symbols with full-length company names (e.g. AAPL => Apple, Inc). For additional stock data, consider using the yfinance Python package (as demonstrated near the bottom of this notebook).  Hopefully this notebook will help you to get started with your own project!

**Step 1: Import Python Packages**

In [1]:
import numpy as np
import pandas as pd
import time
from datetime import datetime
import plotly_express as px
from plotly.offline import init_notebook_mode
init_notebook_mode(connected=True)
todays_date = datetime.fromtimestamp(time.time()).strftime("%Y-%m-%d")

def plot_stock_data(data,title):
    '''function for plotting stock data'''
    plot = px.line(data, 
                        x="Date", 
                        y=["Close"], 
                        hover_name="Date",
                        line_shape="linear",
                        title=title) 
    return plot

# **Step 2: Load the Stock Market Data into your Kaggle Notebook**

In [2]:
facebook = pd.read_csv('/kaggle/input/stock-market-data/stock_market_data/sp500/csv/FB.csv')
apple = pd.read_csv('/kaggle/input/stock-market-data/stock_market_data/nasdaq/csv/AAPL.csv')
netflix = pd.read_csv('/kaggle/input/stock-market-data/stock_market_data/nasdaq/csv/NFLX.csv')
google = pd.read_csv('/kaggle/input/stock-market-data/stock_market_data/sp500/csv/GOOG.csv')
microsoft = pd.read_csv('/kaggle/input/stock-market-data/stock_market_data/nasdaq/csv/MSFT.csv')
amazon = pd.read_csv('/kaggle/input/stock-market-data/stock_market_data/nasdaq/csv/AMZN.csv')
nvidia = pd.read_csv('/kaggle/input/stock-market-data/stock_market_data/nasdaq/csv/NVDA.csv')
print('Preview of NFLX.csv:')
netflix.tail(5)

Preview of NFLX.csv:


Unnamed: 0,Date,Low,Open,Volume,High,Close,Adjusted Close
4699,22-01-2021,564.349976,582.099976,7538200,583.98999,565.169983,565.169983
4700,25-01-2021,548.650024,567.0,7207300,569.75,556.780029,556.780029
4701,26-01-2021,554.059998,554.72998,5023800,567.98999,561.929993,561.929993
4702,27-01-2021,515.72998,550.710022,8670300,556.419983,523.280029,523.280029
4703,28-01-2021,530.73999,535.880005,5874532,553.148193,538.599976,538.599976


**Step 3: Plot the data**

In [3]:
plot_stock_data(facebook[-30:],'Facebook')

In [4]:
plot_stock_data(apple[-30:],'Apple')

In [5]:
plot_stock_data(netflix[-30:],'Netflix')

In [6]:
plot_stock_data(google[-30:],'Google')

In [7]:
plot_stock_data(microsoft[-30:],'Microsoft')

In [8]:
plot_stock_data(amazon[-30:],'Amazon')

In [9]:
plot_stock_data(nvidia[-30:],'NVIDIA')

# **Step 4: Access additional stock data by using the yfinance Python package**

In [10]:
!pip install yfinance

Collecting yfinance
  Downloading yfinance-0.1.55.tar.gz (23 kB)
Collecting multitasking>=0.0.7
  Downloading multitasking-0.0.9.tar.gz (8.1 kB)
Building wheels for collected packages: yfinance, multitasking
  Building wheel for yfinance (setup.py) ... [?25l- \ done
[?25h  Created wheel for yfinance: filename=yfinance-0.1.55-py2.py3-none-any.whl size=22616 sha256=203bbf7f9b360c3f7277fd2ec429b5e2733cce84f06a30a385da685a281c6738
  Stored in directory: /root/.cache/pip/wheels/aa/8a/36/59ed4f6fbcb6100967618eeb0696046bf9777a41ac2ff1f9b9
  Building wheel for multitasking (setup.py) ... [?25l- done
[?25h  Created wheel for multitasking: filename=multitasking-0.0.9-py3-none-any.whl size=8368 sha256=672680f0630a5353f9c936b37f275ef2ac5cd189420e6db590a68a1d4897976d
  Stored in directory: /root/.cache/pip/wheels/ae/25/47/4d68431a7ec1b6c4b5233365934b74c1d4e665bf5f968d363a
Successfully built yfinance multitasking
Installing collected packages: multitasking, yfinance
Successful

In [11]:
import yfinance as yf   
facebook_yf = yf.download('FB','2017-01-01',todays_date) 
facebook_yf['Date'] = facebook_yf.index
apple_yf = yf.download('AAPL','2017-01-01',todays_date) 
apple_yf['Date'] = apple_yf.index
netflix_yf = yf.download('NFLX','2017-01-01',todays_date) 
netflix_yf['Date'] = netflix_yf.index
google_yf = yf.download('GOOG','2017-01-01',todays_date) 
google_yf['Date'] = google_yf.index
amazon_yf = yf.download('AMZN','2017-01-01',todays_date) 
amazon_yf['Date'] = amazon_yf.index
microsoft_yf = yf.download('MSFT','2017-01-01',todays_date) 
microsoft_yf['Date'] = microsoft_yf.index
nvidia_yf = yf.download('NVDA','2017-01-01',todays_date) 
nvidia_yf['Date'] = nvidia_yf.index

[*********************100%***********************]  1 of 1 completed
[*********************100%***********************]  1 of 1 completed
[*********************100%***********************]  1 of 1 completed
[*********************100%***********************]  1 of 1 completed
[*********************100%***********************]  1 of 1 completed
[*********************100%***********************]  1 of 1 completed
[*********************100%***********************]  1 of 1 completed


In [12]:
plot_stock_data(facebook_yf,'Facebook')

In [13]:
plot_stock_data(apple_yf,'Apple')

In [14]:
plot_stock_data(netflix_yf,'Netflix')

In [15]:
plot_stock_data(google_yf,'Google')

In [16]:
plot_stock_data(microsoft_yf,'Microsoft')

In [17]:
plot_stock_data(amazon_yf,'Amazon')

In [18]:
plot_stock_data(nvidia_yf,'NVIDIA')