# Part 1: Reading in Alphabet Class A Stock Data

**Note this method was used for collaborative purposes, where the data was in a shared Google Drive folder with this same Collab notebook.**

Link to dataset: https://www.kaggle.com/borismarjanovic/price-volume-data-for-all-us-stocks-etfs#aadr.us.txt 

In [0]:
import pandas as pd
import requests 
import json
from pandas.io.json import json_normalize


In [0]:
# Code to read csv file from Drive into Colaboratory:
!pip install -U -q PyDrive
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
from google.colab import auth
from oauth2client.client import GoogleCredentials
# Authenticate and create the PyDrive client.
auth.authenticate_user()
gauth = GoogleAuth()
gauth.credentials = GoogleCredentials.get_application_default()
drive = GoogleDrive(gauth)

In [0]:
link = 'https://drive.google.com/open?id=1IsfFD6tPmloFiMWuja_RhzU9k1ZWHASD'
fluff, idx = link.split('=')
downloaded = drive.CreateFile({'id':idx}) 
downloaded.GetContentFile('googl.csv')  
googl_df = pd.read_csv('googl.csv', sep = ',')

In [0]:
googl_df.head()

Unnamed: 0,Date,Open,High,Low,Close,Volume,OpenInt
0,2004-08-19,50.0,52.03,47.98,50.17,44703800,0
1,2004-08-20,50.505,54.54,50.25,54.155,22857200,0
2,2004-08-23,55.375,56.74,54.525,54.7,18274400,0
3,2004-08-24,55.62,55.8,51.785,52.435,15262600,0
4,2004-08-25,52.48,54.0,51.94,53.0,9197800,0


In [0]:
# Changing Date column to date time object
googl_df['Date'] = pd.to_datetime(googl_df['Date'])
googl_df.set_index('Date', inplace=True)

In [0]:
start_date = '2013-01-01'
end_date = '2016-12-31'
mask = (googl_df['Date'] >= start_date) & (googl_df['Date'] <= end_date)
googl_four_df = googl_df[mask]

# Part 2a: Retrieving News Data via NYT REST API 

**Note that after initial webscraping is finished, the data is written to a CSV, which is in the "Project" folder. Skip the below blocks and instead read in the file via drive to save time.**

New York Times API documentation: https://developer.nytimes.com/docs/articlesearch-product/1/overview 

Utilizing newsapi.org to get all data related to Alphabet/Google between 2013 - 2016.

In [0]:
# default values 
key = "dntIk6KTZEo5h1RVjQ0wh9EZKsMrY6Q1"
link = ('https://api.nytimes.com/svc/search/v2/articlesearch.json?q=alphabet' + 
        '&begin_date=20130101&end_date=20161231' + '&api-key=' + key)

In [0]:
# get the number of pages of data that the API has for alphabet 
resp = requests.get(link)
first_h = resp.json()
all_news = json_normalize(first_h)
n_arts = int(all_news['response.meta.hits'])
n_pages = n_arts // 10 + 1
n_pages

87

In [0]:
# traversing through all the pages and adding the elements to a list 
import time 
articles = []
for i in range(0, n_pages): 
  new_l = link + "&page=" + str(i)
  resp = requests.get(new_l)
  hold = resp.json()
  articles.extend(hold['response']['docs'])
  if (i + 2) % 10 == 0:
    time.sleep(60)

In [0]:
# convert to DF
all_articles = json_normalize(articles)
all_articles.tail()

Unnamed: 0,abstract,web_url,snippet,lead_paragraph,source,multimedia,keywords,pub_date,document_type,news_desk,section_name,type_of_material,_id,word_count,uri,headline.main,headline.kicker,headline.content_kicker,headline.print_headline,headline.name,headline.seo,headline.sub,byline.original,byline.person,byline.organization,print_section,print_page,subsection_name,slideshow_credits
862,Two readers comment on a Sunday Review essay t...,https://www.nytimes.com/2015/06/04/opinion/the...,Two readers comment on a Sunday Review essay t...,To the Editor:,The New York Times,[],"[{'name': 'subject', 'value': 'Women and Girls...",2015-06-04T07:20:06+0000,article,Letters,Opinion,Letter,nyt://article/07629053-3e41-5b40-8ae5-8b9b1e35...,265,nyt://article/07629053-3e41-5b40-8ae5-8b9b1e35...,The Pressure to Primp,Letters,,The Pressure to Primp,,,,,[],,A,24.0,,
863,"Remembering some of the artists, innovators an...",https://www.nytimes.com/interactive/2015/12/16...,"Remembering some of the artists, innovators an...","Remembering some of the artists, innovators an...",The New York Times,"[{'rank': 0, 'subtype': 'watch308', 'caption':...","[{'name': 'subject', 'value': 'Deaths (Obituar...",2015-12-23T13:15:48+0000,multimedia,Magazine,Magazine,Interactive Feature,nyt://interactive/f09df3dd-47d1-54a7-9d26-05d1...,0,nyt://interactive/f09df3dd-47d1-54a7-9d26-05d1...,The Lives They Lived,Feature,,,,,,,[],,,,,
864,Two teachers agree with an opinion writer abou...,https://www.nytimes.com/2015/05/26/opinion/the...,Two teachers agree with an opinion writer abou...,To the Editor:,The New York Times,"[{'rank': 0, 'subtype': 'watch308', 'caption':...","[{'name': 'subject', 'value': 'Children and Ch...",2015-05-26T07:21:19+0000,article,Letters,Opinion,Letter,nyt://article/2aab24d1-919a-5a00-9d8d-5af95255...,380,nyt://article/2aab24d1-919a-5a00-9d8d-5af95255...,The Importance of Play as a Learning Tool,Letter,,The Importance of Play as a Learning Tool,,,,,[],,A,18.0,,
865,"The best present ideas, selected by Times expe...",https://www.nytimes.com/interactive/2014/multi...,"The best present ideas, selected by Times expe...","The best present ideas, selected by Times expe...",The New York Times,"[{'rank': 0, 'subtype': 'wide', 'caption': Non...","[{'name': 'subject', 'value': 'Gifts', 'rank':...",2014-10-14T18:56:40+0000,multimedia,Multimedia/Photos,Multimedia/Photos,Interactive Feature,nyt://interactive/e2ec3e6f-b0b1-58e8-8537-4f18...,0,nyt://interactive/e2ec3e6f-b0b1-58e8-8537-4f18...,"2014 Holiday Gift Ideas and Guide — Movies, Mu...",,,,,,,,[],,,,,
866,Readers reflect on how we see ourselves and le...,https://www.nytimes.com/2014/03/23/opinion/sun...,Readers reflect on how we see ourselves and le...,Readers reflect on how we see ourselves and le...,The New York Times,"[{'rank': 0, 'subtype': 'wide', 'caption': Non...","[{'name': 'subject', 'value': 'Books and Liter...",2014-03-22T18:30:01+0000,article,Letters,Opinion,Letter,nyt://article/020c0074-1418-5334-81d2-15a0e71a...,988,nyt://article/020c0074-1418-5334-81d2-15a0e71a...,Diversity in Kids’ Books,Letters,,Diversity in Kids’ Books,,,,,[],,SR,12.0,Sunday Review,


In [0]:
from google.colab import drive
drive.mount('drive')

Drive already mounted at drive; to attempt to forcibly remount, call drive.mount("drive", force_remount=True).


In [0]:
# write to CSV 
all_articles.to_csv('all_articles_googl.csv')
!cp all_articles_googl.csv "drive/My Drive/"

# Part 2b: Reading in Article Data via Drive CSV

In [0]:
# open and reading in CSV 
auth.authenticate_user()
gauth = GoogleAuth()
gauth.credentials = GoogleCredentials.get_application_default()
drive = GoogleDrive(gauth)

link = 'https://drive.google.com/open?id=1-3BMbLq2BGIRQVg4vhguypidcRLLKtOr'
fluff, idx = link.split('=')
downloaded = drive.CreateFile({'id':idx}) 
downloaded.GetContentFile('all_articles_googl.csv')  
articles_df = pd.read_csv('all_articles_googl.csv', sep = ',')

In [0]:
articles_df.tail()

Unnamed: 0.1,Unnamed: 0,abstract,web_url,snippet,lead_paragraph,source,multimedia,keywords,pub_date,document_type,news_desk,section_name,type_of_material,_id,word_count,uri,headline.main,headline.kicker,headline.content_kicker,headline.print_headline,headline.name,headline.seo,headline.sub,byline.original,byline.person,byline.organization,print_section,print_page,subsection_name,slideshow_credits
862,862,Two readers comment on a Sunday Review essay t...,https://www.nytimes.com/2015/06/04/opinion/the...,Two readers comment on a Sunday Review essay t...,To the Editor:,The New York Times,[],"[{'name': 'subject', 'value': 'Women and Girls...",2015-06-04T07:20:06+0000,article,Letters,Opinion,Letter,nyt://article/07629053-3e41-5b40-8ae5-8b9b1e35...,265,nyt://article/07629053-3e41-5b40-8ae5-8b9b1e35...,The Pressure to Primp,Letters,,The Pressure to Primp,,,,,[],,A,24.0,,
863,863,"Remembering some of the artists, innovators an...",https://www.nytimes.com/interactive/2015/12/16...,"Remembering some of the artists, innovators an...","Remembering some of the artists, innovators an...",The New York Times,"[{'rank': 0, 'subtype': 'watch308', 'caption':...","[{'name': 'subject', 'value': 'Deaths (Obituar...",2015-12-23T13:15:48+0000,multimedia,Magazine,Magazine,Interactive Feature,nyt://interactive/f09df3dd-47d1-54a7-9d26-05d1...,0,nyt://interactive/f09df3dd-47d1-54a7-9d26-05d1...,The Lives They Lived,Feature,,,,,,,[],,,,,
864,864,Two teachers agree with an opinion writer abou...,https://www.nytimes.com/2015/05/26/opinion/the...,Two teachers agree with an opinion writer abou...,To the Editor:,The New York Times,"[{'rank': 0, 'subtype': 'watch308', 'caption':...","[{'name': 'subject', 'value': 'Children and Ch...",2015-05-26T07:21:19+0000,article,Letters,Opinion,Letter,nyt://article/2aab24d1-919a-5a00-9d8d-5af95255...,380,nyt://article/2aab24d1-919a-5a00-9d8d-5af95255...,The Importance of Play as a Learning Tool,Letter,,The Importance of Play as a Learning Tool,,,,,[],,A,18.0,,
865,865,"The best present ideas, selected by Times expe...",https://www.nytimes.com/interactive/2014/multi...,"The best present ideas, selected by Times expe...","The best present ideas, selected by Times expe...",The New York Times,"[{'rank': 0, 'subtype': 'wide', 'caption': Non...","[{'name': 'subject', 'value': 'Gifts', 'rank':...",2014-10-14T18:56:40+0000,multimedia,Multimedia/Photos,Multimedia/Photos,Interactive Feature,nyt://interactive/e2ec3e6f-b0b1-58e8-8537-4f18...,0,nyt://interactive/e2ec3e6f-b0b1-58e8-8537-4f18...,"2014 Holiday Gift Ideas and Guide — Movies, Mu...",,,,,,,,[],,,,,
866,866,Readers reflect on how we see ourselves and le...,https://www.nytimes.com/2014/03/23/opinion/sun...,Readers reflect on how we see ourselves and le...,Readers reflect on how we see ourselves and le...,The New York Times,"[{'rank': 0, 'subtype': 'wide', 'caption': Non...","[{'name': 'subject', 'value': 'Books and Liter...",2014-03-22T18:30:01+0000,article,Letters,Opinion,Letter,nyt://article/020c0074-1418-5334-81d2-15a0e71a...,988,nyt://article/020c0074-1418-5334-81d2-15a0e71a...,Diversity in Kids’ Books,Letters,,Diversity in Kids’ Books,,,,,[],,SR,12.0,Sunday Review,
