# Introduction
- Continuing the theme of 'war-profiteering', the focus of study shifts to the interest generated in media during the Afghanistan war. 
- Popular events provoke vivid reflections among the citizens ensuing staunch attraction from media. 
- The latter pursues the events in detail and concentrates efforts to encapsulate and express the buzz thereby distributing information. 
- The increase in the demand of information reflects in the stock value of the media outlet.

## Scope
- During the initial months of the war, in a world devoid of social media, newpapers and television were the main source of updates. The following analyses attempts to visualize the interest in media about the activities surrounding the Afghanistan war. 
- It targets a major newspaper, the New York Times, and its stock value on the New York Stock Exchange.

## Timeline
- The onset of aggressive movement in the aftermath of the 9/11 attacks in 2001 up till the next 8 months and a comparative study with the current timeline.

## Resources
- The stock details of New York Times during the war and its current value.
- The assortment of published articles during the war.

In [9]:
#Import all the libraries
import requests
import json
import os
import csv
import re
import datetime
import time
import glob
import pandas as pd
import ystockquote
from pprint import pprint
from yahoo_finance import Share
from plotly.graph_objs import Scatter, Figure, Layout
from plotly.offline import download_plotlyjs, init_notebook_mode, plot, iplot
import plotly.plotly as py
from plotly.graph_objs import *
import matplotlib.pyplot as plt
import calendar


relativePath = os.getcwd()

In [88]:
#Fetch stocks and shares of NYT during the Afghan War 2002 and store in CSV.
nyt = Share('NYT')
nyt = nyt.get_historical('2001-9-1', '2002-4-20')
keys = nyt[0].keys()
with open(relativePath+"/"+'final/extra/SharesNYTWarTime.csv', 'w') as output_file:
    dict_writer = csv.DictWriter(output_file, keys)
    dict_writer.writeheader()
    dict_writer.writerows(nyt)

In [89]:
#Fetch stocks and shares of NYT now store in CSV.
nyt2 = Share('NYT')
nyt2 = nyt2.get_historical('2016-9-1', '2017-04-20')
keys = nyt2[0].keys()
with open(relativePath+"/"+'final/extra/SharesNYTNow.csv', 'w') as output_file:
    dict_writer = csv.DictWriter(output_file, keys)
    dict_writer.writeheader()
    dict_writer.writerows(nyt2)

In [10]:
#Creating dataframe for each csv file
dfWar = pd.read_csv(relativePath+"/"+'final/extra/SharesNYTWarTime.csv')
dfNow = pd.read_csv(relativePath+"/"+'final/extra/SharesNYTNow.csv')
dfNow

Unnamed: 0,Symbol,Date,Open,High,Low,Close,Volume,Adj_Close
0,NYT,2017-04-19,14.55,14.73,14.50,14.60,752900,14.600000
1,NYT,2017-04-18,14.35,14.55,14.28,14.50,910000,14.500000
2,NYT,2017-04-17,14.30,14.48,14.20,14.45,540900,14.450000
3,NYT,2017-04-13,14.45,14.50,14.15,14.25,560800,14.250000
4,NYT,2017-04-12,14.60,14.70,14.35,14.50,588500,14.500000
5,NYT,2017-04-11,14.35,14.65,14.25,14.60,700300,14.600000
6,NYT,2017-04-10,14.45,14.55,14.20,14.45,523500,14.450000
7,NYT,2017-04-07,14.45,14.50,14.30,14.45,605400,14.450000
8,NYT,2017-04-06,14.40,14.53,14.25,14.50,603600,14.500000
9,NYT,2017-04-05,14.60,14.65,14.30,14.40,955500,14.400000


In [6]:
init_notebook_mode(connected=True)

In [12]:
#Extracting date and converting into 'MM-dd' for stocks during afghan war
dfWar['Date'] = pd.to_datetime(dfWar['Date'])
dfWar['Day'] = dfWar['Date'].dt.day
dfWar['Month'] = dfWar['Date'].dt.month
dfWar['Month'] = dfWar['Month'].apply(lambda x: calendar.month_abbr[x])
dfWar['dayMonth'] = dfWar['Month'].astype(str).str.cat(dfWar['Day'].astype(str), sep='-')
dfWar = dfWar.drop('Day', 1)
#Calculating mean of the stocks and shares during Afghan war 2002  
High = dfWar.groupby(['Month']).mean()
High = High.reset_index()
High.head()

Unnamed: 0,Month,Open,High,Low,Close,Volume,Adj_Close
0,Apr,47.37,47.774667,46.809333,47.312,445073,37.900945
1,Dec,43.9825,44.416,43.537,43.953,799465,35.110321
2,Feb,43.731052,44.37421,43.392105,43.954737,631363,35.122167
3,Jan,43.489048,43.838572,43.086191,43.432857,523204,34.694823
4,Mar,46.968,47.5225,46.6705,47.102,536105,37.732717


In [11]:
#Extracting date and converting into 'MM-dd' for stocks now
dfNow['Date'] = pd.to_datetime(dfNow['Date'])
dfNow['Day'] = dfNow['Date'].dt.day
dfNow['Month'] = dfNow['Date'].dt.month
dfNow['Month'] = dfNow['Month'].apply(lambda x: calendar.month_abbr[x])
dfNow['dayMonth'] = dfNow['Month'].astype(str).str.cat(dfNow['Day'].astype(str), sep='-')
dfNow = dfNow.drop('Day', 1)
#Calculating mean of the stocks and shares now
High = dfNow.groupby(['Month']).mean()
High = High.reset_index()
High.head()

Unnamed: 0,Month,Open,High,Low,Close,Volume,Adj_Close
0,Apr,14.4125,14.57,14.249167,14.458333,706008,14.458333
1,Dec,13.485714,13.678571,13.312381,13.490476,574709,13.413142
2,Feb,14.955263,15.188947,14.686316,14.955263,1179000,14.913721
3,Jan,13.4175,13.5495,13.272,13.4275,450730,13.382187
4,Mar,14.486957,14.636087,14.345652,14.486957,702265,14.446715


## Analysis Plot 1 
The comparative bar graph visualizes the stock value of New York Times following the attacks of 9/11 against its current value. Monthly distribution on the abscissa (X-axis) displays the mean value of stock plotted on the ordinate (Y-axis).

In [13]:
#Plotting the stocks and shares of NYT now and during war
trace1 = Bar(
    x=dfWar['Month'],
    y=dfWar['High'],
    name='NYT Stock during war'
)
trace2 = Bar(
    x=dfNow['Month'],
    y=dfNow['High'],
    name='NYT Stock Now'
)

data = [trace1, trace2]
layout = Layout(
    barmode='group', title='NYT Stocks Now and during Afghan War'
)

fig = Figure(data=data, layout=layout)
iplot(fig, filename='grouped-bar')


## Inference Plot 1
- Since the event was broadcasted live over the globe, and is potentially marked as a new divide in the political environment, minute details were significantly covered by the media to relay as much information to the aghast citizen. 
- A huge surge in the stock value is clearly visible when we compare one of the most apalling events of all time to the recent timeline.

In [None]:
#Fetch the articles published in NYT during the Afghan war of 2002 and store in JSON format
relativePath = os.getcwd()
key=os.getenv('api_key')
begin_date='20011216'
end_date='20020411'
q='war'

if os.path.exists(relativePath):
     os.makedirs(relativePath+"/"+'final/data/')
page=0
pages=120
for i in range(0, pages):
    article_url= "http://api.nytimes.com/svc/search/v2/articlesearch.json?&api-key=" + key +'&'+ "q=" + q +'&'+ "begin_date=" + begin_date +'&'+ "end_date=" + end_date +'&'+"page=" + str(i)  
    #print(article_url)
    response = requests.get(article_url)
    time.sleep(5)
    print(response)
    content=response.json()
    with open(relativePath+"/"+'final/data'+"/"+"War"+str(i)+'.json', 'w') as file:
        json.dump(content, file)

In [3]:
import operator
list_sections = [];                                          #empty list to store different categories by section_name
d={}                                                         #empty dictionary to store section_name n counts
folder='/final/data'
fileList=[]
sectionCountList= {}
for directory, subDirectory, filename in os.walk(relativePath+folder):
    for f in filename:
        a = os.path.join(directory, f)
        fileList.append(a) #appending all 120 json files in fileList

for filename in fileList:          #iterrating over all the json files in data folder
     with open(filename) as r:                                #opening each files and giving alias as r
        json_data = json.load(r)                             #loading the data of each file into json_data
        json_response=json_data['response']                  #getting response from each page
        json_docs=json_response['docs']                      #getting all articles from every page
        for doc in json_docs:                                #iterating over articles
            if doc['section_name'] not in list_sections and doc['section_name'] is not None:  
                list_sections.append(doc['section_name'])
                
for i in range(0,len(list_sections)):
    count=0
    for filename in fileList:      #iterrating over all the json files in data folder
        with open(filename) as r:                            #opening each files and giving alias as r
            json_data = json.load(r)                         #loading the data of each file into json_data
            json_response=json_data['response']
            json_docs=json_response['docs']
            for doc in json_docs:
                if doc['section_name']==list_sections[i]:    #recording the count of articles for each section_name
                    count+=1
    d[doc['section_name']]=count
    sectionCountList[list_sections[i]]=d[doc['section_name']]
#print(sorted(sectionCountList.items(), key=operator.itemgetter(1), reverse=True)[:10])
sectionCountList

{'Arts': 20,
 'Arts; Books': 23,
 'Arts; Books; Opinion': 2,
 'Arts; Corrections; Books': 1,
 'Arts; Obituaries': 4,
 'Arts; Opinion': 1,
 'Arts; Theater': 3,
 'Arts; Theater; Opinion': 3,
 'Books; New York and Region': 3,
 'Books; Week in Review': 1,
 'Business': 16,
 'Business; Obituaries': 5,
 'Business; Washington': 2,
 'Corrections; Books; New York and Region': 1,
 'Corrections; Magazine': 1,
 'Corrections; New York and Region': 14,
 'Corrections; Washington; New York and Region': 5,
 'Education; New York and Region': 2,
 'Education; U.S.; Washington': 1,
 'Front Page': 2,
 "Front Page; Corrections; Editors' Notes; New York and Region": 1,
 'Front Page; New York and Region': 9,
 'Front Page; U.S.': 3,
 'Front Page; U.S.; Washington': 7,
 'Health': 3,
 'Health; Opinion': 2,
 'Health; U.S.': 1,
 'Home and Garden; Style': 4,
 'Home and Garden; Style; Books': 1,
 'Magazine': 10,
 'Magazine; Opinion': 8,
 'Magazine; Washington': 4,
 'Magazine; Washington; Opinion': 4,
 'Movies; Arts': 

In [4]:
#Write the section count list in csv
with open(relativePath+"/"+'final/extra/TT1.csv', 'w') as csv_file:
    writer = csv.writer(csv_file)
    for key, value in sectionCountList.items():
        writer.writerow([key, value])

## Analysis Plot 2
A share of distribution of articles among myriad sub-sections displays the amount of interest in that particular domain among the citizens and henceforth, the media.

In [14]:
#Creating dataframe for each csv file
dfCountArticle = pd.read_csv(relativePath+"/"+'final/extra/TT1.csv')
dfCountArticle = dfCountArticle.dropna() #drop null values
#Pie chart plotting
fig = {
    'data': [{'labels': dfCountArticle['Section_Name'],
              'values': dfCountArticle['Article_Count'],
              'type': 'pie'}],
    'layout': {'title': 'Major sections with most articles during Afghan War'}
     }
iplot(fig)

## Inference Plot 2
- Majority of articles focused on the stories revolving around 9/11 and the rising politcal tensions while all other information eschewed. 
- It is clearly visualized in the pie chart, majority of articles are centered around the topics including World, New York, Washington, Front Page and Opinion.

# Conclusion:
- The consequtive plots validate thes problem statement displaying relative congruence in a direct impact of interest generated in information surrounding the political environment in the aftermath of 9/11 demonstrated in the first plot and the rise in the stock value of New York Times, the major source of information, as compared to its current standing demonstrated in the second plot.
