In [None]:
#Import Google Drive to Google Colab
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [None]:
!pip install -U -q PyDrive

import os
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
from google.colab import auth
from oauth2client.client import GoogleCredentials
# Authenticate and create the PyDrive client.
auth.authenticate_user()
gauth = GoogleAuth()
gauth.credentials = GoogleCredentials.get_application_default()
drive = GoogleDrive(gauth)

# Analyzing how the Discussion of Climate Change has shifted from the 1980's to 2010

This notebook, in collaboration with a storyboard, will tell the story of how the discussion of increasing global average surface temperatures has changed over the past 40 years. More specifically, I will be using data from The New York Times Archives for the years 1980-2010 to analyze how the language and content changes.

The main questions we will explore:
- How have the themes surrounding the conversation of climate change shifted over the years?
- When did shift from just noticing an increase in temperature to discussing the causes and implications?
- Was climate change misunderstood in the past?
- Is there overlap between how it is discussed today vs. in the past?

#Data Collection

For my data, I collected 25 articles for each decade (1980-1990, 1990-2000, 2000-2010) from The New York Times Archive. All of these articles were found under the topic "climate change." I used newspaper3k for web scraping these articles then imported them into a csv.


The steps are as follows: 
1. Download the latest version of Python
2. pip install newspaper3k
3. Write python script to scrape web pages (included in data folder of Google Drive)
4. Write the scrape output to csv 

In [None]:
# Credit to https://www.linkedin.com/pulse/how-do-i-read-csv-file-from-google-drive-using-python-sigmundo/ for explaining how to import the csv file from my drive

downloaded = drive.CreateFile({'id':'1EAlJ1EQ4BpV-D2ZUdgQkE80yuUhByWBB'})
downloaded.GetContentFile('NYTArticles.csv')

In [28]:
import pandas as pd
article_data = pd.read_csv('NYTArticles.csv')
article_data.head()

Unnamed: 0,Decade,Article1,Article2,Article3,Article4,Article5,Article6,Article7,Article8,Article9,Article10,Article11,Article12,Article13,Article14,Article15,Article16,Article17,Article18,Article19,Article20,Article21,Article22,Article23,Article24,Article25
0,1980s,Climatologists say that it may be several deca...,Analysis of the effects of a warmer and drier ...,"A cloud of volcanic debris stretching 13,000 m...","In his announcement, Dr. Sovern cited radical ...","A researcher at the University of Florida, stu...",Despite widespread hardships caused by recent ...,Gases Trap Infrared Radiation\r\n\r\nScientist...,Mankind's activities in increasing the amount ...,"Core Radius Is 2,190 Miles\r\n\r\nThe core rad...",Full text is unavailable for this digitized ar...,AGLOBAL strategy to reduce a potentially dange...,Plans for meeting future energy needs should c...,What can we really say about future climate? T...,"AFTER decades of neglect, environmental issues...",The average cow belches up to 400 liters of me...,"Dr. Hansen, a leading expert on climate, said ...","The report, prepared by the Pacific Northwest ...",As the global sea levels rise by one to three ...,"The nuclear industry, which has has seen the c...",VIRTUALLY all the carbon dioxide emitted by po...,"A. James Wagner, an analyst at the Weather Ser...",The two criteria that have determined the choi...,"Like glass in a greenhouse, these gases are tr...",One futuristic idea is to use giant lasers ato...,The cold war isn't over. But even as the polit...
1,1990s,But the research going on in the Alps is cruci...,The 'best estimate' is about a third lower tha...,Scientists working in Brazil have found the fi...,The report comes at a time when 142 countries ...,The modelers reply that these efficiencies can...,"The latest research extends a study last May, ...","'We appear to be veering off the target, and a...",The human contribution to global warming could...,The evidence mounted last week that man-made g...,American officials have concluded that the Uni...,"Nevertheless, the panel's conclusion marks a w...",The Chinese also are heavy users of such subst...,"Instead, the American manufacturers called for...",The original study of the ice cores showed wha...,"To the Editor:\r\n\r\nS. Fred Singer, in 'Glob...",To the Editor:\r\n\r\nYour Sept. 10 front-page...,"Mr. Karl, who has been agnostic on the questio...","What Was Known\r\n\r\nYears of Research, Littl...",Developing countries make essentially a moral ...,''This paper goes way beyond what has been sho...,Since cutting carbon dioxide emissions would h...,"To the Editor:\r\n\r\nAstonishingly, the recen...",Leaders of environmental groups criticized the...,The information provided by the three groups i...,STUDYING a coral reef among the islands of Ind...
2,2000s,"SYDNEY — This was a good, if distant, vantage ...",COPENHAGEN  On any list of tough sales jobs i...,Whether it’s the manufacturer of the environme...,The glaciers that have long provided water and...,BRUSSELS — European Union leaders agreed on Fr...,'Our goal is to change weathercasts into envir...,In an early indication that the Copenhagen acc...,"BEIJING  Chinese officials, stung by criticis...","Peterborough, England\r\n\r\nFOR the many disa...","The Copenhagen conference, which ended without...",1:14\r\n\r\nChinese Astronauts Arrive at Space...,The price per metric ton of permits to spew ca...,That elsewhere will likely be a much smaller g...,China Presents ‘Challenges’ NATO Chief Says\r\...,"'He was in a hurry home, he said, because they...",The choice of an appropriate social time disco...,"The Inter-American Commission on Human Rights,...",Two miles per gallon may mean nothing more tha...,When you think about the growth of human popul...,'This cottage industry has the potential to be...,"POZNAN, Poland  As ministers from 189 countri...","Along with Jane Mayer of The New Yorker, the o...",CONSERVATIVES don’t support tax increases that...,"In an interview, Dr. Holdren, a 64-year-old ph...",'Just at the time that the U.S. is finally re-...


#Word Clouds


In order to visualize the overall themes from each decade, I will be using word clouds to see if I can identify any overarching themes and patterns.

In [29]:
from wordcloud import WordCloud, STOPWORDS
import gensim
from gensim.utils import simple_preprocess
from gensim.parsing.preprocessing import STOPWORDS
import matplotlib.pyplot as plt

def show_wordcloud(data, title = ""):
    wordcloud = WordCloud(
        background_color='white',
        max_words=300,
        max_font_size=50, 
        scale=3,
        random_state=1 
    ).generate(str(data))

    fig = plt.figure(1, figsize=(12, 12))
    plt.axis('off')
    fig.suptitle(title, fontsize=20)
    fig.subplots_adjust(top=2.3)
    plt.imshow(wordcloud)
    plt.show()

def ep_words(season):
    length = len(season.index)
    result = []
    for i in range(length):
        text = ' '.join(season.iloc[i]['Uncleaned script'])
        text = text[text.find('00:'):] 
        temp = []
        for token in gensim.utils.simple_preprocess(text):
            if token not in english_stopwords:
                temp.append(token)
        result.extend(temp)
    result = " ".join(result)
    return result

In [30]:
import nltk
nltk.download('stopwords')
from nltk.corpus import stopwords
english_stopwords = stopwords.words('english')

[nltk_data] Downloading package stopwords to /root/nltk_data...
[nltk_data]   Package stopwords is already up-to-date!


In [None]:
eighties = article_data.take(1)
nineties = article_data.take(2)
two_thousands = article_data.take(3)

#LDA Topic Modeling

Here, I will use LDA topic modeling to finding the topics that occur in the articles from each decade.

#Article Count Visualizations

I will create a graphic showing how the number of articls about climate change increases. I may do this through a line graph of each year, or by the number of articles per decade.