





# What countries are least corrupt (2012-2016)? Has this changed over time? How does this relate to media perceptions and reporting?#

### Data sets; Transparency International's Corruption Perception Index. The Guardian Newspaper###



Initially, let's take a look at Transparency International's Corruption Index which can be found here:  https://www.transparency.org/research/cpi/overview. For the purposes of this analysis we will be looking at CPI Scores from 2012-2016




In [1]:
import pandas as pd 
df = pd.read_csv('history.csv')
df1 = pd.read_csv('history.csv')
print(len(df))
df. head()

176


Unnamed: 0,CPI 2016 Rank,Country,Country Code,Region,CPI 2016 Score,CPI 2015 Score,CPI 2014 Score,CPI 2013 Score,CPI 2012 Score
0,1,New Zealand,NZL,Asia Pacific,90,88.0,91.0,91.0,90.0
1,1,Denmark,DNK,Europe and Central Asia,90,91.0,92.0,91.0,90.0
2,3,Finland,FIN,Europe and Central Asia,89,90.0,89.0,89.0,90.0
3,4,Sweden,SWE,Europe and Central Asia,88,89.0,87.0,89.0,88.0
4,5,Switzerland,CHE,Europe and Central Asia,86,86.0,86.0,85.0,86.0


### GETTING TO KNOW THE DATA


Imported the csv to data frame, and we can see what the data looks like and what keys it contains 

In [2]:
df1.keys()


Index(['CPI 2016 Rank', 'Country', 'Country Code', 'Region', 'CPI 2016 Score',
       'CPI 2015 Score', 'CPI 2014 Score', 'CPI 2013 Score', 'CPI 2012 Score'],
      dtype='object')

In [3]:
Countries = df1.Country
print (Countries)

0                           New Zealand
1                               Denmark
2                               Finland
3                                Sweden
4                           Switzerland
5                                Norway
6                             Singapore
7                           Netherlands
8                                Canada
9                               Germany
10                           Luxembourg
11                       United Kingdom
12                            Australia
13                              Iceland
14                            Hong Kong
15                              Belgium
16                              Austria
17         The United States of America
18                              Ireland
19                                Japan
20                              Uruguay
21                              Estonia
22                               France
23                              Bahamas
24                                Chile


In [4]:
df1[['Country', 'CPI 2016 Score']]

Unnamed: 0,Country,CPI 2016 Score
0,New Zealand,90
1,Denmark,90
2,Finland,89
3,Sweden,88
4,Switzerland,86
5,Norway,85
6,Singapore,84
7,Netherlands,83
8,Canada,82
9,Germany,81


So, there are a total of 176 countries in the Transparency International Corruption Perception Index 

It would be good to visulalize some of this data on a map, but we will need longitude and latitude points for that. So, let's import some data that we can use for that: 


In [5]:
import pandas as pd 
df = pd.read_csv('simplemaps-worldcities-basic.csv')
LL_df = pd.read_csv('simplemaps-worldcities-basic.csv')
print(len(df))
df. head()


7322


Unnamed: 0,city,city_ascii,lat,lng,pop,country,iso2,iso3,province
0,Qal eh-ye Now,Qal eh-ye,34.983,63.1333,2997.0,Afghanistan,AF,AFG,Badghis
1,Chaghcharan,Chaghcharan,34.516701,65.250001,15000.0,Afghanistan,AF,AFG,Ghor
2,Lashkar Gah,Lashkar Gah,31.582998,64.36,201546.0,Afghanistan,AF,AFG,Hilmand
3,Zaranj,Zaranj,31.112001,61.886998,49851.0,Afghanistan,AF,AFG,Nimroz
4,Tarin Kowt,Tarin Kowt,32.633298,65.866699,10000.0,Afghanistan,AF,AFG,Uruzgan


This has given multiple cities for every country, but we only want one so will need to group by Country: 

In [6]:
LL_df.groupby('country')

<pandas.core.groupby.groupby.DataFrameGroupBy object at 0x112003f98>

In [7]:
df2=LL_df.drop_duplicates('country', keep='first', inplace=False)
df2.head()


Unnamed: 0,city,city_ascii,lat,lng,pop,country,iso2,iso3,province
0,Qal eh-ye Now,Qal eh-ye,34.983,63.1333,2997.0,Afghanistan,AF,AFG,Badghis
33,Mariehamn,Mariehamn,60.096996,19.949004,10682.0,Aland,AX,ALD,Finström
34,Kruje,Kruje,41.518998,19.797004,21286.0,Albania,AL,ALB,Durrës
60,Jijel,Jijel,36.821997,5.766004,148000.0,Algeria,DZ,DZA,Jijel
111,Pago Pago,Pago Pago,-14.276611,-170.706645,12038.0,American Samoa,AS,ASM,


That's better! 

In [8]:
df2.keys()

Index(['city', 'city_ascii', 'lat', 'lng', 'pop', 'country', 'iso2', 'iso3',
       'province'],
      dtype='object')

In [9]:
df1.keys()


Index(['CPI 2016 Rank', 'Country', 'Country Code', 'Region', 'CPI 2016 Score',
       'CPI 2015 Score', 'CPI 2014 Score', 'CPI 2013 Score', 'CPI 2012 Score'],
      dtype='object')

Looking at CPI Scores for 2016, we can remove some columns to make it simpler to merge with our longitude and latitude dataframe 

In [10]:
df3=df1.drop(columns=['Country Code', 'Region', 'CPI 2015 Score', 'CPI 2014 Score', 'CPI 2013 Score', 'CPI 2012 Score',])


In [11]:
df3.head()

Unnamed: 0,CPI 2016 Rank,Country,CPI 2016 Score
0,1,New Zealand,90
1,1,Denmark,90
2,3,Finland,89
3,4,Sweden,88
4,5,Switzerland,86


In [12]:
df2.keys() 

Index(['city', 'city_ascii', 'lat', 'lng', 'pop', 'country', 'iso2', 'iso3',
       'province'],
      dtype='object')

We can simplfy this too: 

In [13]:
df4=df2.drop(columns=['city','city_ascii','pop','iso2','iso3',
       'province',])
df4.head()

Unnamed: 0,lat,lng,country
0,34.983,63.1333,Afghanistan
33,60.096996,19.949004,Aland
34,41.518998,19.797004,Albania
60,36.821997,5.766004,Algeria
111,-14.276611,-170.706645,American Samoa


And now we can merge so we have a new data frame CPI 2016 Scores, longitude and latitude !

In [14]:
merged_left = pd.merge(left=df3,right=df4, how='left', left_on='Country', right_on='country')

merged_left

Unnamed: 0,CPI 2016 Rank,Country,CPI 2016 Score,lat,lng,country
0,1,New Zealand,90,-42.472750,171.208725,New Zealand
1,1,Denmark,90,55.709001,9.534996,Denmark
2,3,Finland,89,60.996996,24.472000,Finland
3,4,Sweden,88,60.613002,15.647005,Sweden
4,5,Switzerland,86,47.369997,7.344999,Switzerland
5,6,Norway,85,58.464756,8.766001,Norway
6,7,Singapore,84,1.293033,103.855821,Singapore
7,8,Netherlands,83,53.000001,6.550003,Netherlands
8,9,Canada,82,50.150025,-96.883322,Canada
9,10,Germany,81,49.982472,8.273219,Germany


In [15]:
df5=pd.merge(left=df3,right=df4, how='left', left_on='Country', right_on='country')

df5.head()

Unnamed: 0,CPI 2016 Rank,Country,CPI 2016 Score,lat,lng,country
0,1,New Zealand,90,-42.47275,171.208725,New Zealand
1,1,Denmark,90,55.709001,9.534996,Denmark
2,3,Finland,89,60.996996,24.472,Finland
3,4,Sweden,88,60.613002,15.647005,Sweden
4,5,Switzerland,86,47.369997,7.344999,Switzerland


Now, lets visualize this on to a map 

In [16]:
import folium
from folium import plugins
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns


In [17]:
m = folium.Map([41.8781, -87.6298], zoom_start=2)
m

In [18]:
df5.head()


Unnamed: 0,CPI 2016 Rank,Country,CPI 2016 Score,lat,lng,country
0,1,New Zealand,90,-42.47275,171.208725,New Zealand
1,1,Denmark,90,55.709001,9.534996,Denmark
2,3,Finland,89,60.996996,24.472,Finland
3,4,Sweden,88,60.613002,15.647005,Sweden
4,5,Switzerland,86,47.369997,7.344999,Switzerland


In [19]:
df5.keys()


Index(['CPI 2016 Rank', 'Country', 'CPI 2016 Score', 'lat', 'lng', 'country'], dtype='object')

In [20]:
df5.keys()
str(df5['CPI 2016 Score'])

'0      90\n1      90\n2      89\n3      88\n4      86\n5      85\n6      84\n7      83\n8      82\n9      81\n10     81\n11     81\n12     79\n13     78\n14     77\n15     77\n16     75\n17     74\n18     73\n19     72\n20     71\n21     70\n22     69\n23     66\n24     66\n25     66\n26     65\n27     64\n28     62\n29     62\n       ..\n146    26\n147    26\n148    26\n149    26\n150    25\n151    25\n152    24\n153    22\n154    22\n155    21\n156    21\n157    21\n158    20\n159    20\n160    20\n161    20\n162    20\n163    18\n164    18\n165    17\n166    17\n167    16\n168    15\n169    14\n170    14\n171    14\n172    13\n173    12\n174    11\n175    10\nName: CPI 2016 Score, Length: 176, dtype: int64'

### Map to show the CPI Scores for 2016 ; the higher the CPI score the lower the levels of corruption

(Zoom out)

In [21]:
map_test = folium.Map(location=[60.613002, 15.647005],
                      tiles = "Stamen Toner")

import math
for index, row in df5.iterrows():
    Lat= row['lat']
    Long = row['lng']
    popup=  row ['Country']
    value= row ['CPI 2016 Score']
    try:
        Marker = folium.Circle([Lat, Long], popup=popup, radius=value*3000, color='#3db7e4',
      fill=True,
      fill_color='#3db7e4')
        

    except ValueError:
        continue
    Marker.add_to(map_test)
map_test
#Need to add a title to the map and change the inital zoom area


I would like to compare this map to similar maps showing the CPI scores for differnt years. This will show if the lowest levels of corruption (the highest scoring areas) are the same or if there are changes over time. To do this, I will need to merge the CPI Scores from all years, to create a new dataframe (df7) 

In [22]:
df1.keys()

Index(['CPI 2016 Rank', 'Country', 'Country Code', 'Region', 'CPI 2016 Score',
       'CPI 2015 Score', 'CPI 2014 Score', 'CPI 2013 Score', 'CPI 2012 Score'],
      dtype='object')

In [23]:
df6=df1.drop(columns=['Country Code', 'Region', 'CPI 2016 Score',])
df6.head()

Unnamed: 0,CPI 2016 Rank,Country,CPI 2015 Score,CPI 2014 Score,CPI 2013 Score,CPI 2012 Score
0,1,New Zealand,88.0,91.0,91.0,90.0
1,1,Denmark,91.0,92.0,91.0,90.0
2,3,Finland,90.0,89.0,89.0,90.0
3,4,Sweden,89.0,87.0,89.0,88.0
4,5,Switzerland,86.0,86.0,85.0,86.0


In [24]:
df7=pd.merge(left=df5,right=df6, how='left', left_on='Country', right_on='Country')

df7.head()

Unnamed: 0,CPI 2016 Rank_x,Country,CPI 2016 Score,lat,lng,country,CPI 2016 Rank_y,CPI 2015 Score,CPI 2014 Score,CPI 2013 Score,CPI 2012 Score
0,1,New Zealand,90,-42.47275,171.208725,New Zealand,1,88.0,91.0,91.0,90.0
1,1,Denmark,90,55.709001,9.534996,Denmark,1,91.0,92.0,91.0,90.0
2,3,Finland,89,60.996996,24.472,Finland,3,90.0,89.0,89.0,90.0
3,4,Sweden,88,60.613002,15.647005,Sweden,4,89.0,87.0,89.0,88.0
4,5,Switzerland,86,47.369997,7.344999,Switzerland,5,86.0,86.0,85.0,86.0


### Map to show the highest CPI Scores for 2013 ; the higher the CPI score the lower the levels of corruption

##### This serves as a comparison to the CPI 2016 Scores map 


In [25]:
map_test = folium.Map(location=[60.613002, 15.647005],
                      tiles = "Stamen Toner")

import math
for index, row in df7.iterrows():
    Lat= row['lat']
    Long = row['lng']
    popup=  row ['Country']
    value= row ['CPI 2013 Score']
    try:
        Marker = folium.Circle([Lat, Long], popup=popup, radius=value*3000, color='#2ca02c',
      fill=True,
      fill_color='#2ca02c')
        

    except ValueError:
        continue
    Marker.add_to(map_test)
map_test

This isn't that useful a comparison, as there is less data for 2013. However, this does demonstrate a concentration of lower levels of corruption in Northern Europe; a trend that can also be seen in the 2016 data 



### Map to show the top CPI Scores for 2016 ; to demonstrate the concentration in Northern Europe 


In [26]:
map_test = folium.Map(location=[60.613002, 15.647005],
                      tiles = "Stamen Toner")

folium.Marker([-42.472750, 171.20872], popup='New Zealand: CPI Score of 90').add_to(map_test)
folium.Marker([55.709001, 9.534996], popup='Denmark: CPI Score of 90').add_to(map_test)
folium.Marker([60.996999,24.472000], popup= 'Finland:CPI Score of 89').add_to(map_test)
folium.Marker([60.613002,15.647005], popup= 'Sweden:CPI Score of 88').add_to(map_test)
folium.Marker([47.369997,7.344999], popup= 'Switzerland: CPI of 86').add_to(map_test)
folium.Marker ([58.464756,8.766001], popup= 'Norway: CPI of 85').add_to(map_test)

map_test

This is better! This shows the six countries with the highest CPI Scores in 2016. You can hover over the points to see the CPI for each country. There is a clear concentration in Northern Europe. 

It would be interesting to see if there is any media coverage of corruption levels in the Northern European countries shown on this map. 
Sweden, Finland, Norway, Denmark and Switzerland. 

Using the Guardian's API we can filter articles that may be of interest. 

In [27]:
import json
import requests
from os import makedirs
from os.path import join, exists
from datetime import date, timedelta


ARTICLES_DIR = join('tempdata', 'articles')
makedirs(ARTICLES_DIR, exist_ok=True)
# I'd like to see articles that mention corruption and northern european countries that scored the highest on the CPI - least corrupt country. I have included only those articles with the tag finance and initally looked only over two days 
API_ENDPOINT = 'http://content.guardianapis.com/search?q=corruption%20AND%20norway&tag=finance'
my_params ={
    'from-date': "",
    'to-date': "",
    'order-by': "newest",
    'show-fields': 'all',
    'page-size': 200,
    'api-key': "aa1b9597-f873-462d-a76f-f12e040388b1"
}


# day iteration from here: http://stackoverflow.com/questions/7274267/print-all-day-dates-between-two-dates
start_date = date(2016,2, 1)
end_date = date(2016,10, 3)
dayrange = range((end_date - start_date).days + 1)
for daycount in dayrange:
    dt = start_date + timedelta(days=daycount)
    datestr = dt.strftime('%Y-%m-%d')
    fname = join(ARTICLES_DIR, datestr + '.json')
    if not exists(fname):
        # Downloading
        print("Downloading", datestr)
        all_results = []
        my_params['from-date'] = datestr
        my_params['to-date'] = datestr
        current_page = 1
        total_pages = 1
        while current_page <= total_pages:
            print("...page", current_page)
            my_params['page'] = current_page
            resp = requests.get(API_ENDPOINT, my_params)
            data = resp.json()
            all_results.extend(data['response']['results'])
            # if there is more than one page
            current_page += 1
            total_pages = data['response']['pages']

        
            with open(fname, 'w') as f:
                print("Writing to", fname)

            # re-serializing 
                f.write(json.dumps(all_results, indent=2))
                

#make a function to read files 


Now I have this information I can look at individual articles and begin some analysis on the content of these articles; does this match up with the CPI assesment of these as the least corrupt countries? What other information could these articles reveal? 

In [28]:
df7.keys()
df7.head()

Unnamed: 0,CPI 2016 Rank_x,Country,CPI 2016 Score,lat,lng,country,CPI 2016 Rank_y,CPI 2015 Score,CPI 2014 Score,CPI 2013 Score,CPI 2012 Score
0,1,New Zealand,90,-42.47275,171.208725,New Zealand,1,88.0,91.0,91.0,90.0
1,1,Denmark,90,55.709001,9.534996,Denmark,1,91.0,92.0,91.0,90.0
2,3,Finland,89,60.996996,24.472,Finland,3,90.0,89.0,89.0,90.0
3,4,Sweden,88,60.613002,15.647005,Sweden,4,89.0,87.0,89.0,88.0
4,5,Switzerland,86,47.369997,7.344999,Switzerland,5,86.0,86.0,85.0,86.0


In [29]:
# for index, row in df7.iterrows():
#     CPI15 = df7['CPI 2015 Score']
#     CPI16 = df7['CPI 2016 Score']
#     idx = df7['Country']
#     df = pd.DataFrame({'CPI15': CPI15,
#                        'CPI16': CPI16}, index=idx)
#     ax = df.plot.bar(rot=0) 


In [30]:
df7.head()

Unnamed: 0,CPI 2016 Rank_x,Country,CPI 2016 Score,lat,lng,country,CPI 2016 Rank_y,CPI 2015 Score,CPI 2014 Score,CPI 2013 Score,CPI 2012 Score
0,1,New Zealand,90,-42.47275,171.208725,New Zealand,1,88.0,91.0,91.0,90.0
1,1,Denmark,90,55.709001,9.534996,Denmark,1,91.0,92.0,91.0,90.0
2,3,Finland,89,60.996996,24.472,Finland,3,90.0,89.0,89.0,90.0
3,4,Sweden,88,60.613002,15.647005,Sweden,4,89.0,87.0,89.0,88.0
4,5,Switzerland,86,47.369997,7.344999,Switzerland,5,86.0,86.0,85.0,86.0


In [45]:
import plotly 

plotly.offline.init_notebook_mode(connected=True)

CPI15 = df7['CPI 2015 Score']
CPI16 = df7['CPI 2016 Score']
CPI14 = df7['CPI 2014 Score']
x = df7['Country']

In [51]:
import plotly 

plotly.offline.init_notebook_mode(connected=True)

trace0 = {'type' : 'bar', 'x': df7['Country'], 'y' : CPI16}

plotly.offline.iplot([trace0])

In [57]:
import plotly 

plotly.offline.init_notebook_mode(connected=True)


trace0 = go.Bar(
    x= df7['Country'],
    y=CPI16,
    name='2016',
    marker=dict(
        color='rgb(49,130,189)'
    )
)
trace1 = go.Bar(
    x=df7['Country'],
    y=CPI14,
    name='2014',
    marker=dict(
        color='rgb(204,204,204)',
    )
)

data = [trace0, trace1]
layout = go.Layout(
    xaxis=dict(tickangle=-45),
    barmode='group',
)

fig = go.Figure(data=data, layout=layout)
plotly.offline.iplot(fig, filename='angled-text-bar')

This 

In [93]:

import plotly 

plotly.offline.init_notebook_mode(connected=True)

CPI15 = df7.head(50)['CPI 2015 Score']
CPI16 = df7.head(50)['CPI 2016 Score']
CPI14 = df7.head(50)['CPI 2014 Score']
x = df7.head()['Country']


plotly.offline.init_notebook_mode(connected=True)


trace0 = go.Bar(
    x= df7.head(50)['Country'],
    y=CPI16,
    name='2016',
    marker=dict(
        color='rgb(253,174,97)'
    )
)
trace1 = go.Bar(
    x=df7.head(50)['Country'],
    y=CPI14,
    name='2014',
    marker=dict(
        color='rgb(204,204,204)',
    )
)

data = [trace0, trace1]
layout = go.Layout(
    xaxis=dict(tickangle=-45),
    barmode='group',
)

fig = go.Figure(data=data, layout=layout)
plotly.offline.iplot(fig, filename='angled-text-bar')

The bar chart above shows that, in general, CPI scores per country do not fluctuate to a great extent year on year in the top 50 countries. However, the chart also highlights those countries where the CPI score did reduce from 2014-2016. Barbados, Quatar, St Vincent and the Grenadines and Cyprus are notable here. This is interesting to review news articles from the Gaurdian for these countries, from these years. 

In [None]:
#add Guardian API call in here 

The bar chart below compares the bottom 5 countires for 2014 and 2016: 


In [73]:
df7.tail()

Unnamed: 0,CPI 2016 Rank_x,Country,CPI 2016 Score,lat,lng,country,CPI 2016 Rank_y,CPI 2015 Score,CPI 2014 Score,CPI 2013 Score,CPI 2012 Score
171,170,Sudan,14,11.770404,34.349986,Sudan,170,12.0,11.0,11.0,13.0
172,173,Syria,13,32.625,36.105004,Syria,173,18.0,20.0,17.0,26.0
173,174,Korea (North),12,,,,174,8.0,8.0,8.0,8.0
174,175,South Sudan,11,9.233333,29.833333,South Sudan,175,15.0,15.0,14.0,
175,176,Somalia,10,4.183299,43.866703,Somalia,176,8.0,8.0,8.0,8.0


In [92]:
import plotly 

plotly.offline.init_notebook_mode(connected=True)

CPI15 = df7.tail()['CPI 2015 Score']
CPI16 = df7.tail()['CPI 2016 Score']
CPI14 = df7.tail()['CPI 2014 Score']
x = df7.tail()['Country']

import plotly 

plotly.offline.init_notebook_mode(connected=True)


trace0 = go.Bar(
    x= df7.tail(20)['Country'],
    y=CPI16,
    name='2016',
    marker=dict(
        color='rgb(253,174,97)'
    )
)
trace1 = go.Bar(
    x=df7.tail(20)['Country'],
    y=CPI14,
    name='2014',
    marker=dict(
        color='rgb(251,154,153)',
    )
)

data = [trace0, trace1]
layout = go.Layout(
    xaxis=dict(tickangle=-45),
    barmode='group',
)

fig = go.Figure(data=data, layout=layout)
plotly.offline.iplot(fig, filename='angled-text-bar')

And here we're comparing the CPI Scores (2014 and 2016) for the 30 lowest scoring countries:


In [94]:
import plotly 

plotly.offline.init_notebook_mode(connected=True)

CPI15 = df7.tail(30)['CPI 2015 Score']
CPI16 = df7.tail(30)['CPI 2016 Score']
CPI14 = df7.tail(30)['CPI 2014 Score']
x = df7.tail()['Country']

import plotly 

plotly.offline.init_notebook_mode(connected=True)


trace0 = go.Bar(
    x= df7.tail(30)['Country'],
    y=CPI16,
    name='2016',
    marker=dict(
        color='rgb(227,26,28)'
    )
)
trace1 = go.Bar(
    x=df7.tail(30)['Country'],
    y=CPI14,
    name='2014',
    marker=dict(
        color='rgb(204,204,204)',
    )
)

data = [trace0, trace1]
layout = go.Layout(
    xaxis=dict(tickangle=-45),
    barmode='group',
)

fig = go.Figure(data=data, layout=layout)
plotly.offline.iplot(fig, filename='angled-text-bar')

This is more interesting! South Sudan, Syria, Yemen and Libya have all scored lower. North Korea's score has increased. This would indicate that there is a greater perceived levels of corruption in the lower scoring counries, and that the perceived level of corruption in North Korea has reduced. Again, this is useful data to apply to API calls to begin to look for relevant articles that may shed some light on possible assumptions we could make on reasons for these fluctuations! 

In [95]:
import json
import requests
from os import makedirs
from os.path import join, exists
from datetime import date, timedelta


ARTICLES_DIR = join('tempdata', 'articles')
makedirs(ARTICLES_DIR, exist_ok=True)
# I'd like to see articles that mention corruption and  Syria in 2016
API_ENDPOINT = 'http://content.guardianapis.com/search?q=corruption%20AND%20Syria&tag=politics'
my_params ={
    'from-date': "",
    'to-date': "",
    'order-by': "newest",
    'show-fields': 'all',
    'page-size': 200,
    'api-key': "aa1b9597-f873-462d-a76f-f12e040388b1"
}


# day iteration from here: http://stackoverflow.com/questions/7274267/print-all-day-dates-between-two-dates
start_date = date(2016,1, 1)
end_date = date(2016,12, 12)
dayrange = range((end_date - start_date).days + 1)
for daycount in dayrange:
    dt = start_date + timedelta(days=daycount)
    datestr = dt.strftime('%Y-%m-%d')
    fname = join(ARTICLES_DIR, datestr + '.json')
    if not exists(fname):
        # Downloading
        print("Downloading", datestr)
        all_results = []
        my_params['from-date'] = datestr
        my_params['to-date'] = datestr
        current_page = 1
        total_pages = 1
        while current_page <= total_pages:
            print("...page", current_page)
            my_params['page'] = current_page
            resp = requests.get(API_ENDPOINT, my_params)
            data = resp.json()
            all_results.extend(data['response']['results'])
            # if there is more than one page
            current_page += 1
            total_pages = data['response']['pages']

        
            with open(fname, 'w') as f:
                print("Writing to", fname)

            # re-serializing 
                f.write(json.dumps(all_results, indent=2))
                

Downloading 2016-01-01
...page 1
Writing to tempdata/articles/2016-01-01.json
Downloading 2016-01-02
...page 1
Writing to tempdata/articles/2016-01-02.json
Downloading 2016-01-03
...page 1
Writing to tempdata/articles/2016-01-03.json
Downloading 2016-01-04
...page 1
Writing to tempdata/articles/2016-01-04.json
Downloading 2016-01-05
...page 1
Writing to tempdata/articles/2016-01-05.json
Downloading 2016-01-06
...page 1
Writing to tempdata/articles/2016-01-06.json
Downloading 2016-01-07
...page 1
Writing to tempdata/articles/2016-01-07.json
Downloading 2016-01-08
...page 1
Writing to tempdata/articles/2016-01-08.json
Downloading 2016-01-09
...page 1
Writing to tempdata/articles/2016-01-09.json
Downloading 2016-01-10
...page 1
Writing to tempdata/articles/2016-01-10.json
Downloading 2016-01-11
...page 1
Writing to tempdata/articles/2016-01-11.json
Downloading 2016-01-12
...page 1
Writing to tempdata/articles/2016-01-12.json
Downloading 2016-01-13
...page 1
Writing to tempdata/articles/20