# Load data about articles from the Guardian Media Group API

## 1. Extract information using the Guardian Media Group API

More information about using the Guardian Media Group API can be found in [Documentation](https://open-platform.theguardian.com/documentation/) 

To access the API, it is needed to [sign up for anAPI key](https://open-platform.theguardian.com/access/), which should be sent with every request.

I generated an API key and saved it in api.cfg file. 

Import required libraries:

In [110]:
import configparser
import json
from datetime import datetime, timedelta, date

In [111]:
import requests

In [112]:
import pandas as pd

Import API key:

In [113]:
config =  configparser.ConfigParser()
config.read('api.cfg')

['api.cfg']

Create query for extracting data from the Guardian Media Group API, which extacts data from 2021-01-01 to today:

In [114]:
from_date = '2019-07-01'
to_date = '2019-12-31'

In [115]:
url_querry = 'https://content.guardianapis.com/search?' \
    + 'api-key=' + config['API']['KEY'] + '&' \
    + 'from-date=' + from_date + '&' \
    + 'to-date=' + to_date + '&' \
    + 'type=' + 'article' + '&' \
    + 'show-tags=keyword' + '&' \
    + 'order-by=' + 'oldest'

Get response from API and save as JSON object.

In [116]:
response = requests.get(url_querry)
data_json = response.json()

Get number of pages of query:

In [117]:
# could be max only 3800
data_json['response']['pages']

3754

In [118]:
data_json['response']

{'status': 'ok',
 'userTier': 'developer',
 'total': 37534,
 'startIndex': 1,
 'pageSize': 10,
 'currentPage': 1,
 'pages': 3754,
 'orderBy': 'oldest',
 'results': [{'id': 'sport/2019/jul/01/familiar-foes-australia-and-england-prepare-to-do-battle-for-womens-ashes',
   'type': 'article',
   'sectionId': 'sport',
   'sectionName': 'Sport',
   'webPublicationDate': '2019-07-01T00:35:28Z',
   'webTitle': "Familiar foes Australia and England prepare to do battle for Women's Ashes",
   'webUrl': 'https://www.theguardian.com/sport/2019/jul/01/familiar-foes-australia-and-england-prepare-to-do-battle-for-womens-ashes',
   'apiUrl': 'https://content.guardianapis.com/sport/2019/jul/01/familiar-foes-australia-and-england-prepare-to-do-battle-for-womens-ashes',
   'tags': [{'id': 'sport/womens-ashes',
     'type': 'keyword',
     'sectionId': 'sport',
     'sectionName': 'Sport',
     'webTitle': "Women's Ashes",
     'webUrl': 'https://www.theguardian.com/sport/womens-ashes',
     'apiUrl': 'http

Convert first page of response to dataframe `df_response`:

In [119]:
df_response = pd.DataFrame(data_json['response']['results'])

In [120]:
df_response.head()

Unnamed: 0,id,type,sectionId,sectionName,webPublicationDate,webTitle,webUrl,apiUrl,tags,isHosted,pillarId,pillarName
0,sport/2019/jul/01/familiar-foes-australia-and-...,article,sport,Sport,2019-07-01T00:35:28Z,Familiar foes Australia and England prepare to...,https://www.theguardian.com/sport/2019/jul/01/...,https://content.guardianapis.com/sport/2019/ju...,"[{'id': 'sport/womens-ashes', 'type': 'keyword...",False,pillar/sport,Sport
1,business/2019/jul/01/mobile-banking-to-overtak...,article,business,Business,2019-07-01T01:19:39Z,Mobile banking to overtake high street branch ...,https://www.theguardian.com/business/2019/jul/...,https://content.guardianapis.com/business/2019...,"[{'id': 'business/banking', 'type': 'keyword',...",False,pillar/news,News
2,books/2019/jul/01/im-a-sucker-for-happy-ending...,article,books,Books,2019-07-01T01:41:35Z,I'm a sucker for happy endings but sometimes I...,https://www.theguardian.com/books/2019/jul/01/...,https://content.guardianapis.com/books/2019/ju...,"[{'id': 'books/books', 'type': 'keyword', 'sec...",False,pillar/arts,Arts
3,sport/commentisfree/2019/jul/01/are-you-for-is...,article,commentisfree,Opinion,2019-07-01T01:41:46Z,Are you for Israel Folau or against? We love a...,https://www.theguardian.com/sport/commentisfre...,https://content.guardianapis.com/sport/comment...,"[{'id': 'sport/israel-folau', 'type': 'keyword...",False,pillar/opinion,Opinion
4,world/2019/jun/30/fears-of-violence-as-sudan-g...,article,world,World news,2019-07-01T01:46:41Z,Scores of protesters wounded and seven dead on...,https://www.theguardian.com/world/2019/jun/30/...,https://content.guardianapis.com/world/2019/ju...,"[{'id': 'world/sudan', 'type': 'keyword', 'sec...",False,pillar/news,News


Get data from the second page to max number of pages from response and save these data to dataframe `df_response`:

In [121]:
count_of_pages = data_json['response']['pages']
for page in range(2,count_of_pages+1):
    response = requests.get(url_querry  + '&' + 'page=' + str(page))
    data_json = response.json()
    df_page = pd.DataFrame(data_json['response']['results'])
    frames = [df_response, df_page]
    result = pd.concat(frames)
    df_response = result
    print('Processed {}/{} - {:2.2%}'.format(page,count_of_pages,page/count_of_pages))

Processed 2/3754 - 0.05%
Processed 3/3754 - 0.08%
Processed 4/3754 - 0.11%
Processed 5/3754 - 0.13%
Processed 6/3754 - 0.16%
Processed 7/3754 - 0.19%
Processed 8/3754 - 0.21%
Processed 9/3754 - 0.24%
Processed 10/3754 - 0.27%
Processed 11/3754 - 0.29%
Processed 12/3754 - 0.32%
Processed 13/3754 - 0.35%
Processed 14/3754 - 0.37%
Processed 15/3754 - 0.40%
Processed 16/3754 - 0.43%
Processed 17/3754 - 0.45%
Processed 18/3754 - 0.48%
Processed 19/3754 - 0.51%
Processed 20/3754 - 0.53%
Processed 21/3754 - 0.56%
Processed 22/3754 - 0.59%
Processed 23/3754 - 0.61%
Processed 24/3754 - 0.64%
Processed 25/3754 - 0.67%
Processed 26/3754 - 0.69%
Processed 27/3754 - 0.72%
Processed 28/3754 - 0.75%
Processed 29/3754 - 0.77%
Processed 30/3754 - 0.80%
Processed 31/3754 - 0.83%
Processed 32/3754 - 0.85%
Processed 33/3754 - 0.88%
Processed 34/3754 - 0.91%
Processed 35/3754 - 0.93%
Processed 36/3754 - 0.96%
Processed 37/3754 - 0.99%
Processed 38/3754 - 1.01%
Processed 39/3754 - 1.04%
Processed 40/3754 - 

Processed 310/3754 - 8.26%
Processed 311/3754 - 8.28%
Processed 312/3754 - 8.31%
Processed 313/3754 - 8.34%
Processed 314/3754 - 8.36%
Processed 315/3754 - 8.39%
Processed 316/3754 - 8.42%
Processed 317/3754 - 8.44%
Processed 318/3754 - 8.47%
Processed 319/3754 - 8.50%
Processed 320/3754 - 8.52%
Processed 321/3754 - 8.55%
Processed 322/3754 - 8.58%
Processed 323/3754 - 8.60%
Processed 324/3754 - 8.63%
Processed 325/3754 - 8.66%
Processed 326/3754 - 8.68%
Processed 327/3754 - 8.71%
Processed 328/3754 - 8.74%
Processed 329/3754 - 8.76%
Processed 330/3754 - 8.79%
Processed 331/3754 - 8.82%
Processed 332/3754 - 8.84%
Processed 333/3754 - 8.87%
Processed 334/3754 - 8.90%
Processed 335/3754 - 8.92%
Processed 336/3754 - 8.95%
Processed 337/3754 - 8.98%
Processed 338/3754 - 9.00%
Processed 339/3754 - 9.03%
Processed 340/3754 - 9.06%
Processed 341/3754 - 9.08%
Processed 342/3754 - 9.11%
Processed 343/3754 - 9.14%
Processed 344/3754 - 9.16%
Processed 345/3754 - 9.19%
Processed 346/3754 - 9.22%
P

Processed 605/3754 - 16.12%
Processed 606/3754 - 16.14%
Processed 607/3754 - 16.17%
Processed 608/3754 - 16.20%
Processed 609/3754 - 16.22%
Processed 610/3754 - 16.25%
Processed 611/3754 - 16.28%
Processed 612/3754 - 16.30%
Processed 613/3754 - 16.33%
Processed 614/3754 - 16.36%
Processed 615/3754 - 16.38%
Processed 616/3754 - 16.41%
Processed 617/3754 - 16.44%
Processed 618/3754 - 16.46%
Processed 619/3754 - 16.49%
Processed 620/3754 - 16.52%
Processed 621/3754 - 16.54%
Processed 622/3754 - 16.57%
Processed 623/3754 - 16.60%
Processed 624/3754 - 16.62%
Processed 625/3754 - 16.65%
Processed 626/3754 - 16.68%
Processed 627/3754 - 16.70%
Processed 628/3754 - 16.73%
Processed 629/3754 - 16.76%
Processed 630/3754 - 16.78%
Processed 631/3754 - 16.81%
Processed 632/3754 - 16.84%
Processed 633/3754 - 16.86%
Processed 634/3754 - 16.89%
Processed 635/3754 - 16.92%
Processed 636/3754 - 16.94%
Processed 637/3754 - 16.97%
Processed 638/3754 - 17.00%
Processed 639/3754 - 17.02%
Processed 640/3754 -

Processed 898/3754 - 23.92%
Processed 899/3754 - 23.95%
Processed 900/3754 - 23.97%
Processed 901/3754 - 24.00%
Processed 902/3754 - 24.03%
Processed 903/3754 - 24.05%
Processed 904/3754 - 24.08%
Processed 905/3754 - 24.11%
Processed 906/3754 - 24.13%
Processed 907/3754 - 24.16%
Processed 908/3754 - 24.19%
Processed 909/3754 - 24.21%
Processed 910/3754 - 24.24%
Processed 911/3754 - 24.27%
Processed 912/3754 - 24.29%
Processed 913/3754 - 24.32%
Processed 914/3754 - 24.35%
Processed 915/3754 - 24.37%
Processed 916/3754 - 24.40%
Processed 917/3754 - 24.43%
Processed 918/3754 - 24.45%
Processed 919/3754 - 24.48%
Processed 920/3754 - 24.51%
Processed 921/3754 - 24.53%
Processed 922/3754 - 24.56%
Processed 923/3754 - 24.59%
Processed 924/3754 - 24.61%
Processed 925/3754 - 24.64%
Processed 926/3754 - 24.67%
Processed 927/3754 - 24.69%
Processed 928/3754 - 24.72%
Processed 929/3754 - 24.75%
Processed 930/3754 - 24.77%
Processed 931/3754 - 24.80%
Processed 932/3754 - 24.83%
Processed 933/3754 -

Processed 1185/3754 - 31.57%
Processed 1186/3754 - 31.59%
Processed 1187/3754 - 31.62%
Processed 1188/3754 - 31.65%
Processed 1189/3754 - 31.67%
Processed 1190/3754 - 31.70%
Processed 1191/3754 - 31.73%
Processed 1192/3754 - 31.75%
Processed 1193/3754 - 31.78%
Processed 1194/3754 - 31.81%
Processed 1195/3754 - 31.83%
Processed 1196/3754 - 31.86%
Processed 1197/3754 - 31.89%
Processed 1198/3754 - 31.91%
Processed 1199/3754 - 31.94%
Processed 1200/3754 - 31.97%
Processed 1201/3754 - 31.99%
Processed 1202/3754 - 32.02%
Processed 1203/3754 - 32.05%
Processed 1204/3754 - 32.07%
Processed 1205/3754 - 32.10%
Processed 1206/3754 - 32.13%
Processed 1207/3754 - 32.15%
Processed 1208/3754 - 32.18%
Processed 1209/3754 - 32.21%
Processed 1210/3754 - 32.23%
Processed 1211/3754 - 32.26%
Processed 1212/3754 - 32.29%
Processed 1213/3754 - 32.31%
Processed 1214/3754 - 32.34%
Processed 1215/3754 - 32.37%
Processed 1216/3754 - 32.39%
Processed 1217/3754 - 32.42%
Processed 1218/3754 - 32.45%
Processed 1219

Processed 1468/3754 - 39.10%
Processed 1469/3754 - 39.13%
Processed 1470/3754 - 39.16%
Processed 1471/3754 - 39.18%
Processed 1472/3754 - 39.21%
Processed 1473/3754 - 39.24%
Processed 1474/3754 - 39.26%
Processed 1475/3754 - 39.29%
Processed 1476/3754 - 39.32%
Processed 1477/3754 - 39.34%
Processed 1478/3754 - 39.37%
Processed 1479/3754 - 39.40%
Processed 1480/3754 - 39.42%
Processed 1481/3754 - 39.45%
Processed 1482/3754 - 39.48%
Processed 1483/3754 - 39.50%
Processed 1484/3754 - 39.53%
Processed 1485/3754 - 39.56%
Processed 1486/3754 - 39.58%
Processed 1487/3754 - 39.61%
Processed 1488/3754 - 39.64%
Processed 1489/3754 - 39.66%
Processed 1490/3754 - 39.69%
Processed 1491/3754 - 39.72%
Processed 1492/3754 - 39.74%
Processed 1493/3754 - 39.77%
Processed 1494/3754 - 39.80%
Processed 1495/3754 - 39.82%
Processed 1496/3754 - 39.85%
Processed 1497/3754 - 39.88%
Processed 1498/3754 - 39.90%
Processed 1499/3754 - 39.93%
Processed 1500/3754 - 39.96%
Processed 1501/3754 - 39.98%
Processed 1502

Processed 1751/3754 - 46.64%
Processed 1752/3754 - 46.67%
Processed 1753/3754 - 46.70%
Processed 1754/3754 - 46.72%
Processed 1755/3754 - 46.75%
Processed 1756/3754 - 46.78%
Processed 1757/3754 - 46.80%
Processed 1758/3754 - 46.83%
Processed 1759/3754 - 46.86%
Processed 1760/3754 - 46.88%
Processed 1761/3754 - 46.91%
Processed 1762/3754 - 46.94%
Processed 1763/3754 - 46.96%
Processed 1764/3754 - 46.99%
Processed 1765/3754 - 47.02%
Processed 1766/3754 - 47.04%
Processed 1767/3754 - 47.07%
Processed 1768/3754 - 47.10%
Processed 1769/3754 - 47.12%
Processed 1770/3754 - 47.15%
Processed 1771/3754 - 47.18%
Processed 1772/3754 - 47.20%
Processed 1773/3754 - 47.23%
Processed 1774/3754 - 47.26%
Processed 1775/3754 - 47.28%
Processed 1776/3754 - 47.31%
Processed 1777/3754 - 47.34%
Processed 1778/3754 - 47.36%
Processed 1779/3754 - 47.39%
Processed 1780/3754 - 47.42%
Processed 1781/3754 - 47.44%
Processed 1782/3754 - 47.47%
Processed 1783/3754 - 47.50%
Processed 1784/3754 - 47.52%
Processed 1785

Processed 2034/3754 - 54.18%
Processed 2035/3754 - 54.21%
Processed 2036/3754 - 54.24%
Processed 2037/3754 - 54.26%
Processed 2038/3754 - 54.29%
Processed 2039/3754 - 54.32%
Processed 2040/3754 - 54.34%
Processed 2041/3754 - 54.37%
Processed 2042/3754 - 54.40%
Processed 2043/3754 - 54.42%
Processed 2044/3754 - 54.45%
Processed 2045/3754 - 54.48%
Processed 2046/3754 - 54.50%
Processed 2047/3754 - 54.53%
Processed 2048/3754 - 54.56%
Processed 2049/3754 - 54.58%
Processed 2050/3754 - 54.61%
Processed 2051/3754 - 54.64%
Processed 2052/3754 - 54.66%
Processed 2053/3754 - 54.69%
Processed 2054/3754 - 54.71%
Processed 2055/3754 - 54.74%
Processed 2056/3754 - 54.77%
Processed 2057/3754 - 54.79%
Processed 2058/3754 - 54.82%
Processed 2059/3754 - 54.85%
Processed 2060/3754 - 54.87%
Processed 2061/3754 - 54.90%
Processed 2062/3754 - 54.93%
Processed 2063/3754 - 54.95%
Processed 2064/3754 - 54.98%
Processed 2065/3754 - 55.01%
Processed 2066/3754 - 55.03%
Processed 2067/3754 - 55.06%
Processed 2068

Processed 2317/3754 - 61.72%
Processed 2318/3754 - 61.75%
Processed 2319/3754 - 61.77%
Processed 2320/3754 - 61.80%
Processed 2321/3754 - 61.83%
Processed 2322/3754 - 61.85%
Processed 2323/3754 - 61.88%
Processed 2324/3754 - 61.91%
Processed 2325/3754 - 61.93%
Processed 2326/3754 - 61.96%
Processed 2327/3754 - 61.99%
Processed 2328/3754 - 62.01%
Processed 2329/3754 - 62.04%
Processed 2330/3754 - 62.07%
Processed 2331/3754 - 62.09%
Processed 2332/3754 - 62.12%
Processed 2333/3754 - 62.15%
Processed 2334/3754 - 62.17%
Processed 2335/3754 - 62.20%
Processed 2336/3754 - 62.23%
Processed 2337/3754 - 62.25%
Processed 2338/3754 - 62.28%
Processed 2339/3754 - 62.31%
Processed 2340/3754 - 62.33%
Processed 2341/3754 - 62.36%
Processed 2342/3754 - 62.39%
Processed 2343/3754 - 62.41%
Processed 2344/3754 - 62.44%
Processed 2345/3754 - 62.47%
Processed 2346/3754 - 62.49%
Processed 2347/3754 - 62.52%
Processed 2348/3754 - 62.55%
Processed 2349/3754 - 62.57%
Processed 2350/3754 - 62.60%
Processed 2351

Processed 2600/3754 - 69.26%
Processed 2601/3754 - 69.29%
Processed 2602/3754 - 69.31%
Processed 2603/3754 - 69.34%
Processed 2604/3754 - 69.37%
Processed 2605/3754 - 69.39%
Processed 2606/3754 - 69.42%
Processed 2607/3754 - 69.45%
Processed 2608/3754 - 69.47%
Processed 2609/3754 - 69.50%
Processed 2610/3754 - 69.53%
Processed 2611/3754 - 69.55%
Processed 2612/3754 - 69.58%
Processed 2613/3754 - 69.61%
Processed 2614/3754 - 69.63%
Processed 2615/3754 - 69.66%
Processed 2616/3754 - 69.69%
Processed 2617/3754 - 69.71%
Processed 2618/3754 - 69.74%
Processed 2619/3754 - 69.77%
Processed 2620/3754 - 69.79%
Processed 2621/3754 - 69.82%
Processed 2622/3754 - 69.85%
Processed 2623/3754 - 69.87%
Processed 2624/3754 - 69.90%
Processed 2625/3754 - 69.93%
Processed 2626/3754 - 69.95%
Processed 2627/3754 - 69.98%
Processed 2628/3754 - 70.01%
Processed 2629/3754 - 70.03%
Processed 2630/3754 - 70.06%
Processed 2631/3754 - 70.09%
Processed 2632/3754 - 70.11%
Processed 2633/3754 - 70.14%
Processed 2634

Processed 2883/3754 - 76.80%
Processed 2884/3754 - 76.82%
Processed 2885/3754 - 76.85%
Processed 2886/3754 - 76.88%
Processed 2887/3754 - 76.90%
Processed 2888/3754 - 76.93%
Processed 2889/3754 - 76.96%
Processed 2890/3754 - 76.98%
Processed 2891/3754 - 77.01%
Processed 2892/3754 - 77.04%
Processed 2893/3754 - 77.06%
Processed 2894/3754 - 77.09%
Processed 2895/3754 - 77.12%
Processed 2896/3754 - 77.14%
Processed 2897/3754 - 77.17%
Processed 2898/3754 - 77.20%
Processed 2899/3754 - 77.22%
Processed 2900/3754 - 77.25%
Processed 2901/3754 - 77.28%
Processed 2902/3754 - 77.30%
Processed 2903/3754 - 77.33%
Processed 2904/3754 - 77.36%
Processed 2905/3754 - 77.38%
Processed 2906/3754 - 77.41%
Processed 2907/3754 - 77.44%
Processed 2908/3754 - 77.46%
Processed 2909/3754 - 77.49%
Processed 2910/3754 - 77.52%
Processed 2911/3754 - 77.54%
Processed 2912/3754 - 77.57%
Processed 2913/3754 - 77.60%
Processed 2914/3754 - 77.62%
Processed 2915/3754 - 77.65%
Processed 2916/3754 - 77.68%
Processed 2917

Processed 3166/3754 - 84.34%
Processed 3167/3754 - 84.36%
Processed 3168/3754 - 84.39%
Processed 3169/3754 - 84.42%
Processed 3170/3754 - 84.44%
Processed 3171/3754 - 84.47%
Processed 3172/3754 - 84.50%
Processed 3173/3754 - 84.52%
Processed 3174/3754 - 84.55%
Processed 3175/3754 - 84.58%
Processed 3176/3754 - 84.60%
Processed 3177/3754 - 84.63%
Processed 3178/3754 - 84.66%
Processed 3179/3754 - 84.68%
Processed 3180/3754 - 84.71%
Processed 3181/3754 - 84.74%
Processed 3182/3754 - 84.76%
Processed 3183/3754 - 84.79%
Processed 3184/3754 - 84.82%
Processed 3185/3754 - 84.84%
Processed 3186/3754 - 84.87%
Processed 3187/3754 - 84.90%
Processed 3188/3754 - 84.92%
Processed 3189/3754 - 84.95%
Processed 3190/3754 - 84.98%
Processed 3191/3754 - 85.00%
Processed 3192/3754 - 85.03%
Processed 3193/3754 - 85.06%
Processed 3194/3754 - 85.08%
Processed 3195/3754 - 85.11%
Processed 3196/3754 - 85.14%
Processed 3197/3754 - 85.16%
Processed 3198/3754 - 85.19%
Processed 3199/3754 - 85.22%
Processed 3200

Processed 3449/3754 - 91.88%
Processed 3450/3754 - 91.90%
Processed 3451/3754 - 91.93%
Processed 3452/3754 - 91.96%
Processed 3453/3754 - 91.98%
Processed 3454/3754 - 92.01%
Processed 3455/3754 - 92.04%
Processed 3456/3754 - 92.06%
Processed 3457/3754 - 92.09%
Processed 3458/3754 - 92.12%
Processed 3459/3754 - 92.14%
Processed 3460/3754 - 92.17%
Processed 3461/3754 - 92.19%
Processed 3462/3754 - 92.22%
Processed 3463/3754 - 92.25%
Processed 3464/3754 - 92.27%
Processed 3465/3754 - 92.30%
Processed 3466/3754 - 92.33%
Processed 3467/3754 - 92.35%
Processed 3468/3754 - 92.38%
Processed 3469/3754 - 92.41%
Processed 3470/3754 - 92.43%
Processed 3471/3754 - 92.46%


KeyError: 'response'

In [122]:
data_json

{'message': 'API rate limit exceeded'}

In [None]:
df_response = df_response.reset_index()

In [None]:
df_response = df_response.drop(columns=['index'])

In [None]:
df_response['id_num'] = range(1, len(df_response) + 1)

## 2. Save general information about article

In [None]:
df_article = df_response[['id_num',
               'id', 
               'type', 
               'sectionId',
               'sectionName',
               'webPublicationDate',
               'webTitle',
               'isHosted',
               'pillarId',
               'pillarName'
              ]]

In [None]:
df_article.head()

In [None]:
name_csv = 'general/guardian-article-data_start-date-' + from_date + '_end-date-' + to_date + '_general.csv'

In [None]:
df_article.to_csv(name_csv,index=False,sep=';')

## 3. Save tag information

In [None]:
df_tags = df_response[['id_num',
               'id',
               'tags'
              ]]

In [None]:
df_tags.head()

In [None]:
pd.DataFrame(df_tags.iloc[3]['tags'])

In [None]:
name_json = 'guardian-article-data_start-date-' + from_date + '_end-date-' + to_date + '_pre-tags.json'

In [None]:
df_tags.to_json(name_json,orient="records")

In [None]:
with open(name_json,'r') as f:
    data = json.loads(f.read())

In [None]:
df_tag = pd.json_normalize(data, 
                           record_path =['tags'], 
                           meta =['id_num','id'],
                           record_prefix='tag_')

In [None]:
df_tag = df_tag[['id_num',
               'id',
               'tag_id',
               'tag_sectionName',
               'tag_webTitle',
               'tag_description'
              ]]

In [None]:
df_tag.head()

In [None]:
name_csv = 'tags/guardian-article-data_start-date-' + from_date + '_end-date-' + to_date + '_tags.csv'

In [None]:
df_tag.to_csv(name_csv,index=False,sep=';')

In [None]:
df_tag[df_tag['tag_webTitle'] == 'Justin Trudeau']