# Analyze sentiment in news articles

Important documentation:
- [Python client for news API](https://newsapi.org/docs/client-libraries/python)
- [Sources endpoint](https://newsapi.org/docs/endpoints/sources)
- [Articlese endpoint](https://newsapi.org/docs/endpoints/everything)

## Imports and notebook configuration

In [1]:
# basic configuration, put these lines at the top of each notebook
%load_ext autoreload
%autoreload 2
%matplotlib inline

In [33]:
import os
from json import dump

import matplotlib.pyplot as plt
import pandas as pd
from dotenv import load_dotenv
from newsapi import NewsApiClient

In [3]:
plt.rcParams["figure.figsize"] = (16, 9)
pd.options.display.max_columns = None
pd.set_option('mode.chained_assignment', None)
pd.set_option("display.precision", 2)
pd.options.display.max_rows = 100

## Setup API connection

In [6]:
load_dotenv()

True

In [8]:
NEWS_API_KEY = os.getenv("NEWS_API_KEY")

In [10]:
api = NewsApiClient(api_key=NEWS_API_KEY)

## Example: Get data sources

In [11]:
us_sources = api.get_sources(country='us', language='en')

In [18]:
print(len(us_sources['sources']))
us_sources

55


{'status': 'ok',
 'sources': [{'id': 'abc-news',
   'name': 'ABC News',
   'description': 'Your trusted source for breaking news, analysis, exclusive interviews, headlines, and videos at ABCNews.com.',
   'url': 'https://abcnews.go.com',
   'category': 'general',
   'language': 'en',
   'country': 'us'},
  {'id': 'al-jazeera-english',
   'name': 'Al Jazeera English',
   'description': 'News, analysis from the Middle East and worldwide, multimedia and interactives, opinions, documentaries, podcasts, long reads and broadcast schedule.',
   'url': 'http://www.aljazeera.com',
   'category': 'general',
   'language': 'en',
   'country': 'us'},
  {'id': 'ars-technica',
   'name': 'Ars Technica',
   'description': "The PC enthusiast's resource. Power users and the tools they love, without computing religion.",
   'url': 'http://arstechnica.com',
   'category': 'technology',
   'language': 'en',
   'country': 'us'},
  {'id': 'associated-press',
   'name': 'Associated Press',
   'description': 

In [19]:
de_sources = api.get_sources(country='de', language='de')

In [20]:
print(len(de_sources['sources']))
de_sources

10


{'status': 'ok',
 'sources': [{'id': 'bild',
   'name': 'Bild',
   'description': 'Die Seite 1 für aktuelle Nachrichten und Themen, Bilder und Videos aus den Bereichen News, Wirtschaft, Politik, Show, Sport, und Promis.',
   'url': 'http://www.bild.de',
   'category': 'general',
   'language': 'de',
   'country': 'de'},
  {'id': 'der-tagesspiegel',
   'name': 'Der Tagesspiegel',
   'description': 'Nachrichten, News und neueste Meldungen aus dem Inland und dem Ausland - aktuell präsentiert von tagesspiegel.de.',
   'url': 'http://www.tagesspiegel.de',
   'category': 'general',
   'language': 'de',
   'country': 'de'},
  {'id': 'die-zeit',
   'name': 'Die Zeit',
   'description': 'Aktuelle Nachrichten, Kommentare, Analysen und Hintergrundberichte aus Politik, Wirtschaft, Gesellschaft, Wissen, Kultur und Sport lesen Sie auf ZEIT ONLINE.',
   'url': 'http://www.zeit.de/index',
   'category': 'business',
   'language': 'de',
   'country': 'de'},
  {'id': 'focus',
   'name': 'Focus',
   'des

In [34]:
with open('../data/news-sentiment/us_sources.json', 'w') as f:
    dump(us_sources, f, indent=2)

In [35]:
with open('../data/news-sentiment/de_sources.json', 'w') as f:
    dump(de_sources, f, indent=2)

## Example: Get news articles

In [24]:
us_source_ids = ','.join([s['id'] for s in us_sources['sources']])
us_source_ids

'abc-news,al-jazeera-english,ars-technica,associated-press,axios,bleacher-report,bloomberg,breitbart-news,business-insider,buzzfeed,cbs-news,cnn,crypto-coins-news,engadget,entertainment-weekly,espn,espn-cric-info,fortune,fox-news,fox-sports,google-news,hacker-news,ign,mashable,medical-news-today,msnbc,mtv-news,national-geographic,national-review,nbc-news,new-scientist,newsweek,new-york-magazine,next-big-future,nfl-news,nhl-news,politico,polygon,recode,reddit-r-all,reuters,techcrunch,techradar,the-american-conservative,the-hill,the-huffington-post,the-next-web,the-verge,the-wall-street-journal,the-washington-post,the-washington-times,time,usa-today,vice-news,wired'

In [25]:
us_news = api.get_everything(q='artificial intelligence', sources=us_source_ids, page_size=100)

In [26]:
us_news

{'status': 'ok',
 'totalResults': 486,
 'articles': [{'source': {'id': 'techcrunch', 'name': 'TechCrunch'},
   'author': 'Rita Liao',
   'title': 'Imint: the Swedish firm that gives Chinese smartphones an edge in video production',
   'description': 'If your phone takes amazing photos, chances are its camera has been augmented by artificial intelligence embedded in the operating system. Now videos are getting the same treatment. In recent years, smartphone makers have been gradually transforming their cam…',
   'url': 'http://techcrunch.com/2020/07/26/imint-vidhance-profile/',
   'urlToImage': 'https://techcrunch.com/wp-content/uploads/2020/07/imint-e1595731081769.jpeg?w=750',
   'publishedAt': '2020-07-26T12:10:06Z',
   'content': 'If your phone takes amazing photos, chances are its camera has been augmented by artificial intelligence embedded in the operating system. Now videos are getting the same treatment.\r\nIn recent years,… [+6948 chars]'},
  {'source': {'id': 'techcrunch', 'na

In [27]:
len(us_news['articles'])

100

In [28]:
de_source_ids = ','.join([s['id'] for s in de_sources['sources']])
de_source_ids

'bild,der-tagesspiegel,die-zeit,focus,gruenderszene,handelsblatt,spiegel-online,t3n,wired-de,wirtschafts-woche'

In [29]:
de_news = api.get_everything(q='künstliche intelligenz', sources=de_source_ids, page_size=100)

In [30]:
len(de_news['articles'])

100

In [31]:
de_news

{'status': 'ok',
 'totalResults': 102,
 'articles': [{'source': {'id': 'spiegel-online', 'name': 'Spiegel Online'},
   'author': 'Sonja Peteranderl',
   'title': 'Künstliche Intelligenz als Krisenhelfer: Software, die Fluchtbewegungen vorhersieht',
   'description': 'Einem Algorithmus zufolge könnte die Corona-Pandemie eine Million Menschen in der Sahelzone aus ihrer Heimat vertreiben. Solche Vorhersagen helfen humanitären Organisationen, Trends zu erkennen und schneller einzugreifen.',
   'url': 'https://www.spiegel.de/politik/ausland/kuenstliche-intelligenz-als-krisenhelfer-software-die-fluchtbewegungen-vorhersieht-a-a8b6f194-cc33-4284-bac9-4f0326c0e47f',
   'urlToImage': 'https://cdn.prod.www.spiegel.de/images/0c62c9bc-3726-4a6f-a6cd-1aa4ef222f35_w1280_r1.77_fpx55.46_fpy50.jpg',
   'publishedAt': '2020-08-26T20:47:09Z',
   'content': None},
  {'source': {'id': 'die-zeit', 'name': 'Die Zeit'},
   'author': 'ZEIT ONLINE: Digital - Rosa Thoneick',
   'title': 'Künstliche Intelligenz: D

In [36]:
with open('../data/news-sentiment/us_articles.json', 'w') as f:
    dump(us_news, f, indent=2)

In [37]:
with open('../data/news-sentiment/de_articles.json', 'w') as f:
    dump(de_news, f, indent=2)