# Get mentions based on some keywords from several news channels and save in a csv file (NEWSAPI)
Python Script by Femi Oyebamiji, Data Scientist and AI Leader. 
I am available for further help on the use of this script or integration into a full software solution and products on
08160837781, oyebamijioluwafemi@yahoo.com

In [45]:
import requests
import os
import pandas as pd
from datetime import datetime


NEWS_API_KEY = '' #newsapi key
#https://newsapi.org/sources


class NewsEngine:
  def __init__(self, topic, NEWS_API_KEY, from_date, to_date, language):
    self.topic = topic
    self.NEWS_API_KEY = NEWS_API_KEY
    self.from_date = from_date
    self.to_date = to_date
    self.language = language
  def fetchNews(self):
    url = "https://newsapi.org/v2/everything?q=" + self.topic + "&apiKey=" + self.NEWS_API_KEY  + "&from=" + self.from_date + "&to=" + self.to_date + "&pageSize=100" + "&language=" + self.language
    r = requests.get(url=url)
    data = r.json()  # extracting data in json format
    ## data to be returned
    data = data["articles"]
    return data

In [46]:
def search_news(topic_news, from_date, to_date, language):
    newsEngine = NewsEngine(topic=topic_news,
                            NEWS_API_KEY=NEWS_API_KEY,
                            from_date = from_date,
                            to_date = to_date,
                            language = language)
    data = newsEngine.fetchNews()
    Title = []
    Description = []
    url = []
    for i in range(len(data)):
        Title.append(data[i]['title'])
        Description.append(data[i]['description'])
        url.append(data[i]['url'])
    data_table = pd.DataFrame({ 'Title': Title,
                                'Description': Description,
                                'URL': url})

    # Obtain timestamp in a readable format
    to_csv_timestamp = datetime.today().strftime('%Y%m%d_%H%M%S')
    # Define working path and filename
    path = os.getcwd()
    if not os.path.exists('data'):
        os.makedirs('data')
    filename = path + '/data/' + to_csv_timestamp + 'NEWS_' + '.csv'
    # Store dataframe in csv with creation date timestamp
    data_table.to_csv(filename, index = False)

# Only modify the fields below and run the script

In [47]:
topic_news = "covid19"
from_date = "2020-12-04"
to_date = from_date
language = "en"
search_news(topic_news, from_date, to_date, language)
print("Done")

Done


# Documentation

This script fetches mentions of inputed keywords from across over 50 news channels using the NewsAPI.
The following steps are required for the running of this script:
1. Supply your NewsAPI key -  This can be obtained on https://newsapi.org. The default used for testing this script is a personal key of Femi Oyebamiji. This script will be able to work with the free version of the NewsAPI key. However you can only access news published within the last one month with the free key.


2. Modify only the last section of the code - 
    1. Topic News: The keywords to search which can be separated with comma, 
    2. from_date(The specific start date of the news you want to fetch, 
    3. to_date( the end date of what you want to fetch, 
    4. language: The language of the news to fetch. Default is English.
    

3. Run the script: Click on the last code section and press shift key + enter key at the same time on a jupyter notebook.


The result is automatically outputted to an external excel/csv sheet in the data folder named current_timestamp + NEWS.csv

This script can also be integrated into a software solution.