**NEWSAPI**

Documentation
News API is a simple HTTP REST API for searching and retrieving live articles from all over the web. It can help you answer questions like:

Endpoints

News API has 2 main endpoints:

*  **Top headlines /v2/top-headlines** - returns breaking news headlines for a country and category, or currently running on a single or multiple sources. This is perfect for use with news tickers or anywhere you want to display live up-to-date news headlines and images.

*  **Everything /v2/everything** - we index every recent news and blog article published by over 50,000 different sources large and small, and you can search through them with this endpoint. This endpoint is better suited for news analysis and article discovery, but can be used to retrieve articles for display too.

We also have a minor endpoint that can be used to retrieve a small subset of the publishers we index from:

*   **Sources /v2/sources** - returns information (including name, description, and 
category) about the most notable sources we index. This list could be piped directly through to your users when showing them some of the options available.

You need an API key to use the API - this is a unique key that identifies your requests. 


In [0]:
import requests
import pandas as pd

# NEWSAPI is also able to provide headlines according to keywords


# **Everything /v2/everything**

Search through millions of articles from over 50,000 large and small news sources and blogs. This includes breaking news as well as lesser articles.

This endpoint suits article discovery and analysis, but can be used to retrieve articles for display, too.

Assuming if we want to find any news article with keyword "SAMSUNG".

In [0]:
#we create URL for samsung for date 30 May 2020 sort by popularity
url = ('http://newsapi.org/v2/everything?' #search everything
       'q=samsung&' 
       #q - Keywords or phrases to search for keyword in the article title and body. 
       #qInTitle - request to Keywords or phrases to search for keyword in the article title only.
       'from=2020-05-30&'#from date
       'to=2020-06-01&'#to date
       'domains=bbc.co.uk&'#domains that we want
       'language=en&'#choose your language
       'sortBy=popularity&'#if we want to sort by popularity,relevancy or publishedAt
       'pageSize=30&' #if we want the number of articles default is 20 and maximum is 100
       'apiKey=4ac92a95346643fdbdb26a7e4d0e98b1') #this is compulsory

#create request to get news article from newsapi
response = requests.get(url)

In [0]:
#test the connection
response

<Response [200]>

In [0]:
#create json format
samsungheadline=response.json()
samsungheadline

{'articles': [{'author': 'https://www.facebook.com/bbcnews',
   'content': 'Image copyrightGetty Images\r\nAmazon has blamed a "bad actor" for racist abuse that appeared on multiple listings on its UK website.\r\nThe abuse, now removed, appeared when users searched the online sh… [+1608 chars]',
   'description': 'The online giant blames a "bad actor" for the language appearing alongside multiple product listings.',
   'publishedAt': '2020-05-31T10:26:22Z',
   'source': {'id': 'bbc-news', 'name': 'BBC News'},
   'title': 'Amazon UK website defaced with racist abuse',
   'url': 'https://www.bbc.co.uk/news/business-52867334',
   'urlToImage': 'https://ichef.bbci.co.uk/news/1024/branded_news/227B/production/_102972880_amazonpackages_getty.jpg'},
  {'author': 'https://www.facebook.com/bbcnews',
   'content': 'Image copyrightGetty / Beyonce / TwitterImage caption\r\n Rihanna and Beyoncé called for justice for George Floyd, while Ariana Grande joined protests in LA\r\nThe music industry is c

In [0]:
#sort them out, we can see that they are sorted according to articles , status and total results.
sorted(samsungheadline.keys())

['articles', 'status', 'totalResults']

In [0]:
#we only want the articles for the keyword 
samsungheadline = samsungheadline['articles']
samsungheadline

[{'author': 'https://www.facebook.com/bbcnews',
  'content': 'Image copyrightGetty Images\r\nAmazon has blamed a "bad actor" for racist abuse that appeared on multiple listings on its UK website.\r\nThe abuse, now removed, appeared when users searched the online sh… [+1608 chars]',
  'description': 'The online giant blames a "bad actor" for the language appearing alongside multiple product listings.',
  'publishedAt': '2020-05-31T10:26:22Z',
  'source': {'id': 'bbc-news', 'name': 'BBC News'},
  'title': 'Amazon UK website defaced with racist abuse',
  'url': 'https://www.bbc.co.uk/news/business-52867334',
  'urlToImage': 'https://ichef.bbci.co.uk/news/1024/branded_news/227B/production/_102972880_amazonpackages_getty.jpg'},
 {'author': 'https://www.facebook.com/bbcnews',
  'content': 'Image copyrightGetty / Beyonce / TwitterImage caption\r\n Rihanna and Beyoncé called for justice for George Floyd, while Ariana Grande joined protests in LA\r\nThe music industry is calling for a "Blacko… 

In [0]:
#convert to dataframe first
df =  pd.DataFrame(samsungheadline)
df

Unnamed: 0,source,author,title,description,url,urlToImage,publishedAt,content
0,"{'id': 'bbc-news', 'name': 'BBC News'}",https://www.facebook.com/bbcnews,Amazon UK website defaced with racist abuse,"The online giant blames a ""bad actor"" for the ...",https://www.bbc.co.uk/news/business-52867334,https://ichef.bbci.co.uk/news/1024/branded_new...,2020-05-31T10:26:22Z,Image copyrightGetty Images\r\nAmazon has blam...
1,"{'id': 'bbc-news', 'name': 'BBC News'}",https://www.facebook.com/bbcnews,George Floyd: Music industry calls for 'blacko...,Record labels say they will not release new mu...,https://www.bbc.co.uk/news/entertainment-arts-...,https://ichef.bbci.co.uk/news/1024/branded_new...,2020-06-01T09:12:07Z,Image copyrightGetty / Beyonce / TwitterImage ...


# **Top headlines /v2/top-headlines**

## What if we want top headlines for a country, specific category in a country single source or multiple sources , or for a keyword ?

In [0]:
#we create URL for top category news headlines for country
catenewsheadlinesurl = ('http://newsapi.org/v2/top-headlines?'
       'q=trump&' #the keyword we are looking for
       'country=us&' #we  chose the country we want
       'category=general&' #we chose the category we wants
       'language=en&'
       'pageSize=30&' #this is the number of results to return per page(request).20 is default . max is 100
       'apiKey=4ac92a95346643fdbdb26a7e4d0e98b1') # api keys
#we create the request to get the news headlines.
catenewsheadlinesurlresponse = requests.get(catenewsheadlinesurl)

In [0]:
#test the connection if 200 means connection is ready
catenewsheadlinesurlresponse

<Response [200]>

In [0]:
#convert the headlines to json format
catenewsheadlines=catenewsheadlinesurlresponse.json()
catenewsheadlines

{'articles': [{'author': 'Tom Gjelten',
   'content': "Police gather to remove demonstrators from the area around Lafayette Park and the White House before President Trump's walk to St. John's Church.\r\nAlex Brandon/AP\r\nThe plaza between St. John's Church… [+3942 chars]",
   'description': "After U.S. Park Police and National Guard troops pushed demonstrators out of Lafayette Park, President Trump walked from the White House to St. John's Church where he posed for pictures with a Bible.",
   'publishedAt': '2020-06-02T08:14:32Z',
   'source': {'id': None, 'name': 'NPR'},
   'title': 'Park Police Tear Gas Peaceful Protesters To Clear Way For Trump Church Photo-Op - NPR',
   'url': 'https://www.npr.org/2020/06/01/867532070/trumps-unannounced-church-visit-angers-church-officials',
   'urlToImage': 'https://media.npr.org/assets/img/2020/06/01/ap_201538622364011_wide-b070944736fe027bbb81b95cc5c728d4bc879d39.jpg?s=1400'},
  {'author': 'Michael M. Grynbaum',
   'content': 'As Mr. Carlson s

In [0]:
#let sort them according to category
sorted(catenewsheadlines.keys())

['articles', 'status', 'totalResults']

In [0]:
#we only wants articles
catenewsheadlines = catenewsheadlines['articles']

In [0]:
#convert to dataframe first
df2 =  pd.DataFrame(catenewsheadlines)
df2.head()

Unnamed: 0,source,author,title,description,url,urlToImage,publishedAt,content
0,"{'id': None, 'name': 'NPR'}",Tom Gjelten,Park Police Tear Gas Peaceful Protesters To Cl...,After U.S. Park Police and National Guard troo...,https://www.npr.org/2020/06/01/867532070/trump...,https://media.npr.org/assets/img/2020/06/01/ap...,2020-06-02T08:14:32Z,Police gather to remove demonstrators from the...
1,"{'id': None, 'name': 'New York Times'}",Michael M. Grynbaum,"Tucker Carlson, Anderson Cooper Deliver Starkl...","In a sign of partisan divide, his monologue ca...",https://www.nytimes.com/2020/06/01/business/me...,https://static01.nyt.com/images/2020/06/01/us/...,2020-06-02T04:42:47Z,"As Mr. Carlson spoke on Fox News, Mr. Trump al..."
2,"{'id': None, 'name': 'Investor's Business Daily'}",Investor's Business Daily,"Dow Jones Futures Slide On Trump Comments, As ...",The Dow Jones futures were squarely lower late...,https://www.investors.com/market-trend/stock-m...,https://www.investors.com/wp-content/uploads/2...,2020-06-02T04:21:10Z,Dow Jones futures were squarely lower late Mon...
3,"{'id': None, 'name': 'YouTube'}",,Trump vows to send troops into cities if neede...,President Donald Trump said on Monday he was d...,https://www.youtube.com/watch?v=Vij8hnFsOxs,https://i.ytimg.com/vi/Vij8hnFsOxs/maxresdefau...,2020-06-02T03:59:19Z,President Donald Trump said on Monday he was d...
4,"{'id': None, 'name': 'The Guardian'}",Sam Levin,George Floyd protests: can Trump deploy federa...,Outcome seems to hang on the interpretation of...,https://www.theguardian.com/us-news/2020/jun/0...,https://i.guim.co.uk/img/media/6e24eabed484dd9...,2020-06-02T02:51:00Z,US president Donald Trump has threatened to de...


#**Sources /v2/sources**
This endpoint returns the subset of news publishers that top headlines (/v2/top-headlines) are available from. It's mainly a convenience endpoint that you can use to keep track of the publishers available on the API, and you can pipe it straight through to your users.

In [0]:
#we create URL for top category news headlines for country
sourcesurl = ('http://newsapi.org/v2/sources?'
       'category=technology&' #we chose the category we wants
       'language=en&'#this is the language 
       'country=us&' #we  chose the country we want
       'apiKey=4ac92a95346643fdbdb26a7e4d0e98b1') # api keys
#we create the request to get the news headlines.
sourcesurlresponse = requests.get(sourcesurl)

In [0]:
#test the connection if 200 means connection is ready
sourcesurlresponse

<Response [200]>

In [0]:
#convert the headlines to json format
sourcesheadlines=sourcesurlresponse.json()
sourcesheadlines

{'sources': [{'category': 'technology',
   'country': 'us',
   'description': "The PC enthusiast's resource. Power users and the tools they love, without computing religion.",
   'id': 'ars-technica',
   'language': 'en',
   'name': 'Ars Technica',
   'url': 'http://arstechnica.com'},
  {'category': 'technology',
   'country': 'us',
   'description': 'Providing breaking cryptocurrency news - focusing on Bitcoin, Ethereum, ICOs, blockchain technology, and smart contracts.',
   'id': 'crypto-coins-news',
   'language': 'en',
   'name': 'Crypto Coins News',
   'url': 'https://www.ccn.com'},
  {'category': 'technology',
   'country': 'us',
   'description': 'Engadget is a web magazine with obsessive daily coverage of everything new in gadgets and consumer electronics.',
   'id': 'engadget',
   'language': 'en',
   'name': 'Engadget',
   'url': 'https://www.engadget.com'},
  {'category': 'technology',
   'country': 'us',
   'description': 'Hacker News is a social news website focusing on co

In [0]:
#let sort them according to category
sorted(sourcesheadlines.keys())

['sources', 'status']

In [0]:
#we only wants articles
sourcesheadlines = sourcesheadlines['sources']

In [0]:
#convert to dataframe first
df3 =  pd.DataFrame(sourcesheadlines)
df3.head()

Unnamed: 0,id,name,description,url,category,language,country
0,ars-technica,Ars Technica,The PC enthusiast's resource. Power users and ...,http://arstechnica.com,technology,en,us
1,crypto-coins-news,Crypto Coins News,Providing breaking cryptocurrency news - focus...,https://www.ccn.com,technology,en,us
2,engadget,Engadget,Engadget is a web magazine with obsessive dail...,https://www.engadget.com,technology,en,us
3,hacker-news,Hacker News,Hacker News is a social news website focusing ...,https://news.ycombinator.com,technology,en,us
4,recode,Recode,"Get the latest independent tech news, reviews ...",http://www.recode.net,technology,en,us
