# Newsdata.IO

With the free tier, we can make 200 searches per day and retrieve 10 articles per request (see https://newsdata.io/pricing). The metadata that is available is:
- Title.
- Link.
- Date.
- Author.
- Publisher.
- Country.
- Language.
- Description.

The full text, however, is not available (it should be directly scraped).

They also have available a [python client](https://newsdata.io/documentation/#client_py), check out the [GitHub](https://github.com/bytesview/python-client) and the [guide](https://newsdata.io/blog/news-api-python-client/). It doesn't seem to add much, though.

# Setup

In [19]:
from dotenv import load_dotenv
import os
import requests
from pprint import pprint

In [2]:
# Load environment variables from .env file
load_dotenv()

# Access the API key
api_key = os.getenv("NEWSDATA_API_KEY")
print(f"API Key loaded: {api_key[:3]}..." if api_key else "No API key found")

API Key loaded: pub...


# Making a request

See the [documentation](https://newsdata.io/documentation/#first-api-request).

Example for searching latest news articles with the keyword "pizza": https://newsdata.io/api/1/latest?apikey=pub_857112301fae34e55b800f06711f2df4d7b25&q=pizza. So first comes the API key, then `&q=` and then the query.

In [None]:
website = 'https://newsdata.io/api/1/'

search = 'latest?'

query = 'conflict in zimbabwe'

url = website + search + 'apikey=' + api_key + '&q=' + query

example = requests.get(url = url)

print(url)

print(example)

https://newsdata.io/api/1/latest?apikey=pub_857112301fae34e55b800f06711f2df4d7b25&q=conflict in zimbabwe
<Response [200]>


In [21]:
# Inspect the response status code and headers
print(example.status_code)
pprint(example.headers)

200
{'Date': 'Thu, 08 May 2025 15:57:20 GMT', 'Server': 'Apache/2.4.41 (Ubuntu)', 'x_rate_limit_remaining': '29', 'x_api_limit_remaining': '199', 'Access-Control-Allow-Origin': '*', 'Vary': 'Accept-Encoding', 'Content-Encoding': 'gzip', 'Keep-Alive': 'timeout=5, max=100', 'Connection': 'Keep-Alive', 'Transfer-Encoding': 'chunked', 'Content-Type': 'application/json'}


In [22]:
# Convert the response to JSON
response_json = example.json()
pprint(response_json, indent=2, width=100)

{ 'nextPage': '1746585892676231727',
  'results': [ { 'ai_org': 'ONLY AVAILABLE IN CORPORATE PLANS',
                 'ai_region': 'ONLY AVAILABLE IN CORPORATE PLANS',
                 'ai_tag': 'ONLY AVAILABLE IN PROFESSIONAL AND CORPORATE PLANS',
                 'article_id': 'eee9c04dcadf09089732cfee0b749f7b',
                 'category': ['top'],
                 'content': 'ONLY AVAILABLE IN PAID PLANS',
                 'country': ['zimbabwe'],
                 'creator': ['daglous masveta'],
                 'description': 'Wallace Ruzvidzo Herald Reporter SENEGAL supports Zimbabwe’s '
                                'regional and continental leadership and is throwing its weight '
                                'behind Zimbabwe’s bid for a non-permanent seat in the United '
                                'Nations Security Council (UNSC)...The post Senegal backs Zim UNSC '
                                'seat bid appeared first on herald.',
                 'duplicate': Fals

Note how the total number of articles that are returned (articles per request) is 10.

# News Archive

Unfortunately, accessing the news archive (articles past from the "latest news") is only under paid plans (check [here](https://newsdata.io/documentation/#news-archive)), with the most basic plan (and for which we would only have 6 months of historical data) costing 150 USD/month.

In [26]:
website = 'https://newsdata.io/api/1/'

search = 'archive?'

query = 'conflict'

url = website + search + 'apikey=' + api_key + '&q=' + query

example = requests.get(url = url)

print(url)

print(example.status_code)

https://newsdata.io/api/1/archive?apikey=pub_857112301fae34e55b800f06711f2df4d7b25&q=conflict
403


It returns an error, because we do not have access with the free plan to the archive news data.