# API-driven News Analytics: Probing Worldwide Media Trend

## Introduction

The passage you've provided is a well-structured introduction to the project, outlining its objectives, tools, and potential uses. It offers a clear insight into the project's aim and its capabilities, drawing upon the power and versatility of the News API.

Here's a slightly revised version, focusing on fluidity and clarity:

In today's digital landscape, as news rapidly evolves and spreads at an unprecedented pace, the need to quickly and efficiently access pertinent news articles has never been more vital. Our endeavor, "API-driven News Analytics: Probing Worldwide Media Trend," rises to meet this challenge, seeking to provide a methodical approach to searching and analyzing articles tailored to specific criteria.

At the heart of our project lies the News API, a potent HTTP REST API designed explicitly for on-the-fly article searches and retrievals. This platform extends beyond mere volume – it delves into the intricacies of the news world. Whether it's the trending headlines on TechCrunch, the newest revelations about the upcoming iPhone, or recent mentions of a particular company or product, NewsAPI is your comprehensive guide.

The search capabilities it offers are nuanced, allowing users to:

Zero in on particular keywords or phrases, with our primary focus being news related to 'Bitcoin'.
Filter by publication date, possibly highlighting articles from a specific day.
Restrict results to articles from designated domains like thenextweb.com.
Limit the search to articles written in a particular language, such as English.
Furthermore, the search results are dynamic. They can be sorted based on the publication date, relevance to the search term, or the popularity of the publishing source. This ensures a multifaceted analysis and deeper insights.

By leveraging the News API, we aspire to craft a platform that parallels the likes of Google News in interactivity but surpasses it in terms of search depth and analytic potential. From keeping tabs on global news trends to closely monitoring mentions of specific products, our mission is to deliver a holistic news analysis solution, achieved by curating the relevant corpus through API-based HTTP requests to NewsAPI (https://newsapi.org/).

## Demonstration

### Loading and importing necessary libraries and API keys

In [37]:
import requests
import os
from newsapi import API_KEY

### Endpoint details and usage
Everything Endpoint: Comprehensive News Search & Analysis

The "Everything" endpoint of NewsAPI allows users to scour millions of articles from over 80,000 diverse news sources and blogs. Designed for both article discovery and deeper analysis, it provides a versatile array of parameters to refine searches:

API Key: Mandatory for authentication.

Query (q): Enables advanced search techniques within article titles and content. Supports exact matches, inclusion/exclusion symbols (+/-), and logical operators (AND/OR/NOT).

Search Fields (searchIn): Limit search to specific fields like title, description, or content.

Sources & Domains: Allows narrowing or broadening search to specific sources or domains, with an option to exclude certain domains.

Date Ranges (from, to): Filter articles based on publication dates, adhering to the ISO 8601 format.

Language: Restrict results to specific languages using 2-letter ISO-639-1 codes.

Sorting (sortBy): Order articles by relevance, popularity, or publication date.

Pagination (pageSize, page): Control result quantity and navigate through multiple result pages.

In response, users receive a status indicating the request's success, the total result count, and an array of articles. Each article provides details such as the source, author, title, description, URLs (for the article and its image), publication date, and a truncated content preview.

In [None]:
## below are the conditional https request criteria I searched in the API call.
q = 'Bitcoin' ## seraching bitcoin in the news article
language= 'en' ## preferred language of the article is english
sortBy = 'popularity' ## sort the result by popularity

### Get HTTP request and calling newsapi

In [62]:
response = requests.get(f'https://newsapi.org/v2/everything?q={q}&language={language}&sortBy={sortBy}&apiKey={API_KEY}')
data = response.json()

### checking the response received from the api

In [63]:
data

{'status': 'ok',
 'totalResults': 9612,
 'articles': [{'source': {'id': 'the-verge', 'name': 'The Verge'},
   'author': 'Emma Roth',
   'title': 'PayPal launches PYUSD stablecoin backed by the US dollar',
   'description': 'PayPal has launched a stablecoin called PayPal USD, starting today and “rolling out in the coming weeks.” The new stablecoin can be used for purchases and person-to-person payments.',
   'url': 'https://www.theverge.com/2023/8/7/23822752/paypal-pyusd-stablecoin-cryptocurrency',
   'urlToImage': 'https://cdn.vox-cdn.com/thumbor/AzUxs8UmwIY2lOByn5LIX8geWjY=/0x0:2200x1650/1200x628/filters:focal(1100x825:1101x826)/cdn.vox-cdn.com/uploads/chorus_asset/file/24835037/PayPal_stablecoin.png',
   'publishedAt': '2023-08-07T14:07:51Z',
   'content': 'PayPal launches PYUSD stablecoin backed by the US dollar\r\nPayPal launches PYUSD stablecoin backed by the US dollar\r\n / PayPal USD is built on Ethereum and is 1:1 redeemable for US dollars.\r\nPayPal is… [+1960 chars]'},
  {'so

In [64]:
type(data)

dict

### Developing the corpus required for analysis by converting the data (relevent news articles) into the dataframe and later may store the same into the csv also 

In [69]:
import pandas as pd

articles_data = data['articles']

# Convert the list of articles into a dataframe
df = pd.DataFrame(articles_data)

# To display top 5 rows of the dataframe
df.head(5)


Unnamed: 0,source,author,title,description,url,urlToImage,publishedAt,content
0,"{'id': 'the-verge', 'name': 'The Verge'}",Emma Roth,PayPal launches PYUSD stablecoin backed by the...,PayPal has launched a stablecoin called PayPal...,https://www.theverge.com/2023/8/7/23822752/pay...,https://cdn.vox-cdn.com/thumbor/AzUxs8UmwIY2lO...,2023-08-07T14:07:51Z,PayPal launches PYUSD stablecoin backed by the...
1,"{'id': None, 'name': 'Gizmodo.com'}",Cheryl Eddy,Everyone's Favorite Knife-Wielding Robot Retur...,Futurama’s new season continues its examinatio...,https://gizmodo.com/futurama-hulu-new-ep-3-cli...,https://i.kinja-img.com/gawker-media/image/upl...,2023-08-04T20:45:00Z,Futuramas new season continues its examination...
2,"{'id': None, 'name': 'Gizmodo.com'}",Gordon Jackson and James Whitbrook,It's Time For Even More Casting Rumors for Fan...,Final Destination’s Jeffrey Reddick says the s...,https://gizmodo.com/fantastic-four-casting-rum...,https://i.kinja-img.com/gawker-media/image/upl...,2023-08-04T14:00:00Z,Final Destinations Jeffrey Reddick says the si...
3,"{'id': None, 'name': 'Gizmodo.com'}",Kyle Barr,"Dropbox Is Dropping Unlimited Storage, Blames ...",Dropbox is no longer offering new customers un...,https://gizmodo.com/dropbox-ends-unlimited-sto...,https://i.kinja-img.com/gawker-media/image/upl...,2023-08-25T14:10:00Z,Dropbox is no longer offering new customers un...
4,"{'id': 'bbc-news', 'name': 'BBC News'}",https://www.facebook.com/bbcnews,Anonymous Sudan hacks X to put pressure on Elo...,Prolific hackers accused of being a front for ...,https://www.bbc.co.uk/news/technology-66668053,https://ichef.bbci.co.uk/news/1024/branded_new...,2023-08-31T08:45:39Z,"A hacking group called Anonymous Sudan took X,..."


### Exporting the developed corpus into the csv format for later use

In [70]:
# Save the dataframe to a CSV file
df.to_csv('articles_data.csv', index=False)

### Summary

By leveraging an API to curate a corpus of 'Bitcoin'-related articles, we have the opportunity to undertake several insightful data mining and analytical ventures:

Sentiment Analysis: By utilizing tools like TextBlob or VADER, we can discern the prevailing sentiment surrounding Bitcoin in these articles.
Topic Modeling: With the aid of techniques such as LDA or NMF, we're equipped to pinpoint the salient themes and topics permeating the Bitcoin narrative.
Trend Analysis: A deep dive into article publication dates and the frequency of specific terms will illuminate evolving trends and shifts in the Bitcoin discourse over time.
Named Entity Recognition: Harnessing platforms like spaCy or NLTK, we can detect and categorize prominent entities, be they corporations or individuals, mentioned in conjunction with Bitcoin.
Source Analysis: By examining articles' origins, we can juxtapose and understand the varied stances and focal points different news outlets adopt concerning Bitcoin.
Keyword Frequency Analysis: A closer look at recurrent terms linked to Bitcoin, aided by visual representations like word clouds, offers a snapshot of the most resonant topics in the Bitcoin sphere.