## Using Stocknews API to grab financial data
The script uses `requests` module to extract financial news headlines, and other important features such as Time and Date, News Outlet, Sector (Technology, Healthcare, Finance), Summary Text and Headline (used for sentimental analysis calculation), Tickers.

The next part of this script is using the Headline and Summary text extracted from the API to calculate the sentimental score using Vader, check `https://github.com/cjhutto/vaderSentiment` for more information.

### Import necessary libraries

In [1]:
import requests
from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer
import pandas as pd
import os

I defined the API_KEY that you can all use to grab financial news info, more syntax on the documentation can be found here: `https://stocknewsapi.com/documentation` 

This part is fixed so we can define them here, for the date_range you can play around and choose only a specific day or a shorter time range

In [4]:
API_KEY = 'n44o8yuk9jdyg4lgw8rsueij4xmv5gznbpsw9rni'
date_range = '03152019-03152024'
sectors = ['technology', 'healthcare', 'financial']

## Using only one Sector 

`base_url: https://stocknewsapi.com/api/v1/category?section=alltickers` - this is the location of where the financial data are
- `section=alltickers` mean its for all available stocks (I did more filtering in the next line)
`url = f'{base_url}&sector=technology&exchange=NYSE,NASDAQ&index=SP500&country=USA&type=article&date={date_range}&items=100&page=1&extra-fields=id&token={API_KEY}'`
- `sector=technology`: this specifies the sector you want so you can manually change this or as you can see below, over a loop
- `exchange=NYSE, NASDAQ`: these are the two main sources of S&P500 stocks
- `index=SP500`: only limiting to SP 500 stocks 
- `country=USA`: limited to US 
- `type=article`: they also have videos, however I couldn't get proper sentiment scores so I sticked to articles 
- `date_range`: the variable above is fixed from March 15, 2019 - March 15, 2024 but for simpler calls you make the search smaller, in the simple code below its set to `last60min` to make it simple
- `items=10`: number of news articles per page, maximum is 100
- `page=1`: how many pages im getting, in this example to make it simple I am getting only 1 page


In [5]:

base_url = 'https://stocknewsapi.com/api/v1/category?section=alltickers'
url = f'{base_url}&sector=technology&exchange=NYSE,NASDAQ&index=SP500&country=USA&type=article&date=last60min&items=10&page=1&extra-fields=id&token={API_KEY}'
response = requests.get(url)
#print(response.text) ##uncomment this to check if you have output
#print(response.status_code) ## uncomment to check the status code (200 means successful)
news_data = response.json()
news_data

{'message': 'Trial Plans can query up to 3 news items. Activate your plan today and query up to 100 news items.'}

## Current code: Extracting News from API + Calculating Sentiment Scores
`def fetch_news_for_sector(sector, page)`: this function is similar to the code above and uses the sector and number of pages you want to fetch as input. Reminder: 1 page has 100 news items maximum. The function returns the response in json format grabbing `data` (`response_news.json()['data']`) which contains all of the information 

`def calculate_sentiment_for_news`: using Vader I used the title/headline and summary text to calclate the sentiment scores, as defined in Vader documentation you get 4 values: 
- positive, neutral, and negative scores are ratios for proportions of text that fall in each category (so these should all add up to be 1... or close to it with float operation)
- compound is the combined score from -1 to 1

In [6]:
#Calculate sentiment scores from the news urls extracted using API

#first step: function news for a sector 
def fetch_news_for_sector(sector, page):
    news_base_url = 'https://stocknewsapi.com/api/v1/category?section=alltickers'
    url_news = f'{news_base_url}&sector={sector}&exchange=NYSE,NASDAQ&index=SP500&country=USA&type=article&metadata=1&date={date_range}&items=100&page={page}&extra-fields=id&token={API_KEY}'
    response_news = requests.get(url_news)
    if response_news.status_code == 200:
        return response_news.json()['data']
    else: 
        print(f"Failed to fetch data for sector {sector} on page {page}, status code: {response_news.status_code}")
        print(f"Response content: {response_news.text}")
        return []
    
#second step: fetch the sentiment for the specific news 
def calculate_sentiment_for_news(title, text):
    analyzer = SentimentIntensityAnalyzer()
    title_text = title + " " + text
    return analyzer.polarity_scores(title_text)

This part creates a dictionary called `news_by_sector` which arranges the financial news data by sector, and saves all the important components of the news including the sentiment scores. 

In [7]:
news_by_sector = {sector: {} for sector in sectors}
for sector in sectors:
    for page in range(1, 101):
        news_data = fetch_news_for_sector(sector, page)
        if not news_data:
            continue
        for news_item in news_data:
            news_title = news_item['title']
            news_text = news_item['text']
            sentiment_scores = calculate_sentiment_for_news(news_title, news_text)
            ticker = news_item['tickers'][0]
            if ticker not in news_by_sector[sector]:
                news_by_sector[sector][ticker] = []
            news_item['sentiment_neg'] = sentiment_scores['neg']
            news_item['sentiment_neu'] = sentiment_scores['neu']
            news_item['sentiment_pos'] = sentiment_scores['pos']
            news_item['sentiment_tot'] = sentiment_scores['compound']
            news_by_sector[sector][ticker].append(news_item)

Failed to fetch data for sector technology on page 1, status code: 403
Response content: {"message":"Trial Plans can query up to 3 news items. Activate your plan today and query up to 100 news items."}
Failed to fetch data for sector technology on page 2, status code: 403
Response content: {"message":"Trial Plans can query up to 3 news items. Activate your plan today and query up to 100 news items."}
Failed to fetch data for sector technology on page 3, status code: 403
Response content: {"message":"Trial Plans can query up to 3 news items. Activate your plan today and query up to 100 news items."}
Failed to fetch data for sector technology on page 4, status code: 403
Response content: {"message":"Trial Plans can query up to 3 news items. Activate your plan today and query up to 100 news items."}
Failed to fetch data for sector technology on page 5, status code: 403
Response content: {"message":"Trial Plans can query up to 3 news items. Activate your plan today and query up to 100 news