#A.I.N.A. 1.0

####**Artificially Intelligent News Anchor (A.I.N.A.)** is an AI that can get current news, summarize them, and tell them to you. The inspiration behind AINA came from my desire to keep up with financial news but the disinterest in browsing different sources. Additionally, when I tried new sources or podcasts that spoke about financial news they did not give me the types of information I desired. So, I decided to create AINA, a way to to quickly consume financial news in one place.

In [1]:
!pip install gtts

Collecting gtts
  Downloading gTTS-2.5.1-py3-none-any.whl (29 kB)
Installing collected packages: gtts
Successfully installed gtts-2.5.1


In [2]:
import pandas as pd
import numpy as np
import yfinance as yf

# Web scraping
import requests
from bs4 import BeautifulSoup

#Transformer
from transformers import pipeline

# Voice
from gtts import gTTS
from IPython.display import Audio

In [3]:
class AINA:

  def __init__(self) -> None:
    pass


  def get_links_titles(self, tickers_list):
  # This accesses the yfinance library and get the first news link and titles for a given ticker
    links = []
    titles = []
    article_amt = 1
    for tick in tickers_list:
      links.append(yf.Ticker(tick).news[article_amt]['link'])
      titles.append(yf.Ticker(tick).news[article_amt]['title'])
    return links, titles



  def get_story(self, link_list, title_list):
    title_story = {}
    body = []
    for index, url in enumerate(link_list):
      headers = {
        'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3'}

      response = requests.get(url, headers=headers)

      # Fetch the article
      html_content = response.text

      # Parse the HTML content
      soup = BeautifulSoup(html_content, 'html.parser')

      # Extract the main content of the article
      article_body = soup.find_all('p')
      # Print the text content of the article
      body.append([paragraph.get_text() for paragraph in article_body])
      # body.append(full)
      title_story[title_list[index]] = body[index]
    return title_story


  def summarize_stories(self, stories_dict):
    summarizer = pipeline("summarization", model='facebook/bart-large-cnn')

    summarized_stories = {}
    for title, story in stories_dict.items():
        if story != "Failed to retrieve the article.":
            # Join the paragraphs into a single string if story is a list
            if isinstance(story, list):
                story = " ".join(story)

            # Ensure the story is not too long
            story = story[:2000]  # Truncate to the first 1000 characters

            summary = summarizer(story, max_length=200, min_length=100, do_sample=False)

            if summary:
                summarized_stories[title] = summary[0]['summary_text']
            else:
                summarized_stories[title] = "Summary could not be generated."
        else:
            summarized_stories[title] = story
    return summarized_stories

  def voice(self, summarized):
    # Iterate over the dictionary and convert each story to speech
    for title, story in summarized.items():
        print(f"Title: '{title}'")
        tts = gTTS(text=story, lang='en')

        # Save the audio file
        audio_file = f"{title}.mp3"
        tts.save(audio_file)

        # Play the audio file
        display(Audio(audio_file))


In [4]:
if __name__ == '__main__':

  tickers = ['^GSPC', '^RUT', '^IXIC', '^DJI']
  aina = AINA()
  links, titles = aina.get_links_titles(tickers)
  story = aina.get_story(link_list = links, title_list = titles)
  summary = aina.summarize_stories(story)
  aina.voice(summary)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


config.json:   0%|          | 0.00/1.58k [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/1.63G [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/363 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/899k [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

Title: 'Dave Ramsey Says Take Social Security at Age 62, But Only If You Do This With Each Check'


Title: 'The Stock Market Is Doing Something Unseen Since the Year 2000. History Says This Happens Next.'


Title: 'Netflix results, retail sales, and a chip update: What to watch this week'
