# Your personalized news podcast with Jina Reader, PromptPerfect, and Bark

This notebook will:

- Scrape RSS feeds from news sources for the latest articles.
- Summarize each article.
- Generate a one-paragraph news report from those summaries.
- Read it to you via text-to-speech.

This notebook is a companion to [this blog post](https://jina.ai/news/create-your-personalized-podcast-with-jina-reader-and-promptperfect) on [Jina AI's blog]().

## Define Settings

These settings define:

- **Feed URLS**: The feed URLs you want to extract data from. In this example they're from a couple of tech news websites.
- **Maximum quantities**: To keep this example manageable, we want to limit things to a certain number of feeds, news items, and sentences in the spoken output.

In [32]:
feed_urls = [
    "https://www.osnews.com/feed/",
    "https://www.theregister.com/headlines.atom"
]

In [33]:
# Maximum number of feeds to fetch
MAX_FEEDS = 10

# Maximum news items per feed to fetch
MAX_ENTRIES = 3

# Maximum sentences of the news script to convert to speech
VOICE_MAX_SENTENCES = 7

## Add API Key

You will be prompted to enter your PromptPerfect API key below.

In [34]:
import getpass
PROMPTPERFECT_KEY = getpass.getpass()

 ········


## Get Article URLs from Feeds

Extract the latest stories from the feeds we defined.

In [None]:
!pip install feedparser

In [37]:
import feedparser, requests

In [38]:
page_urls = []

In [39]:
for feed_url in feed_urls[:MAX_FEEDS]:
    feed = feedparser.parse(feed_url)
    for entry in feed["entries"][:MAX_ENTRIES]:
        page_urls.append(entry["link"])

## Extract Article text

Define a list of URLs of news sources, then pass each URL to Jina Reader to extract the text of the article without any of the junk like sidebars, headers, footers, etc.

In [40]:
articles = []

for url in page_urls:
    print(f"Processing {url}")
    reader_url = f"https://r.jina.ai/{url}"
    article = requests.get(reader_url)
    articles.append(article.text)

Processing https://www.osnews.com/story/139409/logitech-adds-chatgpt-to-its-computer-mice/
Processing https://www.osnews.com/story/139407/the-man-who-killed-google-search/
Processing https://www.osnews.com/story/139405/fedora-40-released-with-kde-plasma-6-and-gnome-46/
Processing https://go.theregister.com/feed/www.theregister.com/2024/04/24/ads_on_gov_uk_websites/
Processing https://go.theregister.com/feed/www.theregister.com/2024/04/24/japan_telco_app_ip/
Processing https://go.theregister.com/feed/www.theregister.com/2024/04/24/china_telcos_buying_ai_servers/


## Summarize each article

Pass each article text to a customized Prompt-as-a-Service on PromptPerfect, generating a summary of each.

Since we're using several prompts-as-services, let's define one function that we can use throughout the script:
- The function's `prompt_id` parameter defines which prompt we call. Each Prompt-as-Service has a unique ID.
- The `template_dict` parameter lets us define variables, like the initial article text or list of concatenated articles.

In [41]:
def get_paas_response(prompt_id, template_dict):
    url = f"https://api.promptperfect.jina.ai/{prompt_id}"

    headers = {
        "x-api-key": f"token {PROMPTPERFECT_KEY}",
        "Content-Type": "application/json"
    }
    
    response = requests.post(url, headers=headers, json={"parameters": template_dict})
    if response.status_code == 200:
        text = response.json()["data"]
        return text
    else:
        return response.text

In [42]:
summaries = []

for article in articles:
    summary = get_paas_response(prompt_id="mkuMXLdx1kMU0Xa8l19A", template_dict={"article": article})
    summaries.append(summary)

In [43]:
summaries

['Logitech has added a new feature called Logi AI Prompt Builder to its computer mice, which utilizes Generative AI to enhance efficiency and creativity for users. The tool allows users to rephrase, summarize, and create custom prompts with ChatGPT, with minimal disruption to their workflow. This unexpected addition has surprised Logitech mice users, who can now access the AI prompt builder tool after a software update.',
 "The article discusses how Google's finance and advertising teams, led by Raghavan with CEO Sundar Pichai's approval, intentionally made Google Search worse in order to increase profits. This approach, referred to as the Rot Economy, focuses on maximizing revenue at the expense of user experience, turning beloved products into frustrating tools that require users to fight against the company's intentions. The author encourages readers to look up the emails detailing this story, highlighting the negative impact of prioritizing financial gain over product quality.",
 "

In [44]:
# Put all of the summaries into one text string as bullet points
concat_summaries = "\n- ".join(summaries)

## Convert summaries to news report script

Use another Prompt-as-a-Service to generate a natural sounding news report from the summaries.

In [45]:
news_script = get_paas_response(prompt_id="tmW07mipzJ14HgAjOcfD", template_dict={"summaries": concat_summaries})

## Convert news report script to speech

We'll use Google's [gTTS library](https://pypi.org/project/gTTS/) to convert our news report script to speech.

In [155]:
!pip install gTTS

Collecting gTTS
  Downloading gTTS-2.5.1-py3-none-any.whl.metadata (4.1 kB)
Collecting click<8.2,>=7.1 (from gTTS)
  Using cached click-8.1.7-py3-none-any.whl.metadata (3.0 kB)
Downloading gTTS-2.5.1-py3-none-any.whl (29 kB)
Using cached click-8.1.7-py3-none-any.whl (97 kB)
Installing collected packages: click, gTTS
Successfully installed click-8.1.7 gTTS-2.5.1


In [162]:
from gtts import gTTS

tts = gTTS(news_script, tld="us")
tts.save("output.mp3")

## Play audio

In [161]:
Audio("output.mp3")

## Next steps

We're not going to dive into how to turn this content into a podcast. There are plenty of better guides out there that can cover that!