Skip to content

engineervix/zed-news

Repository files navigation

zed-news

Automated news podcast consisting of AI-powered updates from various Zambian πŸ‡ΏπŸ‡² sources.

made-with-python made-with-javascript

CI/CD Coverage healthchecks.io

python3 Node v16 Code style: black code style: prettier

Commitizen friendly Conventional Changelog Conventional Commits

Introduction

This is a tool that gathers news from various Zambian πŸ‡ΏπŸ‡² sources, summarizes the news items and presents the news as a podcast.

It consists primarily of two parts / components:

  • core -- this is primarily python code, where the following tasks are handled:
    • gather the news using requests, feedparser and beautifulsoup4
    • summarise the news using LLMs,
    • create the podcast transcript,
    • convert text to speech using AWS Polly,
    • process the audio using ffmpeg, and
    • generate content for the website.
  • web -- this is an 11ty project, consisting of logic to build a static site for the podcast, including an RSS feed.

Why this project?

  • I'm generally terrible at keeping up with current affairs
  • I wanted to learn how to work with AI tools while solving a real problem
  • I was inspired by Hackercast

Development

  • clone / fork the project
  • cd into the project directory

Core

Note

You need to have docker and docker-compose on your machine

On your machine:

  • you need to have poetry installed

  • create a python virtual environment

  • upgrade pip to latest version

    pip install --upgrade pip
  • install dependencies

    poetry install
  • update environment variables.

    # copy .env.sample to .env
    cp -v .env.sample .env
    
    # Now you can update the relevant values in the .env file
  • build images and spin up docker containers

    inv up --build
  • access the app container

    inv exec app bash

In the container:

  • you can run tests

    inv test
  • you can run the program

    inv toolchain

See available invoke tasks with invoke -l

The project uses pgweb to help visualize database changes. You can access this in your browser at http://127.0.0.1:8081

Web

This project uses Node v16. I use volta to manage node versions. If you have volta installed on your machine, then it'll automatically use the correct Node binary for this project.

  • install frontend dependencies

    npm install
  • start the dev server, accessible at http://127.0.0.1:8080/

    npm start

See other available scripts in package.json.

Deployment

The final outputs of this project are:

Warning

Ensure that environment variables are updated accordingly for both core and web.

For a smooth, unattended setup, please follow these steps:

  1. Set up a *nix machine (it can be your laptop, a VPS, etc.) with a Python virtual environment for the project, and make sure docker and docker-compose are installed.

  2. Configure a cron job on the machine to run the cron.sh script located in the repository root. This script will handle the automated generation and deployment process.

  3. Ensure that the machine has git properly configured. This is necessary for the cron.sh script to push the generated content to the repository, triggering the build and deployment.

By following these steps, you can automate the deployment process and keep your project up to date without manual intervention.

Note

The cron.sh script uses apprise to notify the owner when a new episode is ready. You'll need to check the apprise docs on how to configure Telegram.

Feel free to adapt the deployment setup to your specific requirements and preferred hosting platforms.

Contributing

This project follows the all-contributors specification. Contributions, issues and feature requests are most welcome! A good place to start is by helping out with the unchecked items in the TODO section of this README!

Feel free to check the issues page and take a look at the contributing guide before you get started.

To maintain code quality and formatting consistency, we utilize pre-commit hooks. These hooks automatically check and format your code before each commit. This helps ensure that the codebase remains clean and consistent throughout the development process. Set up the Git pre-commit hooks by running the following

pre-commit install && pre-commit install --hook-type commit-msg

See pre-commit-config.yaml for more details. In addition, please note the following:

  • if you're making code contributions, please try and write some tests to accompany your code, and ensure that the tests pass. Also, were necessary, update the docs so that they reflect your changes.
  • your commit messages should follow the conventions described here. Write your commit message in the imperative: "Fix bug" and not "Fixed bug" or "Fixes bug". Once you are done, please create a pull request.

TODO

Docs

  • Add architecture diagram

Dev

Frontend (Web)

  • Create a More ways to listen button with a popup/modal so that people can choose multiple services
  • Keep things DRY. For example, the More ways to listen modal on the home and about pages, the header and footer icons.
  • Toggle Dark/Light mode
  • Improve the mobile UI. For example, the audio player controls
  • Improve a11y. For instance, learn more about using the aria-current attribute
  • Implement search on the web app

Backend (Core)

  • Add a separate module for summarization backends so we can choose which one to work with
  • Add more robust error handling on requests and feedparser jobs as well as all other operations, such as connecting to AWS Polly, etc.
  • Add task to perform substitution so that, for instance, K400 is written as 400 Kwacha. The AWS Polly voices fail to read Zambian money correctly.

Features for future releases

  • Add Diamond TV as a news source. Might be a good idea to replace Muvi TV with Diamond TV because the latter seems to have infrequent updates. Also, we don't want too many news items -- it kills the whole point of this project -- to get the latest updates delivered in a concise manner.
  • Connect with social media platforms and automagically tweet, post to facebook when a new episode is out.
  • Incorporate a newsletter version where the news is sent to your mailbox in a nice, clean format. People can subscribe / unsubscribe.
  • Mention the weather in Lusaka, Livingstone, Kabwe, etc. Perhaps the weather forecast for the following day?
  • Mention exchange rates
  • Cleanup the news by consolidating similar articles from different sources. In other words, let's make this DRY.
  • Find a way of training the voice to learn how to pronounce Zambian words.
  • Find a way to summarize for free, without relying on OpenAI's API. Perhaps train your own model, learn how to leverage tools like NLTK, spaCy, etc.
  • Find a way to make a closing statement based on the news. Something like, "Don't forget to register yor sim card before the ZICTA deadline ..."
  • Keep the background music running throughout the show
  • Different background music for each day of the week
  • Possibly allow for passing of an argument variable for the voice, or dynamically choose a voice from a list, just like the random intros and outros.

Credits

Music

Note

These audio files have the gain reduced by -20dB, like this:

ffmpeg -i intro.src.mp3 -af "volume=-20dB" intro.mp3

Icon

News Sources