Skip to content

A neural network generated daily digest based on RSS feeds

License

Notifications You must be signed in to change notification settings

zmactep/gpt-digest

Repository files navigation

GPT Digest

Try it! (en) Try it! (ru)

This project was created to read daily digests based on biotech news. You may need to edit prompts for your purpose.

This repository includes a project designed to download RSS feeds, filter them with regular expressions, and generate daily digests using OpenAI GPT models (GPT-3.5 and GPT-4 are used for different stages of the pipeline). Here's a quick overview:

  1. Download RSS feeds: The project reads data from various RSS feeds.
  2. Filter using regex: Filters the content of the feeds based on pre-defined regular expressions.
  3. Detect topics: Utilizes OpenAI's GPT to analyze and categorize text into topics.
  4. Generate daily digest: Rounds up the categorized content and generates a daily digest using OpenAI GPT models.
  5. Translates digest to Russian language.
  6. Publishes digests for web access.
  7. Publishes highlights and link to the digest to Telegram channel.

Enjoy exploring the code and feel free to contribute!

The text above is generated by GPT-4

Prerequires

  • conda
  • docker (with compose plugin)
  • Telegram bot
  • 2 or 3 Telegram channels
  • public domain

Installation

Create one or more CSV source files with RSS feeds using this format:

My Feed; https://example.com/rss

Create one or more TXT keywords files with regexps you want to use to filter:

PD-?1

Create a conda environment:

cconda env create -f environment.yml

Set these environment variables:

export DIGEST_PATH="..." # Path to the current repository local installation
export CONDA_PATH="..." # Path of conda installation

export OPENAI_API_KEY="sk-..." # OpenAI API Key, should support GPT-4
export TELEGRAM_DIGEST_KEY="..." # Telegram Bot API, obtained from @BotFather

export TELEGRAM_DIGEST_EN_CHANNEL="..." # English channel chat_id
export TELEGRAM_DIGEST_RU_CHANNEL="..." # Russian channel chat_id
export TELEGRAM_DIGEST_TEST_CHANNEL="..." # Test channel chat_id

export DIGEST_DOMAIN="example.com" # FQDN of the domain for web access

Create digest docker image:

docker build . -t digest

Copy cron_task.sh to cron directory with the period that you wish, e.g.:

cp cron_task.sh /etc/cron.daily/

Run a compose with

Test usage

You can use a test mode to send all output to a test channel:

TEST_RUN=1 ./cron_task.sh

You are also able to use different components of the package:

% python -m digest.generate --help
usage: digest.generate [-h] [--output OUTPUT] [--summaries] [--highlights]
                       [--fix-links]

options:
  -h, --help       show this help message and exit
  --output OUTPUT  custom path to write digest
  --summaries      add topics summaries
  --highlights     add daily highlights
  --fix-links      fix meshed up links

% python -m digest.translate --help
usage: digest.translate [-h] [--input INPUT] [--output OUTPUT]

options:
  -h, --help       show this help message and exit
  --input INPUT    custom path of digest to translate
  --output OUTPUT  custom path to write translated digest

% python -m digest.telegram --help 
usage: digest.telegram [-h] --input INPUT_PATH [--english] [--russian]
                       [--only-highlights]

options:
  -h, --help          show this help message and exit
  --input INPUT_PATH  input path of digest
  --english           work with english channel
  --russian           work with russian channel
  --only-highlights   send ony highlights and a URL

About

A neural network generated daily digest based on RSS feeds

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published