This project was created to read daily digests based on biotech news. You may need to edit prompts for your purpose.
This repository includes a project designed to download RSS feeds, filter them with regular expressions, and generate daily digests using OpenAI GPT models (GPT-3.5 and GPT-4 are used for different stages of the pipeline). Here's a quick overview:
- Download RSS feeds: The project reads data from various RSS feeds.
- Filter using regex: Filters the content of the feeds based on pre-defined regular expressions.
- Detect topics: Utilizes OpenAI's GPT to analyze and categorize text into topics.
- Generate daily digest: Rounds up the categorized content and generates a daily digest using OpenAI GPT models.
- Translates digest to Russian language.
- Publishes digests for web access.
- Publishes highlights and link to the digest to Telegram channel.
Enjoy exploring the code and feel free to contribute!
The text above is generated by GPT-4
- conda
- docker (with compose plugin)
- Telegram bot
- 2 or 3 Telegram channels
- public domain
Create one or more CSV source files with RSS feeds using this format:
My Feed; https://example.com/rss
Create one or more TXT keywords files with regexps you want to use to filter:
PD-?1
Create a conda environment:
cconda env create -f environment.yml
Set these environment variables:
export DIGEST_PATH="..." # Path to the current repository local installation
export CONDA_PATH="..." # Path of conda installation
export OPENAI_API_KEY="sk-..." # OpenAI API Key, should support GPT-4
export TELEGRAM_DIGEST_KEY="..." # Telegram Bot API, obtained from @BotFather
export TELEGRAM_DIGEST_EN_CHANNEL="..." # English channel chat_id
export TELEGRAM_DIGEST_RU_CHANNEL="..." # Russian channel chat_id
export TELEGRAM_DIGEST_TEST_CHANNEL="..." # Test channel chat_id
export DIGEST_DOMAIN="example.com" # FQDN of the domain for web access
Create digest docker image:
docker build . -t digest
Copy cron_task.sh
to cron directory with the period that you wish, e.g.:
cp cron_task.sh /etc/cron.daily/
Run a compose with
You can use a test mode to send all output to a test channel:
TEST_RUN=1 ./cron_task.sh
You are also able to use different components of the package:
% python -m digest.generate --help
usage: digest.generate [-h] [--output OUTPUT] [--summaries] [--highlights]
[--fix-links]
options:
-h, --help show this help message and exit
--output OUTPUT custom path to write digest
--summaries add topics summaries
--highlights add daily highlights
--fix-links fix meshed up links
% python -m digest.translate --help
usage: digest.translate [-h] [--input INPUT] [--output OUTPUT]
options:
-h, --help show this help message and exit
--input INPUT custom path of digest to translate
--output OUTPUT custom path to write translated digest
% python -m digest.telegram --help
usage: digest.telegram [-h] --input INPUT_PATH [--english] [--russian]
[--only-highlights]
options:
-h, --help show this help message and exit
--input INPUT_PATH input path of digest
--english work with english channel
--russian work with russian channel
--only-highlights send ony highlights and a URL