An automated Python-based system that collects articles from multiple sources, filters them, and publishes updates directly to a Telegram channel.
This project is designed for content automation, especially useful for research platforms, blogs, and news aggregation systems.
This application performs the following tasks:
- Reads a list of sources (websites/RSS feeds)
- Extracts new article URLs
- Avoids duplicates using tracking
- Optionally classifies articles using AI
- Sends updates to Telegram automatically
- Stores processed data for tracking
- Python 3
- requests – HTTP requests for static websites
- BeautifulSoup (bs4) – HTML parsing
- Playwright – Dynamic website scraping (JavaScript support)
- dotenv – Environment variable management
- Telegram Bot API – Sending messages
- CSV / JSON – Data storage
- Cron – Task scheduling (automation)
oped-agent/
│
├── app-telegram.py # Main automation script
├── sources.txt # List of source URLs
├── seen_urls.txt # Tracks processed URLs
├── data/ # Stored article data (CSV/JSON)
├── .env # Environment variables (Telegram token, etc.)
├── requirements.txt # Python dependencies
└── README.md
git clone https://github.com/drmashiur/oped-agent.git
cd oped-agentpython3 -m venv .venv
source .venv/bin/activatepip install -r requirements.txtpython -m playwright install(Optional – install dependencies for Linux)
python -m playwright install --with-depsCreate a .env file:
# OpenAI API Key
OPENAI_API_KEY=your_openai_api_key_here
# Telegram Bot Configuration
TELEGRAM_BOT_TOKEN=your_telegram_bot_token_here
TELEGRAM_CHAT_ID=your_telegram_chat_id_here
You can use the env.example for your use
Edit sources.txt and add URLs:
https://example.com/rss
https://another-source.com
create a folder data that will store the output data.
For simple output in your pc, run this command. It will not notify to your channel
python app.pyTo send notification to your Telegram channel run this:
python app-telegram.pyRun every hour from 7 AM to 10 PM:
crontab -eAdd:
0 7-22 * * * cd /home/user/oped-agent && /home/user/oped-agent/.venv/bin/python app-telegram.py >> /home/user/app.log 2>&1-
Loads sources from
sources.txt -
Scrapes article links using:
requests+BeautifulSoupPlaywright(for dynamic sites)
-
Checks against
seen_urls.txtto avoid duplicates -
Optionally classifies content (AI-based)
-
Sends formatted messages to Telegram
-
Saves data into
data/folder
- ✅ Fully automated content pipeline
- ✅ Duplicate filtering
- ✅ Supports dynamic websites
- ✅ Telegram integration
- ✅ Scalable for multiple sources
- ✅ Lightweight and customizable
- Ensure Playwright dependencies are installed on Linux
- Use virtual environment to avoid package conflicts
- Keep
.envfile secure (do not commit it)
- News aggregation systems
- Research monitoring tools
- Telegram content automation
- Blog/article discovery pipelines
Dr. Mashiur Rahman
This project is open-source and can be modified for personal or commercial use.