Skip to content

larsborn/python-telegram-scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Python Telegram Scraper (pts)

This is a scraper doesn't use the Telegram API but scrapes the web frontend. Hence, it doesn't require any API credentials. It currently supports the telegram-store.com web frontend as well as t.me. It seems that the first of the two only supports scrolling back to a certain point, while t.me allows for retrieval of the entire content of a channel.

Installation

Use a virtual environment:

pip install -r requirements.txt

Usage Examples

The following will use t.me to scroll through the entire supplied channel and store all results in a directory called output.

./pts.py --storage file --directory output --strategy preview crawl_channel YOUR_CHANNEL_NAME 

Instead of file, you can pass nsq to publish to an NSQ instead of the local file system. Pass an appropriate value for --nsqd-tcp-address in that case.

./pts.py extract

The above line will print some stats of the parsed data (if stored on disk) to quickly gauge related channels and referenced websites. It will also print a histogram across times of day to be able to easily guess the activity time zone of a channel.

About

Telegram scraper written in Python with a focus on web frontends

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages