Python Telegram Scraper (pts)

This is a scraper doesn't use the Telegram API but scrapes the web frontend. Hence, it doesn't require any API credentials. It currently supports the telegram-store.com web frontend as well as t.me. It seems that the first of the two only supports scrolling back to a certain point, while t.me allows for retrieval of the entire content of a channel.

Installation

Use a virtual environment:

pip install -r requirements.txt

Usage Examples

The following will use t.me to scroll through the entire supplied channel and store all results in a directory called output.

./pts.py --storage file --directory output --strategy preview crawl_channel YOUR_CHANNEL_NAME

Instead of file, you can pass nsq to publish to an NSQ instead of the local file system. Pass an appropriate value for --nsqd-tcp-address in that case.

./pts.py extract

The above line will print some stats of the parsed data (if stored on disk) to quickly gauge related channels and referenced websites. It will also print a histogram across times of day to be able to easily guess the activity time zone of a channel.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
pts		pts
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Python Telegram Scraper (pts)

Installation

Usage Examples

About

Uh oh!

Releases

Packages

Languages

larsborn/python-telegram-scraper

Folders and files

Latest commit

History

Repository files navigation

Python Telegram Scraper (pts)

Installation

Usage Examples

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages