# 03_telegram_scraper.ipynb
Scrape public Canadian far-right / extremist Telegram channels
→ 100–300+ messages
(For Garon’s PhD – Carleton RA project)

In [5]:
from telethon.sync import TelegramClient
from telethon.tl.functions.messages import GetHistoryRequest
import pandas as pd
from pathlib import Path
import os

api_id   = int(os.getenv("TELEGRAM_API_ID"))
api_hash = os.getenv("TELEGRAM_API_HASH")
phone    = os.getenv("TELEGRAM_PHONE")

if not all([api_id, api_hash, phone]):
    raise ValueError("Run Cell 0 first to load credentials!")

client = TelegramClient('garon_session', api_id, api_hash)
print("Telegram client ready!")

Telegram client ready!


In [2]:
# Public Canadian far-right / adjacent channels 
CHANNELS = [
    "https://t.me/canadianpatriotnetwork",
    "https://t.me/diagonation",
    "https://t.me/plandemicrevolution",
    "https://t.me/canadafirstofficial",
    "https://t.me/therealcanadianpatriot",
    "https://t.me/standupcanada",
    "https://t.me/convoyreports",
    # Add more 
]

# Where to save
TELEGRAM_DIR = Path("data/telegram_raw")
TELEGRAM_DIR.mkdir(parents=True, exist_ok=True)

In [3]:
async def scrape_channel(channel_url):
    await client.start(phone)
    entity = await client.get_entity(channel_url)
    channel_name = entity.username or entity.id
    
    messages = []
    offset_id = 0
    limit = 1000  # max per channel (Telegram limit)
    
    while True:
        history = await client(GetHistoryRequest(
            peer=entity,
            offset_id=offset_id,
            offset_date=None,
            add_offset=0,
            limit=100,
            max_id=0,
            min_id=0,
            hash=0
        ))
        if not history.messages:
            break
        for msg in history.messages:
            messages.append({
                'channel': channel_name,
                'message_id': msg.id,
                'date': msg.date,
                'text': msg.message or "",
                'views': getattr(msg, 'views', 0)
            })
        offset_id = history.messages[-1].id
        if len(history.messages) < 100:
            break
    
    print(f"Scraped {len(messages)} messages from {channel_name}")
    return messages

# RUN THIS CELL — it will ask for your phone + verification code once
all_messages = []
for url in CHANNELS:
    try:
        msgs = await scrape_channel(url)
        all_messages.extend(msgs)
    except Exception as e:
        print(f"Failed {url}: {e}")

df = pd.DataFrame(all_messages)
df.to_csv("telegram_dataset.csv", index=False)
print(f"\nDONE! Saved {len(df)} Telegram messages to telegram_dataset.csv")
df.head(10)

Please enter the code you received:  56261


Signed in successfully as Hesjay; remember to not break the ToS or you will risk an account ban!
Failed https://t.me/canadianpatriotnetwork: No user has "canadianpatriotnetwork" as username
Failed https://t.me/diagonation: No user has "diagonation" as username
Failed https://t.me/plandemicrevolution: No user has "plandemicrevolution" as username
Scraped 1499 messages from CanadaFirstOfficial
Failed https://t.me/therealcanadianpatriot: Nobody is using this username, or the username is unacceptable. If the latter, it must match r"[a-zA-Z][\w\d]{3,30}[a-zA-Z\d]" (caused by ResolveUsernameRequest)


Telegram is having internal issues RpcCallFailError: Telegram is having internal issues, please try again later. (caused by GetHistoryRequest)


Scraped 73249 messages from standupcanada
Failed https://t.me/convoyreports: No user has "convoyreports" as username

DONE! Saved 74748 Telegram messages to telegram_dataset.csv


Unnamed: 0,channel,message_id,date,text,views
0,CanadaFirstOfficial,8435,2025-12-01 16:51:43+00:00,When White Canada began to fight back.\n\nAt t...,141.0
1,CanadaFirstOfficial,8434,2025-11-30 17:37:55+00:00,"""WHITE MAN, FIGHT BACK"" banner in Hamilton Ont...",166.0
2,CanadaFirstOfficial,8433,2025-11-25 06:09:59+00:00,The Anti-White system is allowed refugee statu...,296.0
3,CanadaFirstOfficial,8432,2025-11-17 20:18:16+00:00,The time for National Socialism in Canada is N...,352.0
4,CanadaFirstOfficial,8431,2025-11-12 21:01:32+00:00,"""A National Socialist wants to save his people...",431.0
5,CanadaFirstOfficial,8430,2025-11-12 17:59:58+00:00,If there was ever a face for Mass Deportations...,381.0
6,CanadaFirstOfficial,8429,2025-11-11 02:02:25+00:00,,331.0
7,CanadaFirstOfficial,8428,2025-11-11 02:02:24+00:00,,280.0
8,CanadaFirstOfficial,8427,2025-11-11 02:02:24+00:00,We are pleased to announce that we are finally...,250.0
9,CanadaFirstOfficial,8426,2025-11-07 00:01:22+00:00,We crashed a kosher-conservative vs Antifa ral...,326.0


Server closed the connection: [Errno 54] Connection reset by peer
Server closed the connection: [Errno 54] Connection reset by peer
Server closed the connection: [Errno 54] Connection reset by peer
Server closed the connection: [Errno 54] Connection reset by peer
Server resent the older message 7581377777622241281, ignoring
Server closed the connection: [Errno 54] Connection reset by peer
Server resent the older message 7581393565986852865, ignoring
Server closed the connection: [Errno 54] Connection reset by peer
Server resent the older message 7581426112611117057, ignoring
Server closed the connection: [Errno 54] Connection reset by peer
Server closed the connection: [Errno 54] Connection reset by peer
Server closed the connection: [Errno 54] Connection reset by peer
Server resent the older message 7581496278687440897, ignoring
Server closed the connection: [Errno 54] Connection reset by peer
Server closed the connection: [Errno 54] Connection reset by peer
Server resent the older me

In [4]:
print(f"Total messages: {len(df)}")
print(f"Date range: {df['date'].min()} → {df['date'].max()}")
print("\nSample messages:")
for text in df['text'].dropna().head(5):
    print("- " + text[:200].replace('\n', ' ') + "...")

Total messages: 74748
Date range: 2021-02-03 15:01:04+00:00 → 2025-12-07 23:17:55+00:00

Sample messages:
- When White Canada began to fight back.  At the spot where the statue of Canada's first prime minister Sir John A. Macdonald once stood in Hamilton Ontario.  Contact: NS13_88@protonmail.com  Instagram ...
- "WHITE MAN, FIGHT BACK" banner in Hamilton Ontario's Gore Park. In front of the empty pedestal where the statue of Sir John A. Macdonald once stood.  Contact: NS13_88@protonmail.com  Instagram ⚪️ GAB...
- The Anti-White system is allowed refugee status by app for anyone who comes to the border.  This is insanity , and it needs to be stopped.  Why even have a border control agency at this point? We need...
- The time for National Socialism in Canada is NOW.  We need a strong, intentional movement of our White population, and NS is the only political ideology that can unify and take the offensive for White...
- "A National Socialist wants to save his people.  A conservative wants