# Cat-egories YouTube Channel Data Scraper

This notebook uses the YouTube Data API v3 to collect metadata from cat-themed YouTube channels.

## What it does:
- Reads channel IDs from `accounts.txt`
- Fetches channel metadata (subscribers, total views, etc.)
- Retrieves video data including:
  - Titles, descriptions, tags, hashtags
  - View counts, likes, comments
  - Published dates
- Exports data to separate CSV files for each channel
- Creates a summary CSV with all channels

## Before running:
1. **Get a YouTube API Key:**
   - Go to [Google Cloud Console](https://console.cloud.google.com/)
   - Create a new project or select existing
   - Enable YouTube Data API v3
   - Create credentials (API Key)

2. **Add your API key to `.env` file:**
   - Open the `.env` file in this directory
   - Replace `your_api_key_here` with your actual YouTube API key
   - Save the file

3. **Add channel IDs** to `accounts.txt` (already populated with examples)

## What Data Gets Scraped?

### For Each Channel:
The scraper collects **channel-level metadata**:
- **Channel Title** - Name of the channel
- **Channel Description** - About section text
- **Subscriber Count** - Number of subscribers
- **Total View Count** - All-time views across all videos
- **Video Count** - Total number of videos published
- **Published Date** - When the channel was created

### For Each Video (currently up to 50 per channel):
The scraper collects **video-level data**:

**Content Information:**
- **Video Title** - The video's title
- **Description** - Full video description text
- **Tags** - YouTube tags the creator assigned (stored as pipe-separated: `tag1|tag2|tag3`)
- **Hashtags** - Hashtags extracted from the description (stored as pipe-separated)
- **Duration** - Video length (in ISO format)
- **Published Date** - When the video was uploaded

**Engagement Metrics:**
- **View Count** - Number of views
- **Like Count** - Number of likes
- **Comment Count** - Number of comments
- **Favorite Count** - Number of favorites (usually 0, legacy metric)

**Identifiers:**
- **Video ID** - Unique YouTube video identifier
- **Channel ID** - Unique channel identifier
- **Channel Title** - For easy reference

### Example Row from CSV:
```
video_id: dQw4w9WgXcQ
title: Cute Cat Playing with Box
description: My cat loves this box! #catsofyoutube #funny
tags: cat|funny|pets|animals
hashtags: #catsofyoutube|#funny
view_count: 125000
like_count: 3500
comment_count: 450
duration: PT5M32S
published_at: 2024-03-15T10:30:00Z
```

## 1. Install Dependencies

In [12]:
%pip install google-api-python-client
%pip install pandas tqdm python-dotenv

Defaulting to user installation because normal site-packages is not writeable
Note: you may need to restart the kernel to use updated packages.




Defaulting to user installation because normal site-packages is not writeable
Note: you may need to restart the kernel to use updated packages.




## 2. Import Libraries, API Config, and Print Helpers

In [13]:
# Import required libraries
from googleapiclient.discovery import build
import pandas as pd
from datetime import datetime
import os
from tqdm import tqdm
import json
from dotenv import load_dotenv

# Color codes for terminal output
class Colors:
    GREEN = '\033[92m'
    RED = '\033[91m'
    YELLOW = '\033[93m'
    BLUE = '\033[94m'
    RESET = '\033[0m'
    BOLD = '\033[1m'

def print_success(message):
    """Print success message in green"""
    print(f"{Colors.GREEN}{message}{Colors.RESET}")

def print_error(message):
    """Print error message in red"""
    print(f"{Colors.RED}{message}{Colors.RESET}")

def print_warning(message):
    """Print warning message in yellow"""
    print(f"{Colors.YELLOW}{message}{Colors.RESET}")

def print_info(message):
    """Print info message in blue"""
    print(f"{Colors.BLUE}{message}{Colors.RESET}")

# Load environment variables from .env file
load_dotenv()

# Get YouTube API Key from environment variable
API_KEY = os.getenv('YOUTUBE_API_KEY')

if not API_KEY or API_KEY == 'your_api_key_here':
    raise ValueError("Please set your YOUTUBE_API_KEY in the .env file")

# Initialize YouTube API client
youtube = build('youtube', 'v3', developerKey=API_KEY)
print_success("YouTube API client initialized successfully")

[92mYouTube API client initialized successfully[0m


## 3. Define Helper Functions

These functions handle:
- Reading channel IDs from file
- Fetching channel information
- Retrieving video lists
- Getting detailed video metadata

In [15]:
def resolve_channel_identifier(youtube, identifier):
    """
    Resolve a channel identifier to a channel ID.
    Handles @username, custom URLs, and direct channel IDs.
    """
    identifier = identifier.strip()
    
    # If it's already a channel ID (starts with UC), return it
    if identifier.startswith('UC') and len(identifier) == 24:
        return identifier
    
    # If it starts with @, it's a username handle
    if identifier.startswith('@'):
        username = identifier[1:]  # Remove the @
        try:
            request = youtube.channels().list(
                part='id',
                forHandle=username
            )
            response = request.execute()
            if response.get('items'):
                return response['items'][0]['id']
        except:
            pass
    
    # Try as custom URL or username
    try:
        request = youtube.channels().list(
            part='id',
            forUsername=identifier.replace('@', '')
        )
        response = request.execute()
        if response.get('items'):
            return response['items'][0]['id']
    except:
        pass
    
    print_error(f"Could not resolve: {identifier}")
    return None


def read_channel_ids(filename='accounts.txt'):
    """
    Read channel identifiers from a text file and resolve them to channel IDs.
    Skips empty lines and lines starting with #
    """
    identifiers = []
    with open(filename, 'r') as f:
        for line in f:
            line = line.strip()
            if line and not line.startswith('#'):
                identifiers.append(line)
    return identifiers

# Test the function
identifiers = read_channel_ids('accounts.txt')
print_info(f"Found {len(identifiers)} channel identifiers:")
for identifier in identifiers:
    print(f"  - {identifier}")

[94mFound 29 channel identifiers:[0m
  - @CrunchycatLuna
  - @chacha-rme
  - @funnycattshorts
  - @PrincessNikacat
  - @ChipTheManx
  - @cats101_03
  - @CatPusicTeam
  - @OwlKitty
  - @coleandmarmalade
  - @DailyDoseOfInternetCats
  - @ChefCatChangAn666
  - @FeedingStreetCats
  - @TastyPaws
  - @cat.mp4.666
  - @LittleLove666
  - @PawsomeCatsOfTheInternet
  - @PurrfectPets24
  - @CaDAnimals
  - @dextheorangecat
  - @RenusDelph
  - @FunnyandcuteCatLife
  - @Maine_Coon_Kittens
  - @walterthecatt
  - @CatManChrisPoole
  - @TakeYourDoseOfCats
  - @funcatflicks-l
  - @funnyPaws_show
  - @Meowphorius
  - @elcatogato


In [16]:
def get_channel_info(youtube, channel_id):
    """
    Fetch channel metadata including title, description, subscriber count, view count, etc.
    """
    try:
        request = youtube.channels().list(
            part='snippet,statistics,contentDetails',
            id=channel_id
        )
        response = request.execute()
        
        if not response.get('items'):
            print(f"No channel found for ID: {channel_id}")
            return None
        
        channel = response['items'][0]
        
        # Clean description for CSV compatibility
        description = channel['snippet']['description']
        description_clean = description.replace('\n', ' ').replace('\r', ' ')
        
        channel_data = {
            'channel_id': channel_id,
            'channel_title': channel['snippet']['title'],
            'channel_description': description_clean,  # Cleaned description
            'published_at': channel['snippet']['publishedAt'],
            'subscriber_count': channel['statistics'].get('subscriberCount', 0),
            'view_count': channel['statistics'].get('viewCount', 0),
            'video_count': channel['statistics'].get('videoCount', 0),
            'uploads_playlist_id': channel['contentDetails']['relatedPlaylists']['uploads']
        }
        
        return channel_data
    except Exception as e:
        print(f"Error fetching channel info for {channel_id}: {e}")
        return None

In [17]:
def get_channel_videos(youtube, uploads_playlist_id, max_results=50):
    """
    Fetch video IDs from a channel's uploads playlist.
    Returns a list of video IDs.
    """
    video_ids = []
    next_page_token = None
    
    try:
        while len(video_ids) < max_results:
            request = youtube.playlistItems().list(
                part='contentDetails',
                playlistId=uploads_playlist_id,
                # Can be updated to fetch more than 50 if needed, should check API limits
                maxResults=min(50, max_results - len(video_ids)),
                pageToken=next_page_token
            )
            response = request.execute()
            
            for item in response['items']:
                video_ids.append(item['contentDetails']['videoId'])
            
            next_page_token = response.get('nextPageToken')
            
            if not next_page_token:
                break
                
    except Exception as e:
        print(f"Error fetching videos: {e}")
    
    return video_ids

In [18]:
def get_video_details(youtube, video_ids):
    """
    Fetch detailed information for a list of video IDs.
    Includes title, description, tags, views, likes, comments, etc.
    NOTE: Does NOT include channel-level data - that goes in the summary CSV.
    """
    all_video_data = []
    
    # YouTube API allows max 50 videos per request
    for i in range(0, len(video_ids), 50):
        batch = video_ids[i:i+50]
        
        try:
            request = youtube.videos().list(
                part='snippet,statistics,contentDetails',
                id=','.join(batch)
            )
            response = request.execute()
            
            for video in response['items']:
                # Extract hashtags from description
                description = video['snippet'].get('description', '')
                # Replace newlines with spaces to prevent CSV issues
                description_clean = description.replace('\n', ' ').replace('\r', ' ')
                
                hashtags = [word for word in description.split() if word.startswith('#')]
                
                video_data = {
                    'video_id': video['id'],
                    'title': video['snippet']['title'],
                    'description': description_clean,  # Cleaned description
                    'published_at': video['snippet']['publishedAt'],
                    'tags': '|'.join(video['snippet'].get('tags', [])),  # Join tags with |
                    'hashtags': '|'.join(hashtags),  # Join hashtags with |
                    'duration': video['contentDetails']['duration'],
                    'view_count': video['statistics'].get('viewCount', 0),
                    'like_count': video['statistics'].get('likeCount', 0),
                    'comment_count': video['statistics'].get('commentCount', 0),
                }
                
                all_video_data.append(video_data)
                
        except Exception as e:
            print(f"Error fetching video details: {e}")
    
    return all_video_data

In [19]:
def scrape_channel_data(youtube, channel_id, max_videos=50):
    """
    Main function to scrape all data for a single channel.
    Returns a DataFrame with video details (NO channel info - that's in summary).
    """
    print(f"\n{'='*60}")
    print(f"Scraping channel: {channel_id}")
    print(f"{'='*60}")
    
    # Get channel info
    channel_info = get_channel_info(youtube, channel_id)
    if not channel_info:
        return None, None
    
    print(f"Channel: {channel_info['channel_title']}")
    print(f"Subscribers: {channel_info['subscriber_count']}")
    print(f"Total Views: {channel_info['view_count']}")
    print(f"Total Videos: {channel_info['video_count']}")
    
    # Get video IDs
    print(f"\nFetching up to {max_videos} videos...")
    video_ids = get_channel_videos(youtube, channel_info['uploads_playlist_id'], max_videos)
    print(f"Found {len(video_ids)} videos")
    
    # Get video details
    print("Fetching video details...")
    video_data = get_video_details(youtube, video_ids)
    
    # Create DataFrame
    df = pd.DataFrame(video_data)
    
    # Convert numeric columns
    numeric_cols = ['view_count', 'like_count', 'comment_count']
    for col in numeric_cols:
        if col in df.columns:
            df[col] = pd.to_numeric(df[col], errors='coerce').fillna(0).astype(int)
    
    print(f"Successfully scraped {len(df)} videos")
    
    return df, channel_info

## Main Scraping Process
- Scrape all the channels from `accounts.txt` and save the data:

### File Structure:
 - **Individual channel CSVs**: `{ChannelName}.csv` 
   - Contains ONLY video data (no channel info)
 - **Summary CSV**: `channels_summary.csv`
   - Contains one row per channel with channel metadata

In [20]:
# Create data directory if it doesn't exist
os.makedirs('data', exist_ok=True)

# Read channel identifiers
identifiers = read_channel_ids('accounts.txt')

print_info("Resolving channel identifiers...")
channel_ids = []
for identifier in identifiers:
    channel_id = resolve_channel_identifier(youtube, identifier)
    if channel_id:
        print_success(f"  [SUCCESS] {identifier} -> {channel_id}")
        channel_ids.append(channel_id)
    else:
        print_error(f"  [FAILED] {identifier} -> Could not resolve")

print_info(f"\nResolved {len(channel_ids)} out of {len(identifiers)} channels\n")

# Store all results
all_channels_data = []
channel_metadata = []

# Scrape each channel
for channel_id in tqdm(channel_ids, desc="Scraping channels"):
    df, channel_info = scrape_channel_data(youtube, channel_id, max_videos=50)
    
    if df is not None and channel_info is not None:
        channel_name = channel_info['channel_title']
        # Remove special characters and use only ASCII-safe characters
        clean_name = ''.join(c for c in channel_name if c.isalnum() or c in (' ', '-', '_'))
        clean_name = clean_name.replace(' ', '_').strip('_')
        filename = f"data/{clean_name}.csv"
        
        df.to_csv(filename, index=False, encoding='utf-8')
        print_success(f"[SAVED] {filename}\n")
        
        all_channels_data.append(df)
        channel_metadata.append(channel_info)

print(f"\n{'='*60}")
print_success(f"Scraping Complete!")
print_info(f"Successfully scraped {len(all_channels_data)} channels")
print_info(f"CSV files saved in the 'data/' directory")
print(f"{'='*60}")

[94mResolving channel identifiers...[0m
[92m  [SUCCESS] @CrunchycatLuna -> UCbGvv6m9qjuZF9_fcljgvzw[0m
[92m  [SUCCESS] @chacha-rme -> UCTJh2CW0v9MvqXAEC2mpsSQ[0m
[92m  [SUCCESS] @funnycattshorts -> UCbTaoonXlipQ-JsG7lhfKog[0m
[92m  [SUCCESS] @PrincessNikacat -> UC0X1dp5Gkhos0QJpainsjUA[0m
[92m  [SUCCESS] @ChipTheManx -> UC6s-MijPBrcnr55akiKarmQ[0m
[92m  [SUCCESS] @cats101_03 -> UCbWNEXG8GoszgTmJgaPgvnQ[0m
[92m  [SUCCESS] @CatPusicTeam -> UCyIqcxz-vR_o2GK4HWuZL8w[0m
[92m  [SUCCESS] @OwlKitty -> UCpLQXR116cLVUa1LRY8KS4w[0m
[92m  [SUCCESS] @coleandmarmalade -> UCvmijL-eepDVHYSJHDY3d6w[0m
[92m  [SUCCESS] @DailyDoseOfInternetCats -> UCTIa8uo_aisNdqQpMf4wKTg[0m
[92m  [SUCCESS] @ChefCatChangAn666 -> UCv5R_Lzpu-4PopdHiDaH7-A[0m
[92m  [SUCCESS] @FeedingStreetCats -> UC4Eqt3jI2inoD4TIcvuctYQ[0m
[92m  [SUCCESS] @TastyPaws -> UCfVwRiihN37PIcmtlLihj6w[0m
[92m  [SUCCESS] @cat.mp4.666 -> UC-xSWdtNE9tSp9SyfPaNHkQ[0m
[92m  [SUCCESS] @LittleLove666 -> UCuPLku1Zrk6HMr2S51yG

Scraping channels:   0%|          | 0/29 [00:00<?, ?it/s]


Scraping channel: UCbGvv6m9qjuZF9_fcljgvzw
Channel: Crunchycat
Subscribers: 589000
Total Views: 115594347
Total Videos: 560

Fetching up to 50 videos...
Found 50 videos
Fetching video details...


Scraping channels:   3%|▎         | 1/29 [00:00<00:13,  2.00it/s]

Successfully scraped 50 videos
[92m[SAVED] data/Crunchycat.csv
[0m

Scraping channel: UCTJh2CW0v9MvqXAEC2mpsSQ
Channel: 元野良猫チャチャとR me
Subscribers: 663000
Total Views: 328444839
Total Videos: 549

Fetching up to 50 videos...
Found 50 videos
Fetching video details...


Scraping channels:   7%|▋         | 2/29 [00:01<00:14,  1.81it/s]

Successfully scraped 50 videos
[92m[SAVED] data/元野良猫チャチャとR_me.csv
[0m

Scraping channel: UCbTaoonXlipQ-JsG7lhfKog
Channel: The Meow Show
Subscribers: 415000
Total Views: 205245352
Total Videos: 91

Fetching up to 50 videos...
Found 50 videos
Fetching video details...


Scraping channels:  10%|█         | 3/29 [00:01<00:14,  1.83it/s]

Successfully scraped 50 videos
[92m[SAVED] data/The_Meow_Show.csv
[0m

Scraping channel: UC0X1dp5Gkhos0QJpainsjUA
Channel: Princess Nika cat
Subscribers: 10800000
Total Views: 6278111788
Total Videos: 80

Fetching up to 50 videos...


Scraping channels:  14%|█▍        | 4/29 [00:02<00:12,  2.02it/s]

Found 50 videos
Fetching video details...
Successfully scraped 50 videos
[92m[SAVED] data/Princess_Nika_cat.csv
[0m

Scraping channel: UC6s-MijPBrcnr55akiKarmQ
Channel: Chip The Manx
Subscribers: 484000
Total Views: 373055692
Total Videos: 1122

Fetching up to 50 videos...
Found 50 videos
Fetching video details...


Scraping channels:  17%|█▋        | 5/29 [00:02<00:13,  1.84it/s]

Successfully scraped 50 videos
[92m[SAVED] data/Chip_The_Manx.csv
[0m

Scraping channel: UCbWNEXG8GoszgTmJgaPgvnQ
Channel: cats101
Subscribers: 53200
Total Views: 51075766
Total Videos: 203

Fetching up to 50 videos...
Found 50 videos
Fetching video details...


Scraping channels:  21%|██        | 6/29 [00:03<00:11,  1.92it/s]

Successfully scraped 50 videos
[92m[SAVED] data/cats101.csv
[0m

Scraping channel: UCyIqcxz-vR_o2GK4HWuZL8w
Channel: CatPusic Team
Subscribers: 1970000
Total Views: 828472191
Total Videos: 320

Fetching up to 50 videos...
Found 50 videos
Fetching video details...


Scraping channels:  24%|██▍       | 7/29 [00:03<00:10,  2.12it/s]

Successfully scraped 50 videos
[92m[SAVED] data/CatPusic_Team.csv
[0m

Scraping channel: UCpLQXR116cLVUa1LRY8KS4w
Channel: OwlKitty
Subscribers: 2580000
Total Views: 501717927
Total Videos: 101

Fetching up to 50 videos...
Found 50 videos
Fetching video details...


Scraping channels:  28%|██▊       | 8/29 [00:03<00:09,  2.18it/s]

Successfully scraped 50 videos
[92m[SAVED] data/OwlKitty.csv
[0m

Scraping channel: UCvmijL-eepDVHYSJHDY3d6w
Channel: Cole and Marmalade
Subscribers: 1400000
Total Views: 444960820
Total Videos: 477

Fetching up to 50 videos...
Found 50 videos
Fetching video details...


Scraping channels:  31%|███       | 9/29 [00:04<00:09,  2.01it/s]

Successfully scraped 50 videos
[92m[SAVED] data/Cole_and_Marmalade.csv
[0m

Scraping channel: UCTIa8uo_aisNdqQpMf4wKTg
Channel: DailyDoseOfInternetCats
Subscribers: 1180000
Total Views: 791577211
Total Videos: 368

Fetching up to 50 videos...


Scraping channels:  34%|███▍      | 10/29 [00:05<00:09,  2.04it/s]

Found 50 videos
Fetching video details...
Successfully scraped 50 videos
[92m[SAVED] data/DailyDoseOfInternetCats.csv
[0m

Scraping channel: UCv5R_Lzpu-4PopdHiDaH7-A
Channel: Chef Cat ChangAn 
Subscribers: 10700000
Total Views: 7242246227
Total Videos: 610

Fetching up to 50 videos...
Found 50 videos
Fetching video details...


Scraping channels:  38%|███▊      | 11/29 [00:05<00:08,  2.03it/s]

Successfully scraped 50 videos
[92m[SAVED] data/Chef_Cat_ChangAn.csv
[0m

Scraping channel: UC4Eqt3jI2inoD4TIcvuctYQ
Channel: Feeding Street Cats
Subscribers: 1580000
Total Views: 237969065
Total Videos: 1731

Fetching up to 50 videos...
Found 50 videos
Fetching video details...


Scraping channels:  41%|████▏     | 12/29 [00:06<00:09,  1.84it/s]

Successfully scraped 50 videos
[92m[SAVED] data/Feeding_Street_Cats.csv
[0m

Scraping channel: UCfVwRiihN37PIcmtlLihj6w
Channel: Tasty Paws
Subscribers: 302000
Total Views: 71410908
Total Videos: 390

Fetching up to 50 videos...
Found 50 videos
Fetching video details...


Scraping channels:  45%|████▍     | 13/29 [00:06<00:08,  1.90it/s]

Successfully scraped 50 videos
[92m[SAVED] data/Tasty_Paws.csv
[0m

Scraping channel: UC-xSWdtNE9tSp9SyfPaNHkQ
Channel: cat.mp4
Subscribers: 71400
Total Views: 26559625
Total Videos: 58

Fetching up to 50 videos...
Found 50 videos
Fetching video details...


Scraping channels:  48%|████▊     | 14/29 [00:07<00:07,  1.96it/s]

Successfully scraped 50 videos
[92m[SAVED] data/catmp4.csv
[0m

Scraping channel: UCuPLku1Zrk6HMr2S51yGkpQ
Channel: Little Love 
Subscribers: 856000
Total Views: 232736024
Total Videos: 753

Fetching up to 50 videos...


Scraping channels:  52%|█████▏    | 15/29 [00:08<00:12,  1.16it/s]

Found 50 videos
Fetching video details...
Successfully scraped 50 videos
[92m[SAVED] data/Little_Love.csv
[0m

Scraping channel: UCMkRHqr6MjsJ7VEhXJnSRcg
Channel: Pawsome Cats of the Internet
Subscribers: 15500
Total Views: 13861174
Total Videos: 77

Fetching up to 50 videos...
Found 50 videos
Fetching video details...


Scraping channels:  55%|█████▌    | 16/29 [00:09<00:09,  1.38it/s]

Successfully scraped 50 videos
[92m[SAVED] data/Pawsome_Cats_of_the_Internet.csv
[0m

Scraping channel: UCAUqPgpcjPk_JVv8aPYL-rw
Channel: Purrfect Pets
Subscribers: 81300
Total Views: 20667442
Total Videos: 50

Fetching up to 50 videos...


Scraping channels:  59%|█████▊    | 17/29 [00:09<00:07,  1.60it/s]

Found 50 videos
Fetching video details...
Successfully scraped 50 videos
[92m[SAVED] data/Purrfect_Pets.csv
[0m

Scraping channel: UCU0nAvDjqfXo1fTA7xn527w
Channel: CaD Animals
Subscribers: 27900
Total Views: 20883020
Total Videos: 54

Fetching up to 50 videos...
Found 50 videos
Fetching video details...
Successfully scraped 50 videos


Scraping channels:  62%|██████▏   | 18/29 [00:10<00:06,  1.76it/s]

[92m[SAVED] data/CaD_Animals.csv
[0m

Scraping channel: UCfm5SLKZDJqVluh-oo6q9Cw
Channel: Dexter The Cat
Subscribers: 159000
Total Views: 131678507
Total Videos: 203

Fetching up to 50 videos...


Scraping channels:  66%|██████▌   | 19/29 [00:10<00:05,  1.96it/s]

Found 50 videos
Fetching video details...
Successfully scraped 50 videos
[92m[SAVED] data/Dexter_The_Cat.csv
[0m

Scraping channel: UCezWXy_EVsFpQWQ6yk6a4FQ
Channel: Renus Delph
Subscribers: 518000
Total Views: 171563548
Total Videos: 370

Fetching up to 50 videos...
Found 50 videos
Fetching video details...


Scraping channels:  69%|██████▉   | 20/29 [00:10<00:04,  2.01it/s]

Successfully scraped 50 videos
[92m[SAVED] data/Renus_Delph.csv
[0m

Scraping channel: UCW_X3-IRNdOl9vrZ-zGthlQ
Channel: Funny And Cute Cat's Life
Subscribers: 194000
Total Views: 54318925
Total Videos: 461

Fetching up to 50 videos...


Scraping channels:  72%|███████▏  | 21/29 [00:11<00:05,  1.49it/s]

Found 50 videos
Fetching video details...
Successfully scraped 50 videos
[92m[SAVED] data/Funny_And_Cute_Cats_Life.csv
[0m

Scraping channel: UCmCOoX7E0hybWf1h7UJ4mrg
Channel: Maine Coon Kittens
Subscribers: 147000
Total Views: 73158784
Total Videos: 435

Fetching up to 50 videos...
Found 50 videos
Fetching video details...


Scraping channels:  76%|███████▌  | 22/29 [00:12<00:04,  1.57it/s]

Successfully scraped 50 videos
[92m[SAVED] data/Maine_Coon_Kittens.csv
[0m

Scraping channel: UChKwZfRAjbfuWXo1NhKqHgQ
Channel: Walter the Catt
Subscribers: 704000
Total Views: 325062818
Total Videos: 347

Fetching up to 50 videos...
Found 50 videos
Fetching video details...


Scraping channels:  79%|███████▉  | 23/29 [00:13<00:03,  1.52it/s]

Successfully scraped 50 videos
[92m[SAVED] data/Walter_the_Catt.csv
[0m

Scraping channel: UC6VzUf8LyXxOz2ZkQP8_uhw
Channel: CAT MAN CHRIS
Subscribers: 938000
Total Views: 274067260
Total Videos: 154

Fetching up to 50 videos...


Scraping channels:  83%|████████▎ | 24/29 [00:13<00:03,  1.64it/s]

Found 50 videos
Fetching video details...
Successfully scraped 50 videos
[92m[SAVED] data/CAT_MAN_CHRIS.csv
[0m

Scraping channel: UCZTM8ZG29egxUmYJ_pL-iZQ
Channel: TakeYourDoseOfCats
Subscribers: 237000
Total Views: 143080975
Total Videos: 183

Fetching up to 50 videos...
Found 50 videos
Fetching video details...


Scraping channels:  86%|████████▌ | 25/29 [00:14<00:02,  1.83it/s]

Successfully scraped 50 videos
[92m[SAVED] data/TakeYourDoseOfCats.csv
[0m

Scraping channel: UCgpxaYSBRT6Ocb0wAw9S2OQ
Channel: funcatflicks
Subscribers: 28500
Total Views: 25288964
Total Videos: 326

Fetching up to 50 videos...


Scraping channels:  90%|████████▉ | 26/29 [00:14<00:01,  1.95it/s]

Found 50 videos
Fetching video details...
Successfully scraped 50 videos
[92m[SAVED] data/funcatflicks.csv
[0m

Scraping channel: UCzdQGuxoc-B9WrWQnMF4vmA
Channel: FunnyPaws
Subscribers: 173000
Total Views: 48519956
Total Videos: 150

Fetching up to 50 videos...
Found 50 videos
Fetching video details...


Scraping channels:  93%|█████████▎| 27/29 [00:14<00:00,  2.10it/s]

Successfully scraped 50 videos
[92m[SAVED] data/FunnyPaws.csv
[0m

Scraping channel: UCmSu6aPS3LKmf-5MrTr3TRg
Channel: Meowphorius 
Subscribers: 759000
Total Views: 562050369
Total Videos: 245

Fetching up to 50 videos...
Found 50 videos
Fetching video details...


Scraping channels:  97%|█████████▋| 28/29 [00:15<00:00,  1.89it/s]

Successfully scraped 50 videos
[92m[SAVED] data/Meowphorius.csv
[0m

Scraping channel: UCxZKZ-T6Res7lrwDP9i9iUA
Channel: el Cato
Subscribers: 46600
Total Views: 24458105
Total Videos: 96

Fetching up to 50 videos...
Found 50 videos
Fetching video details...


Scraping channels: 100%|██████████| 29/29 [00:15<00:00,  1.81it/s]

Successfully scraped 50 videos
[92m[SAVED] data/el_Cato.csv
[0m

[92mScraping Complete![0m
[94mSuccessfully scraped 29 channels[0m
[94mCSV files saved in the 'data/' directory[0m





## Summary CSV Creation
- After scraping each channel, append its metadata to a summary list
- At the end, convert this list to a DataFrame and save as `channels_summary.csv`

In [21]:
# Create summary DataFrame from channel metadata
summary_df = pd.DataFrame(channel_metadata)

# Calculate average engagement per channel from video data
# Match up with channel metadata by index (they're in the same order)
engagement_stats = []
for i, df in enumerate(all_channels_data):
    if len(df) > 0:
        # Get the corresponding channel info
        channel_info = channel_metadata[i]
        
        stats = {
            'channel_id': channel_info['channel_id'],
            'channel_title': channel_info['channel_title'],
            'total_videos_scraped': len(df),
            'avg_views': df['view_count'].mean(),
            'avg_likes': df['like_count'].mean(),
            'avg_comments': df['comment_count'].mean(),
            'total_views_scraped_videos': df['view_count'].sum(),
            'total_likes_scraped_videos': df['like_count'].sum(),
        }
        engagement_stats.append(stats)

engagement_df = pd.DataFrame(engagement_stats)

# Merge with channel metadata
if len(engagement_df) > 0:
    summary_full = pd.merge(summary_df, engagement_df, on=['channel_id', 'channel_title'], how='left')
    
    # Save summary
    summary_full.to_csv('data/channels_summary.csv', index=False)
    print("Channel Summary:")
    print(summary_full[['channel_title', 'subscriber_count', 'video_count', 
                        'total_videos_scraped', 'avg_views', 'avg_likes']].to_string(index=False))
    print(f"\nSummary saved to: data/channels_summary.csv")
else:
    print("No data scraped")

Channel Summary:
               channel_title subscriber_count video_count  total_videos_scraped   avg_views  avg_likes
                  Crunchycat           589000         560                    50   116697.14   10485.46
               元野良猫チャチャとR me           663000         549                    50   731521.90   21536.54
               The Meow Show           415000          91                    50  2270376.92  151891.80
           Princess Nika cat         10800000          80                    50 84724582.72 1682311.86
               Chip The Manx           484000        1122                    50    20164.66    1246.86
                     cats101            53200         203                    50    36323.78     653.48
               CatPusic Team          1970000         320                    50   335144.12    2602.88
                    OwlKitty          2580000         101                    50  5618675.78  156738.96
          Cole and Marmalade          1400000         47