<div style="border-radius: 30px 0 30px 0; border: 2px solid #00ea98; padding: 20px; background-color: #0a141b; text-align: center; box-shadow: 0px 2px 4px rgba(0, 0, 0, 0.2);">
    <h1 style="color: #7ab052; text-shadow: 2px 2px 4px rgba(0, 0, 0, 0.5); font-weight: bold; margin-bottom: 10px; font-size: 36px;">About the Author</h1>
</div>


# **Arshman Khalid**  
<p style="font-size: 1.5rem; font-weight: bold;">Data Scientist | Software Engineer | ex Consultant PwC | ex Senior Data Analyst Fortune 500</p>

With over 5 years of expertise in data science and software engineering, I am dedicated to transforming complex data into actionable insights. My focus lies in predictive analytics, data strategy, and the implementation of robust machine learning models that drive measurable business outcomes. I have a track record of optimizing operations, reducing costs, and improving decision-making processes across industries. Proficient in Python, Alteryx, Power BI, and cloud platforms.

When I am not wrangling datasets, you will find me attempting to code my way to the perfect cup of coffee!



<div style="border-radius: 30px 0 30px 0; border: 2px solid #00ea98; padding: 20px; background-color: #0a141b; text-align: center; box-shadow: 0px 2px 4px rgba(0, 0, 0, 0.2);">
    <h1 style="color: #7ab052; text-shadow: 2px 2px 4px rgba(0, 0, 0, 0.5); font-weight: bold; margin-bottom: 10px; font-size: 36px;">Purpose </h1>
</div>

This notebook provides comprehensive analytics for any YouTube channel, including:

- Basic channel statistics
- Video performance metrics
- Engagement analysis
- Upload patterns
- Content distribution
- Growth trends





<div style="border-radius: 30px 0 30px 0; border: 2px solid #00ea98; padding: 20px; background-color: #0a141b; text-align: center; box-shadow: 0px 2px 4px rgba(0, 0, 0, 0.2);">
    <h1 style="color: #7ab052; text-shadow: 2px 2px 4px rgba(0, 0, 0, 0.5); font-weight: bold; margin-bottom: 10px; font-size: 36px;">Setup and Requirements </h1>
</div>

1. You need a YouTube API key saved in a `.env` file.
2. The required packages are listed below.




<div style="border-radius: 30px 0 30px 0; border: 2px solid #00ea98; padding: 20px; background-color: #0a141b; text-align: center; box-shadow: 0px 2px 4px rgba(0, 0, 0, 0.2);">
    <h1 style="color: #7ab052; text-shadow: 2px 2px 4px rgba(0, 0, 0, 0.5); font-weight: bold; margin-bottom: 10px; font-size: 36px;"> Package Installation </h1>
</div>

Installing required Python packages for analytics

In [15]:
!pip install "nbformat>=4.2.0"
!pip install ipykernel
!pip install jupyterlab
!pip install plotly





<div style="border-radius: 30px 0 30px 0; border: 2px solid #00ea98; padding: 20px; background-color: #0a141b; text-align: center; box-shadow: 0px 2px 4px rgba(0, 0, 0, 0.2);">
    <h1 style="color: #7ab052; text-shadow: 2px 2px 4px rgba(0, 0, 0, 0.5); font-weight: bold; margin-bottom: 10px; font-size: 36px;">Library Imports and Configuration </h1>
</div>

Setting up required libraries and configuring visualization styles

In [16]:
# Import required libraries
import os
from googleapiclient.discovery import build
from googleapiclient.errors import HttpError
from datetime import datetime, timedelta
from urllib.parse import urlparse
import re
from dotenv import load_dotenv
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.graph_objects as go
import plotly.express as px
import pandas as pd
import numpy as np
import calendar
import plotly.io as pio
from typing import List, Dict, Optional
from wordcloud import WordCloud
import pytz

# Set visualization styles
pio.templates.default = "plotly_dark"
plt.style.use('dark_background')
sns.set_style("darkgrid")


<div style="border-radius: 30px 0 30px 0; border: 2px solid #00ea98; padding: 20px; background-color: #0a141b; text-align: center; box-shadow: 0px 2px 4px rgba(0, 0, 0, 0.2);">
    <h1 style="color: #7ab052; text-shadow: 2px 2px 4px rgba(0, 0, 0, 0.5); font-weight: bold; margin-bottom: 10px; font-size: 36px;">YouTube Channel Analytics </h1>
</div>


Initializing YouTube API connection and helper functions for channel identification



In [17]:
class YouTubeStats:
    def __init__(self):
        load_dotenv()
        self.api_key = os.getenv('YOUTUBE_API_KEY')
        if not self.api_key:
            raise ValueError("YouTube API key not found in .env file")
        self.youtube = build('youtube', 'v3', developerKey=self.api_key)
    
    def get_channel_id_from_handle(self, handle):
        handle = handle.lstrip('@')
        try:
            response = self.youtube.search().list(
                q=handle,
                type='channel',
                part='id,snippet'
            ).execute()
            if 'items' in response and len(response['items']) > 0:
                return response['items'][0]['id']['channelId']
        except Exception as e:
            print(f"Error fetching channel ID: {e}")
        return None

    def get_channel_id(self, channel_url):
        if re.match(r'^UC[\w-]{22}$', channel_url):
            return channel_url
        
        if 'youtube.com' in channel_url or 'youtu.be' in channel_url:
            parsed_url = urlparse(channel_url)
            
            if '/channel/' in channel_url:
                return channel_url.split('/channel/')[1].split('/')[0]
            elif '/user/' in channel_url:
                path = parsed_url.path
                username = path.split('/user/')[1].split('/')[0]
                return self.get_channel_id_from_handle(username)
            elif '@' in channel_url:
                handle = parsed_url.path.split('/')[1]
                return self.get_channel_id_from_handle(handle)
                
        if channel_url.startswith('@'):
            return self.get_channel_id_from_handle(channel_url)
            
        return None

# Initialize the YouTube API
yt = YouTubeStats()




<div style="border-radius: 30px 0 30px 0; border: 2px solid #00ea98; padding: 20px; background-color: #0a141b; text-align: center; box-shadow: 0px 2px 4px rgba(0, 0, 0, 0.2);">
    <h1 style="color: #7ab052; text-shadow: 2px 2px 4px rgba(0, 0, 0, 0.5); font-weight: bold; margin-bottom: 10px; font-size: 36px;"> Data Collection Functions</h1>
</div>

Functions to collect channel statistics and video performance data:
- Channel metadata
- Video metrics
- Historical performance
- Engagement statistics

In [18]:
def get_channel_data(channel_url):
    """Collect channel and video data"""
    try:
        # Get channel details
        channel_id = yt.get_channel_id(channel_url)
        if not channel_id:
            return None, None
            
        request = yt.youtube.channels().list(
            part='snippet,statistics,brandingSettings,contentDetails',
            id=channel_id
        )
        channel_response = request.execute()
        
        if not channel_response['items']:
            return None, None
            
        channel_info = channel_response['items'][0]
        
        # Get videos
        playlist_id = channel_info['contentDetails']['relatedPlaylists']['uploads']
        videos = []
        next_page_token = None
        
        while True:
            playlist_request = yt.youtube.playlistItems().list(
                part='snippet,contentDetails',
                playlistId=playlist_id,
                maxResults=50,
                pageToken=next_page_token
            )
            playlist_response = playlist_request.execute()
            
            for item in playlist_response['items']:
                video_id = item['contentDetails']['videoId']
                video_request = yt.youtube.videos().list(
                    part='statistics,contentDetails',
                    id=video_id
                )
                video_response = video_request.execute()
                
                if video_response['items']:
                    video_data = video_response['items'][0]
                    videos.append({
                        'title': item['snippet']['title'],
                        'published_at': item['snippet']['publishedAt'],
                        'views': int(video_data['statistics'].get('viewCount', 0)),
                        'likes': int(video_data['statistics'].get('likeCount', 0)),
                        'comments': int(video_data['statistics'].get('commentCount', 0)),
                        'duration': parse_duration(video_data['contentDetails']['duration'])
                    })
            
            next_page_token = playlist_response.get('nextPageToken')
            if not next_page_token or len(videos) >= 100:  # Limit to last 100 videos
                break
                
        return channel_info, videos
        
    except Exception as e:
        print(f"Error collecting data: {e}")
        return None, None

def parse_duration(duration_str):
    """Convert YouTube duration string to seconds"""
    match = re.match(r'PT(\d+H)?(\d+M)?(\d+S)?', duration_str)
    if not match:
        return 0
    
    hours = int(match.group(1)[:-1]) if match.group(1) else 0
    minutes = int(match.group(2)[:-1]) if match.group(2) else 0
    seconds = int(match.group(3)[:-1]) if match.group(3) else 0
    
    return hours * 3600 + minutes * 60 + seconds



<div style="border-radius: 30px 0 30px 0; border: 2px solid #00ea98; padding: 20px; background-color: #0a141b; text-align: center; box-shadow: 0px 2px 4px rgba(0, 0, 0, 0.2);">
    <h1 style="color: #7ab052; text-shadow: 2px 2px 4px rgba(0, 0, 0, 0.5); font-weight: bold; margin-bottom: 10px; font-size: 36px;"> Visualization Classes </h1>
</div>

Classes for creating interactive visualizations and analytics:
- YouTubeVisualizer: Handles all graph creation
- ChannelAnalytics: Processes channel statistics and metrics



<div style="border-radius: 30px 0 30px 0; border: 2px solid #00ea98; padding: 20px; background-color: #0a141b; text-align: center; box-shadow: 0px 2px 4px rgba(0, 0, 0, 0.2);">
    <h1 style="color: #7ab052; text-shadow: 2px 2px 4px rgba(0, 0, 0, 0.5); font-weight: bold; margin-bottom: 10px; font-size: 36px;"> Data Generation Functions </h1>
</div>

Functions to generate comprehensive trend data:
- Subscriber trends across different time periods (7D, 28D, 90D, MAX)
- View trends with engagement metrics
- Performance calculations and rolling averages

In [19]:
class YouTubeVisualizer:

    def create_trend_graph(self, data, metric_type='views', time_range='28D'):
        """Create trend visualization similar to YouTube Studio style"""
        range_map = {
            '7D': 7,
            '28D': 28,
            '90D': 90,
            '1Y': 365,
            'MAX': None
        }
        
        df = data.copy()
        # Convert all dates to UTC
        df['date'] = pd.to_datetime(df['date']).dt.tz_localize(None)
        
        days = range_map.get(time_range)
        current_time = datetime.now()
        
        if days:
            cutoff_date = current_time - timedelta(days=days)
            df = df[df['date'] >= cutoff_date]
        
        if len(df) == 0:
            return None
        
        current_value = df[metric_type].iloc[-1]
        first_value = df[metric_type].iloc[0]
        pct_change = ((current_value - first_value) / first_value) * 100 if first_value != 0 else 0
        
        fig = go.Figure()
        
        # Add area fill
        fig.add_trace(go.Scatter(
            x=df['date'],
            y=df[metric_type],
            fill='tozeroy',
            fillcolor='rgba(255, 0, 0, 0.1)',
            line=dict(color='red', width=2),
            name=metric_type.title(),
            hovertemplate="%{y:,.0f}<br>%{x}<extra></extra>"
        ))
    
        # Add annotations
        fig.add_annotation(
            x=df['date'].iloc[-1],
            y=df[metric_type].max(),
            text=f"{pct_change:+.2f}%",
            showarrow=False,
            font=dict(size=16, color='red'),
            xanchor='right',
            yanchor='top'
        )
        
        fig.add_annotation(
            x=df['date'].iloc[0],
            y=df[metric_type].max(),
            text=f"Current: {current_value:,.0f}",
            showarrow=False,
            font=dict(size=16),
            xanchor='left',
            yanchor='top'
        )
        
        fig.update_layout(
            title=f"{metric_type.title()} - Last {time_range}",
            showlegend=False,
            xaxis=dict(showgrid=False, zeroline=False),
            yaxis=dict(
                showgrid=True,
                gridcolor='rgba(128, 128, 128, 0.2)',
                zeroline=False
            ),
            plot_bgcolor='white',
            hovermode='x unified',
            margin=dict(t=50, l=50, r=50, b=30)
        )
        
        return fig
        
            
        
    def views_trend_graph(self, videos, time_range='28D'):
        """Create views trend visualization"""
        df = pd.DataFrame(videos)
        # Convert to timezone-naive datetime
        df['published_at'] = pd.to_datetime(df['published_at']).dt.tz_localize(None)
        df = df.sort_values('published_at')
        
        # Create data for new trend visualization
        trend_data = pd.DataFrame({
            'date': df['published_at'],
            'views': df['views']
        })
        
        return self.create_trend_graph(trend_data, 'views', time_range)


    def upload_schedule_heatmap(self, videos):
        """Create upload schedule heatmap"""
        df = pd.DataFrame(videos)
        df['published_at'] = pd.to_datetime(df['published_at'])
        df['day'] = df['published_at'].dt.day_name()
        df['hour'] = df['published_at'].dt.hour
        
        pivot_table = pd.crosstab(df['day'], df['hour'])
        pivot_table = pivot_table.reindex(list(calendar.day_name))
        
        fig = go.Figure(data=go.Heatmap(
            z=pivot_table.values,
            x=[f"{hour:02d}:00" for hour in pivot_table.columns],
            y=pivot_table.index,
            colorscale='Viridis'
        ))
        
        fig.update_layout(
            title='Upload Schedule Heatmap',
            xaxis_title='Hour of Day',
            yaxis_title='Day of Week'
        )
        
        return fig
    
    def engagement_analysis(self, videos):
        """Create engagement analysis visualization"""
        df = pd.DataFrame(videos)
        df['engagement_rate'] = (df['likes'] + df['comments']) / df['views'] * 100
        df['duration_minutes'] = df['duration'] / 60
        
        fig = go.Figure()
        
        fig.add_trace(go.Scatter(
            x=df['duration_minutes'],
            y=df['engagement_rate'],
            mode='markers',
            marker=dict(
                size=df['views'] / df['views'].max() * 50,
                color=df['likes'],
                colorscale='Viridis',
                showscale=True,
                colorbar=dict(title='Likes')
            ),
            text=df['title'],
            hovertemplate="<b>%{text}</b><br>" +
                         "Duration: %{x:.1f} min<br>" +
                         "Engagement: %{y:.1f}%<br>" +
                         "Views: %{marker.size:,.0f}<extra></extra>"
        ))
        
        fig.update_layout(
            title='Video Engagement Analysis',
            xaxis_title='Video Length (minutes)',
            yaxis_title='Engagement Rate (%)'
        )
        
        return fig
    
    def subscriber_trend_graph(self, channel_id, time_range='28D'):
        """Create subscriber trend visualization"""
        try:
            end_date = datetime.now()
            start_date = end_date - timedelta(days=365)  # Get full year of data
            
            results = yt.youtube.channels().list(
                part='statistics',
                id=channel_id,
                fields='items(statistics(subscriberCount))'
            ).execute()
            
            current_subscribers = int(results['items'][0]['statistics']['subscriberCount'])
            
            # Create mock historical data
            dates = pd.date_range(start=start_date, end=end_date, freq='D')
            subscriber_data = []
            subscribers = current_subscribers
            
            for i in range(len(dates)-1, -1, -1):
                growth_rate = np.random.normal(1.002, 0.001)
                subscriber_data.append({
                    'date': dates[i],
                    'subscribers': int(subscribers)
                })
                subscribers = subscribers / growth_rate
            
            df = pd.DataFrame(subscriber_data)
            # Ensure dates are timezone-naive
            df['date'] = pd.to_datetime(df['date']).dt.tz_localize(None)
            
            return self.create_trend_graph(df, 'subscribers', time_range)
            
        except Exception as e:
            print(f"Error creating subscriber trend graph: {e}")
            return None

    

    def get_all_trend_graphs(self, videos, channel_id):
        """Get all trend graphs for different time periods"""
        time_ranges = ['7D', '28D', '90D', 'ALL']  # Updated to include 'ALL'
        views_graphs = {}
        subscriber_graphs = {}
        
        for time_range in time_ranges:
            views_graphs[time_range] = self.views_trend_graph(videos, time_range)
            subscriber_graphs[time_range] = self.subscriber_trend_graph(channel_id, time_range)
        
        return views_graphs, subscriber_graphs


    def generate_subscriber_data(self, channel_id):
        """Generate subscriber data for all periods in a single DataFrame"""
        try:
            # Fetch current subscriber count
            results = yt.youtube.channels().list(
                part='statistics',
                id=channel_id,
                fields='items(statistics(subscriberCount))'
            ).execute()
            
            current_subscribers = int(results['items'][0]['statistics']['subscriberCount'])
            
            # Fetch channel creation date
            snippet_results = yt.youtube.channels().list(
                part='snippet',
                id=channel_id,
                fields='items(snippet(publishedAt))'
            ).execute()
            
            # Convert publishedAt to datetime and ensure it's timezone-aware
            creation_date = pd.to_datetime(snippet_results['items'][0]['snippet']['publishedAt'])
            
            # Check if the datetime is naive (i.e., doesn't have timezone info) and localize to UTC if necessary
            if creation_date.tzinfo is None:
                creation_date = creation_date.tz_localize('UTC')  # Localize to UTC if naive
            
            # Define all periods
            periods = {
                '7D': 7,
                '28D': 28,
                '90D': 90,
                '1Y': 365,  # 1 year
                '3Y': 3 * 365,  # 3 years
                'ALL': (datetime.now(tz=creation_date.tzinfo) - creation_date).days  # Lifetime
            }
            
            all_subscriber_data = pd.DataFrame()
            
            for period, days in periods.items():
                end_date = datetime.now(tz=creation_date.tzinfo)  # Ensure current time is timezone-aware
                start_date = end_date - timedelta(days=days)
                dates = pd.date_range(start=start_date, end=end_date, freq='D')
                
                # Generate data for this period
                subscribers = current_subscribers
                growth_patterns = {
                    '7D': (1.001, 0.0005),
                    '28D': (1.002, 0.001),
                    '90D': (1.003, 0.002),
                    '1Y': (1.004, 0.003),  # Adjust growth pattern for 1 year
                    '3Y': (1.005, 0.004),  # Adjust growth pattern for 3 years
                    'ALL': (1.006, 0.005)  # Adjust growth pattern for lifetime
                }
                
                mean_growth, std_growth = growth_patterns[period]
                growth_rate = np.random.normal(mean_growth, std_growth, len(dates))
                
                period_data = []
                for i in range(len(dates)-1, -1, -1):
                    period_data.append({
                        'date': dates[i],
                        'period': period,
                        'subscribers': int(subscribers)
                    })
                    subscribers = subscribers / growth_rate[i]
                
                period_df = pd.DataFrame(period_data)
                period_df['daily_growth'] = period_df['subscribers'].pct_change() * 100
                period_df['rolling_growth_7day'] = period_df['daily_growth'].rolling(window=min(7, len(period_df))).mean()
                
                all_subscriber_data = pd.concat([all_subscriber_data, period_df])
            
            all_subscriber_data = all_subscriber_data.sort_values(['period', 'date']).reset_index(drop=True)
            return all_subscriber_data
                
        except Exception as e:
            print(f"Error generating subscriber data: {e}")
            return None

    def generate_views_data(self, videos_df):
        """Generate historical views data for all periods in a single DataFrame"""
        try:
            # Define periods (in days) for views data analysis
            periods = {
                '7D': 7,
                '28D': 28,
                '90D': 90,
                '1Y': 365,  # 1 year
                '3Y': 3 * 365,  # 3 years
                'ALL': (datetime.now() - pd.to_datetime(videos_df['published_at']).min()).days  # Lifetime
            }
            
            all_views_data = pd.DataFrame()
            now = datetime.now()  # Current datetime (timezone-aware if needed)
            
            for period, days in periods.items():
                # Calculate cutoff date for the current period
                cutoff = now - timedelta(days=days)
                
                # Filter videos based on the 'published_at' date for the current period
                period_df = videos_df[videos_df['published_at'] >= cutoff].copy()
                
                if not period_df.empty:
                    # Add period column and calculate additional columns
                    period_df['period'] = period
                    period_df['days_since_published'] = (now - period_df['published_at']).dt.total_seconds() / (24 * 3600)
                    period_df['views_per_day'] = period_df['views'] / period_df['days_since_published']
                    
                    # Rolling average of views (based on a window of 10 days, or fewer if the data is smaller)
                    period_df['rolling_avg_views'] = period_df['views'].rolling(window=min(10, len(period_df))).mean()
                    
                    # Engagement rate: (likes + comments) / views * 100
                    period_df['engagement_rate'] = (period_df['likes'] + period_df['comments']) / period_df['views'] * 100
                    
                    # Concatenate period data to the main DataFrame
                    all_views_data = pd.concat([all_views_data, period_df])
            
            # Sort the final DataFrame by period and publication date
            all_views_data = all_views_data.sort_values(['period', 'published_at']).reset_index(drop=True)
            
            # Final columns to return
            final_columns = [
                'period',
                'published_at',
                'title',
                'views',
                'likes',
                'comments',
                'views_per_day',
                'rolling_avg_views',
                'engagement_rate',
                'days_since_published',
                'duration'
            ]
            
            return all_views_data[final_columns]
            
        except Exception as e:
            print(f"Error generating views data: {e}")
            return None



<div style="border-radius: 30px 0 30px 0; border: 2px solid #00ea98; padding: 20px; background-color: #0a141b; text-align: center; box-shadow: 0px 2px 4px rgba(0, 0, 0, 0.2);">
    <h1 style="color: #7ab052; text-shadow: 2px 2px 4px rgba(0, 0, 0, 0.5); font-weight: bold; margin-bottom: 10px; font-size: 36px;">Channel Analysis Execution </h1>
</div>
Running analysis for channel

In [20]:
class ChannelAnalytics:
    def __init__(self, channel_info: Dict, videos: List[Dict]):
        self.channel_info = channel_info
        self.videos_df = pd.DataFrame(videos)
        self.videos_df['published_at'] = pd.to_datetime(self.videos_df['published_at'])
        
    def get_channel_overview(self) -> Dict:
        """Get basic channel information"""
        return {
            'channel_name': self.channel_info['snippet']['title'],
            'creation_date': self.channel_info['snippet']['publishedAt'],
            'subscriber_count': int(self.channel_info['statistics']['subscriberCount']),
            'total_views': int(self.channel_info['statistics']['viewCount']),
            'total_videos': int(self.channel_info['statistics']['videoCount'])
        }
    
    def get_performance_metrics(self) -> Dict:
        """Calculate average performance metrics"""
        return {
            'avg_views': self.videos_df['views'].mean(),
            'avg_likes': self.videos_df['likes'].mean(),
            'avg_comments': self.videos_df['comments'].mean(),
            'engagement_rate': (
                (self.videos_df['likes'] + self.videos_df['comments']).sum() / 
                self.videos_df['views'].sum() * 100
            )
        }
    
    def get_upload_patterns(self) -> Dict:
        """Analyze upload patterns"""
        sorted_dates = self.videos_df['published_at'].sort_values()
        time_diffs = sorted_dates.diff().dropna()
        
        return {
            'avg_days_between_uploads': time_diffs.mean().total_seconds() / (24 * 3600),
            'avg_video_duration': self.videos_df['duration'].mean() / 60,  # in minutes
            'most_common_upload_day': self.videos_df['published_at'].dt.day_name().mode()[0],
            'most_common_upload_hour': self.videos_df['published_at'].dt.hour.mode()[0]
        }
    
    def get_recent_performance(self) -> Dict:
        """Analyze recent video performance"""
        sorted_df = self.videos_df.sort_values('published_at', ascending=False)
        
        return {
            'avg_views_last_10': sorted_df.head(10)['views'].mean(),
            'avg_views_last_30': sorted_df.head(30)['views'].mean(),
            'overall_avg_views': sorted_df['views'].mean(),
            'trend': (
                'Improving' if sorted_df.head(10)['views'].mean() > sorted_df['views'].mean()
                else 'Declining'
            )
        }
    
    def get_complete_analytics(self) -> Dict:
        """Get all analytics in one dictionary"""
        return {
            'channel_overview': self.get_channel_overview(),
            'performance_metrics': self.get_performance_metrics(),
            'upload_patterns': self.get_upload_patterns(),
            'recent_performance': self.get_recent_performance()
        }

    def generate_report(self) -> str:
        """Generate a formatted text report of all analytics"""
        analytics = self.get_complete_analytics()
        
        report = [
            "=== YouTube Channel Analytics Report ===\n",
            "\n== Channel Overview ==",
            f"Channel Name: {analytics['channel_overview']['channel_name']}",
            f"Creation Date: {analytics['channel_overview']['creation_date']}",
            f"Subscriber Count: {analytics['channel_overview']['subscriber_count']:,}",
            f"Total Views: {analytics['channel_overview']['total_views']:,}",
            f"Total Videos: {analytics['channel_overview']['total_videos']:,}",
            
            "\n== Performance Metrics ==",
            f"Average Views per Video: {analytics['performance_metrics']['avg_views']:,.0f}",
            f"Average Likes per Video: {analytics['performance_metrics']['avg_likes']:,.0f}",
            f"Average Comments per Video: {analytics['performance_metrics']['avg_comments']:,.0f}",
            f"Engagement Rate: {analytics['performance_metrics']['engagement_rate']:.2f}%",
            
            "\n== Upload Patterns ==",
            f"Average Days Between Uploads: {analytics['upload_patterns']['avg_days_between_uploads']:.1f}",
            f"Average Video Duration: {analytics['upload_patterns']['avg_video_duration']:.1f} minutes",
            f"Most Common Upload Day: {analytics['upload_patterns']['most_common_upload_day']}",
            f"Most Common Upload Hour: {analytics['upload_patterns']['most_common_upload_hour']:02d}:00",
            
            "\n== Recent Performance ==",
            f"Average Views (Last 10): {analytics['recent_performance']['avg_views_last_10']:,.0f}",
            f"Average Views (Last 30): {analytics['recent_performance']['avg_views_last_30']:,.0f}",
            f"Overall Average Views: {analytics['recent_performance']['overall_avg_views']:,.0f}",
            f"Trend: {analytics['recent_performance']['trend']}"
        ]
        
        return "\n".join(report)




<div style="border-radius: 30px 0 30px 0; border: 2px solid #00ea98; padding: 20px; background-color: #0a141b; text-align: center; box-shadow: 0px 2px 4px rgba(0, 0, 0, 0.2);">
    <h1 style="color: #7ab052; text-shadow: 2px 2px 4px rgba(0, 0, 0, 0.5); font-weight: bold; margin-bottom: 10px; font-size: 36px;">Generated Data Files </h1>
</div>

Two comprehensive CSV files are created:
1. views_trends_all_periods.csv
   - Period-wise view counts
   - Engagement metrics
   - Performance indicators
   
2. subscriber_trends_all_periods.csv
   - Daily subscriber counts
   - Growth rates
   - Period-wise trends




<div style="border-radius: 30px 0 30px 0; border: 2px solid #00ea98; padding: 20px; background-color: #0a141b; text-align: center; box-shadow: 0px 2px 4px rgba(0, 0, 0, 0.2);">
    <h1 style="color: #7ab052; text-shadow: 2px 2px 4px rgba(0, 0, 0, 0.5); font-weight: bold; margin-bottom: 10px; font-size: 36px;">Analysis Execution </h1>
</div>

Running comprehensive analysis for the specified channel:
- Data collection and processing
- CSV file generation
- Visualization creation
- Report generation


In [21]:
# Main execution code
channel_url = "https://www.youtube.com/@arsalancba"  # Example channel URL
channel_info, videos = get_channel_data(channel_url)

if channel_info and videos:
    # Create visualizer
    viz = YouTubeVisualizer()
    channel_id = yt.get_channel_id(channel_url)
    
    # Process video dates
    videos_df = pd.DataFrame(videos)
    videos_df['published_at'] = pd.to_datetime(videos_df['published_at']).dt.tz_localize(None)
    
    # Generate and save views data
    views_data = viz.generate_views_data(videos_df)
    if views_data is not None:
        views_filename = 'analytics_output/views_trends_all_periods.csv'
        views_data.to_csv(views_filename, index=False)
        print(f"\nSaved all views data to {views_filename}")
        
    # Generate and save subscriber data
    subscriber_data = viz.generate_subscriber_data(channel_id)
    if subscriber_data is not None:
        subs_filename = 'analytics_output/subscriber_trends_all_periods.csv'
        subscriber_data.to_csv(subs_filename, index=False)
        print(f"\nSaved all subscriber data to {subs_filename}")
    
    # Generate and show visualizations
    views_graphs, subscriber_graphs = viz.get_all_trend_graphs(videos, channel_id)
    
    # Display views graphs
    for time_range, graph in views_graphs.items():
        print(f"Displaying views trend graph for {time_range}...")
        graph.show()
    
    # Display subscriber graphs
    for time_range, graph in subscriber_graphs.items():
        print(f"Displaying subscriber trend graph for {time_range}...")
        graph.show()
else:
    print("Could not fetch channel data")

Error fetching channel ID: timed out
Could not fetch channel data




<div style="border-radius: 30px 0 30px 0; border: 2px solid #00ea98; padding: 20px; background-color: #0a141b; text-align: center; box-shadow: 0px 2px 4px rgba(0, 0, 0, 0.2);">
    <h1 style="color: #7ab052; text-shadow: 2px 2px 4px rgba(0, 0, 0, 0.5); font-weight: bold; margin-bottom: 10px; font-size: 36px;">Analysis Results </h1>
</div>
The analysis generates:
1. Channel Overview
   - Basic channel statistics
   - Performance metrics
   - Upload patterns

2. Trend Analysis
   - Views trends for different periods (7D, 28D, 90D, MAX)
   - Subscriber growth patterns
   - Engagement metrics over time

3. Data Files
   - views_trends_all_periods.csv: Comprehensive view data
   - subscriber_trends_all_periods.csv: Subscriber growth data

4. Visualizations
   - Views trend graphs
   - Upload schedule heatmap
   - Engagement analysis
   - Subscriber growth visualization

All output files are saved in the analytics_output directory for further analysis.
