***A little bit of Intro Before we start***

Welcome to the final phase of this comprehensive data visualization project. In this stage, we integrate advanced visual techniques and analytical methods to transform complex datasets into clear, actionable insights. You’ll encounter everything from interactive 3D bubble charts and dynamic NetworkX visualizations to AI-driven image labeling and high-impact KPI dashboards—each designed to illuminate underlying patterns and trends with precision and aesthetic clarity.

Our journey culminates with specialized charts—donut charts that break down category engagement, Sankey and chord diagrams that map flows and connections, and animated line charts that reveal quarterly emotional trends over time. Complementing these visuals are carefully crafted annotations, highlights, and a concise mini-documentary segment that guides you through key findings with professional polish.

By blending data science rigor with thoughtful design, this project not only showcases technical proficiency but also tells a compelling story. Let’s dive in and explore the insights that can drive better decision-making and strategic impact.

1. ***Inside YouTube: A Director's Cut on Trends, Engagement, and Data Stories 🎥***

## 📊 KPIs: Numbers That Matter


In [None]:
import pandas as pd
import plotly.graph_objects as go

#Loaded dataset
youtube_df = pd.read_csv('USvideos.csv')

#️Calculating KPIs (store both raw and formatted version)
metrics = {
    "Total Videos": {
        "raw": len(youtube_df),
        "formatted": f"{len(youtube_df):,}"
    },
    "Total Views": {
        "raw": youtube_df['views'].sum(),
        "formatted": f"{youtube_df['views'].sum():,}"
    },
    "Total Likes": {
        "raw": youtube_df['likes'].sum(),
        "formatted": f"{youtube_df['likes'].sum():,}"
    },
    "Total Comments": {
        "raw": youtube_df['comment_count'].sum(),
        "formatted": f"{youtube_df['comment_count'].sum():,}"
    },
    "Likes per 100 Views": {
        "raw": (youtube_df['likes'].sum() / youtube_df['views'].sum()) * 100,
        "formatted": f"{(youtube_df['likes'].sum() / youtube_df['views'].sum()) * 100:.2f}"
    },
    "Comments per 100 Views": {
        "raw": (youtube_df['comment_count'].sum() / youtube_df['views'].sum()) * 100,
        "formatted": f"{(youtube_df['comment_count'].sum() / youtube_df['views'].sum()) * 100:.2f}"
    }
}

#️Creating KPI Cards
fig = go.Figure()

for i, (kpi, values) in enumerate(metrics.items()):
    fig.add_trace(go.Indicator(
        mode = "number",
        value = values["raw"],
        number = {"font": {"size": 36}},
        title = {
            "text": f"<b>{kpi}</b><br><span style='font-size:0.7em'>{values['formatted']}</span>"
        },
        domain = {"row": i // 3, "column": i % 3}
    ))

fig.update_layout(
    grid = {"rows": 2, "columns": 3, "pattern": "independent"},
    template = "plotly_dark",
    title = {
        "text": "YouTube Director's Cut: Key Engagement Metrics",
        "x": 0.5,
        "xanchor": "center",
        "font": {"size": 24}
    },
    margin = dict(t = 100, b = 20)
)

fig.show()


## 🔗 NetworkX Graph: Creator Connections


In [None]:
import warnings
warnings.filterwarnings('ignore')

import pandas as pd
import networkx as nx
import matplotlib.pyplot as plt
from collections import Counter
import matplotlib.cm as cm
import numpy as np

#️Loading and cleaning the dataset
youtube_df = pd.read_csv('USvideos.csv')
youtube_df = youtube_df.dropna(subset = ['tags'])

#️Getting top 30 tags by frequency
tag_counter = Counter()
for tag_str in youtube_df['tags']:
    tags = [tag.strip() for tag in tag_str.split('|')]
    tag_counter.update(tags)

top_tags = set([tag for tag, count in tag_counter.most_common(30)])

#️Building graph with only top tags
G = nx.Graph()
for tag_str in youtube_df['tags']:
    tags = [tag.strip() for tag in tag_str.split('|') if tag.strip() in top_tags]
    for i in range(len(tags)):
        for j in range(i + 1, len(tags)):
            if G.has_edge(tags[i], tags[j]):
                G[tags[i]][tags[j]]['weight'] += 1
            else:
                G.add_edge(tags[i], tags[j], weight = 1)

#️Removing isolated nodes
G.remove_nodes_from(list(nx.isolates(G)))

#️DErawing graph with custom visuals
plt.figure(figsize = (16, 12))
pos = nx.spring_layout(G, k = 0.5, seed = 42)

#️Node sizes by tag frequency
node_sizes = [tag_counter[node] * 20 for node in G.nodes()]

#️Node colors by degree
degrees = dict(G.degree())
colors = [degrees[node] for node in G.nodes()]
cmap = cm.get_cmap('plasma')

#️Drawing elements
nx.draw_networkx_nodes(G, pos, node_size = node_sizes, node_color = colors, cmap = cmap, alpha = 0.9)
nx.draw_networkx_edges(G, pos, width = 1.0, alpha = 0.3)
nx.draw_networkx_labels(G, pos, font_size = 9)

plt.title(" YouTube Tag Network (Top 30 Tags)", fontsize = 18, fontweight = 'bold')
plt.axis('off')
plt.tight_layout()
plt.show()


## 🌡️ Heatmap: Relationship Intensity



In [None]:
category_mapping = {
    1: 'Film & Animation',
    2: 'Autos & Vehicles',
    10: 'Music',
    #Keeping it less for clarity
}

youtube_df['category'] = youtube_df['category_id'].map(category_mapping)


In [None]:
youtube_df['publish_time'] = pd.to_datetime(youtube_df['publish_time'])
youtube_df['year'] = youtube_df['publish_time'].dt.year


In [None]:
top_categories = youtube_df.groupby('category')['views'].sum().nlargest(6).index
filtered_df = youtube_df[youtube_df['category'].isin(top_categories)]

recent_years = sorted(youtube_df['year'].unique())[-4:]
filtered_df = filtered_df[filtered_df['year'].isin(recent_years)]

heatmap_data = filtered_df.groupby(['category', 'year'])['views'].sum().unstack(fill_value=0)


In [None]:
import plotly.express as px

fig = px.imshow(
    heatmap_data,
    labels=dict(x="Year", y="Category", color="Total Views"),
    x=heatmap_data.columns.astype(str),
    y=heatmap_data.index,
    color_continuous_scale='YlGnBu',
    text_auto=True
)

fig.update_layout(
    title='Total Views by Top 6 Categories and Recent Years',
    template='plotly_dark',
    height=500,
    margin=dict(l=80, r=50, t=50, b=80)
)

fig.show()


## 📖 Emoji Storytelling: A Data Drama


🧠 Inside YouTube: A Director's Cut on Trends, Engagement, and Data Stories
(Emoji Storytelling Panel)

🎬 Category Showdown: Who’s Ruling the Stage?
🥇 Entertainment takes the crown with billions of views 👑🎭

🎵 Music stays timeless and powerful with ultra-high engagement 🎧💖

🎮 Gaming continues to level up — explosive growth in recent years 🚀🕹️

📚 Education rises quietly, delivering value 📈🧠

😂 Comedy keeps us laughing and clicking 😂🔥

❤️ Audience Engagement Pulse
💬 Videos with meaningful comments spark conversation & loyalty 🔄🗣️

👍 High like-to-view ratio? That’s content people truly love ❤️👏

🔁 Reactions go viral when comments + likes + shares = BOOM 💥🌍

🧩 Tag Tactics: Secrets Behind Smart Creators
🏷️ Repeating tags = stronger networks = more reach 🤝📡

🕸️ Top tags like music, vlog, funny form clusters 🔗🎶😂

🎯 Using 4–6 focused tags = better discoverability 🎯⚙️

🌐 The World Watches Together
🌍 Multi-language or global-tagged videos draw wider audiences 🗺️📈

🎌 Korean pop? Indian vlogs? Global love pouring in 💖🌏

🧑‍🤝‍🧑 Cultural trends spread fast — creators ride the wave 🏄‍♂️🎤🎥

⏱️ Time + Consistency = Growth
🔁 Regular posting creates trust & habit with viewers ⏰💡

📅 Creators uploading weekly gain more loyal subs over time 📊❤️

🎯 Story Wrap-Up
YouTube isn’t just a video platform — it’s a playground of patterns, people, and powerful stories.
From viral rockets 🚀 to slow-burn educators 📚, the secret sauce is in data + creativity.
So next time you click ❤️ or 💬 — you’re shaping the trends too!

2. ***YouTube Rewind 2.0: A Data-Driven Dive into Views, Voices & Viral Vibes 🔥***

## 🏁 Bar Chart Race: And They're Off!


In [None]:
#Extracting year from publish_time
youtube_df['publish_time'] = pd.to_datetime(youtube_df['publish_time'], errors='coerce')
youtube_df['year'] = youtube_df['publish_time'].dt.year

#Category mapping
category_map = {
    1: "Film & Animation", 2: "Autos & Vehicles", 10: "Music", 15: "Pets & Animals",
    17: "Sports", 18: "Short Movies", 19: "Travel & Events", 20: "Gaming",
    22: "People & Blogs", 23: "Comedy", 24: "Entertainment", 25: "News & Politics",
    26: "Howto & Style", 27: "Education", 28: "Science & Technology", 29: "Nonprofits & Activism"
}
youtube_df['category'] = youtube_df['category_id'].map(category_map)


In [None]:
#Aggregate total views per year and category
df = youtube_df.groupby(['year', 'category'])['views'].sum().reset_index()

#Filtering the top 6 categories overall
top_categories = df.groupby('category')['views'].sum().nlargest(6).index
df = df[df['category'].isin(top_categories)]

# Creating grid of all years x top categories
years = sorted(df['year'].unique())
full_index = pd.MultiIndex.from_product([years, top_categories], names=['year', 'category'])

#Filling missing columns
df = df.set_index(['year', 'category']).reindex(full_index, fill_value=0).reset_index()

#Sorting for animation
df = df.sort_values(by=['year', 'views'], ascending=[True, False])


In [None]:
import plotly.express as px

fig = px.bar(
    df,
    x = 'views',
    y = 'category',
    color = 'category',
    animation_frame = 'year',
    orientation = 'h',
    text = 'views',
    range_x = [0, df['views'].max() * 1.1],
    title = 'YouTube Category Growth Over Time (Views)'
)

fig.update_layout(
    template = 'plotly_dark',
    showlegend = False,
    height = 600
)

fig.update_traces(texttemplate = '%{text:.2s}', textposition = 'outside')

fig.show()


## 📈 Line Chart Animation: Trends Over Time


In [None]:
#Extracting year from publish_time
youtube_df['publish_time'] = pd.to_datetime(youtube_df['publish_time'], errors='coerce')
youtube_df['year'] = youtube_df['publish_time'].dt.year

#Category mapping
category_map = {
    1: "Film & Animation", 2: "Autos & Vehicles", 10: "Music", 15: "Pets & Animals",
    17: "Sports", 18: "Short Movies", 19: "Travel & Events", 20: "Gaming",
    22: "People & Blogs", 23: "Comedy", 24: "Entertainment", 25: "News & Politics",
    26: "Howto & Style", 27: "Education", 28: "Science & Technology", 29: "Nonprofits & Activism"
}
youtube_df['category'] = youtube_df['category_id'].map(category_map)

In [None]:
import numpy as np

years = sorted(df['year'].unique())
categories = df['category'].unique()

full_index = pd.MultiIndex.from_product([categories, years], names = ['category', 'year'])
df_full = df.set_index(['category', 'year']).reindex(full_index).reset_index()

#Filling missing values
df_full['views'] = df_full['views'].fillna(0)


In [None]:
#Frames for animation
frames = []
for frame_year in years:
    frame_data = df_full[df_full['year'] <= frame_year].copy()
    frame_data['frame'] = frame_year
    frames.append(frame_data)

df_anim = pd.concat(frames)


In [None]:
import plotly.express as px

fig = px.line(
    df_anim,
    x = 'year',
    y = 'views',
    color = 'category',
    line_group = 'category',
    animation_frame = 'frame',
    animation_group = 'category',
    markers = True,
    title = 'YouTube Category Views: Cumulative Trend Animation'
)

fig.update_layout(template = 'plotly_dark', xaxis = dict(tickmode = 'linear'), height = 600)
fig.show()


## 🔥 Viral Video Highlights: The Showstoppers


In [None]:
top_5_viral = youtube_df.sort_values('views', ascending = False).head(5)

In [None]:
from IPython.display import display, HTML

for idx, row in top_5_viral.iterrows():
  html = f"""
  <div style="border:2px solid #fff; padding:10px; margin:10px; background:#111; color:#eee; width:350px;">
  <h3>{row['title']}</h3>
  <img src = "{row['thumbnail_link']}" width = "320px" style = "border-radius:10px;"><br>
  <b>Views:</b> {row['views']:,} &nbsp; <b>Likes:</b> {row['likes']:,}<br>
  <i>Channel: {row['channel_title']}</i>

 </div>
 """
display(HTML(html))



*Videos*

In [None]:
top_viral = youtube_df.sort_values('views', ascending=False).head(5)


In [None]:
from IPython.display import display, HTML

top_5_viral = youtube_df.sort_values('views', ascending=False).head(5)

for idx, row in top_5_viral.iterrows():
    video_id = row['video_id']
    title = row['title']
    thumbnail = row['thumbnail_link']
    channel = row['channel_title']
    views = row['views']
    likes = row['likes']

#Doing some html based research
    html = f"""
    <div style="background-color:#111; border-radius:10px; padding:15px; margin:10px; color:#eee; width:400px;">
        <a href="https://www.youtube.com/watch?v={video_id}" target="_blank">
            <img src="{thumbnail}" width="360px" style="border-radius:10px;"><br>
        </a>
        <h3 style="margin-top:10px;">{title}</h3>
        <b>Views:</b> {views:,} &nbsp;|&nbsp; <b>Likes:</b> {likes:,}<br>
        <i>Channel:</i> {channel}
    </div>
    """
    display(HTML(html))


## 🎞️ GIFs: Motion in the Metrics


In [None]:
import pandas as pd

#Loading the dataset
youtube_df = pd.read_csv('USvideos.csv')

#Cleaning or converting necessary columns
youtube_df['views'] = pd.to_numeric(youtube_df['views'], errors='coerce')
youtube_df['likes'] = pd.to_numeric(youtube_df['likes'], errors='coerce')

#Readding for clarification
youtube_df['video_id'] = youtube_df['thumbnail_link'].str.extract(r'vi/([^/]+)/')


In [None]:
from IPython.display import display, HTML

top_viral = youtube_df[youtube_df['views'] > 0].sort_values('views', ascending=False).head(5)

# In case thumbnail_link is missing or broken
youtube_df['thumbnail_link'] = youtube_df['video_id'].apply(lambda vid: f"https://img.youtube.com/vi/{vid}/0.jpg")

html_code = """
<style>
.carousel-container {
  display: flex;
  overflow-x: auto;
  scroll-behavior: smooth;
  padding: 20px;
  background: #000;
}
.viral-card {
  flex: 0 0 auto;
  margin-right: 20px;
  padding: 15px;
  background-color: #111;
  color: #eee;
  border-radius: 12px;
  width: 350px;
  animation: pulse 4s infinite;
  position: relative;
  text-align: center;
}
.viral-card:hover {
  transform: scale(1.05);
  background-color: #222;
  box-shadow: 0 0 15px #ff4;
}
.viral-thumb {
  border-radius: 10px;
  width: 100%;
}
.floating-emojis {
  position: absolute;
  top: -10px;
  right: 10px;
  font-size: 24px;
  animation: float-emoji 3s infinite;
}
.quote {
  font-style: italic;
  font-size: 0.9rem;
  margin-top: 8px;
  color: #ccc;
}
@keyframes float-emoji {
  0% { transform: translateY(0); opacity: 0.8; }
  50% { transform: translateY(-15px); opacity: 1; }
  100% { transform: translateY(0); opacity: 0.8; }
}
@keyframes pulse {
  0% { transform: scale(1); box-shadow: 0 0 5px #ff5; }
  50% { transform: scale(1.03); box-shadow: 0 0 25px #f06; }
  100% { transform: scale(1); box-shadow: 0 0 5px #ff5; }
}
</style>
<div class="carousel-container">
"""



#Sample viral quotes
viral_quotes = [
    "This broke the internet! 💥",
    "One word: ICONIC 🔥",
    "How did this go so viral?! 🤯",
    "Everyone was talking about it 🎤",
    "A moment in YouTube history 📺"
]

#Floating emoji sets
emoji_sets = [
    "🔥💥", "😂📈", "🎉❤️", "😱🔥", "🤩🎬"
]

for i, (_, row) in enumerate(top_viral.iterrows()):
    quote = viral_quotes[i % len(viral_quotes)]
    emojis = emoji_sets[i % len(emoji_sets)]
    html_code += f"""
    <div class="viral-card">
        <div class="floating-emojis">{emojis}</div>
        <a href="https://www.youtube.com/watch?v={row['video_id']}" target="_blank">
            <img src="{row['thumbnail_link']}" class="viral-thumb"><br>
        </a>
        <h3>{row['title']}</h3>
        <p><b>Views:</b> {row['views']:,} &nbsp; <b>Likes:</b> {row['likes']:,}</p>
        <p><i>Channel: {row['channel_title']}</i></p>
        <div class="quote">“{quote}”</div>
    </div>
    """

html_code += "</div>"

#Displaying the result
display(HTML(html_code))


## 💬 Sentiment Exploration: What the Comments Say


In [None]:
import pandas as pd
import nltk
from nltk.sentiment.vader import SentimentIntensityAnalyzer

nltk.download('vader_lexicon')

youtube_df = pd.read_csv('USvideos.csv')  #Loading original data

sia = SentimentIntensityAnalyzer()

youtube_df['sentiment_scores'] = youtube_df['title'].apply(lambda x: sia.polarity_scores(x))
youtube_df['sentiment_compound'] = youtube_df['sentiment_scores'].apply(lambda d: d['compound'])

def sentiment_label(score):
    if score >= 0.05:
        return 'Positive'
    elif score <= -0.05:
        return 'Negative'
    else:
        return 'Neutral'

youtube_df['sentiment_label'] = youtube_df['sentiment_compound'].apply(sentiment_label)

print(youtube_df[['title', 'sentiment_compound', 'sentiment_label']].head())


In [None]:
from nltk.sentiment import SentimentIntensityAnalyzer

sia = SentimentIntensityAnalyzer()

def get_vader_tone(text):
    scores = sia.polarity_scores(text)
    return scores  #dict with pos, neu, neg, compound

youtube_df['vader_scores'] = youtube_df['title'].apply(get_vader_tone)


In [None]:
print(youtube_df[['title', 'vader_scores']].head())


In [None]:
vader_df = youtube_df['vader_scores'].apply(pd.Series)
youtube_df = pd.concat([youtube_df, vader_df], axis=1)

#Checking the updated DataFrame
print(youtube_df[['title', 'neg', 'neu', 'pos', 'compound']].head())


3.  ***Streaming the Stream: Unmasking YouTube Engagement & Growth Dynamics 📊***

## 🌈🖌️ A Creative Canvas: Artistic Visualization

In [None]:
import pandas as pd

youtube_df['likes-to-views'] = youtube_df['likes'] / youtube_df['views']
youtube_df['comments_to_views'] = youtube_df['comment_count'] / youtube_df['views']

youtube_df.fillna(0, inplace = True)
youtube_df.replace([float('inf'),  -float('inf')], 0, inplace = True)

In [None]:
import plotly.express as px

top_categories = youtube_df.groupby('category_id')['views'].sum().nlargest(10).index
filtered_df = youtube_df[youtube_df['category_id'].isin(top_categories)]

fig = px.scatter(filtered_df,
                 x = 'views',
                 y = 'likes',
                 color = 'category_id',
                 size = 'comment_count',
                 hover_data = ['title', 'channel_title'],
                 title = 'YouTube Engagement')
fig.update_layout(
    template = 'plotly_dark',
    height = 600,
    margin = dict(l = 80, r = 50, t = 50, b = 80)
)
fig.show()

## 🔊📝 Vocal Vibes: Voice-to-Text Polarity

In [None]:
from gtts import gTTS

text = "It starts with a click. A spark. A single video, launched into the vast unknown. Then—something shifts. Views multiply, emotions ripple, a conversation begins. But what lies beneath that moment when a video becomes a movement? In this segment, we unravel the hidden dynamics of YouTube’s most explosive content — not just the numbers, but the story behind the surge. This isn’t just virality. It’s digital resonance."
tts = gTTS(text = text, lang = 'en')

tts.save('output.mp3')
print("Saved as output.mp3")

In [None]:
!ls


In [None]:
from IPython.display import Audio
Audio("output.mp3")

In [None]:
import whisper

model = whisper.load_model('base')

result = model.transcribe('output.mp3')

transcript = result['text']
print('Transcript:\n', transcript)

In [None]:
import nltk
nltk.download('brown')
nltk.download('punkt')

In [None]:
from textblob import TextBlob

transcript = """It starts with a click, a spark, a single video, launched into the vast unknown.
Then, something shifts, views multiply, emotions ripple, a conversation begins.
But what lies beneath that moment when a video becomes a movement?
In this segment, we unravel the hidden dynamics of YouTube's most explosive content.
Not just the numbers, but the story behind the search. This isn't just virality. It's digital resonance.
"""
blob = TextBlob(transcript)

print('polarity', blob.sentiment.polarity)
print('subjectivity', blob.sentiment.subjectivity)

In [None]:
import plotly.graph_objects as go

fig = go.Figure(go.Indicator(
    mode = 'gauge+number',
    value = blob.sentiment.polarity,
    title = {'text': 'Tone Polarity','font': {'color': 'white'}},
    gauge = {'axis' : { 'range':[-1, 1]},
             'bar' : {'color': 'darkred' if blob.sentiment.polarity < 0 else 'green'},
                  'bgcolor': 'black',
        'bordercolor': 'white',
        'borderwidth': 2
    }
))



fig.update_layout(
    paper_bgcolor = 'black',
    font = {'color': 'white'}
)
fig.show()

In [None]:
from gtts import gTTS

paragraphs = [ """ “Welcome to our first clip: the Engagement Surge. In this segment, we spotlight the videos that saw views climb like wildfire—driven by shareable moments, community buzz, and perfectly timed uploads. We’ll unpack the metrics behind each spike—when comments flooded in, likes skyrocketed, and watch time hit record highs. Strap in, because this is how YouTube engagement truly ignites.” """

]

for i, text in enumerate(paragraphs, start = 1):
  tts = gTTS(text = text, lang = 'en')
  filename = f"video{i}.mp3"
  tts.save(filename)
  print(f"Saved {filename}")


In [None]:
import whisper

model = whisper.load_model('base')

result = model.transcribe('video1.mp3')

transcript = result['text']
print('Transcript:\n', transcript)

In [None]:
with open("video1_transcript.txt", "w") as f:
    f.write(transcript)


In [None]:
from textblob import TextBlob

blob = TextBlob(transcript)
print("Polarity:", blob.sentiment.polarity)
print("Subjectivity:", blob.sentiment.subjectivity)


In [None]:
import whisper

model = whisper.load_model("base")
result = model.transcribe("video1.mp3")

transcript1 = result['text']
print("Transcript for video1.mp3:\n", transcript1)


In [None]:
with open('video1_transcript.txt', 'w') as f:
  f.write(transcript1)

In [None]:
from textblob import TextBlob

blob1 = TextBlob(transcript1)

print('Polarity', blob1.sentiment.polarity)
print('Subjectivity', blob1.sentiment.subjectivity)


In [None]:
import plotly.graph_objects as go

fig = go.Figure(go.Indicator(
    mode = 'gauge+number',
    value = blob1.sentiment.polarity,
    title = {'text': 'Tone Polarity'},
    gauge={
        'axis': {'range': [-1, 1]},
        'bar': {'color': 'darkred' if blob1.sentiment.polarity < 0 else 'green'}
    }
))
fig.update_layout(paper_bgcolor = "black", font = {'color': "white"})
fig.show()


4. ***The YouTube Playbook: Animated Insights & Hidden Patterns of Content Power 📘***

## 🟠✨ Interactive 3D Bubble Chart with Floating Emojis  

In [None]:
import re
from collections import Counter
import pandas as pd

transcripts = [
    "Wow 😂 this video is amazing! Love it ❤️",
    "Fail 😞 didn't expect that... lol",
    "So funny 😂😂 loved it!",
]
reaction_keywords = ['wow', 'amazing', 'love', 'fail', 'lol', 'funny', 'wow']

emoji_pattern = re.compile("["
                          u"\U0001F600-\U0001F64F"
                           u"\U0001F300-\U0001F5FF"
                           u"\U0001F680-\U0001F6FF"
                           u"\U0001F1E0-\U0001F1FF"

"]+", flags = re.UNICODE)

all_emojis = []
all_reactions = []

for text in transcripts:

  emojis = emoji_pattern.findall(text)
  all_emojis.extend(emojis)

emoji_counts = Counter(all_emojis)
reaction_counts = Counter(all_reactions)

print("Emoji Counts:")
print(emoji_counts)

print("Reaction Keyword Counts:")
print(reaction_counts)

emoji_df = pd.DataFrame(emoji_counts.items(), columns = ['Emoji', 'Count'])
reaction_df = pd.DataFrame(reaction_counts.items(), columns = ['Reaction', 'Count'])

print("Emoji DataFrame:")
print(emoji_df)

print("Reaction DataFrame")
print(reaction_df)

In [None]:
import plotly.graph_objects as go
import numpy as np
import random

#Emoji list
emojis = ['🔥', '😂', '💡', '❤️', '🎬', '📈', '🤯', '🎉', '📢', '👁️‍🗨️']
n = len(emojis)

#Random 3D positions
x = np.random.rand(n)
y = np.random.rand(n)
z = np.random.rand(n)

#Random sizes
sizes = np.random.randint(25, 50, size = n)

#Colors for each emoji
colors = ['red', 'orange', 'yellow', 'green', 'cyan', 'magenta', 'blue', 'deeppink', 'gold', 'violet']

#Creating the bubble chart
fig = go.Figure()

for i in range(n):
    fig.add_trace(go.Scatter3d(
        x = [x[i]],
        y = [y[i]],
        z = [z[i]],
        mode = 'text',
        text = [emojis[i]],
        textfont = dict(size = sizes[i], color = colors[i]),
        hoverinfo = 'text'
    ))

fig.update_layout(
    scene=dict(
        xaxis = dict(title = 'Engagement', titlefont = dict(color = 'white'), tickfont = dict(color = 'white'), showbackground = False),
        yaxis = dict(title = 'Tone', titlefont = dict(color = 'white'), tickfont = dict(color = 'white'), showbackground = False),
        zaxis = dict(title = 'Virality', titlefont = dict(color = 'white'), tickfont = dict(color = 'white'), showbackground = False),
    ),
    paper_bgcolor = 'black',
    plot_bgcolor = 'black',
    margin = dict(l = 0, r = 0, t = 0, b = 0)
)

fig.show()


## ✨🎬 Engaging Text Animations for Enhanced Storytelling  

In [None]:
from wordcloud import WordCloud
import numpy as np
import plotly.graph_objects as go

texts = [
    """
    It starts with a click, a spark, a single video, launched into the vast unknown.
    Then, something shifts, views multiply, emotions ripple, a conversation begins.
    """,
    """
    From the shadows of the internet, creators emerge with stories untold.
    Their voices rise, narratives intertwine, shaping the culture of the digital age.
    """,
    """
    Engagement explodes, communities form, trends accelerate, and the world watches.
    This is not just content; it is the pulse of a generation connected through screens.
    """
]

#Images of the text
imgs = []
for t in texts:
    wc = WordCloud(width = 400, height = 200, background_color = 'black', colormap = 'viridis').generate(t)
    img = np.array(wc.to_image())
    if img.dtype != 'uint8':
        img = img.astype('uint8')
    imgs.append(img)

fig = go.Figure(
    frames = [go.Frame(data = [go.Image(z = img)]) for img in imgs]
)

#Showing the first image initially
fig.add_trace(go.Image(z = imgs[0]))

fig.update_layout(
    paper_bgcolor = 'black',
    plot_bgcolor = 'black',
    updatemenus = [dict(
        type = "buttons",
        buttons = [
            dict(
                label = "Play",
                method = "animate",
                args = [None, {"frame": {"duration": 1000, "redraw": True}, "fromcurrent": True, "transition": {"duration": 500}}]
            ),
            dict(
                label = "Pause",
                method = "animate",
                args = [[None], {"frame": {"duration": 0, "redraw": False}, "mode": "immediate", "transition": {"duration": 0}}]
            )
        ],
        direction = "left",
        pad = {"r": 10, "t": 10},
        showactive = True,
        x = 0.1,
        y = 0,
        xanchor = "right",
        yanchor = "top",
        bgcolor = 'black',
        bordercolor = 'white',
        borderwidth = 1,
        font = dict(color = 'white')
    )],
    font = dict(color = 'white')
)

fig.show()


## 🌐🔗 NetworkX Graph for Deeper Relationship Insights

In [None]:
import networkx as nx
import matplotlib.pyplot as plt
from collections import Counter
import re

#Sample video texts
video_texts = [
    "It starts with a click, a spark, a single video, launched into the vast unknown. Then, something shifts, views multiply, emotions ripple, a conversation begins.",
    "From the shadows of the internet, creators emerge with stories untold. Their voices rise, narratives intertwine, shaping the culture of the digital age.",
    "Engagement explodes, communities form, trends accelerate, and the world watches. This is not just content; it is the pulse of a generation connected through screens."
]

theme_keywords = ['click', 'spark', 'video', 'views', 'emotions', 'conversation',
                  'shadows', 'creators', 'stories', 'voices', 'narratives',
                  'engagement', 'communities', 'trends', 'content', 'generation']

def extract_themes(text, keywords):
    words = re.findall(r'\b\w+\b', text.lower())
    return [word for word in words if word in keywords]

co_occurrence = Counter()
for text in video_texts:
    themes_in_text = extract_themes(text, theme_keywords)
    for i in range(len(themes_in_text)):
        for j in range(i+1, len(themes_in_text)):
            pair = tuple(sorted([themes_in_text[i], themes_in_text[j]]))
            co_occurrence[pair] += 1

top_edges = co_occurrence.most_common(10)

G = nx.Graph()
for (node1, node2), weight in top_edges:
    G.add_edge(node1, node2, weight=weight)

nodes = set()
for (n1, n2), _ in top_edges:
    nodes.add(n1)
    nodes.add(n2)

G.add_nodes_from(nodes)

fig, ax = plt.subplots(figsize = (12,10))


fig.patch.set_facecolor('black')
ax.set_facecolor('black')

pos = nx.kamada_kawai_layout(G)

node_sizes = [300 + 500*G.degree(n) for n in G.nodes()]
edge_widths = [G[u][v]['weight'] * 3 for u,v in G.edges()]

nx.draw_networkx_nodes(G, pos, ax = ax, node_color = 'deepskyblue', node_size = node_sizes, alpha = 0.8)
nx.draw_networkx_edges(G, pos, ax = ax, width = edge_widths, edge_color='gray', alpha = 0.6)
nx.draw_networkx_labels(G, pos, ax = ax, font_color = 'white', font_size = 14, font_weight = 'bold')

ax.set_title("Top Themes Co-occurrence Network (Dark Background)", fontsize = 18, color = 'white')
ax.axis('off')

plt.show()


## 🎞️🕸️ Animated NetworkX Visualizations to Bring Data to Life

In [None]:
import matplotlib.pyplot as plt
import networkx as nx
import numpy as np
from matplotlib.animation import FuncAnimation
from IPython.display import HTML

#Creating a small theme network graph
G = nx.Graph()
edges = [
    ('Virality', 'Emotion'),
    ('Emotion', 'Engagement'),
    ('Engagement', 'Trend'),
    ('Trend', 'Reach'),
    ('Reach', 'Virality')
]
G.add_edges_from(edges)

#Positioning
pos = nx.spring_layout(G, seed = 42)

#Setting up the figure
fig, ax = plt.subplots(figsize=(8, 6))
plt.style.use('dark_background')
plt.close()

#Base node size
base_size = 800

def update(frame):
    ax.clear()
    ax.set_axis_off()

    #Create pulsing effect
    pulse = 1 + 0.25 * np.sin(frame / 5)
    node_sizes = [base_size * pulse for _ in G.nodes()]

    #Draw graph
    nx.draw_networkx_edges(G, pos, ax = ax, edge_color = 'white', width = 2)
    nx.draw_networkx_nodes(G, pos, ax = ax, node_color = 'deepskyblue', node_size = node_sizes, alpha = 0.9)
    nx.draw_networkx_labels(G, pos, ax = ax, font_color = 'white', font_size = 12)

    ax.set_title("📈 Animated Pulse Network: Content Themes", fontsize=16, color = 'white')

#Animation
ani = FuncAnimation(fig, update, frames = 60, interval = 100)

#Display in Colab
HTML(ani.to_jshtml())


## 📈🔮 Intuitive Forecasting Graph for Future Trends  

In [None]:
df['date'] = pd.to_datetime(df['date'])


In [None]:
df_prophet = df.groupby('date')['views'].sum().reset_index()

df_prophet.rename(columns = {'date': 'ds', 'views': 'y'}, inplace = True)

In [None]:
from prophet import Prophet

model = Prophet()
model.fit(df_prophet)

future = model.make_future_dataframe(periods=30)
forecast = model.predict(future)

model.plot(forecast)


5. ***YouTube Analytics 360: Category Wars, Viewer Trends & Visual Stories 🌍***

## 🍩📊 Donut Chart Visualization of Category Share Engagement

In [None]:
import numpy as np

categories = ['Music', 'Gaming', 'News', 'Education', 'Entertainment']
df['category'] = np.random.choice(categories, size=len(df))


In [None]:
import plotly.express as px

if 'likes' not in df.columns:
  df['likes'] = df['views']

df['engagement'] = df['views'] + df['likes']
engagement_df = df.groupby('category')[['views', 'likes', 'engagement']].sum().reset_index()

fig = px.pie(engagement_df, values = 'engagement', names = 'category',
             title = 'Category SHare of Engagement', hole = 0.5,
             color_discrete_sequence = px.colors.qualitative.Set3)

fig.update_layout(
    paper_bgcolor = 'black',
    plot_bgcolor = 'black',
    font = dict(color = 'white'),
    title_font = dict(size = 20, color = 'white', family = 'Arial')
)

fig.update_traces(textinfo = 'percent+label')
fig.show()

## 🔥📈 Heatmap for Visualizing Data Density  

In [None]:
import pandas as pd

df['date'] = pd.to_datetime(df['date'])

heatmap_data = df.groupby([df['date'].dt.to_period('M').astype(str), 'category'])['views'].sum().reset_index()
heatmap_pivot = heatmap_data.pivot(index = 'category', columns = 'date', values = 'views').fillna(0)

import plotly.graph_objects as go

fig = go.Figure(data = go.Heatmap(
    z = heatmap_pivot.values,
    x = heatmap_pivot.columns,
    y = heatmap_pivot.index,
    colorscale = 'YlOrRd'

))

fig.update_layout(title='Category Popularity Over Time (Views)', xaxis_title='Month', yaxis_title='Category', paper_bgcolor = 'black', plot_bgcolor = 'black', title_font = dict(size = 20, color = 'white', family = 'Arial'))
fig.show()

## 🎞️✨ Dynamic Animated Visuals to Captivate  

In [None]:
import plotly.express as px

fig = px.bar(
    df_anim,
    x = 'category',
    y = 'value',
    color = 'category',
    animation_frame = 'metric',
    range_y = [0, df_anim['value'].max() * 1.1],
    title = 'Category Engagement: Views vs Likes vs Total Engagement',
    color_discrete_sequence=px.colors.qualitative.Bold
)

fig.update_layout(
    plot_bgcolor = 'black',
    paper_bgcolor = 'black',
    font_color = 'white',
    xaxis_title = 'Category',
    yaxis_title = 'Count'
)

fig.update_traces(texttemplate = '%{y:.0f}', textposition = 'outside')
fig.show()


## 🎯📊 Key Performance Indicators (KPIs) at a Glance  

In [None]:
df['engagement'] = df['views'] + df['likes']

agg = df.groupby('category').agg(
    total_views = ('views', 'sum'),
    total_likes = ('likes', 'sum'),
    total_engagement = ('engagement', 'sum'),
    video_count = ('category', 'count')
).reset_index()

top_views = agg.loc[agg['total_views'].idxmax()]
top_likes = agg.loc[agg['total_likes'].idxmax()]
top_engagement = agg.loc[agg['total_engagement'].idxmax()]

avg_views_top_cat = df[df['category'] == top_views['category']]['views'].mean()

print(f" Top category by Views: {top_views['category']} with {top_views['total_views']:,} views")
print(f" Top category by Likes: {top_likes['category']} with {top_likes['total_likes']:,} likes")
print(f" Top category by Engagement: {top_engagement['category']} with {top_engagement['total_engagement']:,} total engagement")
print(f" Average Views per Video in '{top_views['category']}': {avg_views_top_cat:,.0f}")
print(f" Total Videos in '{top_views['category']}': {top_views['video_count']}")

In [None]:
import pandas as pd

from google.colab import files
uploaded = files.upload()

df = pd.read_csv('USvideos.csv')

#Testing it
print("🔥 DATAFRAME LOADED 🔥\n")
print(df.head())
print("\n📦 COLUMNS:\n", df.columns)


In [None]:
from IPython.display import display, Markdown

#Calculate KPIs
total_views = df['views'].sum()
total_likes = df['likes'].sum()
avg_views = df['views'].mean()
top_channel = df.groupby('channel_title')['views'].sum().idxmax()

#Displaying KPIs as markdown cards
display(Markdown(f"""
<style>
.kpi {{
  background-color: #121212;
  color: white;
  border-radius: 10px;
  padding: 15px;
  margin: 10px;
  font-family: Arial, sans-serif;
  width: 250px;
  display: inline-block;
  text-align: center;
  box-shadow: 0 0 10px #00ccff;
}}
.kpi h2 {{
  margin: 0;
  font-size: 30px;
}}
.kpi p {{
  margin: 5px 0 0;
  font-size: 14px;
  color: #66ccff;
}}
</style>

<div class="kpi">
  <h2>{total_views:,}</h2>
  <p>Total Views</p>
</div>

<div class="kpi">
  <h2>{total_likes:,}</h2>
  <p>Total Likes</p>
</div>

<div class="kpi">
  <h2>{avg_views:,.0f}</h2>
  <p>Average Views per Video</p>
</div>

<div class="kpi">
  <h2>{top_channel}</h2>
  <p>Top Channel by Views</p>
</div>
"""))


6. ***YouTube’s Untold Story: Visual Explorations of What Makes Content Win 📽️***

## 📈💖 Quarterly Emotional Trend (Line Chart)  

In [None]:
import pandas as pd
from collections import Counter
import emoji

def extract_emojis(text):
  return [char for char in text if char in emoji.EMOJI_DATA]

emoji_emotion_map = {
    '😂': 'Joy', '😢': 'Sadness', '😭': 'Sadness', '😍': 'Love', '😡': 'Anger',
    '🔥': 'Excitement', '💀': 'Shock', '🤔': 'Thinking', '😱': 'Surprise',
    '😎': 'Cool', '👍': 'Approval', '👎': 'Disapproval', '🎉': 'Celebration'
}

emotion_over_time = []

for index, row in df.iterrows():
    combined_text = str(row['title']) + " " + str(row['tags'])
    emojis_found = extract_emojis(combined_text)
    mapped_emotions = [emoji_emotion_map[e] for e in emojis_found if e in emoji_emotion_map]

    for emotion in mapped_emotions:
        emotion_over_time.append({
            'trending_date': row['trending_date'],
            'emotion': emotion
        })


emotion_df = pd.DataFrame(emotion_over_time)
emotion_df['trending_date'] = pd.to_datetime(emotion_df['trending_date'], errors = 'coerce')
emotion_df.dropna(inplace=True)


In [None]:
emotion_df.set_index('trending_date', inplace=True)


emotion_monthly = (
    emotion_df.groupby('emotion')
    .resample('M')
    .size()
    .reset_index(name='count')
)

In [None]:
print(emotion_monthly.groupby('trending_date')['count'].sum().sort_values(ascending=False).head(10))


In [None]:
emotion_quarterly = (
    emotion_df.groupby('emotion')
    .resample('QE')
    .size()
    .reset_index(name='count')
)

import plotly.express as px

fig = px.line(
    emotion_quarterly,
    x = 'trending_date',
    y = 'count',
    color = 'emotion',
    title = "📆 Quarterly Emotional Trends in YouTube Titles & Tags",
    markers = True,
    template = "plotly_dark"
)

fig.show()


## 🎞️📊 Animated Quarterly Emotional Trend  

In [None]:
emoji_map = {
    'Joy': '😊',
    'Love': '❤️',
    'Surprise': '😲',
    'Anger': '😠',
    'Sadness': '😢',
    'Cool': '😎',
    'Excitement': '😃',
    'Shock': '😱',
    'Thinking': '🤔',
    'Approval': '👍',
    'Disapproval': '👎',
    'Celebration': '🎉'
}



In [None]:
emotion_quarterly['emoji'] = emotion_quarterly['emotion'].map(emoji_map)


In [None]:
import plotly.express as px

fig = px.scatter(
    emotion_quarterly,
    x = 'trending_date',
    y = 'count',
    animation_frame = emotion_quarterly['trending_date'].dt.strftime('%Y-Q%q'),
    color = 'emotion',
    text = 'emoji',
    size = 'count',
    title = 'Animated Quaterly Emotional Trends with Emojis',
    template = 'plotly_dark'
)


fig.update_traces(
    mode = 'markers+text',
    textposition = 'middle center',
    marker = dict(size = 20)

)

fig.update_layout(
    xaxis_title = 'Date',
    yaxis_title = 'Count',
    legend_title = 'Emotion',
    font = dict(size = 14)
)

fig.show()

## 🔄🌊 Sankey Diagram for Flow Analysis

In [None]:
import plotly.graph_objects as go

labels = ['Music', 'Gaming', 'Education', 'Likes', 'Comments']

source = [0, 0, 1, 1, 2, 2]
target = [3, 4, 3, 4, 3, 4]
values = [3000, 1200, 2000, 800, 1500, 700]

fig_sankey = go.Figure(data=[go.Sankey(
    node = dict(
        pad = 15,
        thickness = 20,
        line = dict(color = 'black', width = 0.5),
        label = labels,
        color = ['#636EFA', '#EF553B', '#00CC96', '#AB63FA', '#FFA15A']
    ),
    link = dict(
        source = source,
        target = target,
        value = values
    ))])

fig_sankey.update_layout(
    title_text = "YouTube Viewer Flow and Engagement Paths",
    font_size = 12,
    template = 'plotly_dark'
)
fig_sankey.show()


## 🎼🔗 Chord Diagram to Show Connection

In [None]:
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np

#Sample data
data = {
    'Category': ['Music', 'Gaming', 'Education'],
    'Likes': [3000, 2000, 1500],
    'Comments': [1200, 800, 700]
}

df = pd.DataFrame(data)

num_vars = len(df)
angles = np.linspace(0, 2 * np.pi, num_vars, endpoint=False).tolist()
angles += angles[:1]

#Values
likes = df['Likes'].tolist()
likes += likes[:1]
comments = df['Comments'].tolist()
comments += comments[:1]

#Labels
categories = df['Category'].tolist()
categories += categories[:1]

#Setup plot
fig, ax = plt.subplots(figsize = (6, 6), subplot_kw = dict(polar = True))
fig.patch.set_facecolor('black')
ax.set_facecolor('#111111')

#Plotting bars
bars_likes = ax.bar(angles[:-1], likes[:-1], width = 0.3, color = '#636EFA', label = 'Likes', alpha = 0.8)
bars_comments = ax.bar([a + 0.3 for a in angles[:-1]], comments[:-1], width  0.3, color = '#EF553B', label = 'Comments', alpha=0.8)

#Setting category labels at angles
ax.set_xticks(angles[:-1])
ax.set_xticklabels(categories[:-1], color='white', fontsize=12)

#Rotating each label outward
for label, angle in zip(ax.get_xticklabels(), angles[:-1]):
    angle_deg = np.degrees(angle)
    if angle_deg >= 90 and angle_deg <= 270:
        label.set_rotation(angle_deg + 180)
        label.set_horizontalalignment('right')
    else:
        label.set_rotation(angle_deg)
        label.set_horizontalalignment('left')

#Removing radial labels
ax.set_yticklabels([])

#Style
ax.grid(color = 'gray', linestyle = 'dotted', linewidth = 0.7)
ax.spines['polar'].set_color('white')
ax.set_title('Circular Engagement Chart', color = 'white', fontsize = 14, pad = 20)
ax.legend(loc='upper right', bbox_to_anchor = (1.2, 1.1), facecolor =' black', edgecolor = 'white', labelcolor = 'white')

plt.tight_layout()
plt.show()


## ✨🌟 Highlight Specials  

In [None]:
import pandas as pd

highlight_data = {
     'Title': ['Best Prank Ever', 'Emotional Reunion', 'Epic Fail Compilation'],
    'Views': [12000000, 9400000, 8800000],
    'Likes': [850000, 790000, 770000],
    'Emotion': ['😂 Joy', '😭 Sadness', '🤯 Surprise'],
    'Trending Date': ['2019-06-14', '2020-01-10', '2021-08-05'],
    'Thumbnail': ['thumbs/prank.jpg', 'thumbs/reunion.jpg', 'thumbs/fail.jpg']

}

df_highlights = pd.DataFrame(highlight_data)
df_highlights

In [None]:
from IPython.display import display, HTML

df_highlights['Thumbnail'] = [
  'https://images.unsplash.com/photo-1503023345310-bd7c1de61c7d?auto=format&fit=crop&w=400&q=80',
  'https://images.unsplash.com/photo-1494790108377-be9c29b29330?auto=format&fit=crop&w=400&q=80',
  'https://images.unsplash.com/photo-1511367461989-f85a21fda167?auto=format&fit=crop&w=400&q=80'
]

html_cards = ""

for i, row in df_highlights.iterrows():
  html_cards += f"""
    <div style='
        display: inline-block;
        background-color: #1e1e1e;
        color: white;
        border-radius: 15px;
        margin: 10px;
        padding: 10px;
        width: 250px;
        font-family: Arial;
        box-shadow: 0 0 15px #00ccff;
        text-align: center;
    '>
        <img src='{row["Thumbnail"]}' style='width: 100%; border-radius: 10px;'/>
        <h3 style = 'margin: 10px 0 5px'>{row['Title']}</h3>
        <p style='color: #66ccff; margin: 5px 0;'>{row["Trending Date"]}</p>
        <p style='margin: 5px 0;'>👁️ {row["Views"]:,} | {row["Likes"]:,}</p>
        <p style='font-size: 18px;'>{row["Emotion"]}</p>

        </div>
        """

display(HTML(f"<div style='display: flex; flex-wrap: wrap;'>{html_cards}</div>"))




In [None]:
from IPython.display import display, HTML

#Sample data
import pandas as pd
df_highlights = pd.DataFrame({
    'Title': ['Video 1', 'Video 2', 'Video 3'],
    'Emotion': ['Joy', 'Love', 'Surprise'],
    'Trending Date': ['2025-06-01', '2025-06-02', '2025-06-03'],
    'Views': [10000, 20000, 15000],
    'Likes': [500, 800, 600],
    'Thumbnail': [
    'https://images.unsplash.com/photo-1508214751196-bcfd4ca60f91?auto=format&fit=crop&w=500&q=80',
    'https://images.unsplash.com/photo-1524504388940-b1c1722653e1?auto=format&fit=crop&w=500&q=80',
    'https://images.unsplash.com/photo-1544005313-94ddf0286df2?auto=format&fit=crop&w=500&q=80'
]

})

def get_emotion_color(emotion):
    color_map = {
        'Joy': '#FFD700',
        'Love': '#FF69B4',
        'Surprise': '#00FFFF',
        'Anger': '#FF4500',
        'Sadness': '#1E90FF',
        'Fear': '#8A2BE2'
    }
    return color_map.get(emotion, '#CCCCCC')

html_cards = """
<style>
.card-container {
    display: flex;
    gap: 20px;
    flex-wrap: wrap;
}
.card {
    background: #222;
    color: white;
    border-radius: 10px;
    padding: 10px;
    width: 280px;
    box-shadow: 0 0 10px #555;
    position: relative;
    transition: transform 0.3s ease;
}
.card:hover {
    transform: scale(1.05);
    box-shadow: 0 0 20px #fff;
}
.thumb {
    width: 100%;
    height: 160px;
    object-fit: cover;
    border-radius: 10px 10px 0 0;
}
.tooltip {
    position: absolute;
    top: 10px;
    right: 10px;
    background: rgba(255,255,255,0.2);
    padding: 4px 8px;
    font-size: 12px;
    border-radius: 5px;
    color: #eee;
}
</style>

<div class="card-container">
"""

for _, row in df_highlights.iterrows():
    glow = get_emotion_color(row['Emotion'])
    img_url = row.get('Thumbnail', 'https://img.youtube.com/vi/dQw4w9WgXcQ/hqdefault.jpg')
    html_cards += f"""
    <div class='card' style='box-shadow: 0 0 15px {glow};'>
        <div class='tooltip'>Trending on {row['Trending Date']}</div>
        <img src='{img_url}' class='thumb' />
        <h3>{row['Title']}</h3>
        <p style='color: {glow}; font-weight: bold;'>🎭 {row['Emotion']}</p>
        <p>👁️ {row["Views"]:,} | ❤️ {row["Likes"]:,}</p>
    </div>
    """

html_cards += "</div>"

display(HTML(html_cards))


7. ***YouTube’s Untold Story: Visual Explorations of What Makes Content Win
GIFs and short video clips to spice up visuals***


## 📝✨ Informative Annotations for Clarity  

In [None]:
import plotly.express as px
import pandas as pd

df = pd.DataFrame({
    'x': [1, 2, 3, 4, 5],
    'y': [10, 11, 12, 13, 14],
    'label': ['A', 'B', 'C', 'D', 'E'],
    'info': ['Info A', 'Info B', 'Info C', 'Info D', 'Info E']
})

fig = px.scatter(df, x = 'x', y = 'y', text = 'label', hover_name = 'label', hover_data = ['info'])

#Dark mode background
fig.update_layout(
    plot_bgcolor = 'black',  #plot area
    paper_bgcolor = 'black',
    font=dict(color = 'white'),  #default text color
)

#Annotations styled for dark bg
annotations = [
    dict(
        x = 3, y = 12,
        xref = 'x', yref = 'y',
        text = "✨ Key Point C: This spike is interesting!",
        showarrow = True,
        arrowhead = 3,
        arrowsize = 2,
        arrowwidth = 2,
        arrowcolor = 'cyan',
        bgcolor = '#222222',  #Dark grey bg
        bordercolor = 'cyan',
        borderwidth = 2,
        borderpad = 4,
        font = dict(size = 14, color = 'cyan'),
        ax = 0,
        ay = -60
    ),
    dict(
        x = 5, y = 14,
        xref = 'x', yref = 'y',
        text = "🔥 Peak at E",
        showarrow = True,
        arrowhead = 2,
        arrowsize = 1.5,
        arrowwidth = 2,
        arrowcolor = 'magenta',
        bgcolor = '#222222',
        bordercolor = 'magenta',
        borderwidth = 1,
        borderpad = 3,
        font = dict(size=12, color = 'magenta'),
        ax = 20,
        ay = -40
    )
]

fig.update_layout(annotations=annotations)
fig.show()


## 🎨🔥 Stylish Highlights to Grab Attention  

In [None]:
from IPython.display import display, HTML
import pandas as pd

df_highlights = pd.DataFrame({
    'Title': ['Video 1', 'Video 2', 'Video 3'],
    'Emotion': ['Joy', 'Love', 'Surprise'],
    'Trending Date': ['2025-06-01', '2025-06-02', '2025-06-03'],
    'Views': [10000, 20000, 15000],
    'Likes': [500, 800, 600],
    'Thumbnail': [
        'https://upload.wikimedia.org/wikipedia/commons/thumb/4/47/PNG_transparency_demonstration_1.png/300px-PNG_transparency_demonstration_1.png',
        'https://upload.wikimedia.org/wikipedia/commons/thumb/4/47/PNG_transparency_demonstration_1.png/300px-PNG_transparency_demonstration_1.png',
        'https://upload.wikimedia.org/wikipedia/commons/thumb/5/56/Donald_Trump_official_portrait.jpg/300px-Donald_Trump_official_portrait.jpg'




    ]
})

html_cards = ""

for _, row in df_highlights.iterrows():
    html_cards += f"""
    <div style='
        display: inline-block;
        background-color: #1e1e1e;
        color: white;
        border-radius: 15px;
        margin: 10px;
        padding: 10px;
        width: 250px;
        font-family: Arial;
        box-shadow: 0 0 15px #00ccff;
        text-align: center;
    '>
        <img src="{row['Thumbnail']}" style='width: 100%; border-radius: 10px;'/>
        <h3 style='margin: 10px 0 5px'>{row['Title']}</h3>
        <p style='color: #66ccff; margin: 5px 0;'>{row['Trending Date']}</p>
        <p style='margin: 5px 0;'>👁️ {row['Views']:,} | ❤️ {row['Likes']:,}</p>
        <p style='font-size: 18px;'>{row['Emotion']}</p>
    </div>
    """

display(HTML(f"<div style='display: flex; flex-wrap: wrap;'>{html_cards}</div>"))


## 🤖🖼️ AI-Generated Image Labelling Magic

In [None]:
import base64

image_path = "viral_pulse.png"

#Reading the image and encoding it
with open(image_path, "rb") as f:
    image_bytes = f.read()
encoded_image = base64.b64encode(image_bytes).decode()

#Preparing the HTML with embedded image (base64)
html = f"""
<div style="
    text-align: center;
    margin: 30px auto;
    background: linear-gradient(135deg, #1f0036, #2c003e);
    padding: 25px;
    border-radius: 20px;
    box-shadow: 0 0 25px #ff00cc44;
    max-width: 700px;
">
  <img src="data:image/png;base64,{encoded_image}" style="
    max-width: 90%;
    border-radius: 12px;
    box-shadow: 0 0 20px #00ccff;
  "/>
  <p style="
    font-family: 'Trebuchet MS', sans-serif;
    font-size: 20px;
    color: #66ccff;
    margin-top: 15px;
  ">🚀 <b>Chapter Title:</b> YouTube’s Viral Pulse</p>
  <p style="
    color: #bbbbbb;
    font-size: 16px;
    margin: 5px;
  ">✨ A Data-Driven Exploration of Content Power, Performance & Pulse.</p>
</div>
"""

from IPython.display import display, HTML
display(HTML(html))


## 🎥🌫️ Artistic Mini-Documentary with Smooth Fades  

In [None]:
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.backends.backend_agg import FigureCanvasAgg as FigureCanvas
from moviepy.editor import VideoClip, AudioFileClip

#setting up the animation duration and audio path
animation_duration = 5  #Seconds
audio_path = "narration.mp3"  #shall be saved

#Defining the animated frame function
def make_frame(t):
    fig, ax = plt.subplots(figsize=(6, 4), dpi=100)

    # --- Fade-in and fade-out setup ---
    fade_duration = 1.0  #Seconds
    if t < fade_duration:
        fade_alpha = t / fade_duration
    elif t > animation_duration - fade_duration:
        fade_alpha = (animation_duration - t) / fade_duration
    else:
        fade_alpha = 1.0

    # --- Animation growth ---
    progress = min(t / animation_duration, 1)
    value = progress * 100
    pulse = 0.2 + 0.1 * np.sin(2 * np.pi * t)

    # --- Glow Bar ---
    ax.barh([''], [value + pulse], color = (1, 0, 0.8, 0.12 * fade_alpha), height = 0.8, zorder = 0)
    ax.barh([''], [value], color = (1, 0, 0.8, fade_alpha), height = 0.6, zorder = 1)

    # --- Label ---
    ax.text(value + 2, 0, f"{value:.1f}%", va = 'center', ha = 'left',
            fontsize = 14 + 3 * (1 - fade_alpha),  #Shrink text on fade
            color = (1, 1, 1, fade_alpha), fontweight = 'bold')

    # --- Title ---
    ax.set_title("Viral Views Over Time", fontsize = 16, color = (1, 1, 1, fade_alpha), pad = 20)

    # --- Axes cleanup ---
    ax.set_xlim(0, 110)
    ax.set_ylim(-0.5, 0.5)
    ax.set_yticks([])
    ax.set_xticks([])
    ax.set_facecolor((0.1, 0, 0.2, fade_alpha))     #Fade background
    fig.patch.set_facecolor((0.1, 0, 0.2, fade_alpha))

    for spine in ax.spines.values():
        spine.set_visible(False)

    # --- Render as image ---
    canvas = FigureCanvas(fig)
    canvas.draw()
    buf, (w, h) = canvas.print_to_buffer()
    image = np.frombuffer(buf, dtype=np.uint8).reshape((h, w, 4))
    plt.close(fig)
    return image[:, :, :3]


    #Animating bar growth
    progress = min(t / animation_duration, 1)
    value = progress * 100

    #Glow effect
    pulse = 0.2 + 0.1 * np.sin(2 * np.pi * t)

    #Drawing glow and bar
    ax.barh([''], [value + pulse], color = (1, 0, 0.8, 0.15 * alpha), height = 0.8, zorder  = 0)
    ax.barh([''], [value], color = (1, 0, 0.8, alpha), height  = 0.6, zorder = 1)

    #Labels
    ax.text(value + 2, 0, f"{value:.1f}%", va = 'center', ha = 'left',
            fontsize = 14, color = (1, 1, 1, alpha), fontweight = 'bold')

    #Title and cleanup
    ax.set_title("🔥 Viral Views Over Time 🔥", fontsize = 16, color = (1, 1, 1, alpha), pad = 20)
    ax.set_xlim(0, 110)
    ax.set_ylim(-0.5, 0.5)
    ax.set_yticks([])
    ax.set_xticks([])
    ax.set_facecolor((0.1, 0, 0.2, alpha))
    fig.patch.set_facecolor((0.1, 0, 0.2, alpha))
    for spine in ax.spines.values():
        spine.set_visible(False)

    #Render to image
    canvas = FigureCanvas(fig)
    canvas.draw()
    buf, (w, h) = canvas.print_to_buffer()
    image = np.frombuffer(buf, dtype=np.uint8).reshape((h, w, 4))
    plt.close(fig)

    return image[:, :, :3]

#Generating the video with audio
animation = VideoClip(make_frame, duration=animation_duration)
animation = animation.set_audio(AudioFileClip(audio_path))
animation.write_videofile("mini_doc.mp4", fps=24)


In [None]:
from IPython.display import Video
Video("mini_doc.mp4", embed=True)


***A Little Bit of Summary since this project is officially over!***

Project Wrap-Up: Streaming Stories — YouTube & Netflix Deep Dive
Wow, what a journey! This project took us through the fascinating world of YouTube and Netflix data — exploring what makes videos go viral, which shows dominate across countries, and how streaming trends change over time.

What we uncovered:
For YouTube, we dug into the most popular categories, tracked views and engagement, and even created a narrated mini documentary to bring the data to life.

On the Netflix side, we analyzed release patterns, country-wise trends, and used cool animations and charts to tell the story visually.

The goal? To turn raw numbers into stories that anyone can enjoy and understand — blending data with a little bit of art and creativity.

Thanks so much for sticking with me through this project! 🙏 I hope you found it insightful and maybe even a little fun. If you have questions, ideas, or just want to chat about data, I’m always here.

Here’s to many more data adventures ahead! 🚀✨