# **Exploratory Data Analysis**: FlipTop

## Question
----

The central question of this project revolves around FlipTop battle rap careers and their length.

> How long does a career in the Philippines' premier rap battle league FlipTop last?

Who rises to the top? What determines who stays relevant over the years?

Here I'm thinking survival (time-to-event) analysis.

Another question I'm curious about is this:
> Who has battled who? And what matchups haven't been done yet?

I'm picturing a graph here. I want to understand how dense or sparse the network of rap battles are in FlipTop.

Ideas:
- Emcees are nodes
- "Battled" as edges
- Weight of the edge is how many times they've battled?

## Data Wrangling
----

Data wrangled using [YouTube API](https://developers.google.com/youtube/v3).

In [1]:
import sys
import json
import pandas as pd
import numpy as np
import isodate
import re

Note: This data was scraped July 2, 2025.

In [2]:
# Load JSON data
with open("../data/videos.json", "r", encoding="utf-8") as f:
    data = json.load(f)

# Convert to DataFrame
df = pd.DataFrame(data)

# Preview
df.head()

Unnamed: 0,id,title,description,upload_date,view_count,duration,url,likeCount,commentCount,tags
0,NCElL6MbZ_g,FlipTop - Caspher vs CRhyme,FlipTop presents: Zoning 18 @ The TakeOver Lou...,2025-07-02T13:23:54Z,139884,PT32M32S,https://www.youtube.com/watch?v=NCElL6MbZ_g,3117,613,"[fliptop, flip top, flip top battles, fliptopb..."
1,JabvhPBmoVs,FlipTop CripLi - Isabuhay 2025 | Abangan si Cr...,"FlipTop presents: Gubat 15 @ Mariner's Court, ...",2025-06-25T13:49:25Z,27498,PT1M51S,https://www.youtube.com/watch?v=JabvhPBmoVs,640,31,"[fliptop, flip top, flip top battles, fliptopb..."
2,YiJI_ohq4Pc,FlipTop Ban - Isabuhay 2025 | Abangan si Ban n...,"FlipTop presents: Gubat 15 @ Mariner's Court, ...",2025-06-25T13:49:25Z,29353,PT1M24S,https://www.youtube.com/watch?v=YiJI_ohq4Pc,763,34,"[fliptop, flip top, flip top battles, fliptopb..."
3,yRE3PU0ekaA,FlipTop J-Blaque - Gods Pa Rin? | | Abangan si...,"FlipTop presents: Gubat 15 @ Mariner's Court, ...",2025-06-25T13:49:25Z,17415,PT1M56S,https://www.youtube.com/watch?v=yRE3PU0ekaA,360,8,"[fliptop, flip top, flip top battles, fliptopb..."
4,Ftk2DZ3hcxw,FlipTop Vitrum - Ang HipHop Pinalakas ng mga T...,"FlipTop presents: Gubat 15 @ Mariner's Court, ...",2025-06-25T13:48:58Z,28969,PT3M15S,https://www.youtube.com/watch?v=Ftk2DZ3hcxw,449,50,"[fliptop, flip top, flip top battles, fliptopb..."


In [3]:
# No need to run this again (already ran once and wrote to file)
sys.path.append("../scripts")

from emcee_scraper import scrape_names, write_names_to_csv
emcees = scrape_names()
write_names_to_csv(emcees, "../data/emcees.csv")

Scraped 20 names from page 1.
Scraped 20 names from page 2.
Scraped 20 names from page 3.
Scraped 20 names from page 4.
Scraped 20 names from page 5.
Scraped 20 names from page 6.
Scraped 20 names from page 7.
Scraped 20 names from page 8.
Scraped 7 names from page 9.
No emcees found on page 10. Stopping early.
Wrote 167 names to ../data/emcees.csv


## Data Cleaning
----

Tasks:

- convert `upload_date` to something more usable.
- convert `duration` to human readable (HH:MM:SS maybe?)
- Some shorts/round highlight moments were uploaded on the [videos playlist](https://www.youtube.com/@fliptopbattles/videos) of the channel so need to filter by duration, especially post-2020 (when YouTube Shorts became a thing).
- Only considering 1v1 battles that were judged (for now)
    - So remove:
        - Sound Check
        - Freestyle battles
        - Dos Por Dos (2v2)
        - Royal Rumble (1v1v1v1v1)
        - Those two 5v5 battles that happened only twice
        - Anygma Machine
        - Video Flyer
        - Announcement videos
        - Emcee interview videos

BIG TASKS

- Clean up `title` and create new column `battle_card` or something that contains only names e.g. Emcee1 vs Emcee2
- Webscrape [emcee page](https://www.fliptop.com.ph/emcees) to get a full and complete list of the emcees in FlipTop (according to their own website roster).
    - Good for later user of cross-referencing titles with emcee names.
- 

In [4]:
df.head()

Unnamed: 0,id,title,description,upload_date,view_count,duration,url,likeCount,commentCount,tags
0,NCElL6MbZ_g,FlipTop - Caspher vs CRhyme,FlipTop presents: Zoning 18 @ The TakeOver Lou...,2025-07-02T13:23:54Z,139884,PT32M32S,https://www.youtube.com/watch?v=NCElL6MbZ_g,3117,613,"[fliptop, flip top, flip top battles, fliptopb..."
1,JabvhPBmoVs,FlipTop CripLi - Isabuhay 2025 | Abangan si Cr...,"FlipTop presents: Gubat 15 @ Mariner's Court, ...",2025-06-25T13:49:25Z,27498,PT1M51S,https://www.youtube.com/watch?v=JabvhPBmoVs,640,31,"[fliptop, flip top, flip top battles, fliptopb..."
2,YiJI_ohq4Pc,FlipTop Ban - Isabuhay 2025 | Abangan si Ban n...,"FlipTop presents: Gubat 15 @ Mariner's Court, ...",2025-06-25T13:49:25Z,29353,PT1M24S,https://www.youtube.com/watch?v=YiJI_ohq4Pc,763,34,"[fliptop, flip top, flip top battles, fliptopb..."
3,yRE3PU0ekaA,FlipTop J-Blaque - Gods Pa Rin? | | Abangan si...,"FlipTop presents: Gubat 15 @ Mariner's Court, ...",2025-06-25T13:49:25Z,17415,PT1M56S,https://www.youtube.com/watch?v=yRE3PU0ekaA,360,8,"[fliptop, flip top, flip top battles, fliptopb..."
4,Ftk2DZ3hcxw,FlipTop Vitrum - Ang HipHop Pinalakas ng mga T...,"FlipTop presents: Gubat 15 @ Mariner's Court, ...",2025-06-25T13:48:58Z,28969,PT3M15S,https://www.youtube.com/watch?v=Ftk2DZ3hcxw,449,50,"[fliptop, flip top, flip top battles, fliptopb..."


In [5]:
# looking at desc
samp = df.sample(n=5, replace=False)
samp

Unnamed: 0,id,title,description,upload_date,view_count,duration,url,likeCount,commentCount,tags
823,QeSzwXSYi_Y,FlipTop - Gubat 8 Post-Battle Interviews,FlipTop Visayas presents: Gubat 8 @ CAP Center...,2019-06-21T10:22:55Z,43244,PT9M43S,https://www.youtube.com/watch?v=QeSzwXSYi_Y,585,85,"[fliptop, fliptop new, fliptop latest, fliptop..."
965,TjJnuG8hhas,FlipTop - Sayadd vs J-King @ Isabuhay 2018,FlipTop Metro Manila presents: Second Sight 6 ...,2018-06-10T09:14:38Z,1433353,PT24M22S,https://www.youtube.com/watch?v=TjJnuG8hhas,6889,1761,"[fliptop, fliptop new, fliptop latest, fliptop..."
759,oiN22Wyf_Uo,FlipTop - Chris Ace vs Pen Pluma,https://fliptop.com.ph\nFlipTop Metro Cebu pre...,2019-11-17T09:35:59Z,277024,PT24M16S,https://www.youtube.com/watch?v=oiN22Wyf_Uo,2231,644,"[fliptop, flip top, flip top battles, fliptopb..."
833,pR4axEbcEbQ,FlipTop - BLKD vs Marshall Bonifacio @ Isabuha...,"FlipTop Visayas presents: Gubat 8, Isabuhay To...",2019-05-30T11:17:47Z,2128344,PT25M12S,https://www.youtube.com/watch?v=pR4axEbcEbQ,19154,3585,"[fliptop, fliptop new, fliptop latest, fliptop..."
709,lcX31_kRojs,"FlipTop - Ahon 10, Day 1 Post-Battle Interviews",FlipTop Metro Manila presents: Ahon 10 @ TIU T...,2020-04-13T09:55:24Z,143133,PT11M15S,https://www.youtube.com/watch?v=lcX31_kRojs,2479,272,"[fliptop, flip top, flip top battles, fliptopb..."


In [6]:
samp["description"]

823    FlipTop Visayas presents: Gubat 8 @ CAP Center...
965    FlipTop Metro Manila presents: Second Sight 6 ...
759    https://fliptop.com.ph\nFlipTop Metro Cebu pre...
833    FlipTop Visayas presents: Gubat 8, Isabuhay To...
709    FlipTop Metro Manila presents: Ahon 10 @ TIU T...
Name: description, dtype: object

In [7]:
# Write multi-line descriptions to file for analysis
title_desc = samp[["title", "description"]]
records = title_desc.to_dict(orient="records")

with open("../data/sample_descriptions.json", "w", encoding="utf-8") as f:
    json.dump(records, f, indent=2, ensure_ascii=False)

In [8]:
with open("../data/sample_descriptions.json", "r", encoding="utf-8") as f:
    descriptions = json.load(f)

# Pick the first one (index 0) or whichever index you want
sample = descriptions[1]

# Print nicely
print("TITLE:\n", sample["title"])
print("\nDESCRIPTION:\n", sample["description"])

TITLE:
 FlipTop - Sayadd vs J-King @ Isabuhay 2018

DESCRIPTION:
 FlipTop Metro Manila presents: Second Sight 6 Pre-Battle Interviews @ B-Side, The Collective, Malugay St., Makati City, Metro Manila, Philippines. March 16, 2018. Filipino Conference Battle.

-SAYADD VS J-KING-

Subscribe Here! http://bit.ly/fliptopsub
Check out our top videos! http://bit.ly/fliptopTopVideos

Website: https://fliptop.com.phFacebook: https://www.facebook.com/fliptop.battleleague
Twitter: https://twitter.com/FlipTop_Battles

#fliptopbattles

About fliptopbattles:
FlipTop Kru Corp. is a self-produced events and artist management company with its first product in the FlipTop Battle League. The FlipTop Battle League is the Philippines’ first premier – and the world’s most-viewed – rap battle league. It is popularly credited for the resurgence and widespread acceptance of hiphop culture in the Philippines since its inception in February 2010, and continues to champion all other hiphop elements in its variety o

Video descriptions often have good info on them for when and where the rap battle took place. Need to scrape these too!
- Let's put a pin on that right now. Do low-hanging fruit first before going for messy regex text extraction tasks.

In [9]:
df.head()

Unnamed: 0,id,title,description,upload_date,view_count,duration,url,likeCount,commentCount,tags
0,NCElL6MbZ_g,FlipTop - Caspher vs CRhyme,FlipTop presents: Zoning 18 @ The TakeOver Lou...,2025-07-02T13:23:54Z,139884,PT32M32S,https://www.youtube.com/watch?v=NCElL6MbZ_g,3117,613,"[fliptop, flip top, flip top battles, fliptopb..."
1,JabvhPBmoVs,FlipTop CripLi - Isabuhay 2025 | Abangan si Cr...,"FlipTop presents: Gubat 15 @ Mariner's Court, ...",2025-06-25T13:49:25Z,27498,PT1M51S,https://www.youtube.com/watch?v=JabvhPBmoVs,640,31,"[fliptop, flip top, flip top battles, fliptopb..."
2,YiJI_ohq4Pc,FlipTop Ban - Isabuhay 2025 | Abangan si Ban n...,"FlipTop presents: Gubat 15 @ Mariner's Court, ...",2025-06-25T13:49:25Z,29353,PT1M24S,https://www.youtube.com/watch?v=YiJI_ohq4Pc,763,34,"[fliptop, flip top, flip top battles, fliptopb..."
3,yRE3PU0ekaA,FlipTop J-Blaque - Gods Pa Rin? | | Abangan si...,"FlipTop presents: Gubat 15 @ Mariner's Court, ...",2025-06-25T13:49:25Z,17415,PT1M56S,https://www.youtube.com/watch?v=yRE3PU0ekaA,360,8,"[fliptop, flip top, flip top battles, fliptopb..."
4,Ftk2DZ3hcxw,FlipTop Vitrum - Ang HipHop Pinalakas ng mga T...,"FlipTop presents: Gubat 15 @ Mariner's Court, ...",2025-06-25T13:48:58Z,28969,PT3M15S,https://www.youtube.com/watch?v=Ftk2DZ3hcxw,449,50,"[fliptop, flip top, flip top battles, fliptopb..."


In [10]:
# Convert to upload_date to datetime objects
df["upload_date"] = pd.to_datetime(df["upload_date"], utc=True)

In [9]:
df.head()

Unnamed: 0,id,title,description,upload_date,view_count,duration,url,likeCount,commentCount,tags
0,NCElL6MbZ_g,FlipTop - Caspher vs CRhyme,FlipTop presents: Zoning 18 @ The TakeOver Lou...,2025-07-02 13:23:54+00:00,139884,PT32M32S,https://www.youtube.com/watch?v=NCElL6MbZ_g,3117,613,"[fliptop, flip top, flip top battles, fliptopb..."
1,JabvhPBmoVs,FlipTop CripLi - Isabuhay 2025 | Abangan si Cr...,"FlipTop presents: Gubat 15 @ Mariner's Court, ...",2025-06-25 13:49:25+00:00,27498,PT1M51S,https://www.youtube.com/watch?v=JabvhPBmoVs,640,31,"[fliptop, flip top, flip top battles, fliptopb..."
2,YiJI_ohq4Pc,FlipTop Ban - Isabuhay 2025 | Abangan si Ban n...,"FlipTop presents: Gubat 15 @ Mariner's Court, ...",2025-06-25 13:49:25+00:00,29353,PT1M24S,https://www.youtube.com/watch?v=YiJI_ohq4Pc,763,34,"[fliptop, flip top, flip top battles, fliptopb..."
3,yRE3PU0ekaA,FlipTop J-Blaque - Gods Pa Rin? | | Abangan si...,"FlipTop presents: Gubat 15 @ Mariner's Court, ...",2025-06-25 13:49:25+00:00,17415,PT1M56S,https://www.youtube.com/watch?v=yRE3PU0ekaA,360,8,"[fliptop, flip top, flip top battles, fliptopb..."
4,Ftk2DZ3hcxw,FlipTop Vitrum - Ang HipHop Pinalakas ng mga T...,"FlipTop presents: Gubat 15 @ Mariner's Court, ...",2025-06-25 13:48:58+00:00,28969,PT3M15S,https://www.youtube.com/watch?v=Ftk2DZ3hcxw,449,50,"[fliptop, flip top, flip top battles, fliptopb..."


In [11]:
# Create duration_timedelta col
df["duration_timedelta"] = df["duration"].apply(isodate.parse_duration)

# Create duration_seconds col (for ez calculations down the line)
df["duration_seconds"] = df["duration_timedelta"].dt.total_seconds()

In [12]:
df.head()

Unnamed: 0,id,title,description,upload_date,view_count,duration,url,likeCount,commentCount,tags,duration_timedelta,duration_seconds
0,NCElL6MbZ_g,FlipTop - Caspher vs CRhyme,FlipTop presents: Zoning 18 @ The TakeOver Lou...,2025-07-02 13:23:54+00:00,139884,PT32M32S,https://www.youtube.com/watch?v=NCElL6MbZ_g,3117,613,"[fliptop, flip top, flip top battles, fliptopb...",0 days 00:32:32,1952.0
1,JabvhPBmoVs,FlipTop CripLi - Isabuhay 2025 | Abangan si Cr...,"FlipTop presents: Gubat 15 @ Mariner's Court, ...",2025-06-25 13:49:25+00:00,27498,PT1M51S,https://www.youtube.com/watch?v=JabvhPBmoVs,640,31,"[fliptop, flip top, flip top battles, fliptopb...",0 days 00:01:51,111.0
2,YiJI_ohq4Pc,FlipTop Ban - Isabuhay 2025 | Abangan si Ban n...,"FlipTop presents: Gubat 15 @ Mariner's Court, ...",2025-06-25 13:49:25+00:00,29353,PT1M24S,https://www.youtube.com/watch?v=YiJI_ohq4Pc,763,34,"[fliptop, flip top, flip top battles, fliptopb...",0 days 00:01:24,84.0
3,yRE3PU0ekaA,FlipTop J-Blaque - Gods Pa Rin? | | Abangan si...,"FlipTop presents: Gubat 15 @ Mariner's Court, ...",2025-06-25 13:49:25+00:00,17415,PT1M56S,https://www.youtube.com/watch?v=yRE3PU0ekaA,360,8,"[fliptop, flip top, flip top battles, fliptopb...",0 days 00:01:56,116.0
4,Ftk2DZ3hcxw,FlipTop Vitrum - Ang HipHop Pinalakas ng mga T...,"FlipTop presents: Gubat 15 @ Mariner's Court, ...",2025-06-25 13:48:58+00:00,28969,PT3M15S,https://www.youtube.com/watch?v=Ftk2DZ3hcxw,449,50,"[fliptop, flip top, flip top battles, fliptopb...",0 days 00:03:15,195.0


Next task:
- Subset `df` to only include 1v1 battles.

While the majority of the videos posted on the [FlipTop YouTube channel](https://www.youtube.com/@fliptopbattles) are rap battles, a number of them are video flyers, announcement videos, behind-the-scenes content, reaction videos, and the like. As such, it's important to filter out these videos so that we're left to analyze the data from the **rap battles** alone.

### But what constitutes a **rap battle**?

Good question.

My criteria for what is considered a rap battle for the purposes of this project:
1. The video needs to be *a capella* (no underlying beat to accompany emcees rapping).
2. The video involves emcees performing written material (not all of their rounds are off-the-top freestyle). 
    - The early days of FlipTop saw emcees testing each other's skills in the artform known as off-the-top freestyle, where emcees would take turns berating each other lyrically with material they thought of on the spot or in the moment of speaking. 
3. To a lesser extent, the video needs to involve judging at the end (there needs to be stakes).

Note:
- **By these criteria, earlier videos of FlipTop wouldn't be included. Need explanation for what I'm doing here.**

As a long-time viewer of these videos, there's a couple key words that makes filtering with these criteria in mind easier. 
- Include:
    - **vs** - most, if not all, of the a capella rap battles in the FlipTop YouTube channel have "vs" in the video title. 
        - For the uninitiated: "vs" is short for "versus." 
- Exclude:
    - **tryout** - while these are battles, they are for the newcomers to the scene and is often not judged, especially in older videos.
    - **beatbox** - this is another genre of battle separate from the a capella, judged battles.
    - **flyer** and **promo** - these are advertisements and announcement videos for upcoming events.
    - **Anygma Machine** - Anygma, the head of FlipTop as a company, sometimes reviews battles and gives his take on upcoming matches.
        - A reference to the real [Enigma Machine](https://en.wikipedia.org/wiki/Enigma_machine) that the allies had to break in WW2.
   - **[LIVE]** - live performances from the FlipTop Festival event that happened in 2020.
   - **UnggoYan** - Emcees read comments left on videos of their previous battles
   - **Pre-Battle Interviews** - self-explanatory
   - **Salitang Ugat** - translation: "root word." These are interviews of notable emcees who tell the stories behind how they came up with their rap battle name.
   - **Trailer** - promo video trailer for upcomming events
   - **Video Flyer** - self-explanatory
   - **Silip** - BTS videos added recently
   - **Sound Check** - Pre-event check in with FlipTop event prep stuff
   - **Tribute** - tribute to dead rappers
   - **Tutok** - other BTS videos?
   - 

For the scope of this project, I will only consider the battles that are between two people. FlipTop has a variety of rap battle formats, not just two people insulting each other back and forth. Examples include: the Royal Rumble, the 5 vs 5, and the tag-team 2 vs 2 (Dos Por Dos) matches. The vast majority of the battles, though, are one versus one. Those battles will be the focus of this project.

In [13]:
df.head()

Unnamed: 0,id,title,description,upload_date,view_count,duration,url,likeCount,commentCount,tags,duration_timedelta,duration_seconds
0,NCElL6MbZ_g,FlipTop - Caspher vs CRhyme,FlipTop presents: Zoning 18 @ The TakeOver Lou...,2025-07-02 13:23:54+00:00,139884,PT32M32S,https://www.youtube.com/watch?v=NCElL6MbZ_g,3117,613,"[fliptop, flip top, flip top battles, fliptopb...",0 days 00:32:32,1952.0
1,JabvhPBmoVs,FlipTop CripLi - Isabuhay 2025 | Abangan si Cr...,"FlipTop presents: Gubat 15 @ Mariner's Court, ...",2025-06-25 13:49:25+00:00,27498,PT1M51S,https://www.youtube.com/watch?v=JabvhPBmoVs,640,31,"[fliptop, flip top, flip top battles, fliptopb...",0 days 00:01:51,111.0
2,YiJI_ohq4Pc,FlipTop Ban - Isabuhay 2025 | Abangan si Ban n...,"FlipTop presents: Gubat 15 @ Mariner's Court, ...",2025-06-25 13:49:25+00:00,29353,PT1M24S,https://www.youtube.com/watch?v=YiJI_ohq4Pc,763,34,"[fliptop, flip top, flip top battles, fliptopb...",0 days 00:01:24,84.0
3,yRE3PU0ekaA,FlipTop J-Blaque - Gods Pa Rin? | | Abangan si...,"FlipTop presents: Gubat 15 @ Mariner's Court, ...",2025-06-25 13:49:25+00:00,17415,PT1M56S,https://www.youtube.com/watch?v=yRE3PU0ekaA,360,8,"[fliptop, flip top, flip top battles, fliptopb...",0 days 00:01:56,116.0
4,Ftk2DZ3hcxw,FlipTop Vitrum - Ang HipHop Pinalakas ng mga T...,"FlipTop presents: Gubat 15 @ Mariner's Court, ...",2025-06-25 13:48:58+00:00,28969,PT3M15S,https://www.youtube.com/watch?v=Ftk2DZ3hcxw,449,50,"[fliptop, flip top, flip top battles, fliptopb...",0 days 00:03:15,195.0


In [14]:
df.shape

(1749, 12)

In [None]:
# Include if "vs" in the title
with_vs = df[df["title"].str.contains(r"\bvs\b", case=False, regex=True)]

# Key words and phrases to exclude
exclude_keywords = [
    "tryout", "tryouts", "beatbox", "beat box", "flyer", "promo", "promos", 
    "anygma machine", "unggoyan", "pre-battle interviews", "interview",
    "interviews", "salitang ugat", "trailer", "video flyer", "[live]",
    "silip", "sound check", "tribute", "anniversary party","tutok", "review",
    "abangan"
]

# Regex OR pattern to discard non-battles
exclude_pattern = "|".join([re.escape(word) for word in exclude_keywords])


df_f = with_vs[~with_vs["title"].str.contains(exclude_pattern, case=False, regex=True)]


In [16]:
df_f.shape

(1294, 12)

In [17]:
df_f.head()

Unnamed: 0,id,title,description,upload_date,view_count,duration,url,likeCount,commentCount,tags,duration_timedelta,duration_seconds
0,NCElL6MbZ_g,FlipTop - Caspher vs CRhyme,FlipTop presents: Zoning 18 @ The TakeOver Lou...,2025-07-02 13:23:54+00:00,139884,PT32M32S,https://www.youtube.com/watch?v=NCElL6MbZ_g,3117,613,"[fliptop, flip top, flip top battles, fliptopb...",0 days 00:32:32,1952.0
1,JabvhPBmoVs,FlipTop CripLi - Isabuhay 2025 | Abangan si Cr...,"FlipTop presents: Gubat 15 @ Mariner's Court, ...",2025-06-25 13:49:25+00:00,27498,PT1M51S,https://www.youtube.com/watch?v=JabvhPBmoVs,640,31,"[fliptop, flip top, flip top battles, fliptopb...",0 days 00:01:51,111.0
2,YiJI_ohq4Pc,FlipTop Ban - Isabuhay 2025 | Abangan si Ban n...,"FlipTop presents: Gubat 15 @ Mariner's Court, ...",2025-06-25 13:49:25+00:00,29353,PT1M24S,https://www.youtube.com/watch?v=YiJI_ohq4Pc,763,34,"[fliptop, flip top, flip top battles, fliptopb...",0 days 00:01:24,84.0
3,yRE3PU0ekaA,FlipTop J-Blaque - Gods Pa Rin? | | Abangan si...,"FlipTop presents: Gubat 15 @ Mariner's Court, ...",2025-06-25 13:49:25+00:00,17415,PT1M56S,https://www.youtube.com/watch?v=yRE3PU0ekaA,360,8,"[fliptop, flip top, flip top battles, fliptopb...",0 days 00:01:56,116.0
5,RDiDkJHcnqs,FlipTop - Negho Gy vs Hespero,FlipTop presents: Zoning 18 @ The TakeOver Lou...,2025-06-25 12:44:16+00:00,334399,PT32M9S,https://www.youtube.com/watch?v=RDiDkJHcnqs,4636,764,"[fliptop, flip top, flip top battles, fliptopb...",0 days 00:32:09,1929.0


I still see non-battles from the titles. Need to inspect all titles from `df_f`.

In [18]:
# Inspection
import os
os.makedirs("../data", exist_ok=True)
df_f[["title"]].to_csv("../data/to_filter.csv", index=False)

Low-hanging fruit: "Abangan si..." or "Abangan sa..." are NOT battles but *clips* from battles.

In [19]:
exclude_phrase = "abangan"
df_f = df_f[~df_f["title"].str.contains(exclude_phrase, case=False, regex=True)]

In [23]:
df_f.shape

(1285, 12)

In [24]:
df_f.head()

Unnamed: 0,id,title,description,upload_date,view_count,duration,url,likeCount,commentCount,tags,duration_timedelta,duration_seconds
0,NCElL6MbZ_g,FlipTop - Caspher vs CRhyme,FlipTop presents: Zoning 18 @ The TakeOver Lou...,2025-07-02 13:23:54+00:00,139884,PT32M32S,https://www.youtube.com/watch?v=NCElL6MbZ_g,3117,613,"[fliptop, flip top, flip top battles, fliptopb...",0 days 00:32:32,1952.0
5,RDiDkJHcnqs,FlipTop - Negho Gy vs Hespero,FlipTop presents: Zoning 18 @ The TakeOver Lou...,2025-06-25 12:44:16+00:00,334399,PT32M9S,https://www.youtube.com/watch?v=RDiDkJHcnqs,4636,764,"[fliptop, flip top, flip top battles, fliptopb...",0 days 00:32:09,1929.0
6,9xWuT-atbuo,FlipTop - Yagi vs JP,"FlipTop presents: Pakusganay 8 @ Davao City, M...",2025-03-24 12:51:25+00:00,157503,PT38M52S,https://www.youtube.com/watch?v=9xWuT-atbuo,3078,904,"[fliptop, flip top, flip top battles, fliptopb...",0 days 00:38:52,2332.0
7,UfFT7TbtRy0,FlipTop - No. 144 vs Markong Bungo,FlipTop presents: Second Sight 13 @ TIU Theate...,2024-12-04 12:45:47+00:00,744384,PT26M,https://www.youtube.com/watch?v=UfFT7TbtRy0,15920,2717,"[fliptop, flip top, flip top battles, fliptopb...",0 days 00:26:00,1560.0
8,zWTapWo_Iho,FlipTop - Shaboy vs Marichu,FlipTop presents: Won Minutes Luzon 2 @ 88Fry...,2024-08-03 09:58:17+00:00,294043,PT13M50S,https://www.youtube.com/watch?v=zWTapWo_Iho,4583,689,"[fliptop, flip top, flip top battles, fliptopb...",0 days 00:13:50,830.0


Now it's time to include only 1v1 battles.

In [58]:
import re

def is_probably_not_1v1(title):
    if not isinstance(title, str):
        return True  # defensive: non-string, toss it

    title_lower = title.lower()

    # Heuristic 1: multiple "vs"
    if len(re.findall(r"\bvs\b", title_lower)) > 1:
        return True

    # Heuristic 2: "and" appears more than once (i.e., on both sides of the battle)
    if len(re.findall(r"\band\b", title_lower)) > 1:
        return True

    # Heuristic 3: slash (with or without spaces)
    if re.search(r"\s*/\s*|\s*/|\*/\s*", title_lower):
        return True

    # Heuristic 4: plus sign (no-shows)
    if "+" in title_lower:
        return True

    # Heuristic 5: common 2v2 pattern: "and" on both sides of "vs"
    if re.search(r"\band\b.*\bvs\b.*\band\b", title_lower):
        return True
    
    # Heuristic 6: team format like "5 on 5 battle"
    if re.search(r"\b\d+\s*on\s*\d+\b", title_lower):
        return True

    return False


# Filtering
mask = df_f["title"].apply(is_probably_not_1v1)
df_not_1v1 = df_f[mask]         # Filtered out (bad) - inspect
df_1v1_only = df_f[~mask]       # Final 1v1 dataset? Pending inspection
    


In [59]:
# Audit output
print("Filtered out (not 1v1):")
for title in df_not_1v1["title"]:
    print("-", title)

Filtered out (not 1v1):
- FlipTop - Crazymix/Bassilyo vs Cripli/Towpher @ DosPorDos 2017
- FlipTop - Rapido/Icaruz vs Dopee/Snob @ Dos Por Dos 2
- FlipTop - CripLi / Towpher vs K-Ram / SlockOne
- FlipTop - Caspher / Hespero vs Atoms / Cygnus @ DosPorDos 2024 Finals
- FlipTop - Negho G / Pamoso vs Atoms / Cygnus @ DosPorDos 2024 Semifinals
- FlipTop - Caspher / Hespero vs Aubrey / Marichu @ DosPorDos 2024 Semifinals
- FlipTop - Negho Gy / Pamoso vs RG / Deadline @ DosPorDos2024
- FlipTop - Aubrey / Marichu vs Sickreto / Article Clipted @ DosPorDos2024
- FlipTop - Bisente / Jawz vs Atoms / Cygnus @ DosPorDos 2024
- FlipTop - Caspher / Hespero vs Kenzer / Mimack @ DosPorDos 2024
- FlipTop - Frooz / Elbiz vs Mac T / G-Spot
- FlipTop - Pistolero / Luxuria vs MastaFeat / Hearty | Surprise Freestyle Battle
- FlipTop - Sur Henyo vs Kregga vs Batang Rebelde vs LilStrocks vs Bagsik *ROYAL RUMBLE*
- FlipTop - K-Ram/SlockOne vs Vitrum/Illtimate
- FlipTop - Zaito vs C-Quence vs CNine vs Prince Rhym

In [60]:
df_not_1v1[["title"]].to_csv("../data/not_1v1_filtered_out.csv", index=False)

In [61]:
df_1v1_only.head(3)

Unnamed: 0,id,title,description,upload_date,view_count,duration,url,likeCount,commentCount,tags,duration_timedelta,duration_seconds
0,NCElL6MbZ_g,FlipTop - Caspher vs CRhyme,FlipTop presents: Zoning 18 @ The TakeOver Lou...,2025-07-02 13:23:54+00:00,139884,PT32M32S,https://www.youtube.com/watch?v=NCElL6MbZ_g,3117,613,"[fliptop, flip top, flip top battles, fliptopb...",0 days 00:32:32,1952.0
5,RDiDkJHcnqs,FlipTop - Negho Gy vs Hespero,FlipTop presents: Zoning 18 @ The TakeOver Lou...,2025-06-25 12:44:16+00:00,334399,PT32M9S,https://www.youtube.com/watch?v=RDiDkJHcnqs,4636,764,"[fliptop, flip top, flip top battles, fliptopb...",0 days 00:32:09,1929.0
6,9xWuT-atbuo,FlipTop - Yagi vs JP,"FlipTop presents: Pakusganay 8 @ Davao City, M...",2025-03-24 12:51:25+00:00,157503,PT38M52S,https://www.youtube.com/watch?v=9xWuT-atbuo,3078,904,"[fliptop, flip top, flip top battles, fliptopb...",0 days 00:38:52,2332.0


In [62]:
df_1v1_only.shape

(1194, 12)

In [63]:
df_1v1_only.loc[:, "title"] = df_1v1_only["title"].str.strip()

In [64]:
# Strip leading and trailing " from title — only if they exist
df_1v1_only = df_1v1_only.copy()
df_1v1_only["title"] = df_1v1_only["title"].str.replace(r'^"(.*)"$', r'\1', regex=True)

In [65]:
df_1v1_only.shape

(1194, 12)

Need to inspect this 1v1 only df to see if it still contains non-1v1 battles

In [67]:
df_1v1_only[["title"]].to_csv("../data/filtered_1v1.tsv", sep="\t", index=False)

There are commas in 3 of the videos that are 1v1 battles. Writing to csv without specifying tab separated values instead puts "quotes" on those battles.

Looks all good on cursory inspection.

Next task: deal with early FlipTop battles having parts `pt. 1` and `pt. 2` etc.
- Make it into 1 video but now the links are a list instead of only string?