### <span style="color:black"><b>Pandas Tutorial 5</b></span>

---

<ins>Strings & Advanced Filtering</ins>

This video extends on the ideas in the previous video, but now we can use `df.loc[]` to be even more specific with our filters



In [1]:
import pandas as pd

In [2]:
# Read in data
df = pd.read_csv('yt_data.csv')
df.head()

Unnamed: 0,title,channel_title,category_id,views,likes,dislikes,comment_count,thumbnail_link,description
0,"Racist Superman | Rudy Mancuso, King Bach & Le...",rudy mancuso,23.0,3191434.0,146033.0,5339.0,8181.0,https://i.ytimg.com/vi/5qpjK5DgCt4/default.jpg,WATCH MY PREVIOUS VIDEO ▶ \n\nSUBSCRIBE ► http...
1,Which Countries Are About To Collapse?,NowThis World,25.0,544770.0,7848.0,1171.0,3981.0,https://i.ytimg.com/vi/GgVmn66oK_A/default.jpg,"The world at large is improving, but some coun..."
2,How Can You Control Your Dreams?,Life Noggin,27.0,115791.0,9586.0,75.0,2800.0,https://i.ytimg.com/vi/vU14JY3x81A/default.jpg,What if there was a way to control your dreams...
3,Is It Dangerous To Talk To A Camera While Driv...,Tom Scott,27.0,144418.0,11758.0,89.0,1014.0,https://i.ytimg.com/vi/_-aDHxoblr4/default.jpg,I'm visiting the University of Iowa's National...
4,Using Other People's Showers,gus johnson,23.0,33980.0,4884.0,52.0,234.0,https://i.ytimg.com/vi/lZ68j2J_GOM/default.jpg,Why is it so hard to figure out other people's...


In [3]:
# Shape
df.shape

(2247, 9)

**A Review of .loc[]**

In [4]:
# 1. Get all rows containing 'Katy Perry', 'Keith Urban' and 'Alicia Keys'
df.loc[df['channel_title'].isin(['Katy Perry', 'Keith Urban', 'Alicia Keys']), :]

# If you were to put these names in lowercase ⬆️, we would not get any of these rows in our .loc[] command!!!
# The rest of this notebook will be how to address this
# Basic idea is to make sure we have no case issues with upper and lower and that we should remove whitespace so we don't have ' Alicia KeYs  ' vs '      Alicia keys '

Unnamed: 0,title,channel_title,category_id,views,likes,dislikes,comment_count,thumbnail_link,description
45,Alicia Keys - When You Were Gone,Alicia Keys,10.0,95944.0,1354.0,181.0,117.0,https://i.ytimg.com/vi/5x1FAiIq_pQ/default.jpg,Find out more in The Vault: http://bit.ly/AK_A...
49,Keith Urban - Female (Official Audio),Keith Urban,24.0,754558.0,11985.0,1356.0,1757.0,https://i.ytimg.com/vi/_qSW96a2aKY/default.jpg,Buy or stream Female now: www.smarturl.it/KU-F...
63,Katy Perry - Swish Swish (Behind the Scenes wi...,Katy Perry,24.0,56012.0,6243.0,106.0,414.0,https://i.ytimg.com/vi/M2-IpsWQUWs/default.jpg,"Go behind the scenes with me and my friends, J..."
1231,Keith Urban - Female Acoustic,Keith Urban,24.0,64143.0,1782.0,111.0,122.0,https://i.ytimg.com/vi/SJXXGL_8VcU/default.jpg,Keith Urban performing an acoustic version of ...


**Exercise: Get the 'channel_title' series do the following:**

Show how it can be put into:
* Uppercase
* Lowercase
* Title form

Here is a [great resource](https://www.w3schools.com/python/python_ref_string.asp) on how to manipulate strings in python

In [5]:
# Upper
df['channel_title'].str.upper()

0           RUDY MANCUSO
1          NOWTHIS WORLD
2            LIFE NOGGIN
3              TOM SCOTT
4            GUS JOHNSON
              ...       
2242         LOL NETWORK
2243      MICHAEL DAPAAH
2244    LINDSEY STIRLING
2245           LAURA LEE
2246         BON APPÉTIT
Name: channel_title, Length: 2247, dtype: object

In [6]:
# Lower
df['channel_title'].str.lower()

0           rudy mancuso
1          nowthis world
2            life noggin
3              tom scott
4            gus johnson
              ...       
2242         lol network
2243      michael dapaah
2244    lindsey stirling
2245           laura lee
2246         bon appétit
Name: channel_title, Length: 2247, dtype: object

In [7]:
# Title
df['channel_title'].str.title()

0           Rudy Mancuso
1          Nowthis World
2            Life Noggin
3              Tom Scott
4            Gus Johnson
              ...       
2242         Lol Network
2243      Michael Dapaah
2244    Lindsey Stirling
2245           Laura Lee
2246         Bon Appétit
Name: channel_title, Length: 2247, dtype: object

**Exercise: Find all youtube videos that might have been about music 🎶**
* In your pandas command account for the fact that we might get 'music' or 'Music' or 'MuSiC' or 'muSIC' or 'MUSIc'

In [8]:
# Solution
df.loc[df['description'].str.lower().str.contains('music'), :]

Unnamed: 0,title,channel_title,category_id,views,likes,dislikes,comment_count,thumbnail_link,description
4,Using Other People's Showers,gus johnson,23.0,33980.0,4884.0,52.0,234.0,https://i.ytimg.com/vi/lZ68j2J_GOM/default.jpg,Why is it so hard to figure out other people's...
6,Hunter Hayes - You Should Be Loved (Part One O...,Hunter Hayes,10.0,13917.0,1318.0,24.0,76.0,https://i.ytimg.com/vi/e_7zHm7GsYc/default.jpg,You Should Be Loved (feat. The Shadowboxers) A...
8,Will It Watermarble?! Sister Edition | Waterma...,Simply Nailogical,24.0,1842393.0,99086.0,1339.0,11800.0,https://i.ytimg.com/vi/0PpNlNJ6Nng/default.jpg,Guess who's back... back again... Jenny's back...
10,JaVale McGee's Parking Lot Chronicles: Episode 3,Kevin Durant,17.0,162597.0,5734.0,106.0,525.0,https://i.ytimg.com/vi/w0XYVssCKjw/default.jpg,"The Parking Lot Chronicles return, coming at y..."
13,The Super Google Pixel 2 Camera Upgrade!,Jonathan Morrison,28.0,290801.0,10237.0,377.0,1016.0,https://i.ytimg.com/vi/4v0nOAzcG2A/default.jpg,The Best iPhone X Accessories! https://youtu.b...
...,...,...,...,...,...,...,...,...,...
2240,BIG SHAQ - MAN DON'T DANCE (OFFICIAL MUSIC VIDEO),Michael Dapaah,23.0,3291375.0,237975.0,8625.0,23787.0,https://i.ytimg.com/vi/qQn6TsbYrH8/default.jpg,Man Don't Dance performed by Big Shaq (Michael...
2242,It’s Just Water Weight | Kevin Hart: What The ...,LOL Network,23.0,226849.0,5788.0,130.0,242.0,https://i.ytimg.com/vi/6fi1RZNWCDI/default.jpg,Fight your way to fitness with a firefighter’s...
2243,BIG SHAQ - MAN DON'T DANCE (OFFICIAL MUSIC VIDEO),Michael Dapaah,10.0,5199107.0,328699.0,13876.0,30797.0,https://i.ytimg.com/vi/qQn6TsbYrH8/default.jpg,Man Don't Dance performed by Big Shaq (Michael...
2244,Stampede - Alexander Jean Ft. Lindsey Stirling,Lindsey Stirling,10.0,296615.0,38671.0,463.0,2348.0,https://i.ytimg.com/vi/-9rdDeWzvsU/default.jpg,Lindsey Stirling & Evanescence Co-Headline Sum...


**Exercise: Find all youtube videos that might have been about pinterest or holiday or a firefighter**
* Use basic [regex](https://www.programiz.com/python-programming/regex)

In [9]:
# Solution
df.loc[df.description.str.lower().str.contains('pinterest|holiday|firefighter'), :]

Unnamed: 0,title,channel_title,category_id,views,likes,dislikes,comment_count,thumbnail_link,description
41,Birthdays - Simon's Cat | GUIDE TO,Simon's Cat,15.0,426078.0,19323.0,245.0,945.0,https://i.ytimg.com/vi/qEEtzzi1EII/default.jpg,Watch Simon's Cat's Guide To Birthdays! \nSUBS...
50,I Picked My Girlfriend's Outfit Blindfolded,Tyler Williams,22.0,691229.0,31892.0,197.0,1404.0,https://i.ytimg.com/vi/P4YJwy_T9pM/default.jpg,Link to our pinterest recipes shopping room: h...
71,Brent Pella - Why You Shouldn't Fly on Spirit ...,Brent Pella,23.0,462490.0,14132.0,795.0,666.0,https://i.ytimg.com/vi/YvfYK0EEhK4/default.jpg,Traveling for the holidays? Think twice before...
83,funfetti is extremely fun,emma chamberlain,24.0,126438.0,12229.0,86.0,1865.0,https://i.ytimg.com/vi/MltuW2kcREI/default.jpg,the mini whisk IS BACK\n\nIf you liked this vi...
95,HUGE BEAUTY FAVOURITES! AUTUMN LOVES!,Estée Lalonde,22.0,128523.0,4170.0,148.0,307.0,https://i.ytimg.com/vi/ijJPFowBsBc/default.jpg,Come to Cambodia with me! \nhttps://youtu.be/I...
...,...,...,...,...,...,...,...,...,...
2175,The Sims 4 Seasons: Official Reveal Trailer,The Sims,20.0,875878.0,60686.0,715.0,13595.0,https://i.ytimg.com/vi/tn0hCTyj2Kc/default.jpg,Add weather to your Sims’ lives to tell new st...
2190,Joey Graceffa's Enchanted Gaming Room Makeover...,Mr. Kate,26.0,1116194.0,62929.0,550.0,5298.0,https://i.ytimg.com/vi/PV_GHHXy7Po/default.jpg,Big thanks to DazzlePro for sponsoring and get...
2212,GET READY WITH ME FOR THE MILITARY BALL! | Cas...,Casey Holmes,26.0,287522.0,15784.0,251.0,1371.0,https://i.ytimg.com/vi/sfxilYIycTc/default.jpg,Don't forget to subscribe here-- http://bit.ly...
2233,Waking Up With Ariana Grande | British Vogue,British Vogue,24.0,1092534.0,59225.0,1226.0,2014.0,https://i.ytimg.com/vi/n2wIqXBz4os/default.jpg,"I'm very grateful to be here, and that overpow..."


**Exercise: Find all youtube videos that has a description starting with 'In this video' or 'Hi everyone'**

In [10]:
# startswith does not allow regex. Instead, you pass in a python tuple
df.loc[df.description.str.lower().str.strip().str.startswith(('in this video', 'hi everyone')), :]

Unnamed: 0,title,channel_title,category_id,views,likes,dislikes,comment_count,thumbnail_link,description
150,4 Reasons I Don't Like Thanksgiving || Mayim B...,Mayim Bialik,1.0,46197.0,3794.0,254.0,889.0,https://i.ytimg.com/vi/AJkOHavUH_A/default.jpg,Hi everyone! Mayim Bialik here. You may know m...
228,My Favorite Thanksgiving Recipe || Mayim Bialik,Mayim Bialik,1.0,27225.0,1966.0,47.0,374.0,https://i.ytimg.com/vi/YRRczTwopgA/default.jpg,Hi everyone! Mayim Bialik here. You may know m...
582,Big Bang Recap - The Celebration Reverberation...,Mayim Bialik,1.0,14373.0,1123.0,38.0,133.0,https://i.ytimg.com/vi/JVHFMBfG-cM/default.jpg,Hi everyone! Mayim Bialik here. You may know m...
1909,Updated Everyday Makeup Routine 💋,Daisy Marquez,22.0,412135.0,21146.0,1077.0,1949.0,https://i.ytimg.com/vi/NIf8LjND67Y/default.jpg,In this video I filmed my updated Makeup Routi...
1921,I built a PC out of rope and wood...,DIY Perks,26.0,947071.0,47282.0,1678.0,5425.0,https://i.ytimg.com/vi/N-z9PidYH4E/default.jpg,In this video I show you how I built a unique ...
2227,How My Makeup Looks Under a Microscope,Tina Yong,26.0,676111.0,23613.0,525.0,1773.0,https://i.ytimg.com/vi/UXNnAB3ugPU/default.jpg,"In this video, I look at my makeup under a mic..."


**Exercise: Replace underscore with spaces for all the columns**

In [11]:
df.columns = df.columns.str.replace("_", " ")

df.head()

Unnamed: 0,title,channel title,category id,views,likes,dislikes,comment count,thumbnail link,description
0,"Racist Superman | Rudy Mancuso, King Bach & Le...",rudy mancuso,23.0,3191434.0,146033.0,5339.0,8181.0,https://i.ytimg.com/vi/5qpjK5DgCt4/default.jpg,WATCH MY PREVIOUS VIDEO ▶ \n\nSUBSCRIBE ► http...
1,Which Countries Are About To Collapse?,NowThis World,25.0,544770.0,7848.0,1171.0,3981.0,https://i.ytimg.com/vi/GgVmn66oK_A/default.jpg,"The world at large is improving, but some coun..."
2,How Can You Control Your Dreams?,Life Noggin,27.0,115791.0,9586.0,75.0,2800.0,https://i.ytimg.com/vi/vU14JY3x81A/default.jpg,What if there was a way to control your dreams...
3,Is It Dangerous To Talk To A Camera While Driv...,Tom Scott,27.0,144418.0,11758.0,89.0,1014.0,https://i.ytimg.com/vi/_-aDHxoblr4/default.jpg,I'm visiting the University of Iowa's National...
4,Using Other People's Showers,gus johnson,23.0,33980.0,4884.0,52.0,234.0,https://i.ytimg.com/vi/lZ68j2J_GOM/default.jpg,Why is it so hard to figure out other people's...


**Exercise: Sweep through each value and put it into lowercase, removing any whitespace at the beginning or end of the word**
* For people who know `applymap()` you can do `df.applymap(lambda x: x.lower().strip() if type(x) == str else x)`
* Though a `for` loop might be more readble
* Little bit slower though
* As a whole, pandas is always trying to do things more efficiently than a for loop (planning on covering vectorised operations at some stage of this video series)

In [12]:
for col in df.select_dtypes("object").columns:
    df[col] = df[col].str.lower().str.strip()

In [13]:
df

Unnamed: 0,title,channel title,category id,views,likes,dislikes,comment count,thumbnail link,description
0,"racist superman | rudy mancuso, king bach & le...",rudy mancuso,23.0,3191434.0,146033.0,5339.0,8181.0,https://i.ytimg.com/vi/5qpjk5dgct4/default.jpg,watch my previous video ▶ \n\nsubscribe ► http...
1,which countries are about to collapse?,nowthis world,25.0,544770.0,7848.0,1171.0,3981.0,https://i.ytimg.com/vi/ggvmn66ok_a/default.jpg,"the world at large is improving, but some coun..."
2,how can you control your dreams?,life noggin,27.0,115791.0,9586.0,75.0,2800.0,https://i.ytimg.com/vi/vu14jy3x81a/default.jpg,what if there was a way to control your dreams...
3,is it dangerous to talk to a camera while driv...,tom scott,27.0,144418.0,11758.0,89.0,1014.0,https://i.ytimg.com/vi/_-adhxoblr4/default.jpg,i'm visiting the university of iowa's national...
4,using other people's showers,gus johnson,23.0,33980.0,4884.0,52.0,234.0,https://i.ytimg.com/vi/lz68j2j_gom/default.jpg,why is it so hard to figure out other people's...
...,...,...,...,...,...,...,...,...,...
2242,it’s just water weight | kevin hart: what the ...,lol network,23.0,226849.0,5788.0,130.0,242.0,https://i.ytimg.com/vi/6fi1rznwcdi/default.jpg,fight your way to fitness with a firefighter’s...
2243,big shaq - man don't dance (official music video),michael dapaah,10.0,5199107.0,328699.0,13876.0,30797.0,https://i.ytimg.com/vi/qqn6tsbyrh8/default.jpg,man don't dance performed by big shaq (michael...
2244,stampede - alexander jean ft. lindsey stirling,lindsey stirling,10.0,296615.0,38671.0,463.0,2348.0,https://i.ytimg.com/vi/-9rddewzvsu/default.jpg,lindsey stirling & evanescence co-headline sum...
2245,crayola makeup | hit or miss?,laura lee,26.0,607422.0,26166.0,895.0,3517.0,https://i.ytimg.com/vi/ds5thrl-4kc/default.jpg,"hey larlees, todays video is me testing crayol..."
