In [my webscrapping repository](https://github.com/kzoldak/webscrapping), I scrapped all of the video data from the 17 pages of [Ted Talks](https://www.ted.com/talks?event=tedx&sort=newest) and then saved it to `ted_talks.csv`. In this notebook I will read in that data to `pandas` and play with it. 

__The videos change constantly on the Ted Talk site, so this data was from Jan. 21, 2019 around 2pm PST.__

In [1]:
import pandas as pd

In [6]:
data = pd.read_csv('/Users/kimzoldak/Github/webscrapping/ted_talks.csv')

In [7]:
data = pd.DataFrame(data)

In [9]:
data.head()

Unnamed: 0,page,pagesource,title,speaker,dateposted,duration,link
0,1,https://www.ted.com/talks?event=tedx&amp;sort=...,The ruralities of autism,Amy Price Azano,Jan 2019,12:31,https://www.ted.com/talks/amy_price_azano_the_...
1,1,https://www.ted.com/talks?event=tedx&amp;sort=...,How stigma shaped modern medicine,Nathalia Holt,Jan 2019,15:30,https://www.ted.com/talks/nathalia_holt_how_st...
2,1,https://www.ted.com/talks?event=tedx&amp;sort=...,3 ways to build a happy marriage and avoid div...,George Blair-West,Jan 2019,11:13,https://www.ted.com/talks/george_blair_west_3_...
3,1,https://www.ted.com/talks?event=tedx&amp;sort=...,A mother and son's photographic journey throug...,Tony Luciani,Jan 2019,13:32,https://www.ted.com/talks/tony_luciani_a_mothe...
4,1,https://www.ted.com/talks?event=tedx&amp;sort=...,5 ways to share math with kids,Dan Finkel,Jan 2019,14:41,https://www.ted.com/talks/dan_finkel_5_ways_to...


In [10]:
data.shape

(604, 7)

In [13]:
# 8 people had two talks that made the website. 
# some are joint talks though. 
data.speaker.value_counts()

Hans Rosling                                 2
Sebastian Wernicke                           2
Melinda Gates                                2
Hannah Fry                                   2
Ash Beckham                                  2
Mandy Len Catron                             2
Julia Shaw                                   2
Mikko Hypponen                               2
Yann Dall'Aglio                              1
Christine Porath                             1
Louie Schwartzberg                           1
Yale Fox                                     1
Mary Maker                                   1
Rives                                        1
Karima Bennoune                              1
Joe von Fischer                              1
James B. Glattfelder                         1
Kelly Richmond Pope                          1
David Anderson                               1
Baroness Beeban Kidron                       1
Kandice Sumner                               1
James Beacham

In [163]:
data.speaker.str.contains('and', case=False).sum()
# 25 talks have joint speakers and use the word and
# we took away the case sensitivity of this word to catch all.
# the problem with this is it catches people with 'and' in their name.

25

In [62]:
# we adjusted the command to only catch joint speakers
data.speaker.str.contains(' and ', case=False).sum()

6

In [114]:
data[data.speaker.str.contains(' and ', case=False)]

Unnamed: 0,page,pagesource,title,speaker,dateposted,duration,link
111,4,https://www.ted.com/talks?event=tedx&page=4&so...,The virginity fraud,Nina Dølvik Brochmann and Ellen Støkken Dahl,"Informative, Courageous",11:41,https://www.ted.com/talks/nina_dolvik_brochman...
149,5,https://www.ted.com/talks?event=tedx&page=5&so...,How our friendship survives our opposing politics,Caitlin Quattromani and Lauran Arledge,"Inspiring, Courageous",0:00,https://www.ted.com/talks/caitlin_quattromani_...
226,7,https://www.ted.com/talks?event=tedx&page=7&so...,Ballroom dance that breaks gender roles,Trevor Copp and Jeff Fox,"Beautiful, Inspiring",15:33,https://www.ted.com/talks/trevor_copp_jeff_fox...
338,10,https://www.ted.com/talks?event=tedx&page=10&s...,Meet the robots for humanity,Henry Evans and Chad Jenkins,"Inspiring, Fascinating",10:21,https://www.ted.com/talks/henry_evans_and_chad...
345,10,https://www.ted.com/talks?event=tedx&page=10&s...,A mouse. A laser beam. A manipulated memory.,Steve Ramirez and Xu Liu,"Fascinating, Informative",15:25,https://www.ted.com/talks/steve_ramirez_and_xu...
348,10,https://www.ted.com/talks?event=tedx&page=10&s...,In the key of genius,Derek Paravicini and Adam Ockelford,"Inspiring, Beautiful",19:38,https://www.ted.com/talks/derek_paravicini_and...


__From .value_counts(), we know we still have entries with + between the speaker names. For example, the talk by `Charles Hazlewood + British Paraorchestra`__

In [76]:
# When using characters, use \ before it.
data.speaker.str.contains('\+').sum()

3

In [115]:
data[data.speaker.str.contains('\+')]

Unnamed: 0,page,pagesource,title,speaker,dateposted,duration,link
411,12,https://www.ted.com/talks?event=tedx&page=12&s...,How to step up in the face of disaster,Caitria + Morgan O'Neill,"Inspiring, Informative",9:23,https://www.ted.com/talks/caitria_and_morgan_o...
495,14,https://www.ted.com/talks?event=tedx&page=14&s...,The debut of the British Paraorchestra,Charles Hazlewood + British Paraorchestra,"Inspiring, Beautiful",13:36,https://www.ted.com/talks/the_debut_of_the_bri...
529,15,https://www.ted.com/talks?event=tedx&page=15&s...,What we learned from 5 million books,Jean-Baptiste Michel + Erez Lieberman Aiden,"Funny, Fascinating",14:08,https://www.ted.com/talks/what_we_learned_from...


In [120]:
# When using characters, use \ before it.
data.speaker.str.contains('\&').sum()

0

__To get all entries with both:__

In [116]:
joint_speaker_talks = data[(data.speaker.str.contains(" and ", case=False)) | (data.speaker.str.contains("\+"))]

In [117]:
joint_speaker_talks

Unnamed: 0,page,pagesource,title,speaker,dateposted,duration,link
111,4,https://www.ted.com/talks?event=tedx&page=4&so...,The virginity fraud,Nina Dølvik Brochmann and Ellen Støkken Dahl,"Informative, Courageous",11:41,https://www.ted.com/talks/nina_dolvik_brochman...
149,5,https://www.ted.com/talks?event=tedx&page=5&so...,How our friendship survives our opposing politics,Caitlin Quattromani and Lauran Arledge,"Inspiring, Courageous",0:00,https://www.ted.com/talks/caitlin_quattromani_...
226,7,https://www.ted.com/talks?event=tedx&page=7&so...,Ballroom dance that breaks gender roles,Trevor Copp and Jeff Fox,"Beautiful, Inspiring",15:33,https://www.ted.com/talks/trevor_copp_jeff_fox...
338,10,https://www.ted.com/talks?event=tedx&page=10&s...,Meet the robots for humanity,Henry Evans and Chad Jenkins,"Inspiring, Fascinating",10:21,https://www.ted.com/talks/henry_evans_and_chad...
345,10,https://www.ted.com/talks?event=tedx&page=10&s...,A mouse. A laser beam. A manipulated memory.,Steve Ramirez and Xu Liu,"Fascinating, Informative",15:25,https://www.ted.com/talks/steve_ramirez_and_xu...
348,10,https://www.ted.com/talks?event=tedx&page=10&s...,In the key of genius,Derek Paravicini and Adam Ockelford,"Inspiring, Beautiful",19:38,https://www.ted.com/talks/derek_paravicini_and...
411,12,https://www.ted.com/talks?event=tedx&page=12&s...,How to step up in the face of disaster,Caitria + Morgan O'Neill,"Inspiring, Informative",9:23,https://www.ted.com/talks/caitria_and_morgan_o...
495,14,https://www.ted.com/talks?event=tedx&page=14&s...,The debut of the British Paraorchestra,Charles Hazlewood + British Paraorchestra,"Inspiring, Beautiful",13:36,https://www.ted.com/talks/the_debut_of_the_bri...
529,15,https://www.ted.com/talks?event=tedx&page=15&s...,What we learned from 5 million books,Jean-Baptiste Michel + Erez Lieberman Aiden,"Funny, Fascinating",14:08,https://www.ted.com/talks/what_we_learned_from...


In [142]:
chars = ['+','-','/','@','!','#','$','%','^','&','*','_','=']

In [125]:
name

'Amy Price Azano'

In [160]:
any(i in '613 mitzvahs' for i in ('18','36','613'))

True

In [161]:
any(i in 'Kim and Derek' for i in ('and'))

True

In [158]:
n = 0
for name in names:
    if any(i in name for i in ('+','-','/','@','!','#','$','%','^','&','*','_','=')):
        print(name)
        n = n+1
print(n)

George Blair-West
Jordan Wirfs-Brock
Elise Payzan-LeNestour
Antón García-Abril
Juan López-Aranguren
Natasha Hurley-Walker
Aala El-Khani
Emily Parsons-Lord
Sofia Jawed-Wessel
Jean-Paul Mari
Yassmin Abdel-Magied
Alex Wissner-Gross
Young-ha Kim
Fahad Al-Attiya
Caitria + Morgan O'Neill
Noah Wilson-Rich
John Graham-Cumming
Shereen El-Feki
Charles Hazlewood + British Paraorchestra
Sandra Fisher-Martins
Guy-Philippe Goldstein
Jean-Baptiste Michel + Erez Lieberman Aiden
Ali Carr-Chellman
Ellen Dunham-Jones
Joshua Prince-Ramus


In [153]:
l = [1, 3, 4, 0]
print(any(l))

l = [0, False]
print(any(l))

l = [0, False, 5]
print(any(l))

True
False
True


In [123]:
names = [i for i in data.speaker.tolist()]

# confirmationt here are 6
for name in names:
    if any(chars) in name:
        print(name)

TypeError: 'in <string>' requires string as left operand, not bool

In [72]:
names = [i for i in data.speaker.tolist()]

# we find 3 entries this way.
for name in names:
    if ' + ' in name:
        print(name)

Caitria + Morgan O'Neill
Charles Hazlewood + British Paraorchestra
Jean-Baptiste Michel + Erez Lieberman Aiden


In [75]:
# index 584 is True
data.speaker.str.contains('\+').sum()

3

In [60]:
data

Unnamed: 0,page,pagesource,title,speaker,dateposted,duration,link
0,1,https://www.ted.com/talks?event=tedx&amp;sort=...,The ruralities of autism,Amy Price Azano,Jan 2019,12:31,https://www.ted.com/talks/amy_price_azano_the_...
1,1,https://www.ted.com/talks?event=tedx&amp;sort=...,How stigma shaped modern medicine,Nathalia Holt,Jan 2019,15:30,https://www.ted.com/talks/nathalia_holt_how_st...
2,1,https://www.ted.com/talks?event=tedx&amp;sort=...,3 ways to build a happy marriage and avoid div...,George Blair-West,Jan 2019,11:13,https://www.ted.com/talks/george_blair_west_3_...
3,1,https://www.ted.com/talks?event=tedx&amp;sort=...,A mother and son's photographic journey throug...,Tony Luciani,Jan 2019,13:32,https://www.ted.com/talks/tony_luciani_a_mothe...
4,1,https://www.ted.com/talks?event=tedx&amp;sort=...,5 ways to share math with kids,Dan Finkel,Jan 2019,14:41,https://www.ted.com/talks/dan_finkel_5_ways_to...
5,1,https://www.ted.com/talks?event=tedx&amp;sort=...,Why is algebra so hard? The answer is surprisi...,Emmanuel Schanzer,Jan 2019,13:51,https://www.ted.com/talks/emmanuel_schanzer_wh...
6,1,https://www.ted.com/talks?event=tedx&amp;sort=...,How math is our real sixth sense,Eddie Woo,Jan 2019,13:13,https://www.ted.com/talks/eddie_woo_mathematic...
7,1,https://www.ted.com/talks?event=tedx&amp;sort=...,The real reason female entrepreneurs get less ...,Dana Kanze,Dec 2018,14:48,https://www.ted.com/talks/dana_kanze_the_real_...
8,1,https://www.ted.com/talks?event=tedx&amp;sort=...,3 kinds of bias that shape your worldview,J. Marshall Shepherd,Dec 2018,12:21,https://www.ted.com/talks/j_marshall_shepherd_...
9,1,https://www.ted.com/talks?event=tedx&amp;sort=...,How storytelling helps parents in prison stay ...,Alan Crickmore,Dec 2018,15:28,https://www.ted.com/talks/alan_crickmore_how_s...


In [46]:
joint_speakers_2 = data.speaker.str.contains(' + ')#, case=False)

In [None]:
# Charles Hazlewood + British Paraorchestra 

In [43]:
data[joint_speakers_2]

Unnamed: 0,page,pagesource,title,speaker,dateposted,duration,link
584,17,https://www.ted.com/talks?event=tedx&page=17&s...,Obesity + hunger = 1 global food issue,Ellen Gustafson,"Informative, Persuasive",11:15,https://www.ted.com/talks/ellen_gustafson_obes...


In [23]:
# 8 talks have joint speakers and juses 'And'
data.speaker.str.contains('And').sum()

8

In [24]:
?data.speaker.str.contains

In [15]:
data.map('and' in data.speaker)

AttributeError: 'DataFrame' object has no attribute 'map'

In [14]:
# Caitlin Quattromani and Lauran Arledge
'and 'in data.speaker

False