# Gendering the talks 

One of our areas of interest are to see if TED talks by men and women are different. If there are differences, we would like to see if we can detect the gender of a speaker based on the words in a TED talk transcript. To pursue these avenues, we begin by "gendering" talks when possible. 

In this notebook, we use the genders of the speakers to "gender" the TED talks themselves. A talk with one speaker inherits the gender of the speaker. For a talk with two speakers, if the genders of the speakers are the same, then we proceed as if the talk had one speaker. For talks with two speakers where the genders of the speakers is not the same, we place these talks to the side. 

In [2]:
import pandas as pd
import csv
import string

In [3]:
# Load the gendered speaker file:
speakers = pd.read_csv("speakers_with_gender.csv")

In [4]:
ted_only = pd.read_csv('../data/Release_v0/TEDonly_final.csv')
ted_plus = pd.read_csv('../data/Release_v0/TEDplus_final.csv')

In [5]:
ted_only.head()

Unnamed: 0.2,Unnamed: 0,Unnamed: 0.1,Talk_ID,public_url,headline,description,event,duration,published,tags,views,text,speaker_1,speaker_2,speaker_3,speaker_4
0,0,0,1,https://www.ted.com/talks/al_gore_on_averting_...,Averting the climate crisis,With the same humor and humanity he exuded in ...,TED2006,0:16:17,6/27/06,"alternative energy,cars,global issues,climate ...",3266733,"Thank you so much, Chris. And it's truly a g...",Al Gore,,,
1,1,1,7,https://www.ted.com/talks/david_pogue_says_sim...,Simplicity sells,New York Times columnist David Pogue takes aim...,TED2006,0:21:26,6/27/06,"simplicity,entertainment,interface design,soft...",1702201,"(Music: ""The Sound of Silence,"" Simon & Garf...",David Pogue,,,
2,2,2,53,https://www.ted.com/talks/majora_carter_s_tale...,Greening the ghetto,"In an emotionally charged talk, MacArthur-winn...",TED2006,0:18:36,6/27/06,"MacArthur grant,cities,green,activism,politics...",2000421,If you're here today — and I'm very happy th...,Majora Carter,,,
3,3,3,66,https://www.ted.com/talks/ken_robinson_says_sc...,Do schools kill creativity?,Sir Ken Robinson makes an entertaining and pro...,TED2006,0:19:24,6/27/06,"children,teaching,creativity,parenting,culture...",51614087,Good morning. How are you? (Laughter) ...,Ken Robinson,,,
4,4,4,92,https://www.ted.com/talks/hans_rosling_shows_t...,The best stats you've ever seen,You've never seen data presented like this. Wi...,TED2006,0:19:50,6/27/06,"demo,Asia,global issues,visualizations,global ...",12662135,"About 10 years ago, I took on the task to te...",Hans Rosling,,,


In [6]:
# Set the talk ID as the index and drop the unnecessary first two columns: 
ted_only = ted_only.set_index('Talk_ID')
ted_only = ted_only.drop(columns = ['Unnamed: 0', 'Unnamed: 0.1'])

In [7]:
ted_only.head()

Unnamed: 0_level_0,public_url,headline,description,event,duration,published,tags,views,text,speaker_1,speaker_2,speaker_3,speaker_4
Talk_ID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1
1,https://www.ted.com/talks/al_gore_on_averting_...,Averting the climate crisis,With the same humor and humanity he exuded in ...,TED2006,0:16:17,6/27/06,"alternative energy,cars,global issues,climate ...",3266733,"Thank you so much, Chris. And it's truly a g...",Al Gore,,,
7,https://www.ted.com/talks/david_pogue_says_sim...,Simplicity sells,New York Times columnist David Pogue takes aim...,TED2006,0:21:26,6/27/06,"simplicity,entertainment,interface design,soft...",1702201,"(Music: ""The Sound of Silence,"" Simon & Garf...",David Pogue,,,
53,https://www.ted.com/talks/majora_carter_s_tale...,Greening the ghetto,"In an emotionally charged talk, MacArthur-winn...",TED2006,0:18:36,6/27/06,"MacArthur grant,cities,green,activism,politics...",2000421,If you're here today — and I'm very happy th...,Majora Carter,,,
66,https://www.ted.com/talks/ken_robinson_says_sc...,Do schools kill creativity?,Sir Ken Robinson makes an entertaining and pro...,TED2006,0:19:24,6/27/06,"children,teaching,creativity,parenting,culture...",51614087,Good morning. How are you? (Laughter) ...,Ken Robinson,,,
92,https://www.ted.com/talks/hans_rosling_shows_t...,The best stats you've ever seen,You've never seen data presented like this. Wi...,TED2006,0:19:50,6/27/06,"demo,Asia,global issues,visualizations,global ...",12662135,"About 10 years ago, I took on the task to te...",Hans Rosling,,,


First, we want to know which talks only have one speaker. To do this, we only select the rows that `NaN` for the second speaker. 

In [8]:
talk1speak = ted_only[ted_only['speaker_2'].isnull()]

# https://stackoverflow.com/questions/43831539/how-to-select-rows-with-nan-in-particular-column

In [9]:
talk1speak.shape

(966, 13)

We have 966 talks that have only one speaker. 


To figure out which talks have exactly two speakers, we select the rows that have a second speaker, but **not** a third one:  

In [10]:
s2temp = ted_only[ted_only['speaker_3'].isnull()]
talk2speak = s2temp[~s2temp['speaker_2'].isnull()]

In [11]:
talk2speak.shape

(25, 13)

We have 25 talks that have two speakers. 


To figure out which talks have three speakers, we select the rows that have a third speaker, but **not** a fourth one:  

In [12]:
s3temp = ted_only[ted_only['speaker_4'].isnull()]
talk3speak = s3temp[~s3temp['speaker_3'].isnull()]

In [13]:
talk3speak.shape

(0, 13)

There are no talks with exactly three speakers. 


To figure out which talks have four speakers, we select the rows that have a fourth speaker:  

In [14]:
talk4speak = ted_only[~ted_only['speaker_4'].isnull()]


In [15]:
talk4speak.shape

(1, 13)

As a quick check, we check if the total number of rows in the `talkNspeak` data frames equal the number of rows in `ted_only`:

In [16]:
ted_only.shape[0] == talk1speak.shape[0] + talk2speak.shape[0] + talk3speak.shape[0] + talk4speak.shape[0]

True

## Adding gender to the talks

Now that we have separate the talks into smaller data frames, each representing the number of speakers, we will be working to "gender" the talks. 

In [17]:
gender_slice = speakers[["speaker","Gender_handcheck"]]

In [18]:
gender_slice

Unnamed: 0,speaker,Gender_handcheck
0,Al Gore,male
1,David Pogue,male
2,Majora Carter,female
3,Ken Robinson,male
4,Hans Rosling,male
5,Tony Robbins,male
6,Joshua Prince-Ramus,male
7,Julia Sweeney,female
8,Rick Warren,male
9,Dan Dennett,male


In [19]:
# We want to take speaker gender from speakers and put it into the talks

talk1speak = talk1speak.merge(gender_slice, left_on = "speaker_1", right_on = "speaker", how = "left")


In [20]:
talk1speak

Unnamed: 0,public_url,headline,description,event,duration,published,tags,views,text,speaker_1,speaker_2,speaker_3,speaker_4,speaker,Gender_handcheck
0,https://www.ted.com/talks/al_gore_on_averting_...,Averting the climate crisis,With the same humor and humanity he exuded in ...,TED2006,0:16:17,6/27/06,"alternative energy,cars,global issues,climate ...",3266733,"Thank you so much, Chris. And it's truly a g...",Al Gore,,,,Al Gore,male
1,https://www.ted.com/talks/david_pogue_says_sim...,Simplicity sells,New York Times columnist David Pogue takes aim...,TED2006,0:21:26,6/27/06,"simplicity,entertainment,interface design,soft...",1702201,"(Music: ""The Sound of Silence,"" Simon & Garf...",David Pogue,,,,David Pogue,male
2,https://www.ted.com/talks/majora_carter_s_tale...,Greening the ghetto,"In an emotionally charged talk, MacArthur-winn...",TED2006,0:18:36,6/27/06,"MacArthur grant,cities,green,activism,politics...",2000421,If you're here today — and I'm very happy th...,Majora Carter,,,,Majora Carter,female
3,https://www.ted.com/talks/ken_robinson_says_sc...,Do schools kill creativity?,Sir Ken Robinson makes an entertaining and pro...,TED2006,0:19:24,6/27/06,"children,teaching,creativity,parenting,culture...",51614087,Good morning. How are you? (Laughter) ...,Ken Robinson,,,,Ken Robinson,male
4,https://www.ted.com/talks/hans_rosling_shows_t...,The best stats you've ever seen,You've never seen data presented like this. Wi...,TED2006,0:19:50,6/27/06,"demo,Asia,global issues,visualizations,global ...",12662135,"About 10 years ago, I took on the task to te...",Hans Rosling,,,,Hans Rosling,male
5,https://www.ted.com/talks/tony_robbins_asks_wh...,Why we do what we do,"Tony Robbins discusses the ""invisible forces"" ...",TED2006,0:21:45,6/27/06,"entertainment,goal-setting,potential,psycholog...",22368699,Thank you. I have to tell you I'm both chall...,Tony Robbins,,,,Tony Robbins,male
6,https://www.ted.com/talks/joshua_prince_ramus_...,Behind the design of Seattle's library,Architect Joshua Prince-Ramus takes the audien...,TED2006,0:19:58,7/10/06,"library,architecture,design,culture,collaboration",1042335,I'm going to present three projects in rapid...,Joshua Prince-Ramus,,,,Joshua Prince-Ramus,male
7,https://www.ted.com/talks/julia_sweeney_on_let...,Letting go of God,When two young Mormon missionaries knock on Ju...,TED2006,0:16:32,7/10/06,"atheism,Christianity,religion,God,comedy,humor...",3903747,"On September 10, the morning of my seventh b...",Julia Sweeney,,,,Julia Sweeney,female
8,https://www.ted.com/talks/rick_warren_on_a_lif...,A life of purpose,"Pastor Rick Warren, author of ""The Purpose-Dri...",TED2006,0:21:02,7/18/06,"Christianity,philanthropy,religion,God,happine...",3361934,"I'm often asked, ""What surprised you about t...",Rick Warren,,,,Rick Warren,male
9,https://www.ted.com/talks/dan_dennett_s_respon...,Let's teach religion -- all religion -- in sch...,Philosopher Dan Dennett calls for religion -- ...,TED2006,0:24:45,7/18/06,"atheism,consciousness,evolution,philosophy,rel...",2751013,It's wonderful to be back. I love this wonde...,Dan Dennett,,,,Dan Dennett,male


In [21]:
# Remove the second speaker column and rename the last column to "talk_gender"

talk1speak.drop(columns = ['speaker'])
talk1speak.rename(columns={"Gender_handcheck": "talk_gender"})

Unnamed: 0,public_url,headline,description,event,duration,published,tags,views,text,speaker_1,speaker_2,speaker_3,speaker_4,speaker,talk_gender
0,https://www.ted.com/talks/al_gore_on_averting_...,Averting the climate crisis,With the same humor and humanity he exuded in ...,TED2006,0:16:17,6/27/06,"alternative energy,cars,global issues,climate ...",3266733,"Thank you so much, Chris. And it's truly a g...",Al Gore,,,,Al Gore,male
1,https://www.ted.com/talks/david_pogue_says_sim...,Simplicity sells,New York Times columnist David Pogue takes aim...,TED2006,0:21:26,6/27/06,"simplicity,entertainment,interface design,soft...",1702201,"(Music: ""The Sound of Silence,"" Simon & Garf...",David Pogue,,,,David Pogue,male
2,https://www.ted.com/talks/majora_carter_s_tale...,Greening the ghetto,"In an emotionally charged talk, MacArthur-winn...",TED2006,0:18:36,6/27/06,"MacArthur grant,cities,green,activism,politics...",2000421,If you're here today — and I'm very happy th...,Majora Carter,,,,Majora Carter,female
3,https://www.ted.com/talks/ken_robinson_says_sc...,Do schools kill creativity?,Sir Ken Robinson makes an entertaining and pro...,TED2006,0:19:24,6/27/06,"children,teaching,creativity,parenting,culture...",51614087,Good morning. How are you? (Laughter) ...,Ken Robinson,,,,Ken Robinson,male
4,https://www.ted.com/talks/hans_rosling_shows_t...,The best stats you've ever seen,You've never seen data presented like this. Wi...,TED2006,0:19:50,6/27/06,"demo,Asia,global issues,visualizations,global ...",12662135,"About 10 years ago, I took on the task to te...",Hans Rosling,,,,Hans Rosling,male
5,https://www.ted.com/talks/tony_robbins_asks_wh...,Why we do what we do,"Tony Robbins discusses the ""invisible forces"" ...",TED2006,0:21:45,6/27/06,"entertainment,goal-setting,potential,psycholog...",22368699,Thank you. I have to tell you I'm both chall...,Tony Robbins,,,,Tony Robbins,male
6,https://www.ted.com/talks/joshua_prince_ramus_...,Behind the design of Seattle's library,Architect Joshua Prince-Ramus takes the audien...,TED2006,0:19:58,7/10/06,"library,architecture,design,culture,collaboration",1042335,I'm going to present three projects in rapid...,Joshua Prince-Ramus,,,,Joshua Prince-Ramus,male
7,https://www.ted.com/talks/julia_sweeney_on_let...,Letting go of God,When two young Mormon missionaries knock on Ju...,TED2006,0:16:32,7/10/06,"atheism,Christianity,religion,God,comedy,humor...",3903747,"On September 10, the morning of my seventh b...",Julia Sweeney,,,,Julia Sweeney,female
8,https://www.ted.com/talks/rick_warren_on_a_lif...,A life of purpose,"Pastor Rick Warren, author of ""The Purpose-Dri...",TED2006,0:21:02,7/18/06,"Christianity,philanthropy,religion,God,happine...",3361934,"I'm often asked, ""What surprised you about t...",Rick Warren,,,,Rick Warren,male
9,https://www.ted.com/talks/dan_dennett_s_respon...,Let's teach religion -- all religion -- in sch...,Philosopher Dan Dennett calls for religion -- ...,TED2006,0:24:45,7/18/06,"atheism,consciousness,evolution,philosophy,rel...",2751013,It's wonderful to be back. I love this wonde...,Dan Dennett,,,,Dan Dennett,male


In [22]:
talk1speak.shape

(968, 15)

Two extra rows were added during this process. This is likely due to speakers having the same name as noted in "TED talks as Data" paper. 

We need to remove these extra rows:

In [26]:
talk1_dup = talk1speak[talk1speak.duplicated(['headline'])]


In [27]:
talk1_dup


Unnamed: 0,public_url,headline,description,event,duration,published,tags,views,text,speaker_1,speaker_2,speaker_3,speaker_4,speaker,Gender_handcheck
46,https://www.ted.com/talks/neil_gershenfeld_on_...,Unleash your creativity in a Fab Lab,MIT professor Neil Gershenfeld talks about his...,TED2006,0:17:18,2/19/07,"code,engineering,materials,computers,science,i...",744087,This meeting has really been about a digital...,Neil Gershenfeld,,,,Neil Gershenfeld,male
582,https://www.ted.com/talks/amanda_palmer_the_ar...,The art of asking,"Don't make people pay for music, says Amanda P...",TED2013,0:13:47,3/1/13,"entertainment,performance art,business,music",9417001,"(Breathes in) (Breathes out) So, I did...",Amanda Palmer,,,,Amanda Palmer,female


In [30]:
talk1speak[talk1speak["speaker_1"] == "Neil Gershenfeld"]

Unnamed: 0,public_url,headline,description,event,duration,published,tags,views,text,speaker_1,speaker_2,speaker_3,speaker_4,speaker,Gender_handcheck
45,https://www.ted.com/talks/neil_gershenfeld_on_...,Unleash your creativity in a Fab Lab,MIT professor Neil Gershenfeld talks about his...,TED2006,0:17:18,2/19/07,"code,engineering,materials,computers,science,i...",744087,This meeting has really been about a digital...,Neil Gershenfeld,,,,Neil Gershenfeld,male
46,https://www.ted.com/talks/neil_gershenfeld_on_...,Unleash your creativity in a Fab Lab,MIT professor Neil Gershenfeld talks about his...,TED2006,0:17:18,2/19/07,"code,engineering,materials,computers,science,i...",744087,This meeting has really been about a digital...,Neil Gershenfeld,,,,Neil Gershenfeld,male


In [31]:
talk1speak[talk1speak["speaker_1"] == "Amanda Palmer"]

Unnamed: 0,public_url,headline,description,event,duration,published,tags,views,text,speaker_1,speaker_2,speaker_3,speaker_4,speaker,Gender_handcheck
581,https://www.ted.com/talks/amanda_palmer_the_ar...,The art of asking,"Don't make people pay for music, says Amanda P...",TED2013,0:13:47,3/1/13,"entertainment,performance art,business,music",9417001,"(Breathes in) (Breathes out) So, I did...",Amanda Palmer,,,,Amanda Palmer,female
582,https://www.ted.com/talks/amanda_palmer_the_ar...,The art of asking,"Don't make people pay for music, says Amanda P...",TED2013,0:13:47,3/1/13,"entertainment,performance art,business,music",9417001,"(Breathes in) (Breathes out) So, I did...",Amanda Palmer,,,,Amanda Palmer,female


In [33]:
speakers[speakers["speaker"] == "Amanda Palmer"]

Unnamed: 0,speaker,occupation,introduction,profile,Gender_auto,MaleScore,FemaleScore,NonBinaryScore,Gender_handcheck
441,Amanda Palmer,"Musician, blogger",Alt-rock icon Amanda Fucking Palmer believes w...,Why you should listen\nAmanda Palmer commands ...,female,0.0,15.0,4.0,female
665,Amanda Palmer,musician,Alt-rock icon Amanda Fucking Palmer believes w...,Why you should listen\nAmanda Palmer commands ...,female,0.0,15.0,3.0,female


In [34]:
speakers[speakers["speaker"] == "Neil Gershenfeld"]

Unnamed: 0,speaker,occupation,introduction,profile,Gender_auto,MaleScore,FemaleScore,NonBinaryScore,Gender_handcheck
757,Neil Gershenfeld,"Physicist, personal fab pioneer",As Director of MIT’s Center for Bits and Atoms...,Why you should listen\nMIT's Neil Gershenfeld ...,undetected,2.0,0.0,2.0,male
803,Neil Gershenfeld,Physicist,As Director of MIT’s Center for Bits and Atoms...,Why you should listen\nMIT's Neil Gershenfeld ...,undetected,2.0,0.0,2.0,male


Interestingly, it's that we have two copies of the same person in the speakers dataset. This means that we can drop either copy of the repeated rows in `talk1speak`:

In [35]:
talk1speak = talk1speak.drop_duplicates()
talk1speak.shape

(966, 15)

Now with that talks with just one speaker each have been "gendered," we turn our attention to the talks with multiple speakers. We start with those that have two speakers: 

In [36]:
talk2speak

Unnamed: 0_level_0,public_url,headline,description,event,duration,published,tags,views,text,speaker_1,speaker_2,speaker_3,speaker_4
Talk_ID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1
118,https://www.ted.com/talks/sergey_brin_and_larr...,The genesis of Google,Google co-founders Larry Page and Sergey Brin ...,TED2004,0:20:33,5/3/07,"web,design,Google,culture,business,technology,...",1529641,Sergey Brin: I want to discuss a question I ...,Sergey Brin,Larry Page,,
222,https://www.ted.com/talks/the_jill_and_julia_show,The Jill and Julia Show,"Two TED favorites, Jill Sobule and Julia Sween...",TED2007,0:06:14,2/20/08,"entertainment,comedy,humor,storytelling,collab...",507130,♫ Jill Sobule: At a conference in Monterey b...,Jill Sobule,Julia Sweeney,,
224,https://www.ted.com/talks/roy_gould_and_curtis...,A preview of the WorldWide Telescope,Educator Roy Gould and researcher Curtis Wong ...,TED2008,0:06:42,2/27/08,"telescopes,demo,astronomy,universe,science,tec...",1043036,"Roy Gould: Less than a year from now, the wo...",Roy Gould,Curtis Wong,,
246,https://www.ted.com/talks/tod_machover_and_dan...,Inventing instruments that unlock new music,Tod Machover of MIT's Media Lab is devoted to ...,TED2008,0:20:41,4/15/08,"demo,entertainment,writing,live music,health c...",519734,The first idea I'd like to suggest is that w...,Tod Machover,Dan Ellsey,,
322,https://www.ted.com/talks/bruno_bowden_folds_w...,Blindfold origami and cello,After Robert Lang's talk on origami at TED2008...,TED2008,0:02:58,8/1/08,"origami,entertainment,cello,music",384129,Hello everyone. And so the two of us are her...,Bruno Bowden,Rufus Cappadocia,,
385,https://www.ted.com/talks/toys_from_the_future,Toys and materials from the future,"The Inventables guys, Zach Kaplan and Keith Sc...",TED2005,0:15:46,10/30/08,"toy,smell,industrial design,design,creativity,...",420887,Zach Kaplan: Keith and I lead a research tea...,Zach Kaplan,Keith Schacht,,
481,https://www.ted.com/talks/pattie_maes_demos_th...,Meet the SixthSense interaction,"This demo -- from Pattie Maes' lab at MIT, spe...",TED2009,0:08:42,3/10/09,"demo,interface design,design,technology",9912033,I've been intrigued by this question of whet...,Pattie Maes,Pranav Mistry,,
881,https://www.ted.com/talks/debate_does_the_worl...,Debate: Does the world need nuclear energy?,Nuclear power: the energy crisis has even die-...,TED2010,0:22:59,6/10/10,"nuclear weapons,wind energy,green,climate chan...",1362908,Chris Anderson: We're having a debate. The d...,Stewart Brand,Mark Z. Jacobson,,
988,https://www.ted.com/talks/david_byrne_sings_no...,"""(Nothing But) Flowers"" with string quartet","David Byrne sings the Talking Heads' 1988 hit,...",TED2010,0:03:15,10/22/10,"garden,future,music,performance,society",665679,(Music) ♫ Here we stand ♫ ♫ Like an Ad...,David Byrne,Thomas Dolby,,
1156,https://www.ted.com/talks/robert_gupta_and_jos...,"On violin and cello, ""Passacaglia""",It's a master class in collaboration as violin...,TED2011,0:09:21,5/27/11,"entertainment,live music,TED Fellows,creativit...",802538,(Music) (Applause) (Music) (Applaus...,Robert Gupta,Joshua Roman,,
