# Analyzing Drag Race Success

<a id="table_of_contents"></a>

### Table of contents

<ol>
  <li><a href="#overview">Situation Overview</a>
  <ul>
   <li><a href='#dataset'>Examine the overall dataset</a></li>
    <li><a href='#cleaning'>Execute cleaning procedures</a></li>
  </ul>
  </li>
  
  <li><a href="#eda">Exploratory analysis</a>
  <ul>
    <li><a href='#eda'>How old are the contestants?</a></li>
    <li><a href='#eda'>What is the average age of all the contestants on this show?</a></li>
     <li><a href='#eda'>How many contestants are participating, and where do they come from?</a></li>
     <li><a href='#eda'>What is the averange score?</a></li>
     <li><a href='#eda'>What is the typical number of challenges won on average?</a></li>
    </ul>
  </li>  
  
  <li><a href="#analysis">Data analysis</a>
  <ul>
    <li><a href='#analysis'>Where do the winners come from?</a></li>
    <li><a href='#analysis'>What's the expected number of challenges that the top 4 participants should win, on average, over the course of the competition?</a></li>
    <li><a href='#analysis'>To make it to the top 4, what's the average score a participant should aim for in each episode of the season?</a></li>
    <li><a href='#analysis'>What is the average age of the top 4 participants in the competitions?</a></li>
    <li><a href='#analysis'>Top 4 contestants key elements table </a></li>
    </ul>
  </li>  
  <li><a href="#insights">Insights</a>
</ol>


<a id="overview"></a>
## Situation Overview
The goal of this project is to uncover the key factors, using the available data, that contribute to a participant reaching at least the semifinals in the contest.

<a href="#table_of_contents">Navigate to contents</a>

<a id="dataset"></a>
Examine the overall dataset
<br>
<a href="#table_of_contents">Navigate to contents</a>

In [1]:
#load sql extension
%load_ext sql

In [2]:
#connect to mysql database
%sql mysql://root:***@localhost:3306/project

In [3]:
%%sql 
select * from la_mas_draga 
limit 3

 * mysql://root:***@localhost:3306/project
3 rows affected.


Lugar,Participante,Nombre,Lugar de residencia,Edad,Selección,Retos ganados,Resultado,Temporada
1,Alexis 3XL,Itzel Moreno,"Matamoros, Tamaulipas",28,Audiciones MTY,2,Ganadora,2
2,Sophia Jiménez,Guillermo Jiménez,"Guadalajara, Jal.",31,Audiciones GDL,3,Finalistas,2
3,Gvajardo,Pablo Guajardo,"Monterrey, N.L.",29,Secretísima,1,Finalistas,2


In [4]:
%%sql 
select * from la_mas_draga_scores
limit 3

 * mysql://root:***@localhost:3306/project
3 rows affected.


Concursante,Episodio,Nombre_de_episodio,Progreso,Calificacion
Aisha Dollkills,1,Artesanal,MENOS,7
Aisha Dollkills,2,Juguete,SALV,14
Aisha Dollkills,3,A Color,ELIM,12


<a id="cleaning"></a>
Execute cleaning procedures
<br>
<a href="#table_of_contents">Navigate to contents</a>

In [38]:
%%sql 
select lugar, participante, temporada
from la_mas_draga
where participante = "Paper cut"

 * mysql://root:***@localhost:3306/project
2 rows affected.


lugar,participante,temporada
12,Paper Cut,4
2,Paper Cut,5


In [39]:
%%sql 
select Concursante, episodio, progreso, calificacion
from la_mas_draga_scores
where concursante = "Paper cut"

 * mysql://root:***@localhost:3306/project
40 rows affected.


Concursante,episodio,progreso,calificacion
Paper Cut,1,ALTO,17
Paper Cut,1,ALTO,0
Paper Cut,1,ENTRA,17
Paper Cut,1,ENTRA,0
Paper Cut,2,ALTO,18
Paper Cut,2,ALTO,16
Paper Cut,2,SALV,18
Paper Cut,2,SALV,16
Paper Cut,3,EL MÁS,19
Paper Cut,3,EL MÁS,16


In [33]:
%%sql 
alter table la_mas_draga_scores
add column unique_id INT AUTO_INCREMENT PRIMARY KEY

 * mysql://root:***@localhost:3306/project
0 rows affected.


[]

In [41]:
%%sql 
alter table la_mas_draga
add column unique_id INT AUTO_INCREMENT PRIMARY KEY

 * mysql://root:***@localhost:3306/project
0 rows affected.


[]

In [40]:
%%sql 
select * from la_mas_draga_scores
limit 3

 * mysql://root:***@localhost:3306/project
3 rows affected.


Concursante,Episodio,Nombre_de_episodio,Progreso,Calificacion,unique_id
Huma Kyle,6,Del Toro,ABDICÓ,0,1
Aisha Dollkills,1,Artesanal,MENOS,7,2
Huma Kyle,7,Tejocote,FUERA,0,3


In [42]:
%%sql 
select * from la_mas_draga
limit 3

 * mysql://root:***@localhost:3306/project
3 rows affected.


Lugar,Participante,Nombre,Lugar de residencia,Edad,Selección,Retos ganados,Resultado,Temporada,unique_id
1,Alexis 3XL,Itzel Moreno,"Matamoros, Tamaulipas",28,Audiciones MTY,2,Ganadora,2,1
2,Sophia Jiménez,Guillermo Jiménez,"Guadalajara, Jal.",31,Audiciones GDL,3,Finalistas,2,2
3,Gvajardo,Pablo Guajardo,"Monterrey, N.L.",29,Secretísima,1,Finalistas,2,3


In [44]:
%%sql 
select sc.unique_id, d.lugar, d.participante, `lugar de residencia`, d.edad, `retos ganados`, d.resultado, d.temporada, sc.episodio, sc.progreso, sc.calificacion 
from la_mas_draga as d
join la_mas_draga_scores as sc
on d.participante = sc.concursante
where participante = "Paper Cut"
order by temporada desc

 * mysql://root:***@localhost:3306/project
80 rows affected.


unique_id,lugar,participante,lugar de residencia,edad,retos ganados,resultado,temporada,episodio,progreso,calificacion
286,2,Paper Cut,Ciudad De México,25,2,Finalista,5,1,ALTO,17
306,2,Paper Cut,Ciudad De México,25,2,Finalista,5,6,FUERA,0
287,2,Paper Cut,Ciudad De México,25,2,Finalista,5,1,ALTO,0
307,2,Paper Cut,Ciudad De México,25,2,Finalista,5,6,FUERA,20
288,2,Paper Cut,Ciudad De México,25,2,Finalista,5,1,ENTRA,17
308,2,Paper Cut,Ciudad De México,25,2,Finalista,5,6,EL MÁS,0
289,2,Paper Cut,Ciudad De México,25,2,Finalista,5,1,ENTRA,0
309,2,Paper Cut,Ciudad De México,25,2,Finalista,5,6,EL MÁS,20
290,2,Paper Cut,Ciudad De México,25,2,Finalista,5,2,ALTO,18
310,2,Paper Cut,Ciudad De México,25,2,Finalista,5,7,FUERA,0


In [46]:
%%sql 
update la_mas_draga_scores
set progreso = "FUERA"
where concursante = "Paper Cut" and calificacion = 0

 * mysql://root:***@localhost:3306/project
12 rows affected.


[]

In [48]:
%%sql 
select sc.unique_id, d.lugar, d.participante, `lugar de residencia`, d.edad, `retos ganados`, d.resultado, d.temporada, sc.episodio, sc.progreso, sc.calificacion 
from la_mas_draga as d
join la_mas_draga_scores as sc
on d.participante = sc.concursante
where d.participante = "Paper Cut" and d.temporada = 5
order by episodio desc

 * mysql://root:***@localhost:3306/project
40 rows affected.


unique_id,lugar,participante,lugar de residencia,edad,retos ganados,resultado,temporada,episodio,progreso,calificacion
325,2,Paper Cut,Ciudad De México,25,2,Finalista,5,10,ALTO,19
324,2,Paper Cut,Ciudad De México,25,2,Finalista,5,10,FUERA,0
323,2,Paper Cut,Ciudad De México,25,2,Finalista,5,10,FUERA,19
322,2,Paper Cut,Ciudad De México,25,2,Finalista,5,10,FUERA,0
321,2,Paper Cut,Ciudad De México,25,2,Finalista,5,9,ALTO,18
320,2,Paper Cut,Ciudad De México,25,2,Finalista,5,9,FUERA,0
319,2,Paper Cut,Ciudad De México,25,2,Finalista,5,9,FUERA,18
318,2,Paper Cut,Ciudad De México,25,2,Finalista,5,9,FUERA,0
317,2,Paper Cut,Ciudad De México,25,2,Finalista,5,8,ALTO,18
316,2,Paper Cut,Ciudad De México,25,2,Finalista,5,8,FUERA,0


In [49]:
%%sql 
delete from la_mas_draga_scores
where unique_id in ("286", "288", "287","291", "292", "290", "294", "295", "296", "298", "299", "301", "302", "303", "304", "308", "307", "306", "310", "311", "312", "314", "315", "316", "318", "319", "320", "322", "323", "324" )

 * mysql://root:***@localhost:3306/project
30 rows affected.


[]

In [50]:
%%sql 
select sc.unique_id, d.lugar, d.participante, `lugar de residencia`, d.edad, `retos ganados`, d.resultado, d.temporada, sc.episodio, sc.progreso, sc.calificacion 
from la_mas_draga as d
join la_mas_draga_scores as sc
on d.participante = sc.concursante
where d.participante = "Paper Cut" and d.temporada = 5
order by episodio desc

 * mysql://root:***@localhost:3306/project
10 rows affected.


unique_id,lugar,participante,lugar de residencia,edad,retos ganados,resultado,temporada,episodio,progreso,calificacion
325,2,Paper Cut,Ciudad De México,25,2,Finalista,5,10,ALTO,19
321,2,Paper Cut,Ciudad De México,25,2,Finalista,5,9,ALTO,18
317,2,Paper Cut,Ciudad De México,25,2,Finalista,5,8,ALTO,18
313,2,Paper Cut,Ciudad De México,25,2,Finalista,5,7,ALTO,19
309,2,Paper Cut,Ciudad De México,25,2,Finalista,5,6,EL MÁS,20
305,2,Paper Cut,Ciudad De México,25,2,Finalista,5,5,ALTO,18
300,2,Paper Cut,Ciudad De México,25,2,Finalista,5,4,EL MÁS,19
297,2,Paper Cut,Ciudad De México,25,2,Finalista,5,3,SALV,16
293,2,Paper Cut,Ciudad De México,25,2,Finalista,5,2,SALV,16
289,2,Paper Cut,Ciudad De México,25,2,Finalista,5,1,FUERA,0


In [51]:
%%sql 
select sc.unique_id, d.lugar, d.participante, `lugar de residencia`, d.edad, `retos ganados`, d.resultado, d.temporada, sc.episodio, sc.progreso, sc.calificacion 
from la_mas_draga as d
join la_mas_draga_scores as sc
on d.participante = sc.concursante
where d.participante = "Paper Cut" and d.temporada = 4
order by episodio desc

 * mysql://root:***@localhost:3306/project
10 rows affected.


unique_id,lugar,participante,lugar de residencia,edad,retos ganados,resultado,temporada,episodio,progreso,calificacion
325,12,Paper Cut,Ciudad de México,23,1,4° eliminado,4,10,ALTO,19
321,12,Paper Cut,Ciudad de México,23,1,4° eliminado,4,9,ALTO,18
317,12,Paper Cut,Ciudad de México,23,1,4° eliminado,4,8,ALTO,18
313,12,Paper Cut,Ciudad de México,23,1,4° eliminado,4,7,ALTO,19
309,12,Paper Cut,Ciudad de México,23,1,4° eliminado,4,6,EL MÁS,20
305,12,Paper Cut,Ciudad de México,23,1,4° eliminado,4,5,ALTO,18
300,12,Paper Cut,Ciudad de México,23,1,4° eliminado,4,4,EL MÁS,19
297,12,Paper Cut,Ciudad de México,23,1,4° eliminado,4,3,SALV,16
293,12,Paper Cut,Ciudad de México,23,1,4° eliminado,4,2,SALV,16
289,12,Paper Cut,Ciudad de México,23,1,4° eliminado,4,1,FUERA,0


In [55]:
%%sql 
update la_mas_draga_scores
set progreso = "FUERA", calificacion = 0
where concursante = "Paper Cut" and unique_id in ("309", "313", "317", "321", "325")


 * mysql://root:***@localhost:3306/project
5 rows affected.


[]

In [56]:
%%sql 
select sc.unique_id, d.lugar, d.participante, `lugar de residencia`, d.edad, `retos ganados`, d.resultado, d.temporada, sc.episodio, sc.progreso, sc.calificacion 
from la_mas_draga as d
join la_mas_draga_scores as sc
on d.participante = sc.concursante
where d.participante = "Paper Cut" and d.temporada = 4
order by episodio desc

 * mysql://root:***@localhost:3306/project
10 rows affected.


unique_id,lugar,participante,lugar de residencia,edad,retos ganados,resultado,temporada,episodio,progreso,calificacion
325,12,Paper Cut,Ciudad de México,23,1,4° eliminado,4,10,FUERA,0
321,12,Paper Cut,Ciudad de México,23,1,4° eliminado,4,9,FUERA,0
317,12,Paper Cut,Ciudad de México,23,1,4° eliminado,4,8,FUERA,0
313,12,Paper Cut,Ciudad de México,23,1,4° eliminado,4,7,FUERA,0
309,12,Paper Cut,Ciudad de México,23,1,4° eliminado,4,6,FUERA,0
305,12,Paper Cut,Ciudad de México,23,1,4° eliminado,4,5,ALTO,18
300,12,Paper Cut,Ciudad de México,23,1,4° eliminado,4,4,EL MÁS,19
297,12,Paper Cut,Ciudad de México,23,1,4° eliminado,4,3,SALV,16
293,12,Paper Cut,Ciudad de México,23,1,4° eliminado,4,2,SALV,16
289,12,Paper Cut,Ciudad de México,23,1,4° eliminado,4,1,FUERA,0


In [64]:
%%sql 
select distinct(resultado) from la_mas_draga 

 * mysql://root:***@localhost:3306/project
18 rows affected.


resultado
Ganadora
Finalistas
6ª eliminada
5ª eliminada
4ª eliminada
3ª eliminadas
2ª eliminada
1ª eliminada
8ª/9ª eliminada
7ª eliminada


In [65]:
%%sql 
select distinct(lugar) from la_mas_draga

 * mysql://root:***@localhost:3306/project
14 rows affected.


lugar
1
2
3
4
5
6
7
8
9
10


In [66]:
%%sql
select distinct(progreso) from la_mas_draga_scores

 * mysql://root:***@localhost:3306/project
24 rows affected.


progreso
ABDICÓ
MENOS
FUERA
SALV
ELIM
VUELVE
ALTA
LA MÁS
Invitada
Compitiendo


In [67]:
%%sql 
update la_mas_draga_scores
set progreso = case
    when progreso = "ALTA MENOS" then "MENOS"
    when progreso = "ALTA EG" then "ALTA"
    when progreso = "SALV MENOS" then "MENOS"
    When progreso = "MENOS SALV" then "SALV"
    When progreso = "MENOS BAJA" then "BAJA"
    else progreso
end 

 * mysql://root:***@localhost:3306/project
433 rows affected.


[]

<a id="eda"></a>
## Exploratory Analysis
To gain understanding of my data, the first step is exploration

<a href="#table_of_contents">Navigate to contents</a>

#### How old are the constestants?

In [68]:
%%sql 
select distinct(edad) from la_mas_draga 

 * mysql://root:***@localhost:3306/project
20 rows affected.


edad
28
31
29
34
33
25
37
35
157
39


#### What is the average age of all the contestants on this show?

In [69]:
%%sql
select round(avg(edad)) as avg_edad 
from la_mas_draga

 * mysql://root:***@localhost:3306/project
1 rows affected.


avg_edad
32


#### How many contestants are participating, and where do they come from?

In [70]:
%%sql
select `lugar de residencia` as place, COUNT(*) as participant_count
from la_mas_draga
group by `lugar de residencia`
order by participant_count desc

 * mysql://root:***@localhost:3306/project
23 rows affected.


place,participant_count
Ciudad de México,11
"Monterrey, N.L.",9
"Guadalajara, Jal.",7
"Chihuahua, Chih.",2
"Acapulco de Juárez, Gro.",2
"Matamoros, Tamaulipas",1
"Saltillo, Coah.",1
"Progreso, Yuc.",1
"Mérida, Yuc.",1
"Aguascalientes, Ags.",1


#### What is the averange score?

In [71]:
%%sql 
select round(avg(calificacion)) as avg_score
from la_mas_draga_scores

 * mysql://root:***@localhost:3306/project
1 rows affected.


avg_score
9


#### What is the typical number of challenges won on average?

In [72]:
%%sql 
select round(avg(`retos ganados`),2) as avg_challenges_won
from la_mas_draga

 * mysql://root:***@localhost:3306/project
1 rows affected.


avg_challenges_won
0.82


<a id="analysis"></a>
## Data Analysis

<a href="#table_of_contents">Navigate to contents</a>

#### Where do the winners come from?

In [73]:
%%sql 
select `lugar de residencia` as place, count(*) as participant_count
from la_mas_draga
where resultado in ("Ganadora", "Finalista", "Finalistas", "Finalista secreta")
group by `lugar de residencia`
order by participant_count desc

 * mysql://root:***@localhost:3306/project
11 rows affected.


place,participant_count
"Guadalajara, Jal.",3
"Monterrey, N.L.",3
"Matamoros, Tamaulipas",1
"Mérida, Yuc.",1
"Antofagasta, Chile",1
"Chihuahua, Chih.",1
"Ensenada, B.C.",1
"Lázaro Cárdenas, Mich.",1
Ciudad De México,1
"Acapulco de Juárez, Gro.",1


#### What's the expected number of challenges that the top 4 participants should win, on average, over the course of the competition?

In [74]:
%%sql 
select round(avg(`retos ganados`)) as wins
from la_mas_draga
where resultado in ("Ganadora", "Finalista", "Finalistas", "Finalista secreta")

 * mysql://root:***@localhost:3306/project
1 rows affected.


wins
2


#### To make it to the top 4, what's the average score a participant should aim for in each episode of the season?

In [75]:
%%sql 
select round(avg(s.calificacion),2)
from la_mas_draga_scores as s
join la_mas_draga as d 
on d.participante = s.concursante
where d.resultado in ("Ganadora", "Finalista", "Finalistas", "Finalista secreta")

 * mysql://root:***@localhost:3306/project
1 rows affected.


"round(avg(s.calificacion),2)"
13.38


#### What is the average age of the top 4 participants in the competitions?

In [76]:
%%sql 
select round(avg(edad))
from la_mas_draga
where resultado in ("Ganadora", "Finalista", "Finalistas", "Finalista secreta")

 * mysql://root:***@localhost:3306/project
1 rows affected.


round(avg(edad))
29


#### Top 4 contestants key elements table 

In [63]:
%%sql 
select d.lugar, d.participante, `lugar de residencia`, d.edad, `retos ganados`, d.resultado, d.temporada, sc.episodio, sc.progreso, sc.calificacion 
from la_mas_draga as d
join la_mas_draga_scores as sc
on d.participante = sc.concursante
where resultado in ("Ganadora", "Finalista", "Finalistas", "Finalista secreta")
order by temporada asc

 * mysql://root:***@localhost:3306/project
128 rows affected.


lugar,participante,lugar de residencia,edad,retos ganados,resultado,temporada,episodio,progreso,calificacion
1,Alexis 3XL,"Matamoros, Tamaulipas",28,2,Ganadora,2,1,MENOS,14
1,Alexis 3XL,"Matamoros, Tamaulipas",28,2,Ganadora,2,2,ALTA,13
1,Alexis 3XL,"Matamoros, Tamaulipas",28,2,Ganadora,2,3,ALTA,18
1,Alexis 3XL,"Matamoros, Tamaulipas",28,2,Ganadora,2,4,LA MÁS,18
1,Alexis 3XL,"Matamoros, Tamaulipas",28,2,Ganadora,2,5,SALV,15
1,Alexis 3XL,"Matamoros, Tamaulipas",28,2,Ganadora,2,6,LA MÁS,19
1,Alexis 3XL,"Matamoros, Tamaulipas",28,2,Ganadora,2,7,MENOS,18
1,Alexis 3XL,"Matamoros, Tamaulipas",28,2,Ganadora,2,8,Compitiendo,0
1,Alexis 3XL,"Matamoros, Tamaulipas",28,2,Ganadora,2,9,Ganadora,0
3,Gvajardo,"Monterrey, N.L.",29,1,Finalistas,2,1,LA MÁS,19


<a id="insights"></a>
## Insights

* Participants should aim for at least two wins during all the competition
* To remain competitive, it's important to maintain an average score of at least 13.38
* The top four contestants predominantly come Guadalajara and Monterrey
* The analysis found that the typical age of the top four contestants averages out to be 29 years old.

<a href="#table_of_contents">Navigate to contents</a>