# Data Analysis on the Results of the Hip Hop International 2019
*Developer:* Juan Lorenzo Mercado

## Rationale
TThis Data science project seeks to analyze the score cards of teams within dance competition Hip Hop International 2019. It seeks to analyze the results of each team in various stages of the competition. The project also serves as a exploratory analysis on insights that can be gathered based on the dataset extracted from the results of the competition. Moreover, the study also hopes to find insights with regards to how judging is conducted throughout the competition.



##### References
Hip Hop international. (n.d.). World hip hop dance championship. Hip Hop International. https://www.hiphopinternational.com/world-hip-hop-dance-championship/

##### Database Resource
1. *Hip Hop International (Results) -* http://www.hiphopinternational.com/medalists/

----

# What is Hip Hop International?
Hip Hop International (HHI) is one of the world's largest dance competitions. The event hosts over four thousand dancers from all over the world to compete with one another to receive the competition's world title.(Hip Hop International, n.d.) It is often regarded as the world's olympic for streetdance as dance teams from all around the world gather and compete for this event annually.

## Competition Divisions
The competition currently has 5 divisions. The divisions are Junior Division, Varsity Division, Adult Division, MiniCrew Division, and MegaCrew Division. With teams competiting from various parts of the world.

| Division | # of Members | Age Group|
|:-----------:|:-----------:|:-----------:|
| **Junior Division** | 5-9 | 7-12 |  
| **Varsity Division** | 5-9 | 13-18 |  
| **Adult Division** | 5-9 | 18+ |  
| **MiniCrew Division** | 3 | No Restrictions |
| **MegaCrew Division** | 10-40 | No Restrictions |


## Competition Flow
Competiting teams in HHI go through 3 different stages of competition the Prelims, SemiFinals, Finals. At each stage, the total number of competiting teams are eliminated from the competition based on their ranking until the finals where a podium of 3 winners are determined (Champion, 1st Runner-Up, 2nd Runner-Up). 

# Data Analysis

This section aims to analyze the results of Hip Hop International 2019 through the results of the competition. The dataset was released in their website http://www.hiphopinternational.com/medalists/. However, the data provided by Hip Hop International was created as a PDF file for Prelims, SemiFinals, and Finals. I was forced to manually encode the data into a csv file for uniformity moving forward in the analysis.

In [186]:
# import packages
import pandas as pd
import numpy as np
from math import ceil

## Dataset Documentation
This is the documentation of the dataset file hhi2019.csv that contains the dataset of the results for the Hip Hop International 2019.

| Column Name | Data Type | Description |
|:------------|:---------:|:-----------|
| **Year** | Date | Date of the performance |
| **Category** | String | Category of Performance (Prelimnary, Semi-Finals, Finals) |
| **rank** | String | The rank of the Dance Team in their respective competition category |
| **Team** | String | The name of the Dance Team |
| **Division** | String | The Division of where the Team is competing in |
| **Country** | String | Origin Country of the Team |
| **Performance_1** | float64 | Score of Performance Judge 1 |
| **Performance_2** | float64 | Score of Performance Judge 2 |
| **Performance_3** | float64 | Score of Performance Judge 4 |
| **Performance_4** | float64 | Score of Performance Judge 4 |
| **Skill_1** | float64 | Score of Skill Judge 1 |
| **Skill_2** | float64 | Score of Skill Judge 2 |
| **Skill_3** | float64 | Score of Skill Judge 3 |
| **Skill_4** | float64 | Score of Skill Judge 4 |
| **Deductions** | float64 | Deductions given to the team |
| **Overall** | float64 | Overall Score of the team |

*Notes:*
1. Currently the dataset only contains data for the MegaCrew Division.
2. The dataset does not include overall score values as it will be computed based on the raw scores given by the HHI 2019 panel of judges. This is to get the full numeric total and not a rounded off value.

In [187]:
hhi = pd.read_csv('./../datasets/hhi2019.csv')
hhi.head()

Unnamed: 0,year,category,rank,name,country,performance_1,performance_2,performance_3,performance_4,total_performance,skill_1,skill_2,skill_3,skill_4,total_skill,deductions,overall,Remarks
0,"August 06, 2019",Preliminary,1,The Royal Family,New Zealand,4.2,4.23,4.05,3.35,,4.07,4.01,3.68,3.79,,,,
1,"August 06, 2019",Preliminary,2,J.B.Star,Japan,3.92,4.17,3.85,3.4,,4.19,4.06,4.05,4.11,,,,
2,"August 06, 2019",Preliminary,3,Kana-Boon! All Star,Japan,4.35,4.08,4.5,3.49,,4.25,4.13,4.44,4.3,,0.6,,Deductions: Entire crew not on stage min. 30 s...
3,"August 06, 2019",Preliminary,4,Legit Status,Philippines,4.12,3.84,3.7,3.57,,3.88,3.32,4.13,4.18,,,,
4,"August 06, 2019",Preliminary,5,La Docta,Argentina,3.8,4.1,3.5,3.41,,4.23,3.83,3.95,3.46,,,,


In [188]:
hhi.dtypes

year                  object
category              object
rank                   int64
name                  object
country               object
performance_1        float64
performance_2        float64
performance_3        float64
performance_4        float64
total_performance    float64
skill_1              float64
skill_2              float64
skill_3              float64
skill_4              float64
total_skill          float64
deductions           float64
overall              float64
Remarks               object
dtype: object

## Computing overall scores
This section will compute the overall scores based on the raw scores of the teams. 

The formula to get the overall score is `total performance score + total skill score = overall score`

To get both the total performance score and total skill score has to go through a scoring system where in the minimum and maximum scores of the 

In [189]:
# An array of tuples containing scores given by judges in each criteria
# There are only 2 performance criterias Performance & Skill

performance = [list(scores) for scores in zip(hhi['performance_1'], hhi['performance_2'], hhi['performance_3'], hhi['performance_4'])]
skill = [list(scores) for scores in zip(hhi['skill_1'], hhi['skill_2'], hhi['skill_3'], hhi['skill_4'])]

In [190]:
def round_conditional(number):
    '''
    number: any number
    ---
    round up a number only if it has 3 or more decimals.
    '''
    num = str(number)
    deci = num.split('.')
    if len(deci[1]) >= 3:
        return math.ceil(number*100) / 100
    else:
        return number

In [191]:
def total_score(scores):
    '''
    Arguements:
    
    scores: an array of tuples containing the scores of all judges in a criteria for a team.
    ---------
    total_score calculates the score of a criteria based on HHIs grading computation for total performance score and total skill score.
    Both are calculated by dropping the highest score and lowest score and computing for the average of the remaining score values.
    '''
    scores = np.array(scores)
    criteria_scores = []
    for team in scores:
    # delete minimum and maximum scores to normalize scores for computation
        official_score = np.delete(team, [np.argmin(team), np.argmax(team)])
        criteria_score = official_score.sum() / len(official_score)
        criteria_scores.append(round_conditional(criteria_score))
    #return np.array(scores)
    return criteria_scores

In [192]:
# Encode the total performance and skill scores for each team
hhi['total_performance'] = total_score(performance)
hhi['total_skill'] = total_score(skill)

# Computer for overall score per team
hhi['overall'] = [performance + skill - deductions for performance, skill, deductions in zip(hhi['total_performance'], hhi['total_skill'], hhi['deductions'].fillna(0))]

In [195]:
hhi.head()

Unnamed: 0,year,category,rank,name,country,performance_1,performance_2,performance_3,performance_4,total_performance,skill_1,skill_2,skill_3,skill_4,total_skill,deductions,overall,Remarks
0,"August 06, 2019",Preliminary,1,The Royal Family,New Zealand,4.2,4.23,4.05,3.35,4.13,4.07,4.01,3.68,3.79,3.9,,8.03,
1,"August 06, 2019",Preliminary,2,J.B.Star,Japan,3.92,4.17,3.85,3.4,3.89,4.19,4.06,4.05,4.11,4.09,,7.98,
2,"August 06, 2019",Preliminary,3,Kana-Boon! All Star,Japan,4.35,4.08,4.5,3.49,4.22,4.25,4.13,4.44,4.3,4.28,0.6,7.9,Deductions: Entire crew not on stage min. 30 s...
3,"August 06, 2019",Preliminary,4,Legit Status,Philippines,4.12,3.84,3.7,3.57,3.77,3.88,3.32,4.13,4.18,4.01,,7.78,
4,"August 06, 2019",Preliminary,5,La Docta,Argentina,3.8,4.1,3.5,3.41,3.65,4.23,3.83,3.95,3.46,3.89,,7.54,
