# Translate Grade of Execution

At the end of each competition, the International Skating Union (ISU) releases an event protocol PDF, which is the full compendium of scores given to skaters during a competition. (See, for instance, [this event protocol from the 2017-18 Grand Prix final](http://www.isuresults.com/results/season1718/gpf1718/gpf2017_protocol.pdf).) These PDFs only include each judge's *Grade of Execution (GOE)* for each performed element, but not the equivalent points. This notebook takes the data available in `data/raw/` and translates the *Grade of Execution* for each technical element judgement, based on the publicly available translation tables that the ISU publishes at the beginning of each competition season. Finally, it checks to make sure that when the translated GOEs are combined, they equal the skater's overall translated GOE for each element. (Unlike the per-judgment GOEs, the overall translated GOE is included in the PDFs.)

In [1]:
import pandas as pd
import re

## Load the data

In [2]:
performances = pd.read_csv("../data/raw/performances.csv")
print("{:,} performances".format(len(performances)))

aspects = pd.read_csv("../data/raw/judged-aspects.csv")
print("{:,} aspects".format(len(aspects)))

scores = pd.read_csv("../data/raw/judge-scores.csv")
print("{:,} scores".format(len(scores)))

1,726 performances
23,932 aspects
214,531 scores


In [3]:
scored = scores.pipe(
    pd.merge,
    aspects,
    on = "aspect_id",
    how = "left"
).pipe(
    pd.merge,
    performances,
    on = "performance_id",
    how = "left"
).assign(
    is_junior = lambda x: x["program"].str.contains("JUNIOR")
)

len(scored)

214531

## Prepare the data

Only **technical elements** (as opposed to artistic components) receive a Grade of Execution from the judges. Our analysis only deals with the senior level competitions, so we remove the junior level programs.

In [4]:
elements = scored[
    (scored["section"] == "elements") &
    (~scored["is_junior"])
].copy()
len(elements)

129814

Different **competition seasons** use different Scale of Values translation tables. BuzzFeed News downloaded the PDFs containing these translation tables and transformed them into CSVs. Here are the PDFs: [Figure Skating 2016/17](http://www.isu.org/docman-documents-links/isu-files/documents-communications/isu-communications/459-2000-sptc-sov-and-goe-2016-2017-revised-july-14/file), [Figure Skating 2017/18](http://www.isu.org/docman-documents-links/isu-files/documents-communications/isu-communications/14352-isu-communication-2089/file), [Ice Dancing 2016/17](http://isu.org/docman-documents-links/isu-files/documents-communications/isu-communications/476-isu-communication-2015/file), [Ice Dancing 2017/18](http://www.isu.org/docman-documents-links/isu-files/documents-communications/isu-communications/589-isu-communication-2094/file). The function below assigns each competition to a season.

In [5]:
def find_comp_season(competition):
        if competition in [
            "Grand Prix Final 2017 Senior and Junior",
            "ISU GP 2017 Bridgestone Skate America",
            "ISU GP 2017 Skate Canada International",
            "ISU GP NHK Trophy 2017",
            "ISU GP Audi Cup of China 2017",
            "ISU GP Rostelecom Cup 2017",
            "ISU GP Internationaux de France de Patinage 2017"
        ]:
            return 2017
        elif competition in [
            "ISU Grand Prix of Figure Skating Final 2016",
            "ISU GP Trophee de France 2016",
            "ISU GP NHK Trophy 2016",
            "ISU GP 2016 Skate Canada International",
            "ISU GP Rostelecom Cup 2016",
            "ISU GP Audi Cup of China 2016",
            "ISU World Figure Skating Championships 2017",
            "ISU Four Continents Championships 2017",
            "ISU GP 2016 Progressive Skate America",
            "ISU European Figure Skating Championships 2017"
        ]:
            return 2016
        else:
            print("Competition {} not found!".format(competition))
            return None

In [6]:
elements["season"] = elements["competition"].apply(find_comp_season)

## Load conversion tables

In [7]:
def convert_to_float(x):
    return float(x.replace(",", ".")) if not pd.isnull(x) else None

In [8]:
fs_goe_adj_2016_17 = pd.read_csv(
    "../data/processed/figure-skating-goe-adj-2016-17.csv", 
    names = [
        "name", "code", "+3", "+2", "+1", 
        "base", "v", "v1", "-1", "-2", "-3"
    ],
    decimal = ","
)

fs_goe_adj_2016_17["base"] = fs_goe_adj_2016_17["base"]\
    .apply(convert_to_float)
    
fs_goe_adj_2016_17["v"] = fs_goe_adj_2016_17["v"]\
    .apply(convert_to_float)

In [9]:
fs_goe_adj_2017_18 = pd.read_csv(
    "../data/processed/figure-skating-goe-adj-2017-18.csv", 
    names = [
        "name", "code", "+3", "+2", "+1", 
        "base", "v", "v1", "-1", "-2", "-3"
    ],
    decimal = ","
)

fs_goe_adj_2017_18["base"] = fs_goe_adj_2017_18["base"]\
    .apply(convert_to_float)
    
fs_goe_adj_2017_18["v"] = fs_goe_adj_2017_18["v"]\
    .apply(convert_to_float)

In [10]:
id_goe_adj_2017_18 = pd.read_csv(
    "../data/processed/ice-dancing-goe-adj-2017-18.csv",
    names = [
        "name", "code", "+3", "+2", "+1", 
        "base", "-1", "-2", "-3"
    ],
    dtype = {
        "+3": float, "+2": float, "+1": float, "base": float,
        "-3": float, "-2": float, "-1": float 
    }
)

In [11]:
id_goe_adj_2016_17 = pd.read_csv(
    "../data/processed/ice-dancing-goe-adj-2016-17.csv",
    names=[
        "name", "code", "+3", "+2", "+1", 
        "base", "-1", "-2", "-3"
    ],
    dtype = {
        "+3": float, "+2": float, "+1": float, "base": float,
        "-3": float, "-2": float, "-1": float 
    }
)

## Translate the Grade of Execution

The functions below translate the individual grade of execution that each judge gave a skater.

- `clean_fs_aspect`: Cleans the aspect string that is given in the protocol PDF
- `downgrade_fs_aspect`: Converts a downgraded aspect to the aspect used for the translation
- `translate_sk_goe`: Translates the GOE for an individual figure skating aspect
- `translate_id_goe`: Translates the GOE for an individual ice dancing aspect

In [12]:
def clean_fs_aspect(aspect_string):
    trimmed_aspect = aspect_string.rstrip("<").rstrip("e")
    convert_aspect = {
        "CCoSp4": "(F)CCoSp4",
        "FCCoSp4": "(F)CCoSp4",
        "FCoSp4": "(F)CCoSp4",
        "CCoSp3": "(F)CCoSp3",
        "FCCoSp3": "(F)CCoSp3",
        "FCCoSp1": "(F)CCoSp1",
        "CSSp4": "(F)CSSp4",
        "CCSp4": "(F)CCSp4",
        "CCSp1": "(F)CCSp1",
        "ChSq1": "ChSq",
        "5RLi4": "5A/RLi4",
        "5RLiB": "5A/RLiB",
        "5ALi4": "5A/RLi4",
        "5TLi3": "5T/SLi3",
        "3FTh": "3F/LzTh",
        "3LzTh": "3F/LzTh",
        "CSSp3": "(F)CSSp3",
        "CCSp3": "(F)CCSp3",
        "BiDs4": "Fi/BiDs4",
        "BiDs3": "Fi/BiDs3",
        "BoDs3": "Fo/BoDs3",
        "FiDs4": "Fi/BiDs4",
        "FiDs2": "F/BiDs2",
        "BoDs4": "Fo/BoDs4",
        "FiDs3": "Fi/BiDs3",
        "FiDs1": "Fi/BiDs1",
        "BiDsB": "Fi/BiDsB",
        "CCoSp2": "(F)CCoSp2",
        "CCoSp1": "(F)CCoSp1",
        "5ALi1": "5A/RLi1",
        "5ALi3": "5A/RLi3",
        "5ALiB": "5A/RLiB",
        "5RLi3": "5A/RLi3",
        "5RLi2": "5A/RLi2",
        "BoDs2": "Fo/BoDs2",
        "BoDs1": "Fo/BoDs1",
        "BiDs2": "Fi/BiDs2",
        "BiDs2": "Fi/BiDs2",
        "FCCoSp2": "(F)CCoSp2",
        "CSSp2": "(F)CSSp2",
        "CSSp1": "(F)CSSp1",
        "FCSSp4": "(F)CSSp4",
        "FCSSp3": "(F)CSSp3",
        "CCSp2": "(F)CCSp2",
        "5ALi2": "5A/RLi2",
        "5SLi4": "5T/SLi4",
        "BiDs1": "Fi/BiDs1",
        "BiDs2": "F/BiDs2",
        "BoDs1": "Fo/BoDs1",
        "CCoSp3V": "(F)CCoSp3",
        "CCoSp2V": "(F)CCoSp2",
        "CCoSp4V": "(F)CCoSp4",
        "CCoSp1V": "(F)CCoSp1",
        "FCCoSp1V": "(F)CCoSp1",
        "FCCoSp2V": "(F)CCoSp2",
        "FCCoSp3V": "(F)CCoSp3",
        "FCCoSp4V": "(F)CCoSp4",
        "PCoSp3V": "PCoSp3",
        "PCoSp2V": "PCoSp2",
        "PCoSp4V": "PCoSp4",
        "PCoSp1V": "PCoSp1",
        "FCUSp4": "(F)CUSp4",
        "BoDsB": "Fo/BoDsB",
        "PCoSpBV": "PCoSpB",
        "5SLi1": "5T/SLi1",
        "5SLi2": "5T/SLi2",
        "5SLi3": "5T/SLi3",
        "T": "1T"
    }
    if trimmed_aspect in convert_aspect.keys():
        trimmed_aspect = convert_aspect[trimmed_aspect]
    return trimmed_aspect

In [13]:
def downgrade_fs_aspect(aspect_string):
    trimmed_aspect = re.sub(r"<<$", "", aspect_string).rstrip("e")
    downgrade_dict = {
        "4F": "3F",
        "3F": "2F",
        "2F": "1F",
        "4Lz": "3Lz",
        "3Lz": "2Lz",
        "4Lo": "3Lo",
        "3Lo": "2Lo",
        "2Lo": "1Lo",
        "4S": "3S",
        "3S": "2S",
        "2S": "1S",
        "4T": "3T",
        "3T": "2T",
        "2T": "1T",
        "3A": "2A",
        "2A": "1A",
        "3TwB": "2TwB",
        "1T": None,
        "1Lo": None
    }
    if trimmed_aspect in downgrade_dict.keys():
        trimmed_aspect = downgrade_dict[trimmed_aspect]
    return trimmed_aspect

In [14]:
def translate_sk_goe(row):
    if row["season"] == 2016:
        sov_df = fs_goe_adj_2016_17
    else:
        sov_df = fs_goe_adj_2017_18
    # If score is 0, then the translated GOE is 0
    score_col = str(int(row["score"]))
    if row["score"] == 0:
        return 0
    # positives needs a + in front of them for lookup
    elif ("-" not in score_col):
        score_col = "+" + score_col
    value = None
    # No info_flag is simple case, ! is a warning and doesn't affect the translation
    if pd.isnull(row["info_flag"]) or (row["info_flag"] == "!"):
        aspect = clean_fs_aspect(row["aspect_desc"])
        if "+" in aspect:
            aspects = aspect.split("+")
            highest_bv = 0
            value_highest_element = 0
            for a in aspects:
                if a in ["COMBO", "REP", "SEQ"]:
                    pass
                else:
                    a = clean_fs_aspect(a)
                    aspect_bv = sov_df.set_index("code").loc[a.strip("!")]["base"]
                    if aspect_bv > highest_bv:
                        highest_bv = aspect_bv
                        value_highest_element = sov_df.set_index("code").loc[a.strip("!")][score_col]
            value = value_highest_element
        # There was one case of a downgraded jump with an edge warning
        elif "<<" in row["aspect_desc"]:
            a = downgrade_fs_aspect(row["aspect_desc"])
            value = sov_df.set_index("code").loc[a][score_col]
        else:
            value = sov_df.set_index("code").loc[clean_fs_aspect(aspect.strip("!"))][score_col]
        

    # These are single jumps with flags
    elif ("+" not in row["aspect_desc"]):
        aspect = clean_fs_aspect(row["aspect_desc"])
        # This is an underrotated jump, it affects base value
        if (row["info_flag"] == "<"):
            value = sov_df.set_index("code").loc[aspect][score_col]
        # The edge call affects the base value
        elif (row["info_flag"] == "e"):
            # Edge calls can have a downgrade too
            if "<<" in row["aspect_desc"]:
                aspect = downgrade_fs_aspect(row["aspect_desc"])
                value = sov_df.set_index("code").loc[aspect][score_col]
            else:
                value = sov_df.set_index("code").loc[aspect][score_col]
        elif (row["info_flag"] == "<<"):
            aspect = downgrade_fs_aspect(row["aspect_desc"])
            value = sov_df.set_index("code").loc[aspect][score_col]
    
    else:
        aspects = row["aspect_desc"].split("+")
        highest_bv = 0
        value_highest_element = 0
        for a in aspects:
            if a in ["COMBO", "REP", "SEQ"]:
                pass
            else:
                if "*" in a:
                    aspect_bv = 0
                elif "<<" in a:
                    a = downgrade_fs_aspect(a)
                    # Sometimes the lowest-difficulty jumps get downgraded; 
                    # in that case, `downgrade_fs_aspect` returns None
                    if a == None:
                        aspect_bv = 0
                    else:
                        aspect_bv = sov_df.set_index("code").loc[a]["base"]
                elif "<" in a:
                    a = clean_fs_aspect(a)
                    aspect_bv = sov_df.set_index("code").loc[a]["base"] * 0.7
                else:
                    a = clean_fs_aspect(a)
                    aspect_bv = sov_df.set_index("code").loc[a]["base"]
                if "e" in (a or ""):
                    aspect_bv = aspect_bv * 0.7
                if aspect_bv > highest_bv:
                    highest_bv = aspect_bv
                    value_highest_element = sov_df.set_index("code").loc[a][score_col]
        value = value_highest_element          
    return value

In [15]:
def translate_id_goe(row):
    if row["season"] == 2016:
        sov_df = id_goe_adj_2016_17
    else:
        sov_df = id_goe_adj_2017_18
    trimmed_aspect = row["aspect_desc"].rstrip("<").strip()
    # If score is 0, then the GOE is 0
    score_col = str(int(row["score"]))
    if row["score"] == 0:
        return 0
    # Positives needs a + in front of them for lookup
    elif ("-" not in score_col):
        score_col = "+" + score_col
    value = None
    if "kp" in trimmed_aspect:
        # Key Points don't affect score, just the initial grade
        aspect = trimmed_aspect.split("+")[0]
        value = sov_df.set_index("code").loc[aspect][score_col]
    elif "+" in trimmed_aspect:
        aspects = trimmed_aspect.split("+")
        total_value = 0
        for a in aspects:
            total_value += sov_df.set_index("code").loc[a][score_col]
        value = total_value
    else:
        value = sov_df.set_index("code").loc[trimmed_aspect][score_col]
    return value

In [16]:
def translate_goe(row):
    if "ICE DANCE" in row["program"]:
        return translate_id_goe(row)
    else:
        return translate_sk_goe(row)

In [17]:
elements["judge_goe"] = elements.apply(translate_goe, axis=1)

Example translations:

In [18]:
elements[[ 
    "season",
    "aspect_desc", 
    "info_flag", 
    "score", 
    "judge_goe" 
]].head(20).drop_duplicates()

Unnamed: 0,season,aspect_desc,info_flag,score,judge_goe
36,2017,3Tw2,,-1.0,-0.7
39,2017,3Tw2,,0.0,0.0
41,2017,3Tw2,,1.0,0.7
43,2017,3Tw2,,-2.0,-1.4
45,2016,CoSp4,,2.0,1.2
46,2016,CoSp4,,1.0,0.6
72,2016,FCCoSp4,,1.0,0.5


## Check that the translated GOEs are correct

The protocol PDFs do include the final *overall* translated Grade of Execution for each technical element. In order to check that we have correctly translated the Grade of Execution for each individual judge, this code recalculates the overall translated Grade of Execution from our computed values and then checks that it is equal to the overall GOE in the protocol PDF.

In [19]:
def round_goe(goe):
    return float("{:.2f}".format(goe))

In [20]:
aspect_grp = elements.groupby(["aspect_id"])

In [21]:
check_df = pd.DataFrame({
    "translated_goes": aspect_grp["judge_goe"].apply(lambda x: x.values),
    "base_value": aspect_grp["base_value"].first(),
    "scores_of_panel": aspect_grp["scores_of_panel"].first(),
    "given_goe": aspect_grp["goe"].first()
})

The overall translated GOE for a technical element is calculated by dropping the highest and lowest scores and then taking the mean of the remaining scores (i.e., the "trimmed mean"). The cell below goes through that process for our translated GOEs.

In [22]:
check_df["sorted_goes"] = check_df["translated_goes"].apply(sorted)
check_df["trimmed_goes"] = check_df["sorted_goes"].apply(lambda x: x[1:-1])
check_df["calculated_total_goe"] = check_df["trimmed_goes"].apply(lambda x: round_goe(sum(x)/len(x)))

The overall translated GOE given in the protocol PDF should be equal to the difference between the overall score of the panel minus the base value of the element. The rest of this notebook calculates the overall translated GOE using those two aspects of the protocol PDF and then checks it against the overall translated GOE in the protocol PDF and the overall translated GOE calculated from our individual judge GOE translations.

In [23]:
check_df["goe_from_protocol"] = (check_df["scores_of_panel"] - check_df["base_value"]).apply(round_goe)

In [24]:
check_df[["calculated_total_goe", "goe_from_protocol", "given_goe"]].head()

Unnamed: 0_level_0,calculated_total_goe,goe_from_protocol,given_goe
aspect_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
000df5399a,-0.2,-0.2,-0.2
000f259b7c,0.77,0.77,0.77
0015b0e54a,0.36,0.36,0.36
00184539f1,0.14,0.14,0.14
001d80257c,0.9,0.9,0.9


In [25]:
assert (check_df["calculated_total_goe"] != check_df["given_goe"]).sum() == 0

In [26]:
assert (check_df["calculated_total_goe"] != check_df["goe_from_protocol"]).sum() == 0

In [27]:
elements[["aspect_id", "judge", "judge_goe"]].to_csv("../data/processed/judge-goe.csv", index=None)

---

---

---