# Data setup

In this notebook, we run all of the functions necessary for setting up the data that is later displayed on the website.


In [36]:
from helper_functions import setup
from helper_functions import SPORTS_EVENTS, Subteam
import numpy as np


## Sanitize player data

First, we simply load the responses and anonymize them.

In [34]:
df = setup.sanitize_and_anonymize_data(overwrite=True, verbose=False, anonymize=False)

print(f"{len(df)} entries, of which {np.sum(~df.is_postdoc)} are PhDs and {np.sum(df.is_postdoc)} are postdocs")

# Conflicting sports
# np.sum(df["volleyball"] & df["basketball"])
# np.sum(df["football"] & df["tennis"])
# df[df["capture_the_flag"] & df["spikeball"]]["nickname"].tolist()
endings = [email.split("@")[1] if email != "???" else email for email in df["email"]]
len([end for end in endings if end == "???"])
# df[df["nickname"] == "Magnificent Barracuda"]
df["nickname"].to_csv("animal_names.csv", index=False)


90 entries, of which 70 are PhDs and 20 are postdocs


In [39]:
from helper_functions.setup.openai_image_download import generate_all_images, save_resized_animal_images
animals = [animal.lower() for animal in df["nickname"]]
# The following operation uses up openai credits.
generate_all_images([])
save_resized_animal_images(150)


## Team generation

Since we now have all the player data including the sports where they're available, we can generate the teams based on this information.

We want to keep them balanced with regards to all sports; This  is handled in the `create_teams` routine.

In [41]:
teams = setup.create_teams()
teams[0].player_df.head(3)


Unnamed: 0,nickname,institute,is_postdoc,avail_monday,avail_tuesday,avail_thursday,avail_friday,wants_basketball,basketball,wants_running_sprints,...,spikeball,wants_beer_pong,beer_pong,wants_fooseball,fooseball,wants_ping_pong,ping_pong,num_sports,num_sports_not_avail,late_entry
3,Magnificent Barracuda,MPE,False,True,True,True,True,False,False,True,...,True,True,True,True,True,True,True,9,0,False
21,Animated Yak,MPE,False,True,True,True,True,False,False,False,...,True,True,True,True,True,True,True,8,0,False
60,Failing Muskrat,IPP,False,True,True,True,True,True,True,False,...,False,False,False,False,False,True,True,6,0,False


## Subteam generation

Now we're getting to the spicy stuff!

We can generate the subteams for each main team, but there's a few caveats:

- Some of the sports are going to happen simultaneously, which is accounted for by the SportEvent class keeping book about that, and weights that are assigned while subteams are drawn.
- Some players have only chosen one sport. To make sure they can attend that, we are also adjusting their weights while being drawn.
- Some of the sports do not have traditional subteam generation, which we need to account for.

### Generate the subteams for each sport

For running/sprints, everyone is on their own, and we have different events.\
As a first iteration, we just group everyone to be in their own subteam.\
Conveniently, all reserve players are also taking part in the other concurrent events.

Same with chess.

In [45]:
from helper_functions.setup import generate_all_subteams 

for team in teams:
    all_subteams = generate_all_subteams(team)
    team.add_subteam_keys(all_subteams)
    team.create_backup()


----------------------------------------
 volleyball basketball
WARN: The following players are still set as reserve for both sports: {'Excited Rabbit'}
----------------------------------------
 football tennis
WARN: No solution found for Magnificent Barracuda, they are currently double-booked.
----------------------------------------
 capture_the_flag spikeball
capture_the_flag: Switched out Animated Yak with Nutty Sheep from A_1 to A_R
capture_the_flag: Switched out Magnificent Barracuda with Failing Muskrat from A_1 to A_R
capture_the_flag: Switched out Kindly Clownfish with Frightening Avocet from A_1 to A_R
WARN: No solution found for Nice Albatross, they are currently double-booked.
----------------------------------------
 volleyball basketball
WARN: The following players are still set as reserve for both sports: {'Trivial Uguisu'}
----------------------------------------
 football tennis
----------------------------------------
 capture_the_flag spikeball
capture_the_flag: Swit