# STA 141B Assignment 3

### Preliminaries
Due __February 27__ by __11:59pm__.

Submit your work by uploading it to Gradescope. Submission requires two files: the original Jupyter Notebook and its PDF export.
Please rename this file as "H1_Lastname_Firstname_srnr", where srnr are the last four digits of your student's ID number and do the same for the PDF export file.

### Objective

The objective of this homework assignment is to solidify your understanding of and proficiency with __APIs__, and, more generally, with Web scraping.

### Instructions

1. Provide your solutions in new cells between the `Solution START` cell and the `Solution END` cell. Create as many new cells as necessary within these two blocks. Use code cells for your Python scripts and Markdown cells for explanatory text or answers to non-coding questions.

2. You must execute the code following every `Validation` block to get credits for the corresponding task. Failure to do so may result in a loss of points.

3. Prioritize code readability. Just as in writing a book, the clarity of each line matters. Adopt the __one-statement-per-line__ rule. If you have a lengthy code statement, consider breaking it into multiple lines for clarity. Note you can use `'''` to start and end strings in Python that are written over multiple lines.

4. To help understand and maintain code, you should add comments to explain your code. Use the hash symbol (#) to start writing a comment.

5. Submit your work by uploading it to __Gradescope__. Submission requires two files: the original Jupyter Notebook (.ipynb) and its PDF export. To convert your Jupyter notebook file into a PDF, navigate to "File", select "Download as", and then choose either "PDF via LaTeX" or "HTML". If "PDF via LaTeX" does not work for you, export to "HTML", and then use Chrome to print the .html file into PDF.

6. This assignment will be graded on your proficiency in programming. Be sure to demonstrate your abilities and submit your own, correct and readable solutions.

### Code of conduct

The usage of AI for this homework is strictly forbidden.

### Setting

We will use the [lichess](https://lichess.org/api) API to retrieve some information about the current state of chess in the world. In order to answer below questions, make precise and economical requests. This API is well documented, so please make sure to read the documentation before working on the tasks.

Kindly note that the __results depend on the time of your request.__ Thus, your answer might be correct even if you get a different output as provided at the end of your task.

In [15]:
import requests
import json
import pandas as pd
import time

from datetime import datetime

## Exercise 1 [10 points]

#### Examples

The following examples are provided to help you for the task. Note that these data depend on the time of your request and therefore you might get different results.

### 1a) [1 Points]

#### Task

Consider the top 10 chess players (in terms of their rating) of some chess `variant` in lichess.

Write a function that takes a string `variant` as input and returns a list or Pandas.Series consisting of their __ids__.

Afterwards, create a Pandas.Series called `top_pls` that contains the id's of the top10 players of __classical__ chess on lichess by using this function.

#### Solution START

All code for this task must be written between this `Solution START` and the following `Solution END` block.

In [16]:
def top10_ids(variant):
    url = f"https://lichess.org/api/player/top/10/{variant}"
    response = requests.get(url)
    data = response.json()

    # Extract player ids
    ids = [player["id"] for player in data["users"]]
    return ids

# Creating Pandas Series for classical top 10
top_pls = pd.Series(top10_ids("classical"))

#### Solution END

#### Examples

The following examples are provided to help you for the task. Note that these data depend on the time of your request and therefore you might get different results.

In [17]:
top_pls[0:2]

0      yuuki-asuna
1    chesstheory64
dtype: object

#### Validation
Please run the following code lines. Wrong results or errors in the following code may still get partial credits - as long as the following code is executed.

In [18]:
print(type(top_pls))
print(len(top_pls))

<class 'pandas.core.series.Series'>
10


In [19]:
top_pls

0       yuuki-asuna
1     chesstheory64
2    universalruler
3    theunknownplay
4          sandbad9
5          ojaijoao
6    vlad_lazarev79
7      inner___join
8            emandr
9       elwarrior80
dtype: object

### 1b) [1 Points]

#### Task

Create a list or Pandas.Series `variants` consisting of all chess styles on lichess (this includes both chess games with different rules and at different speed). Do NOT fill the list by hand. Instead create the list by post-processing a request that returns all TOP10 lists of all chess variants and retrieve the variants from there. In particular, note that Puzzles are not considered as a chess variant.

#### Solution START

All code for this task must be written between this `Solution START` and the following `Solution END` block.

In [20]:
url = "https://lichess.org/api/player"
response = requests.get(url, headers={"Accept": "application/json"})
data = response.json()

variants = [key for key in data.keys() if key != "puzzle"] # Key not considered as a variant


#### Solution END

#### Examples

The following examples are provided to help you for the task.

In [21]:
len(variants)

13

In [22]:
variants[0:2]

['bullet', 'blitz']

#### Validation
Please run the following code lines. Wrong results or errors in the following code may still get partial credits - as long as the following code is executed.

In [23]:
variants

['bullet',
 'blitz',
 'rapid',
 'classical',
 'ultraBullet',
 'crazyhouse',
 'chess960',
 'kingOfTheHill',
 'threeCheck',
 'antichess',
 'atomic',
 'horde',
 'racingKings']

### 1c) [2 Points]

#### Task
Create a DataFrame `df` that contains the ids of the top10 players of all chess styles. Each column shall contain the ids of the top10 players of a certain chess style. You may either use functions from previous tasks or process a different request.

Afterwards, create a list or Pandas.Series `multi_talents` of all player id's that appear in more than one top10 list. The list shall contain the id as index and the number of occurences as values.

#### Task description

#### Solution START

All code for this task must be written between this `Solution START` and the following `Solution END` block.

In [24]:
topdict = {}
# Each variant key maps to a list of player dicts with an "id" field
for variant in variants:
    topdict[variant] = [player["id"] for player in data[variant]]

df = pd.DataFrame(topdict)

# Find players that appear in more than one top-10 list
# Stack IDs into one flat Series, then count occurrences
all_ids = pd.Series(df.values.flatten())
id_counts = all_ids.value_counts()
multi_talents = id_counts[id_counts > 1]


#### Solution END

#### Examples

The following examples are provided to help you for the task. Note that these data depend on the time of your request and therefore you might get different results.

In [25]:
df.head(1)

Unnamed: 0,bullet,blitz,rapid,classical,ultraBullet,crazyhouse,chess960,kingOfTheHill,threeCheck,antichess,atomic,horde,racingKings
0,mraquariyaz67,calisthenicsboy,yuuki-asuna,yuuki-asuna,mraquariyaz67,catask,zhigalko_sergei,woshigeshagua,statham_13,outoff0rm,wolfram_ep,matvei-e2e4,seth_777


In [26]:
multi_talents[0:2]

mraquariyaz67    4
yuuki-asuna      3
Name: count, dtype: int64

#### Validation
Please run the following code lines. Wrong results or errors in the following code may still get partial credits - as long as the following code is executed.

In [27]:
df

Unnamed: 0,bullet,blitz,rapid,classical,ultraBullet,crazyhouse,chess960,kingOfTheHill,threeCheck,antichess,atomic,horde,racingKings
0,mraquariyaz67,calisthenicsboy,yuuki-asuna,yuuki-asuna,mraquariyaz67,catask,zhigalko_sergei,woshigeshagua,statham_13,outoff0rm,wolfram_ep,matvei-e2e4,seth_777
1,arkadiy_khromaev,wonderland305,just_no_fun,chesstheory64,rush_g7,larso,emirislamking,johnsam1,mraquariyaz67,tetiksh1agrawal,maxwellssilvrhammer,stubenfisch,royalmaniac
2,chess-art-us,mraquariyaz67,ilqar_7474,universalruler,tamojerry,visualdennis,maratgilfanovyoutube,amirreza_p,trixr4kidzzz,pinni7,judebaetorrens,horus_88,m-a01
3,emirislamking,cutemouse83,aspiringstar,theunknownplay,ohanyaneminchess,crazy_eight,johnsam1,yakov25,lion2006-45,devansh2008,jakestatefarm,kutlugbilgekagan,queeneatingdragon
4,boozyguy,woshigeshagua,kurald_galain,sandbad9,trixr4kidzzz,littleplotkin,hanskai,aqua_blazing,littleplotkin,bonab_bakery,ihatespammers,mereseberserknhi,cybershredder
5,sindarovgm,athena-pallada,tuzakli_egitim,ojaijoao,chess-art-us,jannlee,dfsocial,nurhan2020,knightrider-777,antichessepico,rkrounit,jerry_is_here,r2300
6,rush_g7,mlchael,olhasocomosejoga,vlad_lazarev79,legacyofbob,presh_1,sultai,aquashymath,maxwellssilvrhammer,antichess_valentino,chrisrapid,christopherius,natso
7,chessplayer202024,ciderdrinker,metan0ia,inner___join,rakitind,kingswitcher,jl202074,shnitez,yuuki-asuna,cagatayulusoy,haydurus365,emirislamking,nilsewon
8,ediz_gurel,legacyofbob,qucani,emandr,nu_aspect,bardiash76,pap-g,mereseberserknhi,vyga2012,sudenurk2,pashpash,mindhoonter-f5f6,panagiskosmatos
9,zhigalko_sergei,ilikeknightmost,yhm250903,elwarrior80,to3kandbeyond,neutralizerr,oleg_chess14,sultai,luukdegrote,theunknownguyreborn,rechesster,hod-konem96,playanotherone


In [28]:
print(multi_talents)

mraquariyaz67          4
yuuki-asuna            3
emirislamking          3
rush_g7                2
mereseberserknhi       2
legacyofbob            2
littleplotkin          2
sultai                 2
chess-art-us           2
maxwellssilvrhammer    2
johnsam1               2
trixr4kidzzz           2
woshigeshagua          2
zhigalko_sergei        2
Name: count, dtype: int64


In [29]:
type(multi_talents)

pandas.core.series.Series

### 1d) [1 Points]

#### Task

Write a function `get_user_info` that gets a user's ID as an argument and returns their name, title, rating on lichess of the _classical_ variant and the fide rating. Whenever one of these properties cannot be found, replace the corresponding entry with `None` (but return all the other properties). Furthermore, if a player has no profile at all, return `None` for each of their properties.

#### Solution START

In [30]:
def get_user_info(user_id):
    user_id = user_id.strip() # remove whitespaces 

    url = f"https://lichess.org/api/user/{user_id}"
    response = requests.get(url)

    # If user does not exist then liches 404 error!
    if response.status_code != 200:
        return (None, None, None, None)
    
    data = response.json()
    name = data.get("username", None)
    title = data.get("title", None)

    # Classical Rating
    # Stored under profile.realName, None if missing
    classical_rating = None
    if "perfs" in data and "classical" in data["perfs"]:
        classical_rating = data["perfs"]["classical"].get("rating", None)

    # FIDE rating
    # Stored under perfs, None if missing
    fide_rating = None
    if "profile" in data:
        fide_rating = data["profile"].get("fideRating", None)
    
    return [name,title,classical_rating,fide_rating]


#### Solution END

#### Examples

The following examples are provided to help you for the task. Note that these data depend on the time of your request and therefore you might get different results.

In [31]:
get_user_info('kurald_galain ')

['Kurald_Galain', None, 2292, None]

#### Validation
Please run the following code lines. Wrong results or errors in the following code may still get partial credits - as long as the following code is executed.

In [32]:
get_user_info('muisback')

['muisback', 'GM', 1500, 2659]

In [33]:
get_user_info('chesstheory64')

['ChessTheory64', 'FM', 2585, 2280]

In [34]:
get_user_info('mysterious-master')

['Mysterious-Master', None, 2429, None]

### 1e) [2 Points]

#### Task

Write to functions `get_best_style` and `get_favourite_style` that take the player's ID as input and do the following:

`get_best_style` calculates the best current rating of that player within all variants and returns a dictionary consisting of the best variant as keyword and the corresponding rating as value. If the best variant is not unique, then the dictionary shall contain all best variants.

`get_favourite_style` calculates for each variant the number of games a player has played and returns a dictionary consisting of the variant the player has played most often as a keyword and the number of games as a value. If the most played variant is not unique, then the dictionary shall contain all such variants.

#### Solution START

In [35]:
def get_best_style(user_id):
    user_id = user_id.strip()
    
    url = f"https://lichess.org/api/user/{user_id}"
    response = requests.get(url, headers={"Accept": "application/json"})
    data = response.json()
    
    perfs = data.get("perfs", {})
    
    # Build a dict for ratings skip entries with no rating key
    ratings = {
        variant: info["rating"]
        for variant, info in perfs.items()
        if "rating" in info
    }
    
    if not ratings:
        return {}
    
    # Find the maximum rating value
    max_rating = max(ratings.values())
    
    # Return all variants that share the maximum rating
    max_variants = {var: rating for var, rating in ratings.items() if rating == max_rating}
    return max_variants

def get_favourite_style(user_id):
    user_id = user_id.strip()
    
    url = f"https://lichess.org/api/user/{user_id}"
    response = requests.get(url, headers={"Accept": "application/json"})
    data = response.json()
    
    perfs = data.get("perfs", {})
    
    # Build a dict of games played
    games_played = {
        variant: info["games"]
        for variant, info in perfs.items()
        if "games" in info
    }
    
    if not games_played:
        return {}
    
    # Find the maximum game count
    max_games = max(games_played.values())
    
    # Return all variants tied for the most games
    favourite = {var: games for var, games in games_played.items() if games == max_games}
    return favourite


#### Solution END

#### Examples

The following examples are provided to help you for the task. Note that these data depend on the time of your request and therefore you might get different results.

In [36]:
get_best_style('kurald_galain ')

{'bullet': 3016}

In [37]:
get_favourite_style('kurald_galain ')

{'bullet': 39000}

#### Validation
Please run the following code lines. Wrong results or errors in the following code may still get partial credits - as long as the following code is executed.

In [38]:
get_best_style(top_pls[0])

{'bullet': 2915}

In [39]:
get_best_style('muisback')

{'bullet': 3202}

In [40]:
get_favourite_style(top_pls[0])

{'correspondence': 17639}

In [41]:
get_favourite_style('muisback')

{'bullet': 12637}

### 1f) [1 Points]

#### Task

Define a function `get_best_rating` that takes the user's id and a variant as input and calculates the highest rating in the history of this player for this specific variant. The function shall return a dictionary consisting of the best ratings' date (the key) and the highest rating (as the dictionary value).

Note: if a user has not played any games of this variant yet, return the dictionary `{None: None}`.

#### Solution START

In [42]:
def get_best_rating(user_id, variant):
    user_id = user_id.strip()
    
    url = f"https://lichess.org/api/user/{user_id}/rating-history"
    response = requests.get(url, headers={"Accept": "application/json"})
    data = response.json()
    
    for entry in data:
        # Entry can be a dict OR a list 
        if isinstance(entry, (list, tuple)) and len(entry) >= 2:
            name = str(entry[0])
            points = entry[1]
        elif isinstance(entry, dict):
            name = entry.get("name", "")
            points = entry.get("points", [])
        else:
            continue  # skip anything unexpected

        if name == variant:
            if not points:
                return {None: None}

            # Build lists of dates and ratings from the points data
            dates = []
            ratings = []
            for point in points:
                year, month, day, rating = point
                # add 1 for 0-index
                dates.append(datetime(year, month + 1, day))
                ratings.append(rating)

            # Use a Pandas Series with datetime index for max lookup
            rating_series = pd.Series(ratings, index=dates)

            # idxmax() gives the date, max() gives the rating value
            best_date = rating_series.idxmax()
            best_val = rating_series.max()

            return {best_date: best_val}
        
    # If nothing found
    return {None: None}

#### Solution END

#### Examples

The following examples are provided to help you for the task. Note that these data depend on the time of your request and therefore you might get different results.

In [43]:
print(get_best_rating('kurald_galain', 'Bullet'))

{Timestamp('2025-03-06 00:00:00'): 3068}


#### Validation
Please run the following code lines. Wrong results or errors in the following code may still get partial credits - as long as the following code is executed.

In [44]:
print(get_best_rating(top_pls[0], 'Classical'))

{Timestamp('2026-02-27 00:00:00'): 2649}


In [45]:
print(get_best_rating('muisback', 'Classical'))

{None: None}


In [46]:
print(get_best_rating('muisback', 'Blitz'))

{Timestamp('2024-04-20 00:00:00'): 3067}


### 1g) [2 Points]

#### Task

Write a function `get_games` that takes two (different) user id's as arguments and returns the number of total games these two players played against each other.

Afterwards, find out against which player in the top10 list of the variant 'Classical' the player 'yuuki-asuna' played the most games and how many games they played. Display this player (e.g. by using the `print` command).

Hint: For this request, you may easily exceed the rate limit. To avoid this, wait for 10 seconds between each request.

#### Solution START

In [47]:
def get_games(user1, user2):
    url = f"https://lichess.org/api/crosstable/{user1}/{user2}"
    response = requests.get(url)
    data = response.json()
    
    return data.get("nbGames", 0)

results = {}

# Find who "yuuki-asuna" played most among classical top10
target = "yuuki-asuna"
best_opponent = None
best_games = 0

for player in top_pls:
    # Skip comparing player to themselves
    if player.lower() == target.lower():
        continue

    games = get_games(target, player)

    # Track maximum of the most games
    if games > best_games:
        best_games = games
        best_opponent = player

    # Avoid hitting API rate limit 
    time.sleep(10)

print(best_opponent, best_games)


ojaijoao 13


#### Solution END

## Exercise 2 [5 points]

### 2a) [1 Points]

#### Task

Write a function that takes a user_id 'opponent' as argument, loads all games between 'yuuki-asuna' and the player 'opponent' (regardless of the variant) and returns a DataFrame that consists of (at least) the following columns: `players`, `winner` and `opening`.


Hint: for this task, it may be helpful to use the ndjson package because the API returns an ndjson object. To do so, you might want to consider using the following lines:

`import ndjson`: Imports the package (don't forget to install it first)

`headers = {'Accept': 'application/x-ndjson'}`: Add headers to your request

`response.json(cls=ndjson.Decoder)`: Decode the response using the ndjson.Decoder.

#### Solution START

In [None]:
def get_all_games_vs(opponent):
    url = "https://lichess.org/api/games/user/yuuki-asuna"
    params = {
        "opponent": opponent,
        "opening": True,
        "max": get_games("yuuki-asuna", opponent)
    }
    
    headers = {"Accept": "application/x-ndjson"}
    response = requests.get(url, params=params, headers=headers)
    response.raise_for_status()

    # games takes games line by line for the format
    games = [json.loads(line) for line in response.text.splitlines()]
    df_games = pd.DataFrame(games)

    return df_games

#### Solution END

#### Validation
Please run the following code lines. Wrong results or errors in the following code may still get partial credits - as long as the following code is executed.

In [49]:
df = get_all_games_vs('byebob')

In [50]:
df

Unnamed: 0,id,rated,variant,speed,perf,createdAt,lastMoveAt,status,source,players,winner,moves,clock,opening,tournament,initialFen
0,FtvP1cpk,True,antichess,bullet,antichess,1772258265249,1772258348070,outoftime,lobby,"{'white': {'user': {'name': 'yuuki-asuna', 'ti...",white,e3 b5 Bxb5 Nf6 Bxd7 Nfxd7 Qh5 g6 Qxg6 hxg6 Nh3...,"{'initial': 30, 'increment': 0, 'totalTime': 30}",,,
1,g9qse7GC,True,antichess,bullet,antichess,1772258185608,1772258263149,variantEnd,lobby,"{'white': {'user': {'name': 'COLE_1PALMER', 'f...",black,b3 b6 g3 g5 e3 Ba6 Bxa6 Nxa6 b4 Nxb4 Qg4 Nxa2 ...,"{'initial': 30, 'increment': 0, 'totalTime': 30}",,,
2,1aM1BRAk,True,horde,bullet,horde,1772184078696,1772184142514,outoftime,lobby,"{'white': {'user': {'name': 'yuuki-asuna', 'ti...",white,d5 d6 c6 a6 d4 axb5 axb5 bxc6 bxc6 e6 b5 exd5 ...,"{'initial': 30, 'increment': 0, 'totalTime': 30}",,,
3,AsNjPhWU,True,horde,bullet,horde,1772184005865,1772184075736,variantEnd,lobby,{'white': {'user': {'name': 'LimeSuccinctAnand...,black,h5 e6 h4 d6 fxe6 fxe6 cxd6 cxd6 h3 a5 f5 Nd7 f...,"{'initial': 30, 'increment': 0, 'totalTime': 30}",,,
4,rw00mZ6B,True,kingOfTheHill,blitz,kingOfTheHill,1772174684905,1772175027409,mate,arena,"{'white': {'user': {'name': 'yuuki-asuna', 'ti...",white,e4 e5 Nf3 d6 d4 Qf6 Be2 h6 Nc3 c6 Be3 Be7 Qd2 ...,"{'initial': 180, 'increment': 2, 'totalTime': ...","{'eco': 'C41', 'name': 'Philidor Defense', 'pl...",Rmney4G3,
5,rZ1cNnAl,True,kingOfTheHill,blitz,kingOfTheHill,1772174297315,1772174566398,resign,arena,"{'white': {'user': {'name': 'yuuki-asuna', 'ti...",white,e4 e5 Nf3 d6 d4 Nd7 Bc4 c6 a4 Be7 Nc3 Ngf6 O-O...,"{'initial': 180, 'increment': 2, 'totalTime': ...","{'eco': 'C41', 'name': 'Philidor Defense: Hanh...",Rmney4G3,
6,obI4MQuG,True,kingOfTheHill,blitz,kingOfTheHill,1772174067364,1772174176423,resign,arena,"{'white': {'user': {'name': 'Hodge42', 'id': '...",black,e4 e6 h3 d5 exd5 exd5 Nf3 Nf6 d3 Bd6 Be2 O-O O...,"{'initial': 180, 'increment': 2, 'totalTime': ...","{'eco': 'C00', 'name': 'French Defense', 'ply'...",Rmney4G3,
7,nUcc9uVl,True,kingOfTheHill,blitz,kingOfTheHill,1772173791513,1772173954877,mate,arena,"{'white': {'user': {'name': 'GameN', 'id': 'ga...",black,e4 e6 d4 d5 f4 dxe4 c4 c5 Be3 cxd4 Qxd4 Qxd4 B...,"{'initial': 180, 'increment': 2, 'totalTime': ...","{'eco': 'C00', 'name': 'French Defense', 'ply'...",Rmney4G3,
8,amSfQyzN,True,crazyhouse,bullet,crazyhouse,1772173601842,1772173660549,outoftime,lobby,"{'white': {'user': {'name': 'yuuki-asuna', 'ti...",white,e4 e5 Nf3 Nc6 Bc4 Bc5 Nc3 d6 d3 Nf6 O-O Ng4 Bg...,"{'initial': 30, 'increment': 0, 'totalTime': 30}","{'eco': 'C50', 'name': 'Italian Game: Giuoco P...",,
9,PLdc3zax,True,standard,rapid,rapid,1772173023970,1772173321335,resign,lobby,"{'white': {'user': {'name': 'playforpleasure',...",black,d4 e6 g3 c5 dxc5 Bxc5 e3 Nf6 Bg2 O-O a3 d5 b4 ...,"{'initial': 480, 'increment': 0, 'totalTime': ...","{'eco': 'A40', 'name': 'Horwitz Defense', 'ply...",,


In [51]:
len(df)

17

### 2b) [1 Points]

#### Task

Write a function `get_players` that takes one game of the DataFrame in 2a) as argument (that is: it gets a row of the DataFrame) and returns a dictionary.
The dictionary contains 'white' and 'black' as keywords and the user_ids of the corresponding white/black players as values.

#### Solution START

In [52]:
def get_players(game_row):
    # The 'players' field uses dict with 'white' and 'black' keys
    players = game_row["players"]
    
    white_id = players["white"]["user"]["id"]
    black_id = players["black"]["user"]["id"]
    
    return {"white": white_id, "black": black_id}

#### Solution END

#### Validation
Please run the following code lines. Wrong results or errors in the following code may still get partial credits - as long as the following code is executed.

In [53]:
get_players(df.loc[0])

{'white': 'yuuki-asuna', 'black': 'cole_1palmer'}

### 2c) [1 Points]

#### Task

Write a function `get_winner` that takes one game of 2a) as argument (that is: it gets a row of the DataFrame) and returns the user_id of the player who won the game. For this task, you may use the function of 2b). If the game ended with a draw, the function should return `None`.

Afterwards, create a Pandas.Series or list `winners` that contains the winners of all games between 'byebob' and 'yuuki-asuna'.

#### Solution START

In [54]:
def get_winner(game_row):
    # The 'winner' field is 'white', 'black', or missing (draw)
    winner_color = game_row.get("winner", None)
    
    if winner_color is None:
        return None
    
    # Use get_players to map color -> user ID
    players = get_players(game_row)
    return players[winner_color]

# Load games for the validation example
df = get_all_games_vs("byebob")

# Series
winners = df.apply(get_winner, axis=1)

#### Solution END

#### Validation
Please run the following code lines. Wrong results or errors in the following code may still get partial credits - as long as the following code is executed.

In [55]:
get_winner(df.loc[0])

'yuuki-asuna'

In [56]:
winners

0     yuuki-asuna
1     yuuki-asuna
2     yuuki-asuna
3     yuuki-asuna
4     yuuki-asuna
5     yuuki-asuna
6     yuuki-asuna
7     yuuki-asuna
8     yuuki-asuna
9     yuuki-asuna
10    yuuki-asuna
11    yuuki-asuna
12    yuuki-asuna
13    yuuki-asuna
14    yuuki-asuna
15    yuuki-asuna
16    yuuki-asuna
dtype: object

### 2d) [2 Points]

#### Task

Define a function `get_opening` that takes one game of 2a) as argument (that is: it gets a row of the DataFrame) and returns a dictionary. This dictionary consists of only one entry: the user_id of the white player as key and the name of the opening as value.

Create a DataFrame of all games between 'yuuki-asuna' and 'byebob' that contains the name of the white player in the first column and the opening in the second column. Afterwards, create a new DataFrame consisting of all openings that 'yuuki-asuna' played as a white player and how often he played this opening. Find the two most common openings of all games 'yuuki-asuna' played as a white player against 'byebob' and display them (e.g. by using the `print` command).

#### Solution START

In [None]:
def get_opening(row):
    players = get_players(row)
    return {players["white"]: row["opening"]}


df_byebob = get_all_games_vs("byebob")

# Create DataFrame with white player and opening
white_openings = []

for _, row in df_byebob.iterrows():
    players = get_players(row)
    white_player = players["white"]
    opening_name = row["opening"]
    
    white_openings.append({
        "white": white_player,
        "opening": opening_name
    })

white_openings = pd.DataFrame(white_openings)

# Filter games where yuuki-asuna was white
yuuki_white = white_openings[
    white_openings["white"] == "yuuki-asuna"
]

# Count how often each opening was played
opening_counts = yuuki_white["opening"].value_counts(dropna=False)

# Convert to df
opening_counts = opening_counts.reset_index()
opening_counts.columns = ["opening", "count"]

# Print the two most common openings
print(opening_counts.head(2))

                                             opening  count
0                                                NaN      5
1  {'eco': 'C41', 'name': 'Philidor Defense', 'pl...      1


#### Solution END