In [1]:
import ibis
from ibis import _

ibis.options.interactive = True

# Create game-level features

In [2]:
game_level_features = []

In [None]:
games = ibis.read_parquet("data/games.parquet")
games

## `event`-based features

The `event` field includes interesting information, such as whether the game was rated or part of a tournament.

The first thing we see above is that all of the `event` values start with `"Rated "`; is this really the case?

In [None]:
games.event[: len("Rated ")].value_counts()

It looks like unrated games simply exclude the prefix. Let's create our first feature, `is_rated`, given this information.

In [None]:
is_rated = games.event.startswith("Rated ")
is_rated.value_counts()

We'll add each feature we define in this section to our list of game-level features. Spoiler alert: when we combine our features later, we'll see an interesting property of working with Ibis this way.

Don't forget to give each feature a meaningful name!

In [None]:
game_level_features.append(is_rated.name("is_rated"))
game_level_features

What else can we extract from the `event` field? For starters, let's examine the most popular `event` values.

In [None]:
games.event.value_counts().order_by(ibis.desc("event_count"))

Lichess categorizes games according to their "time control". If you're not familiar with chess, Classical games are the slowest, followed by Rapid, then Blitz. Bullet games are very fast, and UltraBullet games are, well, ultra-fast.

Correspondence games are essentially untimed. We'll exclude these games later, because we want to see how time modulates win likelihood.

Notice that we reuse the `is_rated` logic below when creating the time control feature.

In [None]:
event_with_rated_prefix_stripped = is_rated.ifelse(
    games.event[len("Rated ") :], games.event
)
lichess_time_control_type = event_with_rated_prefix_stripped.substr(
    0, event_with_rated_prefix_stripped.find(" ")
)
lichess_time_control_type.value_counts()

In [9]:
game_level_features.append(lichess_time_control_type.name("lichess_time_control_type"))

### Exercise 1

The last `event`-based feature we want for now is whether the game was a tournament game. No need to overcomplicate things—just check whether the `event` field [contains](https://ibis-project.org/reference/expression-strings#ibis.expr.types.strings.StringValue.contains) the relevant text.

In [10]:
is_tournament = games

#### Solution

In [12]:
%load solutions/nb02_ex01.py

As usual, don't forget to add the feature you created to the list!

In [13]:
game_level_features.append(is_tournament.name("is_tournament"))