# Wishes data visualisation

This notebook contains various graphs and data based on my personal gacha rolls. It is still WIP.

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import math

In [None]:
df_perm = pd.read_csv("./datasets/wishes_permanent.csv")
df_lim = pd.read_csv("./datasets/wishes_character_event.csv")

In [None]:
def count_sequence_lengths(_df, quality):
    sequence_i = 0
    df = _df.copy()
    
    def label_sequence(row, quality):
        nonlocal sequence_i
        
        if (row["quality"] == quality):
            sequence_i = sequence_i + 1
            return sequence_i - 1
        else:
            return sequence_i

    df["sequence"] = df.apply(lambda row: label_sequence(row, quality), axis=1)
    should_trim = df.tail(1)["quality"].values[0] != quality

    df = df.groupby("sequence").size()

    if should_trim:
        df.drop(df.tail(1).index, inplace=True)
        
    return df

## Lengths of roll sequences

Genshin Impact has a so-called pity system: it guarantees that after a certain number of failure rolls you get a success roll: for 4* items it's 10, and for 5* items it's 90. Getting a 3* item is considered a failure. A roll sequence here is a set of rolls that starts right after the previous success and ends with another success.

It was proven already that getting a 5* roll does not reset the 4* sequence, so they are analysed separately here.

In [None]:
def compare_seqs(seq_1, seq_2, pity):
    seq_1_counts = seq_1.value_counts(normalize=True)
    seq_2_counts = seq_2.value_counts(normalize=True)

    x_axis = []
    y_axis_1 = []
    y_axis_2 = []
    for x in range(1, pity + 1):
        x_axis.append(x)
        y_axis_1.append(seq_1_counts[x] if x in seq_1_counts.index else 0)
        y_axis_2.append(seq_2_counts[x] if x in seq_2_counts.index else 0)

    x_axis = np.array(x_axis)
    width = 0.35

    fig, ax = plt.subplots()
    rects1 = ax.bar(x_axis - width / 2, y_axis_1, width, label="Permanent banner")
    rects2 = ax.bar(x_axis + width / 2, y_axis_2, width, label="Limited event banners")

    ax.set_ylabel("Resets %")
    ax.set_title("Success rolls distribution")
    ax.legend()

    fig.tight_layout()
    plt.show()

The text describing the wishes system within Genshin Impact is written in such a way that it makes you believe that the chance of success is constant. However, it's apparent on the graphs below that it's not. 

### Four-star item sequences

Around 70% of the 4* items require either 9 or 10 rolls to get.

In [None]:
perm_4_seq = count_sequence_lengths(df_perm, 4)
lim_4_seq = count_sequence_lengths(df_lim, 4)
compare_seqs(perm_4_seq, lim_4_seq, 10)

### Five-star item sequences

Since 5* rolls are pretty rare and I only analyse my own data, the graph for them looks less interesting. For that reason, and to make bars wider than a few pixels, it combines adjacent successes by this logic:

| Real  | On graph |
| ----- | -------- |
| 1-10  | 1        |
| 11-20 | 2        |
| 21-30 | 3        |
| 31-40 | 4        |
| 41-50 | 5        |
| 51-60 | 6        |
| 61-70 | 7        |
| 71-80 | 8        |
| 81-90 | 9        |

Similarly to 4* items, most of the 5* items are obtained as a result of 71-80 rolls.

In [None]:
condense = lambda c: math.ceil(c / 10)

perm_5_seq = count_sequence_lengths(df_perm, 5).apply(condense)
lim_5_seq = count_sequence_lengths(df_lim, 5).apply(condense)
compare_seqs(perm_5_seq, lim_5_seq, 9)

## Most common items

Here are the top-10s of the most common items dropping from gacha. Many of them are "unknown" because I didn't bother writing down trash rolls at first.

### Permanent banner

In [None]:
df_perm["item"].value_counts().head(11)

### Limited banners

In [None]:
df_lim["item"].value_counts().head(11)