# TLoL-LLM - Initial LLM Experiments

## Overview

This notebook contains the initial experiments for generating text descriptions of observations / scenes, embedding them using `text-embedding-ada-002` (OpenAI GPT3.5? embedding), and then seeing how well we can query these embeddings. GPT3.5/4 is used for the embeddings now as they most likely have the best zero-shot performance, this may be changed to a fine-tuned model in the future, however for the experimental stage it's easier to use a model which has very good out-of-the-box performance. This can have dual usage:

1. For analysis, similar situations can be compared or queried in the future which allows us to create a large database of League of Legends situations. This can be used to coach new players on what high elo or pro players would of done in a similar situation, for coaches to analyse similar situations in the future and get a more nuanced summary or description of events that occured.
2. For creating a game playing bot. If we cover a large enough number of situations, the game playing bot can either copy what was done before, or attempt to generalise from examples of similar situations.

## Dataset

The dataset for this notebook is Game 5 of the League of Legends Worlds 2022 Finals. The processed version of the *.rofl file is available on [Google Drive](https://drive.google.com/file/d/1kZchHUksTCOvpN_hJZ5iVvESF6Be5FPt/view?usp=sharing).

### Dataset Reliability

There is a possibility that some of the fields may be inaccurate, it's a good idea to eyeball check the data first.

## Load the Replay File

In [19]:
import pandas as pd
import os
from pathlib import Path
from sqlite3 import connect

HOME   = Path(os.getcwd())
REPLAY = "ESPORTSTMNT02-3080905(old).db"
conn   = connect(HOME / REPLAY)

champs_df = pd.read_sql('SELECT * FROM champs;', conn)
conn.close()

### Preprocess Replay File

#### Remove Duplicate Frames

In [20]:
champs_df

Unnamed: 0,game_id,time,obj_type,net_id,obj_id,name,health,max_health,team,armour,...,max_mana,ability_haste,ap,lethality,experience,mana_regen,health_regen,attack_range,current_gold,total_gold
0,3080905,2.006888,champs,1073741856,2589,varus,600.000000,600.000000,100,0.0,...,360.000000,0.000000,0.0,0.0,1.000000,1.600000,0.700000,575.0,500.000000,500.000000
1,3080905,2.006888,champs,1073741860,2608,azir,622.000000,622.000000,200,0.0,...,480.000000,0.000000,9.0,0.0,1.000000,1.600000,1.400000,525.0,500.000000,500.000000
2,3080905,2.006888,champs,1073741858,2594,aatrox,730.000000,730.000000,200,0.0,...,0.000000,0.000000,0.0,0.0,1.000000,0.000000,1.800000,225.0,50.000000,500.000000
3,3080905,2.006888,champs,1073741862,2615,bard,630.000000,630.000000,200,0.0,...,350.000000,0.000000,0.0,0.0,1.000000,1.200000,1.100000,500.0,500.000000,500.000000
4,3080905,2.006888,champs,1073741854,2583,viego,630.000000,630.000000,100,0.0,...,10000.000000,0.000000,0.0,0.0,1.000000,0.000000,1.400000,200.0,150.000000,500.000000
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
170583,3080905,2528.976807,champs,1073741858,2804,aatrox,2366.304932,2843.145264,200,0.0,...,0.000000,35.000000,0.0,18.0,1.000000,0.000000,6.219801,225.0,2286.463379,15345.379883
170584,3080905,2528.976807,champs,1073741862,2776,bard,2471.295166,2471.295166,200,0.0,...,1013.250061,65.000000,110.0,0.0,1.000000,9.934479,2.559150,500.0,1033.226807,9447.924805
170585,3080905,2528.976807,champs,1073741857,2844,karma,0.000000,2273.550049,100,0.0,...,1041.500000,100.800003,208.0,0.0,1.000000,18.334251,2.304500,525.0,108.697708,10123.342773
170586,3080905,2528.976807,champs,1073741859,2803,hecarim,2944.588379,3258.025146,200,0.0,...,1145.500000,50.000000,0.0,0.0,1.000000,3.037000,3.571250,175.0,1475.142334,13764.005859


In [21]:
champs_df = champs_df.drop_duplicates(subset=['time', 'name'], keep='first')
champs_df

Unnamed: 0,game_id,time,obj_type,net_id,obj_id,name,health,max_health,team,armour,...,max_mana,ability_haste,ap,lethality,experience,mana_regen,health_regen,attack_range,current_gold,total_gold
0,3080905,2.006888,champs,1073741856,2589,varus,600.000000,600.000000,100,0.0,...,360.000000,0.000000,0.0,0.0,1.000000,1.600000,0.700000,575.0,500.000000,500.000000
1,3080905,2.006888,champs,1073741860,2608,azir,622.000000,622.000000,200,0.0,...,480.000000,0.000000,9.0,0.0,1.000000,1.600000,1.400000,525.0,500.000000,500.000000
2,3080905,2.006888,champs,1073741858,2594,aatrox,730.000000,730.000000,200,0.0,...,0.000000,0.000000,0.0,0.0,1.000000,0.000000,1.800000,225.0,50.000000,500.000000
3,3080905,2.006888,champs,1073741862,2615,bard,630.000000,630.000000,200,0.0,...,350.000000,0.000000,0.0,0.0,1.000000,1.200000,1.100000,500.0,500.000000,500.000000
4,3080905,2.006888,champs,1073741854,2583,viego,630.000000,630.000000,100,0.0,...,10000.000000,0.000000,0.0,0.0,1.000000,0.000000,1.400000,200.0,150.000000,500.000000
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
170533,3080905,2528.976807,champs,1073741858,2804,aatrox,2366.304932,2843.145264,200,0.0,...,0.000000,35.000000,0.0,18.0,1.000000,0.000000,6.219801,225.0,2286.463379,15345.379883
170534,3080905,2528.976807,champs,1073741862,2776,bard,2471.295166,2471.295166,200,0.0,...,1013.250061,65.000000,110.0,0.0,1.000000,9.934479,2.559150,500.0,1033.226807,9447.924805
170535,3080905,2528.976807,champs,1073741857,2844,karma,0.000000,2273.550049,100,0.0,...,1041.500000,100.800003,208.0,0.0,1.000000,18.334251,2.304500,525.0,108.697708,10123.342773
170536,3080905,2528.976807,champs,1073741859,2803,hecarim,2944.588379,3258.025146,200,0.0,...,1145.500000,50.000000,0.0,0.0,1.000000,3.037000,3.571250,175.0,1475.142334,13764.005859


#### Champ Count

In [22]:
unique_champs = champs_df['name'].nunique()
unique_champs

10