## Requirements install

In [None]:
%pip install pandas
%pip install "betterproto[compiler]"
%pip install matplotlib
%pip install scipy
%pip install pyQt5
%pip install scikit-learn
%pip install numpy
%matplotlib qt      
%matplotlib

## Pyproto generation

In [None]:
from utils.auto_generate_proto import generate_proto_classes

generate_proto_classes()

## Base includes 

In [3]:
from data_select.data_filter_operator import *
import pandas as pd
import matplotlib.pyplot as plt
from utils.utils import OUTPUT_DIR, data_frame_to_csv, LogFields, gel_2d_length_in_column
from sklearn.preprocessing import PolynomialFeatures
from sklearn.linear_model import LinearRegression
from sklearn.metrics import r2_score, mean_absolute_error
import scipy.stats as stats
import numpy as np
import math

## Path to log file

In [4]:
PREFIX_PATH = '../logs/'
LOG_FILE = 'testKickDataSplit.log.gz'

## Load log data

In [5]:
selects = [LogFields.PROCESSED_FRAME, LogFields.RAW_FRAME, LogFields.TELEMETRY, LogFields.ROBOTS_COMMAND]

data_list = load_select_modules(PREFIX_PATH+LOG_FILE, selects)

# Data analysis

## Kick Data Split 

### üß© Descri√ß√£o dos DataFrames

Durante a extra√ß√£o dos **eventos de chute (kick events)**, v√°rios *dataframes* s√£o gerados e filtrados para o mesmo intervalo temporal.  
Cada um representa um conjunto de dados diferente ‚Äî bola, rob√¥s, comandos e telemetria ‚Äî obtidos de diferentes partes do sistema.

---

#### ‚öΩ `ball`
Cont√©m informa√ß√µes processadas da bola ao longo do tempo, derivadas do sistema de vis√£o.

| Coluna | Descri√ß√£o |
|:--|:--|
| `timestamp` | Momento em que o frame foi publicado. |
| `position_x`, `position_y` | Posi√ß√£o da bola no campo, em mil√≠metros. |
| `velocity_x`, `velocity_y` | Componentes da velocidade da bola nos eixos X e Y. |
| `acceleration_x`, `acceleration_y` | Componentes da acelera√ß√£o da bola nos eixos X e Y. |
| `velocity_norm` | M√≥dulo da velocidade. |
| `acceleration_norm` | M√≥dulo da acelera√ß√£o. |

Esses dados s√£o usados para **detectar e analisar os chutes**, pois mostram mudan√ßas bruscas de acelera√ß√£o e velocidade.

---

#### ü§ñ `processed_robots`
Cont√©m a posi√ß√£o e velocidade **estimadas** de todos os rob√¥s (aliados e advers√°rios), calculadas a partir dos frames processados pelo sistema de vis√£o.

| Coluna | Descri√ß√£o |
|:--|:--|
| `timestamp` | Momento em que o frame foi salvo. |
| `team` | Indica se o rob√¥ pertence ao time aliado (`allies`) ou advers√°rio (`enemies`). |
| `id` | Identificador do rob√¥. |
| `position_x`, `position_y`, `position_w` | Posi√ß√£o e orienta√ß√£o do rob√¥ no campo. |
| `velocity_x`, `velocity_y` | Componentes da velocidade linear. |
| `velocity_norm` | M√≥dulo da velocidade. |

---

#### üß≠ `commands`
Cont√©m os comandos de movimenta√ß√£o e atua√ß√£o enviados aos rob√¥s aliados.

| Coluna | Descri√ß√£o |
|:--|:--|
| `timestamp` | Momento em que o comando foi enviado. |
| `id` | Identificador do rob√¥ aliado. |
| `move_x`, `move_y`, `move_w` | Componentes de comando de movimento (velocidades desejadas nos eixos X, Y e angular). |
| `actuation_kick_strength` | Tempo de descarga dos capacitores. |
| `actuation_chip` | Indica se o chute foi do tipo *chip*. |
| `actuation_front` | Indica se o chute foi do tipo *front*. |
| `actuation_charge` | Indica que deve carregar os capacitores. |
| `actuation_dribbler` | Estado do dribbler (ligado/desligado). |
| `actuation_dribbler_velocity` | Velocidade do dribbler (RPM). |

---

#### üîß `telemetry`
Cont√©m dados de telemetria enviados pelos rob√¥s aliados.

| Coluna | Descri√ß√£o |
|:--|:--|
| `timestamp` | Instante de recep√ß√£o da telemetria. |
| `id` | Identificador do rob√¥ aliado. |
| `wheel1`, `wheel2`, `wheel3`, `wheel4` | Velocidade de cada roda individual. |
| `position_x`, `position_y`, `position_w` | Posi√ß√£o e orienta√ß√£o reportadas pelo rob√¥. |
| `velocity_x`, `velocity_y`, `velocity_w` | Velocidades lineares e angulares reportadas. |
| `dribbler_speed` | Velocidade do dribbler. |
| `capacitor_charge` | N√≠vel de carga do capacitor. |
| `dribbler_ball_contact` | Indica se o IR est√° ativo. |
| `battery` | N√≠vel da bateria. |
| `count` | Contador de pacotes (para detec√ß√£o de perdas). |

---

#### üü¶ `raw_robots`

Este dataframe cont√©m os **dados brutos de detec√ß√£o dos rob√¥s** (tanto aliados quanto advers√°rios) obtidos diretamente das mensagens de vis√£o (*raw frames*) do sistema **SSL-Vision**.

Cada linha representa a detec√ß√£o de **um rob√¥ em um determinado instante (`timestamp`)**.

| Coluna | Descri√ß√£o |
|:--------|:-----------|
| `timestamp` | Momento exato em que o frame foi publicado pelo sistema de vis√£o. |
| `team` | Nome do time ao qual o rob√¥ pertence ‚Äî pode ser `"robots_blue"` (time azul) ou `"robots_yellow"` (time amarelo). |
| `robot_id` | Identificador √∫nico do rob√¥ dentro do time (geralmente de 0 a 11). |
| `position_x`, `position_y`, `position_w` | Posi√ß√£o e orienta√ß√£o do rob√¥ no campo. |

---

#### üü† `raw_ball`

Este dataframe cont√©m os **dados brutos de detec√ß√£o da bola**, tamb√©m obtidos a partir dos *raw frames* do **SSL-Vision**.

Cada linha representa a posi√ß√£o detectada da **bola** em um determinado instante (`timestamp`).

| Coluna | Descri√ß√£o |
|:--------|:-----------|
| `timestamp` | Momento em que a bola foi detectada no frame, em segundos. |
| `position_x`, `position_y` | Posi√ß√£o da bola no campo, em mil√≠metros. |

### Defining important functions

In [125]:
def processed_frame_extract_all_robots_data(filtered_data: list) -> pd.DataFrame:
    robot_data_list = []

    for data in filtered_data:
        if 'processed_frame' not in data:
            continue

        process_frame = data['processed_frame']
        timestamp = process_frame.get('publish_timestamp', 0)

        if timestamp == 0:
            continue

        # Percorre tanto aliados quanto inimigos, se existirem
        for team_name in ['allies', 'enemies']:
            if team_name not in process_frame:
                continue

            for robot in process_frame[team_name]:
                data_line = {
                    'timestamp': timestamp,
                    'team': team_name,
                    'robot_id': robot.get('id')
                }

                # Posi√ß√£o
                if 'position' in robot:
                    data_line['position_x'] = robot['position'].get('x')
                    data_line['position_y'] = robot['position'].get('y')
                    data_line['position_w'] = robot['position'].get('omega')

                # Velocidade
                if 'velocity' in robot:
                    data_line['velocity_x'] = robot['velocity'].get('x')
                    data_line['velocity_y'] = robot['velocity'].get('y')

                robot_data_list.append(data_line)

    return pd.DataFrame(robot_data_list)

In [126]:
def raw_frame_extract_all_robots_data(raw_data: list) -> pd.DataFrame:
    robot_data_list = []

    for data in raw_data:
        if 'raw_frame' not in data or not data['raw_frame']:
            continue

        for raw_frame in data['raw_frame']:
            timestamp = raw_frame.get('publish_timestamp', 0)
            if timestamp == 0:
                continue

            # Verifica se h√° dados de detec√ß√£o v√°lidos
            packet = raw_frame.get('packet', {})
            detection = packet.get('detection', {})

            # Itera sobre os dois times, se existirem
            for team_name in ['robots_blue', 'robots_yellow']:
                if team_name not in detection or not detection[team_name]:
                    continue

                for robot in detection[team_name]:
                    data_line = {
                        'timestamp': timestamp,
                        'team': team_name,
                        'robot_id': robot.get('robot_id')
                    }

                    # Posi√ß√£o e orienta√ß√£o
                    data_line['position_x'] = robot.get('x')
                    data_line['position_y'] = robot.get('y')
                    data_line['position_w'] = robot.get('orientation')

                    robot_data_list.append(data_line)

    return pd.DataFrame(robot_data_list)

In [127]:
def robots_command_extract_all_robots_data(filtered_data: list) -> pd.DataFrame:
    robot_command_data_list = []

    for data in filtered_data:
        if 'robots_command' not in data:
            continue

        navigation = data['robots_command'].get('navigation', [])
        if not navigation:
            continue

        for robot in navigation:
            timestamp = robot.get('publish_timestamp', 0)
            if timestamp == 0:
                continue

            data_line = {'timestamp': timestamp, 'robot_id': robot.get('id')}

            # Movimento
            move = robot.get('move', {})
            data_line['move_x'] = move.get('x')
            data_line['move_y'] = move.get('y')
            data_line['move_w'] = move.get('omega')

            # Atua√ß√£o
            actuation = robot.get('actuation', {})
            data_line['actuation_kick_strength'] = actuation.get('kick_strength')
            data_line['actuation_front'] = actuation.get('front')
            data_line['actuation_chip'] = actuation.get('chip')
            data_line['actuation_charge'] = actuation.get('charge')
            data_line['actuation_dribbler'] = actuation.get('dribbler')
            data_line['actuation_dribbler_velocity'] = actuation.get('dribbler_velocity')

            robot_command_data_list.append(data_line)

    return pd.DataFrame(robot_command_data_list)

In [128]:
def telemetry_extract_all_robots_data(filtered_data: list) -> pd.DataFrame:
    telemetry_data_list = []

    for data in filtered_data:
        if 'telemetry' not in data:
            continue

        telemetries = data['telemetry'].get('telemetries', [])
        if not telemetries:
            continue

        for robot in telemetries:
            timestamp = robot.get('receive_timestamp', 0)
            if timestamp == 0:
                continue

            data_line = {'timestamp': timestamp, 'robot_id': robot.get('id')}

            # Rodas
            for i in range(1, 5):
                data_line[f'wheel{i}'] = robot.get(f'wheel{i}')

            # Posi√ß√£o
            position = robot.get('position', {})
            data_line['position_x'] = position.get('x')
            data_line['position_y'] = position.get('y')
            data_line['position_w'] = position.get('omega')

            # Velocidade
            velocity = robot.get('velocity', {})
            data_line['velocity_x'] = velocity.get('x')
            data_line['velocity_y'] = velocity.get('y')
            data_line['velocity_w'] = velocity.get('omega')

            # Outros par√¢metros
            data_line['dribbler_speed'] = robot.get('dribbler_speed')
            data_line['capacitor_charge'] = robot.get('capacitor_charge')
            data_line['dribbler_ball_contact'] = robot.get('dribbler_ball_contact')
            data_line['battery'] = robot.get('battery')
            data_line['count'] = robot.get('count')

            telemetry_data_list.append(data_line)

    return pd.DataFrame(telemetry_data_list)

In [129]:
def extract_kick_events(ball_df: pd.DataFrame, other_dfs: dict, buffer_size=20, pre_frames=100, post_frames=100):
    """
    Extracts kick events from the ball dataframe and slices corresponding rows
    from other dataframes based on matching timestamp ranges.

    Parameters:
        ball_df (pd.DataFrame): Ball dataframe with 'timestamp', 'acceleration_norm', 'velocity_norm'
        other_dfs (dict): Dictionary of other DataFrames, e.g. {'robots': df_robots, 'telemetry': df_telemetry}
        buffer_size (int): Number of samples to use for rolling analysis
        pre_frames (int): Frames to include before the kick start
        post_frames (int): Frames to include after the kick end

    Returns:
        dict: A dictionary with keys 'ball' and one key per other dataframe,
              each containing a list of DataFrame slices corresponding to kick events.
    """

    buffer = []
    sectorIdx = []
    startKick = None
    HasKickEnable = False

    # --- Detect kicks in ball dataframe ---
    for index, row in ball_df.iterrows():
        buffer.append([row['timestamp'], row['acceleration_norm'], row['velocity_norm']])

        if len(buffer) < buffer_size:
            continue

        accLine = [b[1] for b in buffer]
        velLine = [b[2] for b in buffer]

        if (np.average(velLine) > 500 and not HasKickEnable and np.average(accLine) > 1000):
            HasKickEnable = True
            startKick = index
        elif (np.average(velLine) < 10 and HasKickEnable):
            HasKickEnable = False
            sectorIdx.append([startKick, index])

        while len(buffer) > buffer_size:
            buffer.pop(0)

    # --- Collect dataframes for each kick ---
    kicks_data = {'ball': []}
    for name in other_dfs.keys():
        kicks_data[name] = []

    for s, e in sectorIdx:
        # Get timestamp bounds from ball dataframe
        s_idx = int(s - pre_frames)
        e_idx = int(e + post_frames)
        
        # Ball segment
        kicks_data['ball'].append(ball_df.loc[s_idx:e_idx])

        start_ts = kicks_data['ball'][-1].iloc[0]['timestamp']
        end_ts = kicks_data['ball'][-1].iloc[-1]['timestamp']

        # Match data from other dataframes by timestamp range
        for name, df in other_dfs.items():
            mask = (df['timestamp'] >= start_ts) & (df['timestamp'] <= end_ts)
            kicks_data[name].append(df[mask])

    return kicks_data

### Extract Dataframes

In [130]:
df_processed_robots = processed_frame_extract_all_robots_data(data_list)
df_raw_robots = raw_frame_extract_all_robots_data(data_list)
df_commands = robots_command_extract_all_robots_data(data_list)
df_telemetry = telemetry_extract_all_robots_data(data_list)
raw_ball = raw_frame_extract_ball_data_frame(data_list)
ball = processed_frame_extract_ball_data_frame(data_list)

### Process timestamps

In [131]:
# CONVERT TIMESTAMPS TO FLOAT
df_processed_robots['timestamp'] = df_processed_robots['timestamp'].astype(float)
df_raw_robots['timestamp'] = df_raw_robots['timestamp'].astype(float)
df_commands['timestamp'] = df_commands['timestamp'].astype(float)
df_telemetry['timestamp'] = df_telemetry['timestamp'].astype(float)
raw_ball['timestamp'] = raw_ball['timestamp'].astype(float)
ball['timestamp'] = ball['timestamp'].astype(float)

# FILTER DATA WITH NEGATIVE TIMESTAMP
df_processed_robots = df_processed_robots[df_processed_robots['timestamp'] > 1]
df_raw_robots = df_raw_robots[df_raw_robots['timestamp'] > 1]
df_commands = df_commands[df_commands['timestamp'] > 1]
df_telemetry = df_telemetry[df_telemetry['timestamp'] > 1]
raw_ball = raw_ball[raw_ball['timestamp'] > 1]
ball = ball[ball['timestamp'] > 1]

# GET TIME REFERENCE
timeref = min(ball['timestamp'].values[0], raw_ball['timestamp'].values[0])

# CONVERT FROM NANOSECONDS TO MILLISECONDS
df_processed_robots['timestamp'] = (df_processed_robots['timestamp'] - timeref).apply(lambda x: float(x)%1e13/1e9)
df_raw_robots['timestamp'] = (df_raw_robots['timestamp'] - timeref).apply(lambda x: float(x)%1e13/1e9)
df_commands['timestamp'] = (df_commands['timestamp'] - timeref).apply(lambda x: float(x)%1e13/1e9)
df_telemetry['timestamp'] = (df_telemetry['timestamp'] - timeref).apply(lambda x: float(x)%1e13/1e9)
raw_ball['timestamp'] = (raw_ball['timestamp'] - timeref).apply(lambda x: float(x)%1e13/1e9)
ball['timestamp'] = (ball['timestamp'] - timeref).apply(lambda x: float(x)%1e13/1e9)

### Create columns for velocity and acceleration norms

In [132]:
# CREATE COLUMNS FOR VELOCITY AND ACCELERATION NORMS
df_processed_robots['velocity_norm'] = gel_2d_length_in_column(df_processed_robots, 'velocity')
ball['velocity_norm'] = gel_2d_length_in_column(ball, 'velocity')
ball['acceleration_norm'] = gel_2d_length_in_column(ball, 'acceleration')

### Printing head of Dataframes

In [None]:
df_processed_robots.head()

In [None]:
df_raw_robots.head()

In [None]:
df_commands.head()

In [None]:
df_telemetry.head()

In [None]:
ball.head()

In [None]:
raw_ball.head()

### Process Dataframe and save data

In [139]:
other_dfs = {
    'processed_robots': df_processed_robots,
    'raw_robots': df_raw_robots,
    'telemetry': df_telemetry,
    'commands': df_commands,
    'raw_ball': raw_ball
}

kicks_data = extract_kick_events(ball, other_dfs)

In [140]:
# --- Save plots and CSVs for each kick event ---

count = 0
for kick_idx in range(len(kicks_data['ball'])):
    # --- Plot ball velocity for this kick ---
    kick_ball = kicks_data['ball'][kick_idx]
    plt.plot(kick_ball.index, kick_ball['velocity_norm'])
    label_base = f"{OUTPUT_DIR}/kickEvent{count}"

    plt.savefig(label_base + '.png', dpi=300, bbox_inches='tight')
    plt.clf()

    # --- Save CSV for each dataframe (ball + others) ---
    for df_name, df_list in kicks_data.items():
        if len(df_list) <= kick_idx:
            continue  # skip if this df has fewer events

        df_event = df_list[kick_idx]
        if df_event.empty:
            continue  # skip empty dataframes

        df_event.to_csv(f"{label_base}_{df_name}.csv", index=False)

    count += 1
