# GameBus Health Behavior Mining - Data Extraction

This notebook demonstrates how to extract data from the GameBus platform using the framework.

## Setup

First, let's set up our environment and import the necessary modules.

In [1]:
import sys
import os
import pandas as pd
import json
from pprint import pprint
import matplotlib.pyplot as plt
import seaborn as sns
import datetime

# Add the project root directory to the Python path
sys.path.append('..')

# Import project modules
from config.credentials import AUTHCODE
from config.paths import USERS_FILE_PATH, RAW_DATA_DIR
from src.extraction.gamebus_client import GameBusClient
from src.extraction.data_collectors import (
    LocationDataCollector, 
    MoodDataCollector,
    ActivityTypeDataCollector,
    HeartRateDataCollector,
    AccelerometerDataCollector,
    NotificationDataCollector
)
from src.utils.logging import setup_logging
from src.utils.file_handlers import load_json, save_json, load_csv, save_csv

## Initialize Logging

Set up logging for the notebook.

In [None]:
import logging
logging.basicConfig(level=logging.DEBUG)
logger = setup_logging(log_level="INFO")
logger.info("Notebook initialized")

## Load GameBus Users

Load the GameBus users from the CSV file. If the file doesn't exist yet, we'll create it from the example file.

In [None]:
# Load the users
try:
    users_df = pd.read_csv(USERS_FILE_PATH, delimiter=';')
    display(users_df.head())
    print(f"Loaded {len(users_df)} users")
except Exception as e:
    # If users file doesn't exist yet, we need to create it
    print(f"Failed to load users file: {e}")
    print("Loading example users from the examples directory...")
    example_users_path = os.path.join('..', 'examples', 'GB-users.csv')
    users_df = pd.read_csv(example_users_path, delimiter=';')
    # Save to configured path
    os.makedirs(os.path.dirname(USERS_FILE_PATH), exist_ok=True)
    users_df.to_csv(USERS_FILE_PATH, sep=';', index=False)
    display(users_df.head())
    print(f"Loaded {len(users_df)} example users and saved to {USERS_FILE_PATH}")

## Initialize GameBus Client

Create a GameBus client for API interactions.

In [None]:
# Initialize the GameBus client
client = GameBusClient(AUTHCODE)
print("GameBus client initialized")

## Define Date Range (Optional)

Optionally define a date range for data extraction. If no dates are specified, all available data will be extracted.

In [None]:
# Set to True to enable date filtering
use_date_filter = True

if use_date_filter:
    # Define start date (inclusive)
    start_date = datetime.datetime(2025, 5, 9)  # Format: YYYY, MM, DD
    
    # Define end date (inclusive)
    end_date = datetime.datetime(2025, 7, 1)    # Format: YYYY, MM, DD
    
    print(f"Date range: {start_date.strftime('%Y-%m-%d')} to {end_date.strftime('%Y-%m-%d')}")
else:
    start_date = None
    end_date = None
    print("No date filtering applied - extracting all available data")

## Select User

Select a user from the list to extract data for.

In [None]:
# Select a user
user_index = 0  # Change this to select a different user
selected_user = users_df.iloc[user_index]
username = selected_user['Username']
password = selected_user['Password']

print(f"Selected user: {username}")

## Authenticate User

Authenticate with the GameBus API and get the player token and ID.

In [None]:
# Get player token and ID
token = client.get_player_token(username, password)
player_id = client.get_player_id(token)

print(f"Player token: {token[:10]}..." if token else "Failed to get token")
print(f"Player ID: {player_id}" if player_id else "Failed to get player ID")

## Extract GPS Location Data

Extract GPS location data from GameBus. This includes latitude, longitude, altitude, speed, error, timestamp, and arm.

In [None]:
# Create a location data collector
location_collector = LocationDataCollector(client, token, player_id)

# Collect location data
location_data, location_file = location_collector.collect(start_date=start_date, end_date=end_date)

print(f"Collected {len(location_data)} location data points")
print(f"Data saved to {location_file}")

# Display a sample of the data
location_df = pd.DataFrame(location_data)
if len(location_df) > 0:
    display(location_df.head())

## Extract Mood Data

Extract mood logging data from GameBus. This includes valence_state_value, arousal_state_value, stress_state_value, and event_timestamp.

In [None]:
# Create a mood data collector
mood_collector = MoodDataCollector(client, token, player_id)

# Collect mood data
mood_data, mood_file = mood_collector.collect(start_date=start_date, end_date=end_date)

print(f"Collected {len(mood_data)} mood data points")
print(f"Data saved to {mood_file}")

# Display a sample of the data
mood_df = pd.DataFrame(mood_data)
if len(mood_df) > 0:
    display(mood_df.head())

## Extract Activity Type Data

Extract activity type data from GameBus. This includes src, ts (timestamp), type, speed, steps, walks, runs, freq, distance, and cals.

In [None]:
# Create an activity type data collector
activity_collector = ActivityTypeDataCollector(client, token, player_id)

# Collect activity type data
activity_data, activity_file = activity_collector.collect(start_date=start_date, end_date=end_date)

print(f"Collected {len(activity_data)} activity data points")
print(f"Data saved to {activity_file}")

# Display a sample of the data
activity_df = pd.DataFrame(activity_data)
if len(activity_df) > 0:
    display(activity_df.head())

## Extract Heart Rate Data

Extract heart rate monitoring data from GameBus. This includes ts (timestamp), hr (heartrate), and pp.

In [None]:
# Create a heart rate data collector
heartrate_collector = HeartRateDataCollector(client, token, player_id)

# Collect heart rate data
heartrate_data, heartrate_file = heartrate_collector.collect(start_date=start_date, end_date=end_date)

print(f"Collected {len(heartrate_data)} heart rate data points")
print(f"Data saved to {heartrate_file}")

# Display a sample of the data
heartrate_df = pd.DataFrame(heartrate_data)
if len(heartrate_df) > 0:
    display(heartrate_df.head())

## Extract Accelerometer Data

Extract accelerometer data from GameBus. This includes ts (timestamp), x axis, y axis, and z axis.

In [None]:
# Create an accelerometer data collector
accelerometer_collector = AccelerometerDataCollector(client, token, player_id)

# Collect accelerometer data
accelerometer_data, accelerometer_file = accelerometer_collector.collect(start_date=start_date, end_date=end_date)

print(f"Collected {len(accelerometer_data)} accelerometer data points")
print(f"Data saved to {accelerometer_file}")

# Display a sample of the data
accelerometer_df = pd.DataFrame(accelerometer_data)
if len(accelerometer_df) > 0:
    display(accelerometer_df.head())

## Extract Notification Data

Extract notification data from GameBus. This includes action (e.g., received, read) and event_timestamp.

In [None]:
# Create a notification data collector
notification_collector = NotificationDataCollector(client, token, player_id)

# Collect notification data
notification_data, notification_file = notification_collector.collect(start_date=start_date, end_date=end_date)

print(f"Collected {len(notification_data)} notification data points")
print(f"Data saved to {notification_file}")

# Display a sample of the data
notification_df = pd.DataFrame(notification_data)
if len(notification_df) > 0:
    display(notification_df.head())

## Summary

Let's summarize all the data we've collected.

In [None]:
# Create a summary of collected data
summary = {
    "Location": len(location_data),
    "Mood": len(mood_data),
    "Activity": len(activity_data),
    "Heart Rate": len(heartrate_data),
    "Accelerometer": len(accelerometer_data),
    "Notification": len(notification_data)
}

summary_df = pd.DataFrame({
    "Data Type": list(summary.keys()),
    "Count": list(summary.values())
})

# Display summary
display(summary_df)

print(f"\nTotal data points collected: {sum(summary.values())}")

## Next Steps

Now that we have extracted the raw data from GameBus, the next steps would be:

1. **Preprocessing**: Clean and normalize the data, handle missing values, synchronize timestamps, etc.
2. **Activity Recognition**: Recognize human activities from the sensor data using machine learning techniques.
3. **OCEL Generation**: Transform the preprocessed data and recognized activities into the Object-Centric Event Log (OCEL) format for process mining.

These steps will be covered in the subsequent notebooks.