Here are the two detailed "how-to" guides you requested, outlining the setup for distributed chess engine computing and the creation of new machine learning-based AI engines. These guides consolidate information from our previous discussions and your provided project files, aiming to be actionable for implementation with the help of Gemini AI in VS Code Copilot.

-----

# 1\. Distributed Chess Engine Computing How-To Guide

This guide details how to set up a distributed computing network for your Viper Chess Engine, leveraging your existing Windows PCs and extending to a low-cost cloud environment like AWS. The goal is to efficiently run thousands or millions of game simulations and gather data for AI experimentation.

## Core Principles for Distributed Chess Engine Testing

Before diving into implementation, understanding these principles is key:

  * **Decoupling**: Separate your application into distinct, independent services:
      * **Game Runner/Worker**: An instance of your chess engine that plays games and generates data. This should be a lightweight, single-purpose unit.
      * **Metrics Storage**: A centralized database to store all game results, move-by-move data, and configuration details. This is the single source of truth for your analytics.
      * **Orchestration/Job Queue**: A system to manage and distribute game-playing tasks to available workers.
      * **Dashboard**: A visualization layer to monitor progress and analyze collected data.
  * **Stateless Workers**: Each game runner/worker should not retain state between game runs. It receives a task, executes it, reports results, and then is ready for a new, independent task. This simplifies scaling and fault tolerance.
  * **Centralized Data**: All output data (PGNs, detailed move metrics, game results, configurations) must be sent to a single, shared database. This is crucial for aggregated analysis and A/B testing.
  * **Scalability**: The architecture should allow you to easily add more workers (local PCs or cloud instances) to increase computational throughput without significant re-configuration.

## Step 1: Containerization with Docker

Docker allows you to package your application and its dependencies into a standardized unit (a container), ensuring it runs consistently across different environments (your various Windows PCs, Linux micro-PCs, or cloud servers).

### 1.1 Install Docker

  * **For Windows 10/11 Desktops/Laptops**: Install [Docker Desktop for Windows](https://docs.docker.com/desktop/install/windows-install/). Ensure WSL 2 (Windows Subsystem for Linux 2) is enabled as Docker Desktop uses it for its backend.
  * **For Linux (Orange Pi, Raspberry Pi)**: Install [Docker Engine](https://docs.docker.com/engine/install/) directly.

### 1.2 Create a `Dockerfile`

In the root of your project, create a file named `Dockerfile`:

```dockerfile
# Dockerfile
# Use a slim Python base image for smaller container size
FROM python:3.10-slim-buster

# Set the working directory in the container
WORKDIR /app

# Copy the requirements.txt file and install dependencies
# This step is cached if requirements.txt doesn't change, speeding up builds
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# Copy the entire project directory into the container
COPY . .

# Set environment variables if needed (e.g., for database connection, Stockfish path)
# Stockfish path will be relative to /app/engine_utilities/external_engines/stockfish/
# DB_HOST, DB_PORT, DB_USER, DB_PASSWORD, DB_NAME should be set dynamically or in Kubernetes/ECS
ENV VIPER_DB_HOST="localhost" \
    VIPER_DB_PORT="5432" \
    VIPER_DB_USER="chessuser" \
    VIPER_DB_PASSWORD="securepassword" \
    VIPER_DB_NAME="chess_metrics_db" \
    STOCKFISH_EXEC_PATH="/app/engine_utilities/external_engines/stockfish/stockfish-windows-x86-64-avx2.exe"

# Expose any ports if your application has a web interface (like the dashboard)
# EXPOSE 8050 # For the Dash dashboard

# Command to run your chess_game.py when the container starts
# The arguments passed will be available in sys.argv in chess_game.py
CMD ["python", "chess_game.py"]
```

### 1.3 Create `requirements.txt`

Ensure your `requirements.txt` includes all Python packages used:

```
pygame
python-chess
numpy
PyYAML
pandas
psutil
dash
plotly
sqlite3 # Although moving to PostgreSQL/MySQL, some modules might still rely on it internally or for initial setup
# For PostgreSQL:
psycopg2-binary
SQLAlchemy
```

### 1.4 Build the Docker Image

Navigate to your project root in the terminal and run:

```bash
docker build -t viper-chess-engine .
```

This will create a Docker image named `viper-chess-engine` which bundles your code and dependencies.

## Step 2: Centralized Database

SQLite is file-based and unsuitable for multiple concurrent writes from distributed workers, especially over a network. You need a robust, centralized database. PostgreSQL is an excellent open-source choice.

### 2.1 Choose a Database Solution

  * **Local Network**: Deploy **PostgreSQL** on one of your more powerful Windows desktops (e.g., the i9-11900k machine). You can run it directly or within Docker.
      * **Direct Installation (Windows)**: Download and install PostgreSQL from [postgresql.org](https://www.postgresql.org/download/windows/).
      * **Dockerized PostgreSQL**:
        ```bash
        docker run --name some-postgres -e POSTGRES_PASSWORD=securepassword -p 5432:5432 -d postgres
        ```
        This runs PostgreSQL in a container on a specified machine. Other machines can connect to `[IP_OF_POSTGRES_HOST]:5432`.
  * **Cloud (AWS)**: Use **Amazon RDS for PostgreSQL**. This is a managed service, meaning AWS handles setup, backups, patching, and scaling. It's not free tier eligible for continuous use but can be cost-effective for burst testing.

### 2.2 Update `MetricsStore` for PostgreSQL

Modify `metrics_store.py` to connect to PostgreSQL (or another SQL database) using SQLAlchemy.

In [None]:
# metrics_store.py (excerpt - modifications to existing file)
import os
import sqlite3 # Keep for potential local dev fallback or initial setup if needed
import json
import re
import time
import pandas as pd
import yaml
import glob
import threading
import chess.pgn
import io
import random
from datetime import datetime

# --- New Imports for SQLAlchemy and PostgreSQL ---
from sqlalchemy import create_engine, text, inspect
from sqlalchemy.exc import OperationalError, ProgrammingError
from sqlalchemy.schema import Table, MetaData, Column
from sqlalchemy import String, Integer, Float, Text, DateTime
# --------------------------------------------------

class MetricsStore:
    def __init__(self, db_url: Optional[str] = None):
        # Determine if using SQLite or a relational DB based on db_url
        if db_url is None:
            # Fallback to SQLite if no DB URL provided (e.g., for local dev/testing without external DB)
            self.db_path = "metrics/chess_metrics.db"
            os.makedirs(os.path.dirname(self.db_path), exist_ok=True)
            self.engine = create_engine(f"sqlite:///{self.db_path}", connect_args={'timeout': 30})
            self.is_sqlite = True
        else:
            self.db_url = db_url
            self.engine = create_engine(self.db_url, pool_pre_ping=True, pool_recycle=3600)
            self.is_sqlite = False

        self.lock = threading.RLock()
        self.local = threading.local()
        self.metadata = MetaData() # For SQLAlchemy reflection

        self._initialize_database()

        self.collection_active = False
        self.collection_thread = None

    def _get_connection(self):
        # Use SQLAlchemy engine for connection management
        if not hasattr(self.local, 'connection') or self.local.connection is None:
            try:
                self.local.connection = self.engine.connect()
                # For SQLite, ensure WAL and normal sync are set for better performance
                if self.is_sqlite:
                    self.local.connection.execute(text('PRAGMA journal_mode=WAL'))
                    self.local.connection.execute(text('PRAGMA synchronous=NORMAL'))
                self.local.connection.begin() # Start a transaction immediately
            except Exception as e:
                print(f"Error getting database connection: {e}")
                self.local.connection = None # Ensure it's None on failure
                raise
        return self.local.connection

    def _initialize_database(self):
        # Use SQLAlchemy's metadata and Table objects for schema definition
        # This allows for a more database-agnostic schema definition and migration
        with self._get_connection() as conn:
            # Define tables
            log_entries_table = Table(
                'log_entries', self.metadata,
                Column('id', Integer, primary_key=True, autoincrement=True),
                Column('timestamp', String),
                Column('function_name', String),
                Column('log_file', String),
                Column('message', Text),
                Column('value', Float),
                Column('label', String),
                Column('side', String),
                Column('fen', Text),
                Column('raw_text', Text, unique=True),
                Column('created_at', DateTime, default=datetime.now())
            )
            game_results_table = Table(
                'game_results', self.metadata,
                Column('id', Integer, primary_key=True, autoincrement=True),
                Column('game_id', String, unique=True),
                Column('timestamp', String),
                Column('winner', String),
                Column('game_pgn', Text),
                Column('white_player', String),
                Column('black_player', String),
                Column('game_length', Integer),
                Column('created_at', DateTime, default=datetime.now()),
                Column('white_ai_type', String),
                Column('black_ai_type', String),
                Column('white_depth', Integer),
                Column('black_depth', Integer)
            )
            config_settings_table = Table(
                'config_settings', self.metadata,
                Column('id', Integer, primary_key=True, autoincrement=True),
                Column('config_id', String, unique=True),
                Column('timestamp', String),
                Column('game_id', String),
                Column('config_data', Text),
                Column('white_engine', String),
                Column('black_engine', String),
                Column('white_depth', Integer),
                Column('black_depth', Integer),
                Column('created_at', DateTime, default=datetime.now()),
                Column('white_ai_type', String),
                Column('black_ai_type', String)
            )
            metrics_table = Table(
                'metrics', self.metadata,
                Column('id', Integer, primary_key=True, autoincrement=True),
                Column('metric_name', String),
                Column('metric_value', Float),
                Column('side', String),
                Column('function_name', String),
                Column('timestamp', String),
                Column('game_id', String),
                Column('config_id', String),
                Column('created_at', DateTime, default=datetime.now()),
                # Adding composite unique constraint via SQL, not directly in SQLAlchemy for simplicity here
            )
            move_metrics_table = Table(
                'move_metrics', self.metadata,
                Column('id', Integer, primary_key=True, autoincrement=True),
                Column('game_id', String),
                Column('move_number', Integer),
                Column('player_color', String),
                Column('move_uci', String),
                Column('fen_before', Text),
                Column('evaluation', Float),
                Column('ai_type', String),
                Column('depth', Integer),
                Column('nodes_searched', Integer),
                Column('time_taken', Float),
                Column('pv_line', Text),
                Column('created_at', DateTime, default=datetime.now()),
                # Adding composite unique constraint via SQL, not directly in SQLAlchemy for simplicity here
            )

            self.metadata.create_all(self.engine) # Creates tables if they don't exist

            # Ensure unique constraints and indices are added if not already there
            # (e.g., for `metrics` and `move_metrics` composite unique constraints)
            # For simplicity, we assume the initial CREATE TABLE statements handle these,
            # or they are added as separate DDL commands outside this init, as ALTER TABLE
            # for composite unique constraints is complex and DB-specific.

            conn.commit() # Commit changes after creating tables

    def _execute_with_retry(self, query, params=(), max_retries=5):
        # Modified to use SQLAlchemy's connection execute
        retries = 0
        while retries < max_retries:
            try:
                with self._get_connection() as conn:
                    result = conn.execute(query, params)
                    conn.commit() # Explicit commit for each operation
                    return result
            except OperationalError as e:
                if "locked" in str(e).lower() or "database is locked" in str(e): # For SQLite fallback
                    retries += 1
                    time.sleep(0.1 * (2 ** retries)) # Exponential backoff
                else:
                    raise
            except ProgrammingError as e:
                if "duplicate key value violates unique constraint" in str(e).lower() or "UNIQUE constraint failed" in str(e):
                    # Ignore duplicate inserts as per OR IGNORE behavior
                    return None
                raise
            except Exception as e:
                print(f"Unhandled error during DB operation: {e}")
                raise
        raise OperationalError("Max retries reached for query execution.")

    def add_game_result(self, game_id: str, timestamp: str, winner: str, game_pgn: str,
                        white_player: str, black_player: str, game_length: int,
                        white_ai_config: dict, black_ai_config: dict):
        # Use SQLAlchemy's text() for literal SQL or map to ORM if fully migrating
        query = text(f"""
            INSERT INTO game_results (game_id, timestamp, winner, game_pgn, white_player, black_player, game_length,
                                     white_ai_type, black_ai_type, white_depth, black_depth)
            VALUES (:game_id, :timestamp, :winner, :game_pgn, :white_player, :black_player, :game_length,
                    :white_ai_type, :black_ai_type, :white_depth, :black_depth)
            ON CONFLICT(game_id) DO NOTHING;
        """)
        params = {
            "game_id": game_id,
            "timestamp": timestamp,
            "winner": winner,
            "game_pgn": game_pgn,
            "white_player": white_player,
            "black_player": black_player,
            "game_length": game_length,
            "white_ai_type": white_ai_config.get('ai_type', 'unknown'),
            "black_ai_type": black_ai_config.get('ai_type', 'unknown'),
            "white_depth": white_ai_config.get('depth', 0),
            "black_depth": black_ai_config.get('depth', 0)
        }
        self._execute_with_retry(query, params)

    def add_move_metric(self, game_id: str, move_number: int, player_color: str,
                        move_uci: str, fen_before: str, evaluation: float,
                        ai_type: str, depth: int, nodes_searched: int,
                        time_taken: float, pv_line: str):
        query = text(f"""
            INSERT INTO move_metrics (game_id, move_number, player_color, move_uci, fen_before,
                                     evaluation, ai_type, depth, nodes_searched, time_taken, pv_line)
            VALUES (:game_id, :move_number, :player_color, :move_uci, :fen_before,
                    :evaluation, :ai_type, :depth, :nodes_searched, :time_taken, :pv_line)
            ON CONFLICT(game_id, move_number, player_color) DO NOTHING;
        """)
        params = {
            "game_id": game_id, "move_number": move_number, "player_color": player_color,
            "move_uci": move_uci, "fen_before": fen_before, "evaluation": evaluation,
            "ai_type": ai_type, "depth": depth, "nodes_searched": nodes_searched,
            "time_taken": time_taken, "pv_line": pv_line
        }
        self._execute_with_retry(query, params)

    def get_all_game_results_df(self):
        with self._get_connection() as conn:
            df = pd.read_sql_query("SELECT * FROM game_results", conn)
        return df

    def get_distinct_move_metric_names(self):
        # Query schema using inspect for dynamic column names
        inspector = inspect(self.engine)
        columns_info = inspector.get_columns('move_metrics')

        plot_eligible_types = ['REAL', 'INTEGER', 'FLOAT']
        exclude_names = ['id', 'game_id', 'move_number', 'player_color', 'move_uci', 'fen_before', 'ai_type', 'pv_line', 'created_at']

        metric_names = []
        for col in columns_info:
            col_name = col['name']
            col_type = str(col['type']).upper()

            if col_name not in exclude_names and any(eligible_type in col_type for eligible_type in plot_eligible_types):
                metric_names.append(col_name)
        return sorted(metric_names)

    def get_filtered_move_metrics(self, white_ai_types: Optional[list] = None, black_ai_types: Optional[list] = None, metric_name: Optional[str] = None):
        with self._get_connection() as conn:
            # Building WHERE clause
            where_clauses = []
            params = {}

            if white_ai_types:
                # Use parameterized query for IN clause
                white_placeholders = ', '.join([f':white_ai_type_{i}' for i in range(len(white_ai_types))])
                where_clauses.append(f"gr.white_ai_type IN ({white_placeholders})")
                params.update({f'white_ai_type_{i}': val for i, val in enumerate(white_ai_types)})

            if black_ai_types:
                black_placeholders = ', '.join([f':black_ai_type_{i}' for i in range(len(black_ai_types))])
                where_clauses.append(f"gr.black_ai_type IN ({black_placeholders})")
                params.update({f'black_ai_type_{i}': val for i, val in enumerate(black_ai_types)})

            select_columns = "mm.game_id, mm.move_number, mm.player_color, mm.move_uci, mm.fen_before, mm.created_at, mm.evaluation, mm.nodes_searched, mm.time_taken, mm.depth, mm.pv_line"
            if metric_name:
                if metric_name in self.get_distinct_move_metric_names():
                    select_columns += f", mm.{metric_name}"
                else:
                    print(f"Warning: Attempted to query invalid or non-numeric metric_name: {metric_name}")
                    return [] # Return empty if invalid metric_name

            query_str = f"""
            SELECT {select_columns}, gr.white_ai_type, gr.black_ai_type, gr.white_depth, gr.black_depth
            FROM move_metrics mm
            JOIN game_results gr ON mm.game_id = gr.game_id
            """

            if where_clauses:
                query_str += " WHERE " + " AND ".join(where_clauses)

            query_str += " ORDER BY mm.created_at"

            df = pd.read_sql_query(text(query_str), conn, params=params)
        return df.to_dict(orient='records') # Return as list of dicts for Dash

    def close(self):
        if hasattr(self.local, 'connection') and self.local.connection:
            try:
                self.local.connection.commit() # Commit any pending transactions
            except Exception as e:
                print(f"Error during final commit: {e}")
            self.local.connection.close()
            self.local.connection = None
        if self.collection_active:
            self.stop_collection()
        # Dispose of the engine if it's not SQLite to close pooled connections
        if not self.is_sqlite and self.engine:
            self.engine.dispose()

**Key Changes in `metrics_store.py`:**

  * **`__init__`**: Now accepts an optional `db_url`. If `None`, it defaults to SQLite. Otherwise, it uses `sqlalchemy.create_engine` to connect to the specified database (e.g., PostgreSQL).
  * **`_initialize_database`**: Uses `sqlalchemy.schema.Table` and `MetaData.create_all` to define and create tables in a database-agnostic way. This automatically handles table creation and column additions more gracefully than raw SQL `ALTER TABLE` for cross-DB compatibility.
  * **`_get_connection`**: Manages connections using SQLAlchemy's connection pool. It also starts a transaction, which is critical for concurrent database operations.
  * **`_execute_with_retry`**: Adapted to use SQLAlchemy's connection execution, adding `ON CONFLICT DO NOTHING` for `INSERT` statements to replicate SQLite's `INSERT OR IGNORE` behavior and handle concurrency.
  * **`add_game_result` / `add_move_metric`**: Updated to use SQLAlchemy's `text()` construct for SQL queries with named parameters, improving readability and security against SQL injection.
  * **`get_distinct_move_metric_names`**: Now uses SQLAlchemy's `inspect` module to dynamically query the schema of the `move_metrics` table, ensuring it only presents valid numeric columns for plotting.
  * **`get_filtered_move_metrics`**: Enhanced to use SQLAlchemy's `text()` and parameterized queries for dynamic filtering based on AI types and selected metrics. It explicitly joins `move_metrics` with `game_results` to access AI configuration details. Returns a list of dictionaries for easier Dash consumption.
  * **`close`**: Ensures proper disposal of the SQLAlchemy engine if an external database is used.

**Action for You:**

  * You must modify `local_metrics_dashboard.py` to use the new `db_url` parameter when initializing `MetricsStore` if you want to use PostgreSQL. E.g., `metrics_store = MetricsStore(db_url="postgresql://chessuser:securepassword@localhost:5432/chess_metrics_db")`. If left as `MetricsStore()`, it will continue using SQLite.

## Step 3: Local Distributed Setup with Kubernetes (K3s/MicroK8s)

For your Windows PCs, K3s or MicroK8s are great lightweight options for a local Kubernetes cluster. This will allow you to deploy your Dockerized chess engine workers.

### 3.1 Install K3s (Recommended for simplicity on Windows)

  * **Install WSL 2**: Follow Microsoft's guide to install WSL 2 on all your Windows 10/11 machines.
  * **Install K3s on Windows (via WSL 2)**:
    1.  On each Windows machine, open your WSL 2 Linux distribution (e.g., Ubuntu).
    2.  Run the K3s installation script:
        ```bash
        curl -sfL https://get.k3s.io | sh -
        ```
    3.  **One machine as Master**: On one machine (e.g., your i9 desktop), set it as the master node. Get its token:
        ```bash
        sudo cat /var/lib/rancher/k3s/server/node-token
        ```
        And its IP address in WSL2: `ip a | grep eth0`
    4.  **Other machines as Agents**: On other Windows machines, join them as agent nodes to the master. Replace `MASTER_IP` and `NODE_TOKEN`:
        ```bash
        curl -sfL https://get.k3s.io | K3S_URL=https://MASTER_IP:6443 K3S_TOKEN=NODE_TOKEN sh -
        ```
  * **Verify Cluster**: On the master node, check if nodes are ready:
    ```bash
    kubectl get nodes
    ```

### 3.2 Kubernetes Deployment Files

You'll define Kubernetes objects (YAML files) to tell your cluster how to run your chess engine.

#### `chess-engine-worker.yaml` (Kubernetes Job)

This defines a `Job` that runs your chess engine container. Each `Job` will run a game (or a batch of games) and then exit.

```yaml
# chess-engine-worker.yaml
apiVersion: batch/v1
kind: Job
metadata:
  name: chess-engine-worker-{{ .Release.Time.Seconds }} # Unique name for each job instance
spec:
  template:
    metadata:
      labels:
        app: chess-engine-worker
    spec:
      restartPolicy: Never # Crucial for Jobs: container completes and does not restart
      containers:
      - name: viper-chess-engine
        image: viper-chess-engine:latest # Use the Docker image you built
        imagePullPolicy: Never # Use local image if not pulling from registry
        env:
          - name: VIPER_DB_HOST
            value: "192.168.1.100" # Replace with your PostgreSQL host IP (e.g., your i9 desktop's LAN IP)
          - name: VIPER_DB_PORT
            value: "5432"
          - name: VIPER_DB_USER
            value: "chessuser"
          - name: VIPER_DB_PASSWORD
            valueFrom: # Best practice: use Kubernetes Secret for passwords
              secretKeyRef:
                name: chess-db-secret
                key: db-password
          - name: VIPER_DB_NAME
            value: "chess_metrics_db"
          - name: STOCKFISH_EXEC_PATH # Path inside the container (from Dockerfile)
            value: "/app/engine_utilities/external_engines/stockfish/stockfish-windows-x86-64-avx2.exe"
          # Pass AI config dynamically (example: override for this specific job)
          # These values would override defaults in config.yaml for this run
          # Or, modify chess_game.py to read these from env vars directly
          - name: WHITE_AI_TYPE
            value: "deepsearch"
          - name: BLACK_AI_TYPE
            value: "stockfish"
          - name: AI_GAME_COUNT
            value: "10" # Play 10 games per job instance
        volumeMounts:
          - name: stockfish-volume
            mountPath: /app/engine_utilities/external_engines/stockfish # Mount Stockfish binary
        resources:
          requests: # Request minimum resources
            cpu: "500m" # 0.5 CPU core
            memory: "1Gi" # 1 GB RAM
          limits: # Set maximum resources
            cpu: "2" # 2 CPU cores
            memory: "4Gi" # 4 GB RAM
      volumes:
        - name: stockfish-volume
          hostPath:
            path: C:\Users\patss\OneDrive\Documents\Programming\ViperChessEngine\v7p3r_chess_bot_simple\engine_utilities\external_engines\stockfish # Host path to Stockfish folder
            type: Directory # For Windows host paths

```

**Explanation for `chess-engine-worker.yaml`:**

  * **`apiVersion: batch/v1`, `kind: Job`**: Defines this as a Kubernetes Job.
  * **`restartPolicy: Never`**: Important for Jobs; the container exits after completing its task and is not restarted.
  * **`image: viper-chess-engine:latest`**: Refers to the Docker image you built.
  * **`imagePullPolicy: Never`**: Tells Kubernetes to not try to pull the image from a remote registry, assuming it's available locally on each node (you'll need to `docker load` or `docker push/pull` it if nodes don't share Docker daemon).
  * **`env`**: Environment variables are passed to the container. This is how you'll pass database connection details and potentially override AI configurations for specific test runs.
  * **`volumeMounts` / `volumes`**: This is how you'll make your Stockfish executable available *inside* the container. `hostPath` is used to mount a directory from the host machine directly into the container. **Adjust `path` to your exact Stockfish directory on your Windows hosts.**
  * **`resources`**: Critical for resource allocation. You can specify CPU (e.g., "500m" for 0.5 CPU core, "2" for 2 CPU cores) and memory limits. This helps Kubernetes schedule jobs efficiently on your diverse hardware.

#### `chess-db-secret.yaml` (Kubernetes Secret - for passwords)

```yaml
# chess-db-secret.yaml
apiVersion: v1
kind: Secret
metadata:
  name: chess-db-secret
type: Opaque
data:
  db-password: YXNlY3VyZXBhc3N3b3Jk # Base64 encoded 'securepassword'
```

**To create the secret**:

1.  Base64 encode your actual password: `echo -n "your_db_password" | base64`
2.  Replace `YXNlY3VyZXBhc3N3b3Jk` with your encoded password.
3.  Apply the secret: `kubectl apply -f chess-db-secret.yaml`

### 3.3 Running Jobs on Kubernetes

To run a batch of games:

```bash
kubectl create -f chess-engine-worker.yaml
```

You can scale up the number of parallel workers by increasing the `parallelism` field in the Job spec, or by creating multiple `Job` instances programmatically (e.g., using a Python script or a CI/CD pipeline).

## Step 4: Cloud Deployment Option (AWS)

For scaling beyond your home network or for continuous, high-volume testing, AWS is a powerful option.

### 4.1 AWS Services Overview

  * **Amazon EC2 (Elastic Compute Cloud)**: Virtual servers (VMs) where you can run your Docker containers. You'd choose instance types with powerful GPUs for ML training or high-CPU instances for search-intensive work.
      * **Pricing**: Pay-as-you-go per second. Spot Instances offer significant discounts for fault-tolerant workloads.
  * **Amazon ECS (Elastic Container Service) / EKS (Elastic Kubernetes Service)**: Managed container orchestration services. ECS is simpler for basic Docker deployments, while EKS provides a fully managed Kubernetes experience.
      * **AWS Fargate**: A compute engine for ECS/EKS that lets you run containers without managing servers. You only pay for the compute resources your containers use. Ideal for short-lived, burstable jobs like game simulations.
  * **Amazon RDS (Relational Database Service)**: Managed relational databases (PostgreSQL, MySQL, etc.). Simplifies database administration (backups, scaling, patching).
      * **Pricing**: Instance-based, plus storage and I/O.
  * **Amazon S3 (Simple Storage Service)**: Object storage for large files (PGNs, model checkpoints, raw log data).
  * **AWS CloudWatch**: Monitoring and logging for all your AWS resources. Essential for observing resource usage, performance, and errors.
  * **AWS Budgets**: Set custom cost alerts to monitor your spending and avoid surprises.

### 4.2 AWS Deployment Strategy (Fargate for Workers, RDS for DB)

This is a good starting point for a cost-effective and managed deployment.

1.  **Set up Amazon RDS for PostgreSQL**:
      * Go to the RDS console.
      * Create a new PostgreSQL database instance. Choose a small instance size (e.g., `db.t3.micro` or `db.t4g.micro`) for low-cost testing.
      * Configure public accessibility if accessing from local network for testing (disable for production for security).
      * Note the endpoint (hostname), port, username, and password. Update your `MetricsStore`'s `db_url` and Dockerfile environment variables accordingly.
2.  **Push Docker Image to Amazon ECR (Elastic Container Registry)**:
      * Create a repository in ECR.
      * Follow ECR's push commands to push your `viper-chess-engine` Docker image:
        ```bash
        aws ecr get-login-password --region <your-region> | docker login --username AWS --password-stdin <your-account-id>.dkr.ecr.<your-region>.amazonaws.com
        docker tag viper-chess-engine:latest <your-account-id>.dkr.ecr.<your-region>.amazonaws.com/viper-chess-engine:latest
        docker push <your-account-id>.dkr.ecr.<your-region>.amazonaws.com/viper-chess-engine:latest
        ```
3.  **Create an Amazon ECS Cluster**:
      * In the ECS console, create a new ECS Cluster (choose Fargate as the capacity provider).
4.  **Define an ECS Task Definition**:
      * This specifies your Docker image, CPU/memory limits, environment variables (for DB connection), and volume mounts (if using EFS for shared data, though direct DB writes are preferred).
      * For Stockfish executable, you'd either bundle it in the Docker image (which is what we set up) or mount it from EFS (if you need a shared, persistent file system across tasks, less common for binaries).
      * Example task definition structure:
        ```json
        {
          "family": "viper-chess-engine-task",
          "networkMode": "awsvpc",
          "cpu": "1024",
          "memory": "2048",
          "executionRoleArn": "arn:aws:iam::ACCOUNT_ID:role/ecsTaskExecutionRole",
          "containerDefinitions": [
            {
              "name": "viper-chess-engine-worker",
              "image": "ACCOUNT_ID.dkr.ecr.REGION.amazonaws.com/viper-chess-engine:latest",
              "essential": true,
              "command": ["python", "chess_game.py"],
              "environment": [
                {"name": "VIPER_DB_HOST", "value": "your-rds-endpoint"},
                {"name": "VIPER_DB_PORT", "value": "5432"},
                {"name": "VIPER_DB_USER", "value": "chessuser"},
                {"name": "VIPER_DB_PASSWORD", "valueFrom": {"secretManagerSecretArn": "arn:aws:secretsmanager:REGION:ACCOUNT_ID:secret:your-db-secret"}}, # Use Secrets Manager
                {"name": "VIPER_DB_NAME", "value": "chess_metrics_db"},
                {"name": "STOCKFISH_EXEC_PATH", "value": "/app/engine_utilities/external_engines/stockfish/stockfish-windows-x86-64-avx2.exe"},
                {"name": "AI_GAME_COUNT", "value": "100"} # Run 100 games per task
              ],
              "logConfiguration": {
                "logDriver": "awslogs",
                "options": {
                  "awslogs-group": "/ecs/viper-chess-engine",
                  "awslogs-region": "REGION",
                  "awslogs-stream-prefix": "ecs"
                }
              }
            }
          ]
        }
        ```
5.  **Run ECS Tasks**:
      * In the ECS console, go to your cluster and select "Run new Task".
      * Specify your task definition and the number of tasks to run (e.g., 100 tasks to run 10,000 games if each task plays 100 games).
      * **Programmatic Execution**: For large-scale testing, you'd write a script (e.g., a Python script or AWS Lambda function) that uses the AWS SDK (Boto3) to launch many ECS tasks in parallel, possibly varying `AI_GAME_COUNT` or other AI parameters for A/B testing.
6.  **Deploy the Dashboard (Streamlit Cloud or AWS App Runner)**:
      * **Streamlit Cloud**: Easiest way. Connect your GitHub repo containing `local_metrics_dashboard.py` and `metrics_store.py`. Ensure your `metrics_store.py` connects to your **publicly accessible** AWS RDS endpoint. (Be mindful of security implications if exposing your DB publicly.)
      * **AWS App Runner**: A fully managed service for deploying web applications and APIs. You provide your source code or a Docker image, and App Runner handles infrastructure. This would be a more robust and secure way to host your dashboard in AWS.
7.  **Monitoring and Cost Management (CloudWatch, Budgets)**:
      * **CloudWatch**: Monitor CPU, memory, and network usage of your ECS tasks. Set up alarms for high resource utilization or task failures. Stream container logs to CloudWatch Logs for debugging.
      * **AWS Budgets**: Set up monthly budgets for your EC2/Fargate and RDS costs. Configure alerts (email/SNS) when your actual or forecasted spend approaches your budget limit. This is crucial for controlling costs in the cloud.

### 4.3 Key Considerations for Your Hardware

  * **Mixed Hardware**: Your diverse set of Windows PCs (i9, i5, older desktop, laptop, mini-PCs) can all serve as Docker hosts and Kubernetes nodes. The `requests` and `limits` in Kubernetes YAMLs are important to ensure tasks are scheduled on nodes with sufficient resources.
  * **GPU Utilization**: For future ML integration, direct GPU access in Kubernetes (especially on Windows) is complex.
      * **Local Kubernetes**: Requires specific drivers and GPU-aware Docker runtimes (e.g., NVIDIA Container Toolkit) on each node. K3s has some experimental support for GPU passthrough.
      * **Cloud (AWS)**: EC2 instances offer GPU instances (e.g., `p3`, `g4dn`, `g5` families) that come with NVIDIA drivers pre-installed. Fargate generally does not offer GPU instances, so you'd use EC2 instances managed by ECS/EKS for GPU workloads.
  * **Storage**: For local Kubernetes, you might need a Network File System (NFS) or SMB share if your workers need shared file access (e.g., for large datasets for ML training). For metrics, stick to the central database.
  * **Network (LAN)**: Ensure reliable network connectivity between all your local PCs and the PostgreSQL host. Hardwiring (Gigabit Ethernet) is always preferred over Wi-Fi for stability and performance.

This detailed guide provides a roadmap for your distributed computing setup.

-----

# 2\. Machine Learning Engine Creation and Enhancements How-To Guide

This guide outlines how to build four additional AI types for your Viper Chess Engine, leveraging machine learning techniques and integrating them with your existing evaluation logic and the overall `chess_game.py` framework. This will allow your AI to learn from data and improve its play, and be compatible with external chess GUIs.

## Overview of New AI Types

We will define how to implement the following AI types:

1.  **Viper NN AI Engine (Supervised Learning)**: Learns moves from existing human-played PGN data.
2.  **Viper Genetic AI Engine**: Evolves evaluation parameters using genetic algorithms, driven by performance against other AIs.
3.  **Viper Reinforcement AI Engine**: Learns through self-play by receiving rewards/penalties from game outcomes and evaluation scores.
4.  **Viper Hybrid NN-Search Engine**: Integrates a neural network directly into your existing search algorithms (e.g., for leaf node evaluation or move ordering).

## Common Integration Points for New AI Types

All new AI types will need to conform to an interface similar to your existing `EvaluationEngine` and `StockfishHandler` to be plug-and-play within `chess_game.py`.

### 2.0 General Structure for a New AI Engine

Each new AI engine type will primarily exist as a new Python class, similar to `EvaluationEngine` and `StockfishHandler`.

It will need:

  * **`__init__(self, board: chess.Board, player: chess.Color, ai_config: dict)`**: Initializes the engine with the current board, player color, and specific AI configuration (from `config.yaml`).
  * **`search(self, board: chess.Board, player: chess.Color, ai_config: dict, stop_callback=None)`**: This is the primary method called by `chess_game.py` to get a move. It should return a `chess.Move` object.
  * **`evaluate_position_from_perspective(self, board: chess.Board, player: chess.Color)`**: Returns a numerical evaluation of the board from the given player's perspective. This is crucial for the dashboard and for internal reward/penalty functions.
  * **`reset(self, board: chess.Board)`**: Resets the AI's internal state for a new game.
  * **Configuration in `config.yaml`**: Add a new `ai_type` and `engine` entry, along with any specific parameters for that AI (e.g., `model_path`, `mutation_rate`).

## 2.1 Viper NN AI Engine (Supervised Learning)

This AI learns to play by mimicking moves from human-played games (your PGN data).

### 2.1.1 Core Components (Based on `v7p3r_chess_ai_old_2025-05-31/train.py` and `chess_core.py`)

  * **`chess_core.py` (or similar)**:
      * **`ChessDataset(Dataset)`**: A PyTorch `Dataset` that reads your PGN files. It converts chess board positions into a numerical tensor representation (e.g., a 12x8x8 tensor representing pieces on squares) and associates them with the UCI string of the move played by "you" (the specified `username` in the PGN header).
      * **`board_to_tensor(board)`**: A function within `ChessDataset` (or as a standalone utility) that transforms a `chess.Board` object into the numerical input tensor for the neural network.
      * **`ChessAI(nn.Module)`**: Your neural network architecture. The provided example uses convolutional layers for spatial patterns and fully connected layers for move prediction. It outputs logits for each possible move.

In [None]:
# Simplified ChessAI (from provided files)
        import torch.nn as nn
        import torch.nn.functional as F

        class ChessAI(nn.Module):
            def __init__(self, num_classes): # num_classes is the size of your move vocabulary
                super().__init__()
                self.conv1 = nn.Conv2d(12, 64, kernel_size=3, padding=1)
                self.conv2 = nn.Conv2d(64, 128, kernel_size=3, padding=1)
                self.conv3 = nn.Conv2d(128, 64, kernel_size=3, padding=1)
                self.fc1 = nn.Linear(64 * 8 * 8, 512)
                self.fc2 = nn.Linear(512, 256)
                self.fc3 = nn.Linear(256, num_classes) # Output logits for each move
                # Optionally, a value head for evaluation (like Stockfish NNUE)
                self.value_fc1 = nn.Linear(64 * 8 * 8, 256)
                self.value_fc2 = nn.Linear(256, 1)

            def forward(self, x):
                x = F.relu(self.conv1(x))
                x = F.relu(self.conv2(x))
                x = F.relu(self.conv3(x))
                x = x.view(x.size(0), -1) # Flatten for FC layers

                # Policy head (move prediction)
                policy_output = F.relu(self.fc1(x))
                policy_output = F.relu(self.fc2(policy_output))
                policy_output = self.fc3(policy_output)

                # Value head (position evaluation)
                value_output = F.relu(self.value_fc1(x))
                value_output = self.value_fc2(value_output)

                return policy_output, value_output # Return both for supervised learning

* **`train.py` (or similar)**:
      * **`MoveEncoder`**: Maps unique UCI move strings to integer indices and vice-versa. This is essential for the NN's output layer and for converting predicted move indices back to `chess.Move` objects.
      * **Training Loop**: Loads your PGNs, converts them into a dataset using `ChessDataset`, and trains the `ChessAI` model using a supervised learning approach (e.g., `nn.CrossEntropyLoss` for move prediction and `nn.MSELoss` for value prediction).
          * **Objective**: Minimize the difference between predicted moves/evaluations and the actual moves/evaluations from your PGNs.
      * **Saving/Loading Model**: Saves the trained `state_dict` of the `ChessAI` model (e.g., as `v7p3r_chess_ai_model.pth`).

### 2.1.2 Integration into `chess_game.py`

1.  **Load Model and Encoder**: In your `ChessGame.__init__()`, load the trained NN model and `MoveEncoder`.

In [None]:
# In ChessGame.__init__()
    # ... other initializations ...
    self.move_encoder = MoveEncoder() # Assuming MoveEncoder class is available

    # Load move vocabulary (mapping of moves to indices)
    with open("move_vocab.pkl", "rb") as f: # You need to create this file during training
        self.move_to_index = pickle.load(f)
        self.index_to_move = {v: k for k, v in self.move_to_index.items()} # Reverse map

    self.device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
    self.nn_model = ChessAI(num_classes=len(self.move_to_index)).to(self.device)
    self.nn_model.load_state_dict(
        torch.load("v7p3r_chess_ai_model.pth", map_location=self.device, weights_only=False)
    )
    self.nn_model.eval() # Set model to evaluation mode
    # ...

2.  **New AI Type in `process_ai_move`**: Add a new `elif` block in `process_ai_move` to handle `ai_type == 'viper_nn'`.

In [None]:
# In EvaluationEngine.search (or where AI decision is made in chess_game.py)
    elif self.ai_type == 'viper_nn_ai_engine': # New AI type name
        with torch.no_grad(): # No gradients needed for inference
            # Convert board to tensor
            board_tensor = board_to_tensor(self.board).unsqueeze(0).to(self.device)
            policy_logits, value_eval = self.nn_model(board_tensor)

            # For move selection, choose the move with the highest logit
            # You might want to add some exploration (epsilon-greedy) for training
            probabilities = F.softmax(policy_logits, dim=1).squeeze(0).cpu().numpy()

            # Filter for legal moves only
            legal_moves = list(self.board.legal_moves)
            legal_move_indices = [self.move_to_index[m.uci()] for m in legal_moves if m.uci() in self.move_to_index]

            # Create a probability distribution over legal moves
            legal_probabilities = np.zeros_like(probabilities)
            if legal_move_indices:
                legal_probabilities[legal_move_indices] = probabilities[legal_move_indices]
                legal_probabilities /= legal_probabilities.sum() # Normalize

            if len(legal_moves) > 0 and legal_probabilities.sum() > 0:
                # Choose move based on probabilities
                chosen_idx = np.random.choice(len(self.index_to_move), p=legal_probabilities)
                chosen_move_uci = self.index_to_move[chosen_idx]
                ai_move = chess.Move.from_uci(chosen_move_uci)
                if not self.board.is_legal(ai_move): # Fallback if chosen move is somehow illegal
                    ai_move = random.choice(legal_moves)
            else: # If no legal moves or no valid moves from NN
                ai_move = random.choice(list(self.board.legal_moves)) # Fallback to random

            self.current_eval = value_eval.item() # Update current_eval from NN's value head

## 2.2 Viper Genetic AI Engine

This AI learns by evolving its evaluation parameters through genetic algorithms, with fitness determined by playing games using your existing evaluation logic.

### 2.2.1 Core Components (Based on `v7p3r_chess_ai_old_2025-05-31/genetic_algorithm.py` and `config.yaml` rulesets)

  * **`GeneticAlgorithm` Class**: From `v7p3r_chess_ai_old_2025-05-31/genetic_algorithm.py`.
      * **`initialize_population(model_template)`**: Creates an initial population of AI "models" (which are essentially sets of `config.yaml` evaluation parameters). Each "model" is a copy of your `EvaluationEngine` but with randomized `genetic_params` (e.g., `material_weight`, `center_control_bonus`, `king_safety_bonus`, etc.).
      * **`evaluate_fitness(model, games)`**: This is where your existing `EvaluationEngine` shines. Each `model` (set of parameters) is used to play games against a benchmark (e.g., your current best Viper, Stockfish, or even other genetic AIs). The fitness score for a `model` is derived from its game results (win/loss/draw, average evaluation deltas).
          * **Reward Function**: Use your existing `EvaluationEngine.evaluate_position_from_perspective()` output as a granular reward signal at each step, in addition to terminal game results. For example:
              * Win: +1000 points
              * Draw: +500 points
              * Loss: -1000 points
              * Evaluation gain from previous move: +X points (where X is the eval difference)
              * Evaluation loss from previous move: -Y points
      * **`select_parents()`**: Chooses fitter individuals from the population (e.g., roulette wheel selection, tournament selection).
      * **`crossover(parent1, parent2)`**: Combines parameters from two parent models to create offspring.
      * **`mutate(model)`**: Randomly alters some parameters of an offspring based on `mutation_rate` (e.g., slightly adjust a `checkmate_bonus` or `pst_weight`).

### 2.2.2 Integration and Training Workflow

1.  **Define Genetic Parameters**: Explicitly list which parameters from your `config.yaml` rulesets (`default_evaluation`, `aggressive_evaluation`, etc.) will be subject to genetic evolution. These become the "genes" of your AI.
2.  **Training Script (`train_genetic.py` - New File)**:
      * Initialize `GeneticAlgorithm`.
      * Load your `EvaluationEngine` (the current one) as the "template" for parameter structure.
      * Implement the main genetic loop:
          * `initialize_population()`
          * For each generation:
              * For each `model` in the population:
                  * Play `N` games using this `model`'s parameters.
                  * Use your `chess_game.py` (which now supports loading external AI configs) to play games.
                  * After each game, calculate `fitness` based on results and `EvaluationEngine`'s per-move scores.
              * `select_parents()`
              * `crossover()` and `mutate()` to create the next generation.
          * Track and log the best performing individual (its parameters and fitness) over generations.
      * **Store Best Parameters**: Save the `config.yaml` snippet of the best-performing genetic AI.

### 2.2.3 Integration into `chess_game.py`

1.  **New AI Type**: Add `ai_type: genetic_ai_engine` to your `ai_types` list in `config.yaml`.
2.  **Dynamic Config Loading**: In `ChessGame._initialize_ai_engines()`, when `engine == 'Viper'` and `ai_type == 'genetic_ai_engine'`, load a saved genetic parameter set (e.g., from a specific YAML file `best_genetic_config.yaml`) and apply it to the `EvaluationEngine` instance. This means `EvaluationEngine` might need a method to `load_ruleset(config_dict)`.

## 2.3 Viper Reinforcement AI Engine

This AI learns through self-play and trial-and-error, using your evaluation scores as immediate rewards.

### 2.3.1 Core Concepts (Based on `reinforcement_dot_ai.py` concepts and `self_play.py`)

  * **State Representation**: Convert `chess.Board` into a state that an RL agent can understand (e.g., the same tensor representation as for the NN AI).
  * **Action Space**: The set of all legal moves from a given state. You'll need `move_to_index`/`index_to_move` from your NN AI part.
  * **Reward Function**: This is where your existing evaluation logic is invaluable.
      * **Immediate Rewards**: The change in `EvaluationEngine.evaluate_position_from_perspective()` score after making a move. A positive change is a reward, a negative change is a penalty.
      * **Terminal Rewards**: Large positive reward for winning (checkmate), a large negative reward for losing (checkmated), and a neutral reward for draws.
  * **RL Algorithm**:
      * **Q-Learning / SARSA**: A tabular Q-learning approach (like in `reinforcement_dot_ai.py`) is feasible for very small board states or simplified chess (e.g., endgames with few pieces). For full chess, a neural network is required to approximate the Q-function (Deep Q-Networks - DQN).
      * **Policy Gradients (REINFORCE, Actor-Critic)**: More suitable for complex games like chess, where the network learns a policy (probability distribution over moves) directly.
  * **Self-Play Loop**: Your `self_play.py` provides the foundation.
      * An agent plays against itself (or a copy of itself).
      * At each step, the agent chooses a move (with some exploration - epsilon-greedy).
      * The environment (the chess board) updates, and a reward is calculated based on the immediate evaluation change.
      * The experience (state, action, reward, next\_state) is stored.
      * The agent updates its policy/Q-values based on these experiences.

### 2.3.2 Integration and Training Workflow

1.  **Define RL Environment (`chess_env.py` - New File)**:
      * A class that wraps `chess.Board`.
      * `reset()`: Resets the board to a starting position.
      * `step(action)`: Takes a `chess.Move` (action), applies it, returns `(new_state, reward, done, info)`.
          * `new_state`: Tensor representation of the new board.
          * `reward`: Calculated using `EvaluationEngine.evaluate_position_from_perspective()` changes and terminal game results.
          * `done`: Boolean indicating if the game is over.
          * `info`: Additional data (e.g., legal moves, actual evaluation).
      * `get_legal_actions()`: Returns a list of legal moves (or their indices).
2.  **RL Agent (`rl_agent.py` - New File)**:
      * Contains the `ChessAI` (NN) from your supervised learning part, but now trained with an RL objective.
      * Implements `choose_move(state, legal_moves)` (with exploration).
      * Implements `learn(experience)`: Updates the NN's weights based on experience.
3.  **Training Script (`train_rl.py` - New File)**:
      * Main loop for self-play episodes.
      * Each episode, instantiate `chess_env` and two `rl_agent`s (or one agent playing against itself).
      * Generate games, collect experiences, and train the agent(s).
      * **Save/Load Model**: Save the trained `state_dict` of the RL agent's `ChessAI` model.

### 2.3.3 Integration into `chess_game.py`

1.  **New AI Type**: Add `ai_type: reinforcement_ai_engine` to `config.yaml`.
2.  **Load RL Model**: In `ChessGame._initialize_ai_engines()`, load the saved RL model and `MoveEncoder`.
3.  **Use RL Agent for `search`**: The `search` method for this AI type would involve feeding the current board state into the loaded RL agent's NN to predict a move.

## 2.4 V7P3R Chess AI Engine (Old NN Approach)

This refers to your original Neural Network AI that trains specifically off *your* PGNs to attempt to play like you. The core components are very similar to the "Viper NN AI Engine (Supervised Learning)" described in 2.1, but with a stronger emphasis on filtering and training on your own games.

### 2.4.1 Key Components and Differences

  * **`chess_core.py` (specifically `ChessDataset`)**: Ensure your `ChessDataset` can filter PGNs to include only games where "v7p3r" (or your specific username) was a player. This is explicitly handled in your `filter_v7p3r_games` function in `train.py`.
  * **`train.py`**: This script would be very similar to the supervised learning train script, but the dataset preparation would specifically filter for your games.
  * **Objective**: The primary objective is accurate move prediction based on *your* playstyle, not necessarily absolute chess strength (though it can correlate).

### 2.4.2 Integration into `chess_game.py`

This is largely identical to the "Viper NN AI Engine" integration (2.1.2), but you would set `ai_type: v7p3r_nn_ai_engine` and potentially use a different model file path in `config.yaml`.

## 2.5 Viper Hybrid NN-Search Engine (Additional ML/Eval Strategy)

This approach combines the strengths of traditional search algorithms (like your refined Minimax/Negamax with alpha-beta pruning) with the pattern recognition capabilities of a Neural Network. This is the foundation of modern top-tier chess engines (like Stockfish + NNUE).

### 2.5.1 Core Idea

Instead of solely using your hand-crafted evaluation function (`_calculate_score`) at the leaf nodes of your search tree, you would use a trained Neural Network for evaluation.

### 2.5.2 Components and Integration

1.  **Trained Neural Network**: You'll need a pre-trained `ChessAI` (or a similar lightweight NN architecture focused on evaluation) from your "Viper NN AI Engine" or a dedicated evaluation-focused NN. This NN's `value_output` will be its primary output.
2.  **Modify `EvaluationEngine.evaluate_position()`**:
      * In your `evaluation_engine.py`, find `evaluate_position()` or `_calculate_score()`.
      * Introduce a conditional check: If the AI type is "Viper Hybrid NN-Search", and a NN model is loaded, use the NN's `value_output` for evaluation. Otherwise, fall back to your existing rule-based evaluation.
    <!-- end list -->

In [None]:
# In EvaluationEngine (or a new HybridEvaluationEngine class)
    # Assume self.nn_model and self.device are initialized if using this AI type
    def evaluate_position_hybrid(self, board: chess.Board):
        if self.ai_type == 'viper_hybrid_nn_search' and hasattr(self, 'nn_model') and self.nn_model:
            with torch.no_grad():
                board_tensor = board_to_tensor(board).unsqueeze(0).to(self.device)
                _, value_eval = self.nn_model(board_tensor) # Only interested in value_eval
                return value_eval.item() # Return the numerical evaluation

        # Fallback to traditional evaluation for other AI types or if NN not loaded
        return self.evaluate_position(board) # Your existing rule-based evaluation

3.  **Search Algorithm Integration**: Your `_deep_search`, `_minimax_search`, `_negamax_search`, and `_negascout` functions (in `evaluation_engine.py`) would then call `evaluate_position_hybrid` (or directly use `self.nn_model` if deeply integrated) at their leaf nodes (when `depth == 0` or game is over).

### 2.5.3 Training the Hybrid NN

  * The NN component of the hybrid engine would be trained using supervised learning, similar to the "Viper NN AI Engine," but specifically focused on producing accurate positional evaluations.
  * **Data Sources**: You can use:
      * Your own PGNs (extract board states and their final game results, or use deep search evaluations as target values).
      * Lichess datasets (millions of games with Stockfish evaluations often included).
  * **Reward Function**: The network learns to predict an accurate evaluation score for a position.

## 2.6 GUI and Lichess Bot Compatibility

The current `chess_game.py` and `stockfish_handler.py` are designed with the UCI protocol in mind. This is the key to compatibility.

  * **UCI Compatibility**: The UCI protocol is the standard interface for chess engines. Any AI you develop (Viper NN, Genetic, Reinforcement, Hybrid) needs to expose a UCI-like interface. Your existing `EvaluationEngine.search()` and `evaluate_position_from_perspective()` methods fulfill this role within your current `chess_game.py` structure.
      * If you were to create a standalone UCI interface (like `uci_interface.py` from old projects), it would simply instantiate your chosen AI engine and handle UCI commands (`position`, `go`, `isready`, etc.) by calling the engine's `search` and `evaluate_position_from_perspective` methods.
  * **Lichess Bot Deployment**: Your `lichess_bot.py` already uses `chess.engine` and `requests` to interact with the Lichess API.
      * The `LichessBot` class takes an `engine` parameter during initialization. You would simply pass an instance of your new AI engine class (e.g., `LichessBot(token, engine=ViperNNAIEngine(board, color, config))`) instead of `EvaluationEngine` directly.
      * As long as your new AI classes implement the `search()` and `evaluate_position_from_perspective()` methods (or adapt to what the `LichessBot` expects for move generation and evaluation), integration should be straightforward.
  * **Nibbler and Other GUIs**: These GUIs expect a UCI-compatible executable.
      * You would use `PyInstaller` (as hinted in `package_exe.py`) to create a standalone executable for a dedicated UCI interface script. This script would act as a wrapper, taking UCI commands from the GUI and passing them to your chosen AI engine internally.
      * The `package_uci_engine` function in `package_exe.py` provides a good starting point for creating an executable for a UCI interface for your custom engines.

These two comprehensive guides should provide you with the detailed steps and conceptual understanding needed to implement your distributed computing setup and advanced AI engines. Good luck with your exciting experiments\!

<div class="md-recitation">
  Sources
  <ol>
  <li><a href="https://github.com/brunomgv/api-gcp-secret-manager">https://github.com/brunomgv/api-gcp-secret-manager</a></li>
  <li><a href="https://www.screenshotapi.net/blog/deploying-a-playwright-container-using-docker-in-aws">https://www.screenshotapi.net/blog/deploying-a-playwright-container-using-docker-in-aws</a></li>
  <li><a href="https://github.com/Wegatriespython/Chess-RL">https://github.com/Wegatriespython/Chess-RL</a></li>
  <li><a href="https://github.com/askvyas/Monocular_Depth_Estimation">https://github.com/askvyas/Monocular_Depth_Estimation</a></li>
  <li><a href="https://github.com/TTitcombe/DQN">https://github.com/TTitcombe/DQN</a></li>
  </ol>
</div>