# QCML Client Demo: End-to-End Training with the QCML API

This notebook demonstrates the complete QCML workflow:
- 🔍 **Resource Discovery**: Finding available environments and experiments
- 📁 **Dataset Management**: Loading and uploading training data
- ⚙️ **Model Configuration**: Setting up hyperparameters for quantum models
- 🚀 **Training Experiments**: Running training with real-time monitoring
- 📊 **Results Analysis**: Viewing training metrics and outcomes

**Prerequisites:**
- QCOG API key set in environment or `.env` file
- Access to a QCML project
- Test dataset (we'll create one!)

## 📦 Setup and Imports

First, let's import all the necessary libraries and set up our clients.


In [1]:
import qcogclient
import qcogclient._version
from qcogclient.qcog.adapters import load_csv
from qcog_types.pytorch_models.hyperparameters import (
    GeneralHSModelHyperparameters,
    OptimizerConfig,
    LossFunctionConfig,
    EarlyStoppingConfig
)
import dotenv
import os
import pandas as pd
import numpy as np

# Load environment variables
dotenv.load_dotenv()

print("✅ All imports successful!")


✅ All imports successful!


We'll create three different clients for different operations:
- **AdminClient**: For discovering system resources
- **ProjectClient**: For managing datasets
- **ExperimentClient**: For running training experiments

In [2]:
# Initialize clients with API key from environment
api_key = os.getenv("QCOG_API_KEY")
if not api_key:
    raise ValueError("❌ QCOG_API_KEY not found in environment variables!")

admin_client = qcogclient.AdminClient(api_key=api_key)
project_client = qcogclient.ProjectClient(api_key=api_key)
experiment_client = qcogclient.ExperimentClient(api_key=api_key)

print("✅ QCML clients initialized successfully!")
print("🔑 API key loaded from environment")


✅ QCML clients initialized successfully!
🔑 API key loaded from environment


## 🔍 Discover Available Resources

Before we start, let's see what environments and experiments are available in the system.

In [3]:
# Discover available environments
print("🖥️ Available Compute Environments:")
print("=" * 50)

envs = await admin_client.list_environments()
for env in envs['response']:
    print(f"📋 {env['name']: <15} - {env['description']}")

print("\n🧪 Available Experiments:")
print("=" * 50)

experiments = await admin_client.list_experiments()
for exp in experiments['response']:
    print(f"🔬 {exp['name']: <15} - {exp['description']}")

print("\n🎯 For this demo, we'll use:")
print("   Environment: py3.12 (CPU)")
print("   Experiment: pytorch-models")


🖥️ Available Compute Environments:
📋 py3.12          - Python 3.12. CPU
📋 py3.12-cuda     - Python 3.12. GPU with CUDA 19.9
📋 python3.12      - Basic Python environment with no GPU support

🧪 Available Experiments:
🔬 pytorch-models  - A package for training and predicting with pytorch models
🔬 weighted-general - A package for training and predicting with a weighted general model

🎯 For this demo, we'll use:
   Environment: py3.12 (CPU)
   Experiment: pytorch-models


## 📊 Create Test Dataset

Let's create a synthetic dataset perfect for testing QCML models. We'll generate numerical features with realistic relationships.

In [4]:
# Set random seed for reproducibility
np.random.seed(42)

# Generate synthetic dataset
n_samples = 100
n_features = 5

# Create correlated features
X = np.random.randn(n_samples, n_features)
X[:, 1] = X[:, 0] * 0.8 + np.random.randn(n_samples) * 0.2  # feature_2 correlated with feature_1
X[:, 2] = X[:, 0] * 0.3 + X[:, 1] * 0.4 + np.random.randn(n_samples) * 0.3  # feature_3 depends on both

# Scale features to reasonable range
X = (X + 3) * 1.5  # Shift and scale to positive values around 1-6

# Create target with realistic relationship
# Target is a nonlinear combination of features (perfect for quantum models!)
target = (
    2.5 * X[:, 0] +
    1.8 * X[:, 1] +
    1.2 * X[:, 2] +
    0.9 * X[:, 3] +
    0.6 * X[:, 4] +
    0.3 * X[:, 0] * X[:, 1] +  # Interaction term
    np.random.randn(n_samples) * 0.5  # Noise
)

# Create DataFrame
feature_names = [f'feature_{i+1}' for i in range(n_features)]
df = pd.DataFrame(X, columns=feature_names)
df['target'] = target

# Save to CSV
dataset_path = "demo_dataset.csv"
df.to_csv(dataset_path, index=False)

print(f"📊 Created dataset with {len(df)} rows and {len(df.columns)} columns")
print(f"💾 Saved to: {dataset_path}")
print("\n🔍 Dataset Preview:")
print(df.head())
print("\n📈 Dataset Statistics:")
print(df.describe().round(3))


📊 Created dataset with 100 rows and 6 columns
💾 Saved to: demo_dataset.csv

🔍 Dataset Preview:
   feature_1  feature_2  feature_3  feature_4  feature_5     target
0   5.245071   5.373910   5.413730   6.784545   4.148770  46.072146
1   4.148795   4.791861   4.096408   3.795788   5.313840  37.006177
2   3.804873   3.524328   4.292516   1.630080   1.912623  27.292355
3   3.656569   3.994146   4.654666   3.137964   2.381544  29.846807
4   6.698473   6.063586   5.971022   2.362878   3.683426  50.569224

📈 Dataset Statistics:
       feature_1  feature_2  feature_3  feature_4  feature_5   target
count    100.000    100.000    100.000    100.000    100.000  100.000
mean       4.433      4.412      4.455      4.692      4.452   37.562
std        1.351      1.119      0.963      1.439      1.609    9.632
min        1.572      1.999      2.318      1.630      0.570   18.343
25%        3.398      3.598      3.927      3.696      3.364   29.783
50%        4.485      4.464      4.469      4.729     

## 📤 Upload Dataset to QCML Platform

Now let's upload our dataset to the QCML platform using the `load_csv` adapter.

In [5]:
# Dataset configuration
dataset_name = "demo-dataset"
dataset_description = "Synthetic dataset for QCML demo"

print("📤 Uploading dataset to QCML platform...")
print("=" * 50)

# Load the CSV and extract metadata
loaded_dataset = load_csv(dataset_path)
print(f"✅ Dataset loaded: {loaded_dataset['number_of_rows']} rows, {loaded_dataset['number_of_columns']} columns")

try:
    await project_client.upload_dataset(
        file=loaded_dataset['file'],
        name=dataset_name,
        description=dataset_description,
        override=True,  # Allow overwriting for demo purposes
        chunk_size=1024 * 1024 * 10  # 10MB chunks
    )
    print("🎉 Dataset uploaded successfully!")

    # Store metadata for later use
    n_columns = loaded_dataset['number_of_columns']
    n_rows = loaded_dataset['number_of_rows']

except Exception as e:
    print(f"❌ Failed to upload dataset: {e}")
    raise


📤 Uploading dataset to QCML platform...


Loading CSV: 100%|██████████| 1/1 [00:00<00:00, 506.86chunk/s, percentage=100] 

✅ Dataset loaded: 101 rows, 6 columns





🎉 Dataset uploaded successfully!


## ⚙️ Configure Model Hyperparameters

Now let's set up hyperparameters for our General Hilbert Space Model. We'll use settings optimized for this demo dataset.

In [6]:
print("⚙️ Configuring QCML Model Hyperparameters")
print("=" * 50)

# Calculate input features (total columns - target column)
input_features_count = n_columns - 1
print(f"📊 Input features: {input_features_count}")
print("🎯 Target: 1 (regression)")
print(f"📏 Dataset size: {n_rows} samples")

# Configure hyperparameters for demo
hyperparameters = GeneralHSModelHyperparameters(
    # Model architecture
    hsm_model="general",
    input_operator_count=input_features_count,  # Number of input features
    output_operator_count=1,  # Single target (regression)
    hilbert_space_dims=8,  # Small dimension for quick demo
    complex=True,  # Use complex operators (recommended)

    # Training configuration
    epochs=50,  # Reasonable number for demo
    batch_size=16,  # Small batch for small dataset
    seed=42,  # Reproducibility
    targets="target",  # Our target column name (note: 'targets' not 'target')
    device="cpu",  # Use CPU for broader compatibility
    split=0.2,  # 20% validation split

    # Optimizer settings
    optimizer_config=OptimizerConfig(
        type="Adam",
        default_params={"lr": 0.001}  # Standard learning rate
    ),

    # Loss function
    loss_fn_config=LossFunctionConfig(
        type="MSELoss",  # Mean Squared Error for regression
        params={}
    ),

    # Early stopping to prevent overfitting
    early_stopping_config=EarlyStoppingConfig(
        monitor="val_loss",
        patience=10,  # Wait 10 epochs before stopping
        mode="min",
        verbose=True,
        restore_best_weights=True
    )
)

print("✅ Hyperparameters configured successfully!")
print("\n🔧 Model Configuration:")
print(f"   Architecture: General HSM")
print(f"   Hilbert Space Dims: {hyperparameters.hilbert_space_dims}")
print(f"   Input Operators: {hyperparameters.input_operator_count}")
print(f"   Output Operators: {hyperparameters.output_operator_count}")
print(f"   Training Epochs: {hyperparameters.epochs}")
print(f"   Batch Size: {hyperparameters.batch_size}")
print(f"   Learning Rate: {hyperparameters.optimizer_config.default_params['lr']}")


⚙️ Configuring QCML Model Hyperparameters
📊 Input features: 5
🎯 Target: 1 (regression)
📏 Dataset size: 101 samples
✅ Hyperparameters configured successfully!

🔧 Model Configuration:
   Architecture: General HSM
   Hilbert Space Dims: 8
   Input Operators: 5
   Output Operators: 1
   Training Epochs: 50
   Batch Size: 16
   Learning Rate: 0.001


## 🚀 Launch Training Experiment

Time to start our training! We'll submit the experiment to the QCML platform.

In [7]:
# Launch training experiment
experiment_name = "qcml-demo"

print("🚀 Launching QCML Training Experiment")
print("=" * 50)

try:
    result = await experiment_client.run_experiment(
        name=experiment_name,
        description="QCML demo experiment",
        experiment_name="pytorch-models",
        dataset_name=dataset_name,
        environment_name="py3.12",
        parameters={
            "hyperparameters": hyperparameters,
            "cpu_count": 2,
            "memory": 1024 * 4,
            "timeout": 3600 * 2,
        }
    )

    if 'response' in result:
        print("🎉 Training launched successfully!")
    else:
        print(f"❌ Failed: {result.get('error')}")

except Exception as e:
    print(f"💥 Exception: {e}")


🚀 Launching QCML Training Experiment
🎉 Training launched successfully!


## 📊 Monitor Training Progress

Monitor your training experiment with real-time status updates!

In [9]:
# Monitor training progress
print("📊 Training Monitor")
print("=" * 50)


result = await experiment_client.get_experiment_run(experiment_name)
response = result.get("response", {})
response.pop("params", None)

status = response.get("status")
metrics = response.get("metrics")

print(f"⏱️ Status: {status}")

if metrics:
    print("📈 Metrics:")
    for key, value in metrics.items():
        if isinstance(value, float):
            print(f"   {key}: {value:.4f}")
        else:
            print(f"   {key}: {value}")
else:
    print("📊 No metrics available yet")

if status == 'completed':
    print("🎉 Training completed!")
elif status == 'failed':
    print("❌ Training failed!")
    print(result)
    print(f"Error: {result['response']['errors']}")
elif status == 'running':
    print("🏃 Training in progress...")
else:
    print("⏳ Waiting for training to start...")


📊 Training Monitor
⏱️ Status: completed
📈 Metrics:
   test_loss: 1230.4495
   best_val_loss: 1230.4495
   avg_epoch_time: 0.0197
   final_val_loss: 1230.4495
   epochs_completed: 50
   final_train_loss: 1238.3502
   val_dataset_size: 20
   val_loss_history: [1476.8082275390625, 1463.5443115234375, 1451.2857666015625, 1439.960205078125, 1429.5654296875, 1419.9169921875, 1410.8724365234375, 1402.4344482421875, 1394.45458984375, 1386.916259765625, 1379.869140625, 1373.3643798828125, 1367.2841796875, 1361.681396484375, 1356.4644775390625, 1351.5858154296875, 1347.027587890625, 1342.6849365234375, 1338.5345458984375, 1334.558837890625, 1330.7103271484375, 1326.9677734375, 1323.2818603515625, 1319.666748046875, 1316.068603515625, 1312.531494140625, 1309.0074462890625, 1305.512939453125, 1302.0322265625, 1298.5528564453125, 1295.1033935546875, 1291.650390625, 1288.215087890625, 1284.7786865234375, 1281.3504638671875, 1277.92724609375, 1274.51953125, 1271.1123046875, 1267.707275390625, 1264.31

## 🎓 What You've Learned

Congratulations! You've successfully completed a full QCML workflow:

### 🔧 **Technical Skills:**
- ✅ Set up QCML clients and authentication
- ✅ Created and uploaded datasets to the platform
- ✅ Configured quantum machine learning hyperparameters
- ✅ Launched training experiments on cloud infrastructure
- ✅ Monitored training progress in real-time

### 🚀 **Next Steps:**
1. **Experiment with different hyperparameters** (Hilbert space dims, learning rates)
2. **Upload your own datasets** and see how quantum models perform
3. **Deploy trained models** for inference
4. **Scale up** with GPU environments for larger datasets


Checkout the documentation for more information on how to use the QCML API!

### 📚 **Resources:**
- [PyTorch Models Guide](docs/pytorch-models.md)
- [Dataset Management](docs/dataset-management.md) 
- [Training Runs Documentation](docs/training-runs.md)
- [Model Deployment](docs/model-deployment.md)