# DLT Framework Quick Start Guide

Welcome to the Deep Learning Toolkit (DLT) framework! This notebook will walk you through the essential features and show you how to get started with both sklearn and PyTorch models.

## What is DLT?

DLT is a universal machine learning framework that provides:
- **Unified API** for sklearn, PyTorch, TensorFlow, and JAX
- **Smart Configuration** with automatic validation
- **Advanced Features** like hyperparameter optimization, mixed precision training
- **Production Ready** with comprehensive logging and monitoring

Let's dive in!

In [1]:
# Install DLT if needed (uncomment if running locally)
# !pip install -e .

# Import essential modules
import numpy as np
import pandas as pd
from sklearn.datasets import make_classification, make_regression
from sklearn.model_selection import train_test_split

# DLT imports
from dlt.core.config import DLTConfig
from dlt.core.model import DLTModel
from dlt.core.pipeline import train, evaluate, predict, tune

print("✅ DLT Framework imported successfully!")

2025-09-29 12:06:23.140557: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.


✅ DLT Framework imported successfully!


  from .autonotebook import tqdm as notebook_tqdm


## 🚀 Example 1: Sklearn Classification

Let's start with a simple classification task using scikit-learn:

In [2]:
# Generate sample data
X, y = make_classification(
    n_samples=1000, 
    n_features=20, 
    n_classes=3, 
    n_informative=15,
    n_redundant=0,
    random_state=42
)

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42, stratify=y
)

print(f"Training data shape: {X_train.shape}")
print(f"Test data shape: {X_test.shape}")
print(f"Number of classes: {len(np.unique(y))}")

Training data shape: (800, 20)
Test data shape: (200, 20)
Number of classes: 3


In [3]:
# Create a DLT configuration for Random Forest
config = DLTConfig(
    model_type='sklearn.ensemble.RandomForestClassifier',
    model_params={
        'n_estimators': 100,
        'max_depth': 10,
        'random_state': 42
    },
    experiment={
        'name': 'sklearn_quickstart',
        'tags': ['classification', 'random_forest']
    }
)

print("📋 Configuration created:")
print(f"Model: {config.model_type}")
print(f"Experiment: {config.experiment['name']}")

📋 Configuration created:
Model: sklearn.ensemble.RandomForestClassifier
Experiment: sklearn_quickstart


In [4]:
# Train the model using DLT's high-level API
results = train(
    config=config,
    train_data=(X_train, y_train),
    test_data=(X_test, y_test),
    verbose=True
)

print("\n🎯 Training Results:")
print(f"Training completed in {results['training_time']:.2f} seconds")
print(f"Test accuracy: {results.get('test_results', {}).get('accuracy', 'N/A')}")

Starting training with sklearn.ensemble.RandomForestClassifier
Framework: sklearn
Device: cuda:0
Training sklearn.ensemble.RandomForestClassifier...
Training completed in 0.23s
Evaluating on test data...
Test accuracy: 0.8100
Training completed in 0.23 seconds

🎯 Training Results:
Training completed in 0.23 seconds
Test accuracy: 0.81
Training completed in 0.23s
Evaluating on test data...
Test accuracy: 0.8100
Training completed in 0.23 seconds

🎯 Training Results:
Training completed in 0.23 seconds
Test accuracy: 0.81


In [5]:
# Make predictions
model = results['model']
predictions = predict(model=model, data=X_test[:5])
probabilities = model.predict_proba(X_test[:5])

print("🔮 Predictions for first 5 test samples:")
for i in range(5):
    print(f"Sample {i+1}: Predicted={predictions[i]}, Actual={y_test[i]}, Confidence={probabilities[i].max():.3f}")

🔮 Predictions for first 5 test samples:
Sample 1: Predicted=0, Actual=0, Confidence=0.792
Sample 2: Predicted=2, Actual=2, Confidence=0.663
Sample 3: Predicted=1, Actual=1, Confidence=0.652
Sample 4: Predicted=0, Actual=1, Confidence=0.484
Sample 5: Predicted=2, Actual=2, Confidence=0.781


## 🧠 Example 2: PyTorch Neural Network

Now let's create a PyTorch neural network for the same task:

In [6]:
# Create PyTorch configuration
pytorch_config = DLTConfig(
    model_type='torch.nn.Sequential',
    model_params={
        'layers': [
            {'type': 'Linear', 'in_features': 20, 'out_features': 64},
            {'type': 'ReLU'},
            {'type': 'Dropout', 'p': 0.2},
            {'type': 'Linear', 'in_features': 64, 'out_features': 32},
            {'type': 'ReLU'},
            {'type': 'Linear', 'in_features': 32, 'out_features': 3}
        ]
    },
    training={
        'epochs': 50,
        'batch_size': 32,
        'optimizer': {'type': 'adam', 'lr': 0.001},
        'loss': {'type': 'cross_entropy'},
        'early_stopping': {'patience': 10}
    },
    experiment={
        'name': 'pytorch_quickstart',
        'tags': ['neural_network', 'pytorch']
    }
)

print("🧠 PyTorch Neural Network Configuration:")
print(f"Layers: {len(pytorch_config.model_params['layers'])}")
print(f"Training epochs: {pytorch_config.training['epochs']}")

🧠 PyTorch Neural Network Configuration:
Layers: 6
Training epochs: 50


In [7]:
# Train PyTorch model
pytorch_results = train(
    config=pytorch_config,
    train_data=(X_train.astype(np.float32), y_train),
    test_data=(X_test.astype(np.float32), y_test),
    verbose=True
)

print("\n🚀 PyTorch Training Results:")
print(f"Training completed in {pytorch_results['training_time']:.2f} seconds")
print(f"Final training loss: {pytorch_results['training_results'].get('history', {}).get('train_loss', ['N/A'])[-1]}")
print(f"Test accuracy: {pytorch_results.get('test_results', {}).get('accuracy', 'N/A')}")

Starting training with torch.nn.Sequential
Framework: torch
Device: cuda:0
Training torch.nn.Sequential...
Using configured task type: classification
Epoch 1/50, Loss: 0.9905
Epoch 1/50, Loss: 0.9905
Epoch 11/50, Loss: 0.3997
Epoch 11/50, Loss: 0.3997
Epoch 21/50, Loss: 0.2605
Epoch 21/50, Loss: 0.2605
Epoch 31/50, Loss: 0.2032
Epoch 31/50, Loss: 0.2032
Epoch 41/50, Loss: 0.1929
Epoch 41/50, Loss: 0.1929
Training completed in 1.99s
Evaluating on test data...
Test accuracy: 0.8900
Training completed in 1.99 seconds

🚀 PyTorch Training Results:
Training completed in 1.99 seconds
Final training loss: N/A
Test accuracy: 0.89
Training completed in 1.99s
Evaluating on test data...
Test accuracy: 0.8900
Training completed in 1.99 seconds

🚀 PyTorch Training Results:
Training completed in 1.99 seconds
Final training loss: N/A
Test accuracy: 0.89


## ⚙️ Example 3: Hyperparameter Optimization

DLT includes powerful hyperparameter optimization using Optuna:

In [8]:
# Define hyperparameter search space
base_config = {
    'model_type': 'sklearn.ensemble.RandomForestClassifier',
    'model_params': {'random_state': 42}
}

param_space = {
    'model_params.n_estimators': (50, 200),
    'model_params.max_depth': (3, 20),
    'model_params.min_samples_split': (2, 20),
    'model_params.min_samples_leaf': (1, 10)
}

print("🔍 Starting hyperparameter optimization...")
print(f"Search space: {len(param_space)} parameters")

# Run optimization
optimization_results = tune(
    base_config=base_config,
    config_space=param_space,
    train_data=(X_train, y_train),
    val_data=(X_test, y_test),
    n_trials=20,
    verbose=True
)

print("\n🏆 Optimization Results:")
print(f"Best score: {optimization_results['best_value']}")
print(f"Best parameters: {optimization_results['best_params']}")

[I 2025-09-29 12:06:28,139] A new study created in memory with name: no-name-f32120de-2ccc-42a2-ba10-8a01a59fc35c


🔍 Starting hyperparameter optimization...
Search space: 4 parameters
Starting hyperparameter optimization with 20 trials...
Optimizing val_loss (minimize)


Best trial: 0. Best value: inf:   5%|▌         | 1/20 [00:00<00:03,  5.66it/s]

[I 2025-09-29 12:06:28,317] Trial 0 finished with value: inf and parameters: {'model_params.n_estimators': 81, 'model_params.max_depth': 16, 'model_params.min_samples_split': 20, 'model_params.min_samples_leaf': 4}. Best is trial 0 with value: inf.


Best trial: 0. Best value: inf:  10%|█         | 2/20 [00:00<00:03,  5.34it/s]

[I 2025-09-29 12:06:28,512] Trial 1 finished with value: inf and parameters: {'model_params.n_estimators': 92, 'model_params.max_depth': 15, 'model_params.min_samples_split': 18, 'model_params.min_samples_leaf': 2}. Best is trial 0 with value: inf.


Best trial: 0. Best value: inf:  15%|█▌        | 3/20 [00:00<00:02,  5.73it/s]

[I 2025-09-29 12:06:28,671] Trial 2 finished with value: inf and parameters: {'model_params.n_estimators': 116, 'model_params.max_depth': 3, 'model_params.min_samples_split': 8, 'model_params.min_samples_leaf': 8}. Best is trial 0 with value: inf.


Best trial: 0. Best value: inf:  25%|██▌       | 5/20 [00:00<00:02,  5.75it/s]

[I 2025-09-29 12:06:28,896] Trial 3 finished with value: inf and parameters: {'model_params.n_estimators': 105, 'model_params.max_depth': 10, 'model_params.min_samples_split': 6, 'model_params.min_samples_leaf': 3}. Best is trial 0 with value: inf.
[I 2025-09-29 12:06:29,033] Trial 4 finished with value: inf and parameters: {'model_params.n_estimators': 76, 'model_params.max_depth': 7, 'model_params.min_samples_split': 9, 'model_params.min_samples_leaf': 10}. Best is trial 0 with value: inf.


Best trial: 0. Best value: inf:  30%|███       | 6/20 [00:01<00:02,  6.20it/s]

[I 2025-09-29 12:06:29,170] Trial 5 finished with value: inf and parameters: {'model_params.n_estimators': 88, 'model_params.max_depth': 4, 'model_params.min_samples_split': 8, 'model_params.min_samples_leaf': 4}. Best is trial 0 with value: inf.


Best trial: 0. Best value: inf:  35%|███▌      | 7/20 [00:01<00:02,  5.33it/s]

[I 2025-09-29 12:06:29,412] Trial 6 finished with value: inf and parameters: {'model_params.n_estimators': 131, 'model_params.max_depth': 20, 'model_params.min_samples_split': 15, 'model_params.min_samples_leaf': 9}. Best is trial 0 with value: inf.


Best trial: 0. Best value: inf:  40%|████      | 8/20 [00:01<00:02,  4.19it/s]

[I 2025-09-29 12:06:29,760] Trial 7 finished with value: inf and parameters: {'model_params.n_estimators': 195, 'model_params.max_depth': 6, 'model_params.min_samples_split': 13, 'model_params.min_samples_leaf': 5}. Best is trial 0 with value: inf.


Best trial: 0. Best value: inf:  45%|████▌     | 9/20 [00:01<00:02,  3.81it/s]

[I 2025-09-29 12:06:30,075] Trial 8 finished with value: inf and parameters: {'model_params.n_estimators': 164, 'model_params.max_depth': 13, 'model_params.min_samples_split': 2, 'model_params.min_samples_leaf': 7}. Best is trial 0 with value: inf.


Best trial: 0. Best value: inf:  55%|█████▌    | 11/20 [00:02<00:01,  4.59it/s]

[I 2025-09-29 12:06:30,329] Trial 9 finished with value: inf and parameters: {'model_params.n_estimators': 133, 'model_params.max_depth': 8, 'model_params.min_samples_split': 2, 'model_params.min_samples_leaf': 6}. Best is trial 0 with value: inf.
[I 2025-09-29 12:06:30,452] Trial 10 finished with value: inf and parameters: {'model_params.n_estimators': 55, 'model_params.max_depth': 18, 'model_params.min_samples_split': 20, 'model_params.min_samples_leaf': 1}. Best is trial 0 with value: inf.


Best trial: 0. Best value: inf:  65%|██████▌   | 13/20 [00:02<00:01,  5.39it/s]

[I 2025-09-29 12:06:30,565] Trial 11 finished with value: inf and parameters: {'model_params.n_estimators': 52, 'model_params.max_depth': 15, 'model_params.min_samples_split': 20, 'model_params.min_samples_leaf': 2}. Best is trial 0 with value: inf.
[I 2025-09-29 12:06:30,750] Trial 12 finished with value: inf and parameters: {'model_params.n_estimators': 88, 'model_params.max_depth': 16, 'model_params.min_samples_split': 17, 'model_params.min_samples_leaf': 3}. Best is trial 0 with value: inf.


Best trial: 0. Best value: inf:  70%|███████   | 14/20 [00:02<00:01,  5.60it/s]

[I 2025-09-29 12:06:30,911] Trial 13 finished with value: inf and parameters: {'model_params.n_estimators': 73, 'model_params.max_depth': 13, 'model_params.min_samples_split': 17, 'model_params.min_samples_leaf': 1}. Best is trial 0 with value: inf.


Best trial: 0. Best value: inf:  75%|███████▌  | 15/20 [00:02<00:00,  5.32it/s]

[I 2025-09-29 12:06:31,121] Trial 14 finished with value: inf and parameters: {'model_params.n_estimators': 103, 'model_params.max_depth': 16, 'model_params.min_samples_split': 17, 'model_params.min_samples_leaf': 4}. Best is trial 0 with value: inf.


Best trial: 0. Best value: inf:  85%|████████▌ | 17/20 [00:03<00:00,  4.92it/s]

[I 2025-09-29 12:06:31,444] Trial 15 finished with value: inf and parameters: {'model_params.n_estimators': 160, 'model_params.max_depth': 20, 'model_params.min_samples_split': 13, 'model_params.min_samples_leaf': 5}. Best is trial 0 with value: inf.
[I 2025-09-29 12:06:31,588] Trial 16 finished with value: inf and parameters: {'model_params.n_estimators': 69, 'model_params.max_depth': 11, 'model_params.min_samples_split': 20, 'model_params.min_samples_leaf': 3}. Best is trial 0 with value: inf.


Best trial: 0. Best value: inf:  90%|█████████ | 18/20 [00:03<00:00,  4.84it/s]

[I 2025-09-29 12:06:31,803] Trial 17 finished with value: inf and parameters: {'model_params.n_estimators': 101, 'model_params.max_depth': 14, 'model_params.min_samples_split': 18, 'model_params.min_samples_leaf': 2}. Best is trial 0 with value: inf.


Best trial: 0. Best value: inf: 100%|██████████| 20/20 [00:04<00:00,  4.83it/s]



[I 2025-09-29 12:06:32,096] Trial 18 finished with value: inf and parameters: {'model_params.n_estimators': 149, 'model_params.max_depth': 18, 'model_params.min_samples_split': 14, 'model_params.min_samples_leaf': 6}. Best is trial 0 with value: inf.
[I 2025-09-29 12:06:32,283] Trial 19 finished with value: inf and parameters: {'model_params.n_estimators': 88, 'model_params.max_depth': 17, 'model_params.min_samples_split': 11, 'model_params.min_samples_leaf': 4}. Best is trial 0 with value: inf.
Best trial value: inf
Best parameters:
  model_params.n_estimators: 81
  model_params.max_depth: 16
  model_params.min_samples_split: 20
  model_params.min_samples_leaf: 4
Starting training with sklearn.ensemble.RandomForestClassifier
Framework: sklearn
Device: cuda:0
Training sklearn.ensemble.RandomForestClassifier...
Training completed in 0.16s
Training completed in 0.16 seconds

🏆 Optimization Results:
Best score: inf
Best parameters: {'model_params.n_estimators': 81, 'model_params.max_depth

## 💾 Example 4: Model Persistence

Save and load models easily:

In [9]:
import tempfile
import os

# Save the best model
best_model = optimization_results['best_model']
best_config = optimization_results['best_config']

# Create temporary files for demo using mkstemp (secure)
model_fd, model_path = tempfile.mkstemp(suffix='.pkl')
config_fd, config_path = tempfile.mkstemp(suffix='.yaml')

try:
    # Close the file descriptors since we only need the paths
    os.close(model_fd)
    os.close(config_fd)
    
    # Save model and config
    best_model.save(model_path)
    best_config.save(config_path)
    
    print(f"💾 Model saved to: {os.path.basename(model_path)}")
    print(f"📋 Config saved to: {os.path.basename(config_path)}")
    
    # Load model and config
    loaded_config = DLTConfig.from_file(config_path)
    loaded_model = DLTModel.load(model_path, loaded_config)
    
    # Test loaded model
    test_predictions = loaded_model.predict(X_test[:3])
    original_predictions = best_model.predict(X_test[:3])
    
    print(f"\n✅ Model loaded successfully!")
    print(f"Predictions match: {np.array_equal(test_predictions, original_predictions)}")
    
finally:
    # Clean up temporary files
    for path in [model_path, config_path]:
        if os.path.exists(path):
            os.unlink(path)

💾 Model saved to: tmpvjw8lif3.pkl
📋 Config saved to: tmp_qlp4de1.yaml

✅ Model loaded successfully!
Predictions match: True


## 📊 Example 5: Configuration Management

DLT provides flexible configuration management:

In [10]:
# Create comprehensive configuration
advanced_config = DLTConfig(
    model_type='sklearn.svm.SVC',
    model_params={
        'kernel': 'rbf',
        'C': 1.0,
        'probability': True,
        'random_state': 42
    },
    experiment={
        'name': 'advanced_svm_experiment',
        'tags': ['svm', 'classification', 'rbf_kernel'],
        'notes': 'Testing SVM with RBF kernel',
        'seed': 42
    },
    hardware={
        'device': 'cpu',
        'num_workers': 4
    }
)

print("🛠️ Advanced Configuration:")
print(f"Framework: {advanced_config.get_framework()}")
print(f"Model: {advanced_config.model_type}")
print(f"Experiment: {advanced_config.experiment['name']}")
print(f"Tags: {', '.join(advanced_config.experiment['tags'])}")

# Convert to dictionary for inspection
config_dict = advanced_config.to_dict()
print(f"\n📋 Configuration keys: {list(config_dict.keys())}")

🛠️ Advanced Configuration:
Framework: sklearn
Model: sklearn.svm.SVC
Experiment: advanced_svm_experiment
Tags: svm, classification, rbf_kernel

📋 Configuration keys: ['model_type', 'model_params', 'training', 'data', 'experiment', 'hardware', 'performance', 'optimization']


## 🎯 Summary

In this quick start guide, you've learned how to:

✅ **Create configurations** for different ML frameworks  
✅ **Train models** with sklearn and PyTorch  
✅ **Optimize hyperparameters** automatically  
✅ **Save and load** models and configurations  
✅ **Manage complex configurations** with validation  

### Next Steps:

- Explore `02_advanced_features.ipynb` for GPU training, mixed precision, and distributed training
- Check `03_multi_framework_comparison.ipynb` to compare different ML frameworks
- See `04_production_deployment.ipynb` for production best practices

### 📚 Resources:

- **Documentation**: Check the `tests/` folder for comprehensive examples
- **Configuration**: See `src/dlt/core/config.py` for all configuration options
- **Examples**: Explore other notebooks in this folder

Happy training! 🚀