# Domino Flow Training and Evaluation Workflow

This notebook demonstrates how to execute a Domino Flow for training multiple fraud detection models and registering the best performing model to the Domino Model Registry.

## Overview
- Execute Domino Flow to train AdaBoost, GaussianNB, and XGBoost classifiers
- Monitor execution progress
- Compare model performance using Experiment Manager
- Register the best model to Model Registry

## Step 1: Execute Domino Flow

In [3]:
import subprocess
import json
import time
from pathlib import Path
import requests
import os
import sys

# Store the flow execution ID for later use
flow_execution_id = None
experiment_name = None

# Detect working directory (file-based or git-based project)
domino_working_dir = os.environ.get("DOMINO_WORKING_DIR", "/mnt")
print(f"Detected working directory: {domino_working_dir}")

# Add project root to path
if domino_working_dir not in sys.path:
    sys.path.insert(0, domino_working_dir)

# Execute the Domino Flow
print("\n" + "=" * 60)
print("STARTING DOMINO FLOW EXECUTION")
print("=" * 60)

try:
    # Alternative approach: Use flytectl or direct Domino API
    # If pyflyte has issues, we can execute via Domino Jobs directly
    
    print("\nAttempting to execute workflow via pyflyte...")
    print("Note: If you encounter CLI parameter conflicts, you can:")
    print("  1. Execute the workflow from the Domino Flows UI instead")
    print("  2. Run each trainer script as separate Domino Jobs")
    print("  3. Use the workflow from a Domino Workspace with flytekit installed\n")
    
    # Run the workflow using pyflyte with minimal parameters
    result = subprocess.run(
        [
            "pyflyte",
            "run",
            "--remote",
            f"{domino_working_dir}/exercises/d_TrainingAndEvaluation/workflow.py",
            "credit_card_fraud_detection_workflow"
        ],
        capture_output=True,
        text=True,
        timeout=120
    )
    
    if result.returncode == 0:
        print("‚úÖ Flow execution started successfully!")
        print(f"\nOutput:\n{result.stdout}")
        
        # Extract experiment name from the environment or generate it
        try:
            # Import the domino_short_id function to generate the same experiment name
            from domino_short_id import domino_short_id
            experiment_name = f"CC Fraud Classifier Training {domino_short_id()}"
            print(f"\nFlow will create experiments under: {experiment_name}")
        except Exception as e:
            print(f"Could not determine experiment name: {e}")
            experiment_name = "CC Fraud Classifier Training"
        
    else:
        print(f"‚ùå Flow execution failed with return code {result.returncode}")
        print(f"\nStderr:\n{result.stderr}")
        print(f"\nStdout:\n{result.stdout}")
        
        print("\n" + "=" * 60)
        print("ALTERNATIVE: Execute workflow manually")
        print("=" * 60)
        print("\nYou can execute the training tasks individually:")
        print("  1. Run: python exercises/d_TrainingAndEvaluation/trainer_ada.py")
        print("  2. Run: python exercises/d_TrainingAndEvaluation/trainer_xgb.py")
        print("  3. Run: python exercises/d_TrainingAndEvaluation/trainer_gnb.py")
        print("  4. Run: python exercises/d_TrainingAndEvaluation/compare.py")
        print("\nOr use the Domino Flows UI to create and execute the workflow visually.")
        
except subprocess.TimeoutExpired:
    print("‚è±Ô∏è  Workflow execution timed out (2 min limit for initial submission)")
    print("   The flow may still be running in the background.")
    print("   Check the Domino Flows UI for execution status.")
except Exception as e:
    print(f"‚ùå Error executing flow: {e}")
    print("\nNote: Domino Flows requires proper platform environment setup")
    print("If CLI execution fails, please use the Domino Flows UI interface.")

Detected working directory: /mnt

STARTING DOMINO FLOW EXECUTION

Attempting to execute workflow via pyflyte...
Note: If you encounter CLI parameter conflicts, you can:
  1. Execute the workflow from the Domino Flows UI instead
  2. Run each trainer script as separate Domino Jobs
  3. Use the workflow from a Domino Workspace with flytekit installed

‚úÖ Flow execution started successfully!

Output:
Running Execution on Remote.
[1;34mUpgrade to ydata-sdk[0m
Improve your data and profiling with ydata-sdk, featuring data quality scoring, redundancy detection, outlier identification, text validation, and synthetic data generation.
Register at https://ydata.ai/register
Creating task using latest values. This is not recommended, as values not explicitly defined may change between subsequent executions of this task
Retrieving default properties for job against project 68f2477cfe21ba6bf5ff485b
Resolved job properties: DominoJobConfig(Command='python exercises/d_TrainingAndEvaluation/trainer_

---

## Step 2: Monitor Flow Execution in Domino UI

### Instructions for Viewing Flow Progress:

1. **Navigate to Flows Dashboard**
   - In the Domino UI, click on "Flows" in the left navigation panel
   - You should see your "credit_card_fraud_detection_workflow" listed

2. **Monitor Execution Status**
   - Click on the flow name to view the execution details
   - Watch the DAG (Directed Acyclic Graph) as tasks complete
   - Each task will show status: Running, Completed, or Failed

3. **Task Execution Order**
   - Three training tasks run in parallel:
     - Train AdaBoost classifier
     - Train GaussianNB classifier  
     - Train XGBoost classifier
   - After all training completes, the comparison task executes

4. **View Task Logs**
   - Click on individual tasks to view their execution logs
   - Monitor progress and check for any errors

### Expected Flow Duration: 10-15 minutes

Wait for all tasks to complete before proceeding to the next step.

---

## Step 3: Access Experiment Manager and Compare Results

### Instructions for Experiment Manager:

1. **Open Experiment Manager**
   - Click "Experiment Manager" in the left navigation panel
   - You should see 3 new experiment runs from the flow execution

2. **Compare Model Performance**
   - Select all 3 runs (AdaBoost, GaussianNB, XGBoost)
   - Click "Compare" button in the top toolbar

3. **Analyze Metrics**
   - Review key metrics: ROC-AUC, Precision, Recall, F1-Score
   - Look for the model with the highest performance
   - Expected best performer: XGBoost

4. **Review Model Details**
   - Click on the best performing model
   - Review complete traceability: code, data, parameters, artifacts
   - Check model artifacts and performance visualizations

---

## Step 4: Register Best Model to Model Registry

### Find the Best Performing Model

This cell searches the MLflow experiment for the best performing model based on accuracy. It will:
- Search for runs with accuracy metrics in the flow's experiment
- Sort by accuracy (highest first)
- Display key metrics (ROC-AUC, Precision, Recall, F1-Score)

### Register the Best Model to Model Registry

This cell registers the best performing model to the Domino Model Registry with:
- Comprehensive model card and description
- Model specifications (training framework, size, memory requirements)
- Tags for governance and discoverability
- Dataset information and performance metrics

### Register the Preprocessing Pipeline

This cell registers the feature scaling/preprocessing pipeline from Exercise 3 as a separate model endpoint. This creates the feature transformation endpoint needed by the Streamlit app and deployment services.

In [None]:
# This section demonstrates programmatic model registration
# In practice, you would typically do this through the Domino UI as shown in the instructions

import mlflow
import mlflow.tracking
from datetime import datetime

def get_best_experiment_run(target_experiment_name=None):
    """
    Retrieve the best performing experiment run based on accuracy from the specified experiment
    """
    try:
        if target_experiment_name is None:
            target_experiment_name = experiment_name
            
        if target_experiment_name is None:
            print("No experiment name specified")
            return None
            
        print(f"Searching for best model in experiment: {target_experiment_name}")
        
        # Get the specific experiment by name
        try:
            experiment = mlflow.get_experiment_by_name(target_experiment_name)
            if experiment is None:
                print(f"Experiment '{target_experiment_name}' not found")
                # Fall back to searching all experiments
                experiments = mlflow.search_experiments()
                if not experiments:
                    print("No experiments found")
                    return None
                experiment_ids = [exp.experiment_id for exp in experiments]
            else:
                experiment_ids = [experiment.experiment_id]
                print(f"Found experiment: {experiment.name} (ID: {experiment.experiment_id})")
        except Exception as e:
            print(f"Error finding experiment: {e}")
            # Fall back to all experiments
            experiments = mlflow.search_experiments()
            experiment_ids = [exp.experiment_id for exp in experiments] if experiments else []
        
        if not experiment_ids:
            print("No experiments available")
            return None
        
        # Search for runs with accuracy metrics, ordered by accuracy (best first)
        runs = mlflow.search_runs(
            experiment_ids=experiment_ids,
            filter_string="metrics.accuracy > 0",
            order_by=["metrics.accuracy DESC"],
            max_results=10
        )
        
        if runs.empty:
            print("No runs found with accuracy metrics in the target experiment")
            return None
        
        # Get the best performing run
        best_run = runs.iloc[0]
        
        print(f"Best model found:")
        print(f"  Run ID: {best_run['run_id']}")
        print(f"  Experiment: {best_run.get('experiment_id', 'Unknown')}")
        print(f"  Model: {best_run.get('tags.model_type', 'Unknown')}")
        print(f"  Accuracy: {best_run['metrics.accuracy']:.4f}")
        print(f"  ROC-AUC: {best_run.get('metrics.roc_auc', 'N/A')}")
        print(f"  Precision: {best_run.get('metrics.precision_fraud', 'N/A')}")
        print(f"  Recall: {best_run.get('metrics.recall_fraud', 'N/A')}")
        print(f"  F1-Score: {best_run.get('metrics.f1_fraud', 'N/A')}")
        
        return best_run
        
    except Exception as e:
        print(f"Error retrieving experiments: {e}")
        return None

# Get the best performing model from the current flow's experiment
best_run = get_best_experiment_run(experiment_name)

### Register All Three Classifiers

This cell registers **all three models** trained in the Domino Flow to the Model Registry:
- **CC Fraud ADA Classifier** - AdaBoost ensemble model
- **CC Fraud XGBoost Classifier** - Gradient boosting model
- **CC Fraud GaussianNB Classifier** - Naive Bayes probabilistic model

Each model is registered with:
- Descriptive name and performance metrics
- Model type tags and specifications
- Metadata for governance and deployment

In [None]:
def register_model_to_registry(run_info, base_model_name="fraud_detection_classifier"):
    """
    Register the best model to Domino Model Registry with model card and specifications.
    Model name is made unique by appending project name for cross-project compatibility.
    """
    try:
        if run_info is None:
            print("No run information provided")
            return None
        
        # Create unique model name using project name
        project_name = os.environ.get('DOMINO_PROJECT_NAME', 'unknown-project')
        project_owner = os.environ.get('DOMINO_PROJECT_OWNER', 'unknown-user')
        model_name = f"{base_model_name}_{project_owner}_{project_name}"
        
        print(f"üìù Creating unique model name: {model_name}")
        print(f"   Project: {project_name}")
        print(f"   Owner: {project_owner}")
            
        run_id = run_info['run_id']
        
        # Extract training framework to determine correct artifact path
        training_framework = None
        methods = [
            ('tags.model_type', run_info.get('tags.model_type')),
            ('tags.mlflow.runName', run_info.get('tags.mlflow.runName', '')),
            ('tags.model', run_info.get('tags.model')),
        ]
        
        for method, value in methods:
            if value and training_framework is None:
                value_lower = str(value).lower()
                if 'xgb' in value_lower or 'xgboost' in value_lower:
                    training_framework = 'XGBoost'
                    break
                elif 'ada' in value_lower or 'adaboost' in value_lower:
                    training_framework = 'AdaBoost'
                    break
                elif 'gnb' in value_lower or 'gaussian' in value_lower or 'naive' in value_lower:
                    training_framework = 'GaussianNB'
                    break
        
        # Fallback if no detection worked
        if training_framework is None:
            training_framework = 'Unknown'
            print(f"‚ö†Ô∏è Could not detect training framework")
        
        # Construct correct artifact path based on model name
        model_artifact_name = run_info.get('tags.model', training_framework).lower().replace(' ', '_')
        artifact_path = f"{model_artifact_name}_model"
        model_uri = f"runs:/{run_id}/{artifact_path}"
        
        print(f"üîç Detected model type: {training_framework}")
        print(f"üìÅ Using artifact path: {artifact_path}")
        
        # Create model version with correct model URI
        model_version = mlflow.register_model(
            model_uri=model_uri,
            name=model_name
        )
        
        print(f"‚úÖ Model registered successfully:")
        print(f"   Name: {model_version.name}")
        print(f"   Version: {model_version.version}")
        print(f"   Run ID: {run_id}")
        print(f"   Model URI: {model_uri}")
        
        # Check dataset information used by Flow scripts
        dataset_info, data_files = check_flow_datasets()
        
        # Print first 3 data files found
        if data_files:
            print(f"üìÅ Data files used: {', '.join(data_files[:3])}")
            if len(data_files) > 3:
                print(f"   ... and {len(data_files) - 3} more files")
        
        # Update model with comprehensive card template and specifications
        client = mlflow.tracking.MlflowClient()
        
        # Set simple one-sentence description for model version
        desc = f"Best performing fraud detection classifier from {project_owner}/{project_name} based on accuracy from automated flow training"
        client.update_model_version(
            name=model_name,
            version=model_version.version,
            description=desc
        )
        
        # Set basic model tags using client
        basic_tags = {
            "model_type": training_framework,
            "use_case": "fraud_detection",
            "training_data": "credit_card_transactions",
            "performance_metric": "accuracy",
            "deployment_ready": "true",
            "project_name": project_name,
            "project_owner": project_owner
        }
        
        for key, value in basic_tags.items():
            client.set_model_version_tag(model_name, model_version.version, key, value)
        
        # Check if model card template exists and use it for registered model description
        domino_working_dir = os.environ.get("DOMINO_WORKING_DIR", "/mnt")
        template_path = f"{domino_working_dir}/exercises/d_TrainingAndEvaluation/Model_Registry_template.md"
        try:
            with open(template_path, 'r') as file:
                model_card_description = file.read()
                client.update_registered_model(model_name, model_card_description)
                print(f"‚úÖ Updated model description from template: {template_path}")
        except Exception as e:
            print(f"‚ö†Ô∏è Could not load model card template: {e}")
            # Fallback to simple description
            fallback_desc = f"Fraud Detection Classifier from {project_owner}/{project_name} trained using Domino Flow with multiple algorithm comparison"
            client.update_registered_model(model_name, fallback_desc)
        
        # Estimate model artifacts size based on model type
        size_estimates = {
            'XGBoost': '15-25 MB',
            'AdaBoost': '8-15 MB', 
            'GaussianNB': '< 5 MB'
        }
        model_size = size_estimates.get(training_framework, '10-20 MB (estimated)')
        
        # Estimate inference memory requirements based on model type
        memory_estimates = {
            'XGBoost': '512 MB - 1 GB (tree ensemble requires memory for all trees)',
            'AdaBoost': '256 MB - 512 MB (smaller ensemble, moderate memory)',
            'GaussianNB': '128 MB - 256 MB (simple probabilistic model, minimal memory)'
        }
        inference_memory = memory_estimates.get(training_framework, '256 MB - 512 MB (estimated)')
        
        # Set relevant model specifications
        model_specs = {
            "mlflow.domino.specs.Training Framework": training_framework,
            "mlflow.domino.specs.Model Artifacts Size": model_size,
            "mlflow.domino.specs.Training Dataset Size": dataset_info,
            "mlflow.domino.specs.Inference Memory": inference_memory
        }
        
        print(f"üîß Setting model specs:")
        for key, value in model_specs.items():
            print(f"   {key}: {value}")
        
        # Apply all model specifications
        for key, value in model_specs.items():
            try:
                client.set_registered_model_tag(model_name, key, value)
                print(f"‚úÖ Set spec: {key}")
            except Exception as e:
                print(f"‚ùå Failed to set spec {key}: {e}")
        
        print(f"‚úÖ Model registration completed")
        print(f"   Unique Model Name: {model_name}")
        print(f"   Training Framework: {training_framework}")
        print(f"   Model Size: {model_size}")
        print(f"   Dataset Info: {dataset_info}")
        print(f"   Inference Memory: {inference_memory}")
        
        return model_version
        
    except Exception as e:
        print(f"‚ùå Error registering model: {e}")
        import traceback
        traceback.print_exc()
        return None

def check_flow_datasets():
    """
    Check dataset information used by Flow scripts - fast file-based check
    Returns dataset info and list of data files found
    """
    try:
        import os
        data_files_found = []
        
        # Primary dataset file used by all trainers
        dataset_filename = 'transformed_cc_transactions.csv'
        
        # Detect project type and set appropriate paths
        domino_working_dir = os.environ.get("DOMINO_WORKING_DIR", "/mnt")
        domino_project_name = os.environ.get("DOMINO_PROJECT_NAME", "Fraud-Detection-Workshop")
        
        # Check multiple possible locations for datasets based on project type
        if domino_working_dir == "/mnt/code":
            # Git-based project
            search_paths = [
                f"/mnt/data/{domino_project_name}/",
                "/mnt/artifacts/",
                domino_working_dir
            ]
        else:
            # File-based project
            search_paths = [
                f"/domino/datasets/local/{domino_project_name}/",
                "/mnt/",
                "/tmp/"
            ]
        for search_path in search_paths:
            if os.path.exists(search_path):
                try:
                    files = [f for f in os.listdir(search_path) 
                           if f.endswith('.csv') and os.path.isfile(os.path.join(search_path, f))]
                    for file in files:
                        if file not in data_files_found:
                            data_files_found.append(file)
                except Exception:
                    continue
        
        # Try to get info from the main dataset file
        dataset_info = "~500,000 credit card transactions (estimated)"
        for search_path in search_paths:
            dataset_path = os.path.join(search_path, dataset_filename)
            if os.path.exists(dataset_path):
                try:
                    file_size_mb = os.path.getsize(dataset_path) / (1024 * 1024)
                    with open(dataset_path, 'r') as f:
                        first_line = f.readline()
                        line_count = 1
                        for _ in range(100):
                            if f.readline():
                                line_count += 1
                            else:
                                break
                        sample_size = f.tell()
                        if sample_size > 0:
                            estimated_rows = int((file_size_mb * 1024 * 1024) / sample_size * line_count)
                        else:
                            estimated_rows = line_count
                    dataset_info = f"{estimated_rows:,} transactions ({file_size_mb:.1f} MB)"
                    break
                except Exception as e:
                    dataset_info = f"Dataset found ({file_size_mb:.1f} MB)"
                    break
        
        return dataset_info, data_files_found
        
    except Exception as e:
        print(f"Warning: Could not check dataset info: {e}")
        return "~500,000 credit card transactions (estimated)", []

# Register the best model
if best_run is not None:
    registered_model = register_model_to_registry(best_run)
else:
    print("No best run available for registration")
    print("Please use the Domino UI to manually register the model as described in the instructions")

In [None]:
def register_preprocessing_pipeline():
    """
    Register the preprocessing pipeline from Exercise 3 as a separate model endpoint.
    Model name is made unique by appending project/user info for cross-project compatibility.
    """
    import sys
    import os
    
    # Create unique model name
    project_name = os.environ.get('DOMINO_PROJECT_NAME', 'unknown-project')
    project_owner = os.environ.get('DOMINO_PROJECT_OWNER', 'unknown-user')
    model_name = f"CC_Fraud_Feature_Scaling_{project_owner}_{project_name}"
    
    print(f"üìù Creating unique preprocessing model name: {model_name}")
    
    # Add project root to path to import from Exercise 3
    project_root = os.path.abspath(os.path.join(os.getcwd(), '..', '..'))
    if project_root not in sys.path:
        sys.path.insert(0, project_root)
    
    try:
        # Import the preprocessing functions from Exercise 3
        from exercises.c_DataEngineering.data_engineering import add_derived_features
        
        # Look for existing preprocessing experiments
        preprocessing_experiments = mlflow.search_experiments(
            filter_string="name LIKE '%Preprocessing%'"
        )
        
        if preprocessing_experiments:
            # Get the most recent preprocessing experiment
            latest_experiment = max(preprocessing_experiments, 
                                  key=lambda x: x.creation_time)
            
            print(f"Found preprocessing experiment: {latest_experiment.name}")
            
            # Search for preprocessing runs in this experiment
            preprocessing_runs = mlflow.search_runs(
                experiment_ids=[latest_experiment.experiment_id],
                filter_string="tags.pipeline = 'preprocessing'",
                order_by=["start_time DESC"],
                max_results=1
            )
            
            if not preprocessing_runs.empty:
                best_preprocessing_run = preprocessing_runs.iloc[0]
                run_id = best_preprocessing_run.run_id
                
                print(f"Found preprocessing run: {run_id}")
                
                # Register the preprocessing pipeline as a model
                preprocessing_model_uri = f"runs:/{run_id}/preprocessing_pipeline"
                
                try:
                    # Register the preprocessing pipeline
                    registered_model = mlflow.register_model(
                        model_uri=preprocessing_model_uri,
                        name=model_name
                    )
                    
                    print(f"‚úÖ Registered preprocessing pipeline as model: {model_name}")
                    print(f"   Model Version: {registered_model.version}")
                    print(f"   Project: {project_name}")
                    print(f"   Owner: {project_owner}")
                    
                    # Update model version with description and tags
                    client = mlflow.tracking.MlflowClient()
                    
                    # Add description
                    client.update_model_version(
                        name=model_name,
                        version=registered_model.version,
                        description=f"Feature scaling pipeline for fraud detection from {project_owner}/{project_name}"
                    )
                    
                    # Add tags
                    client.set_model_version_tag(
                        name=model_name,
                        version=registered_model.version,
                        key="stage",
                        value="staging"
                    )
                    
                    client.set_model_version_tag(
                        name=model_name,
                        version=registered_model.version,
                        key="model_type",
                        value="preprocessing"
                    )
                    
                    client.set_model_version_tag(
                        name=model_name,
                        version=registered_model.version,
                        key="project_name",
                        value=project_name
                    )
                    
                    client.set_model_version_tag(
                        name=model_name,
                        version=registered_model.version,
                        key="project_owner",
                        value=project_owner
                    )
                    
                    print(f"‚úÖ Updated model version with description and tags")
                    
                    return registered_model
                    
                except Exception as e:
                    print(f"‚ùå Error registering preprocessing model: {e}")
                    return None
            else:
                print("‚ùå No preprocessing runs found with 'preprocessing' tag")
                return None
        else:
            print("‚ùå No preprocessing experiments found")
            print("   Please run Exercise 3 (Data Engineering) first to create the preprocessing pipeline")
            return None
            
    except ImportError as e:
        print(f"‚ùå Could not import from Exercise 3: {e}")
        print("   Please ensure Exercise 3 (Data Engineering) has been completed")
        return None

# Register the preprocessing pipeline
print("üîÑ Registering preprocessing pipeline as feature scaling endpoint...")
preprocessing_model = register_preprocessing_pipeline()

In [None]:
def register_all_flow_models(target_experiment_name=None):
    """
    Register all three classifiers trained in the Domino Flow to the Model Registry.
    Each model is registered with a descriptive name: 'CC_Fraud_{Classifier}_Classifier_{owner}_{project}'
    Model names are made unique by appending project/user info for cross-project compatibility.
    """
    try:
        # Get project information for unique naming
        project_name = os.environ.get('DOMINO_PROJECT_NAME', 'unknown-project')
        project_owner = os.environ.get('DOMINO_PROJECT_OWNER', 'unknown-user')
        
        print(f"üìù Creating unique model names for project: {project_owner}/{project_name}")
        
        if target_experiment_name is None:
            target_experiment_name = experiment_name
            
        if target_experiment_name is None:
            print("No experiment name specified")
            return []
            
        print(f"Searching for all models in experiment: {target_experiment_name}")
        
        # Get the specific experiment by name
        try:
            experiment = mlflow.get_experiment_by_name(target_experiment_name)
            if experiment is None:
                print(f"Experiment '{target_experiment_name}' not found")
                return []
            experiment_ids = [experiment.experiment_id]
            print(f"Found experiment: {experiment.name} (ID: {experiment.experiment_id})")
        except Exception as e:
            print(f"Error finding experiment: {e}")
            return []
        
        # Search for all runs with accuracy metrics
        runs = mlflow.search_runs(
            experiment_ids=experiment_ids,
            filter_string="metrics.accuracy > 0",
            order_by=["metrics.accuracy DESC"],
            max_results=50
        )
        
        if runs.empty:
            print("No runs found with accuracy metrics")
            return []
        
        # Define the three models with unique naming pattern
        model_mapping = {
            'AdaBoost': f'CC_Fraud_ADA_Classifier_{project_owner}_{project_name}',
            'XGBoost': f'CC_Fraud_XGBoost_Classifier_{project_owner}_{project_name}',
            'GaussianNB': f'CC_Fraud_GaussianNB_Classifier_{project_owner}_{project_name}'
        }
        
        registered_models = []
        
        # Find and register each model type
        for model_type, registered_name in model_mapping.items():
            # Search for this specific model type
            matching_run = None
            for _, run in runs.iterrows():
                # Check multiple possible tag locations for model type
                run_model_type = (
                    run.get('tags.model', '') or
                    run.get('tags.model_type', '') or 
                    run.get('tags.mlflow.runName', '') or
                    ''
                )
                
                # Match model type (case insensitive)
                if model_type.lower() in str(run_model_type).lower():
                    matching_run = run
                    break
            
            if matching_run is None:
                print(f"‚ö†Ô∏è  No run found for {model_type}")
                continue
            
            # Register this model
            print(f"\n{'='*60}")
            print(f"Registering {model_type} model as: {registered_name}")
            print(f"{'='*60}")
            
            run_id = matching_run['run_id']
            
            # CRITICAL FIX: Use correct artifact path
            # Models are logged as "{model_name}_model" in generic_trainer.py
            # Get the actual model name from tags
            model_tag_name = matching_run.get('tags.model', model_type)
            artifact_path = f"{model_tag_name.lower().replace(' ', '_')}_model"
            model_uri = f"runs:/{run_id}/{artifact_path}"
            
            # Print model metrics and artifact path
            print(f"  Run ID: {run_id}")
            print(f"  Artifact Path: {artifact_path}")
            print(f"  Model URI: {model_uri}")
            print(f"  Accuracy: {matching_run.get('metrics.accuracy', 'N/A'):.4f}")
            print(f"  ROC-AUC: {matching_run.get('metrics.roc_auc', 'N/A')}")
            print(f"  Precision: {matching_run.get('metrics.precision_fraud', 'N/A')}")
            print(f"  Recall: {matching_run.get('metrics.recall_fraud', 'N/A')}")
            print(f"  F1-Score: {matching_run.get('metrics.f1_fraud', 'N/A')}")
            
            try:
                # Register the model with correct URI and unique name
                model_version = mlflow.register_model(
                    model_uri=model_uri,
                    name=registered_name
                )
                
                print(f"\n‚úÖ Model registered successfully:")
                print(f"   Name: {model_version.name}")
                print(f"   Version: {model_version.version}")
                print(f"   Project: {project_name}")
                print(f"   Owner: {project_owner}")
                
                # Add metadata and tags
                client = mlflow.tracking.MlflowClient()
                
                # Set description
                desc = f"{model_type} fraud detection classifier from {project_owner}/{project_name} trained via Domino Flow with ROC-AUC: {matching_run.get('metrics.roc_auc', 'N/A')}"
                client.update_model_version(
                    name=registered_name,
                    version=model_version.version,
                    description=desc
                )
                
                # Set basic tags (including project info)
                basic_tags = {
                    "model_type": model_type,
                    "use_case": "fraud_detection",
                    "training_data": "credit_card_transactions",
                    "deployment_ready": "true",
                    "training_method": "domino_flow",
                    "project_name": project_name,
                    "project_owner": project_owner
                }
                
                for key, value in basic_tags.items():
                    client.set_model_version_tag(registered_name, model_version.version, key, value)
                
                # Set model specifications
                size_estimates = {
                    'XGBoost': '15-25 MB',
                    'AdaBoost': '8-15 MB', 
                    'GaussianNB': '< 5 MB'
                }
                memory_estimates = {
                    'XGBoost': '512 MB - 1 GB',
                    'AdaBoost': '256 MB - 512 MB',
                    'GaussianNB': '128 MB - 256 MB'
                }
                
                model_specs = {
                    "mlflow.domino.specs.Training Framework": model_type,
                    "mlflow.domino.specs.Training Dataset Size": "~500,000 credit card transactions",
                    "mlflow.domino.specs.Model Artifacts Size": size_estimates.get(model_type, '10-20 MB'),
                    "mlflow.domino.specs.Inference Memory": memory_estimates.get(model_type, '256 MB - 512 MB')
                }
                
                for key, value in model_specs.items():
                    client.set_registered_model_tag(registered_name, key, value)
                
                print(f"‚úÖ Added metadata and tags")
                
                registered_models.append({
                    'name': registered_name,
                    'version': model_version.version,
                    'model_type': model_type,
                    'run_id': run_id,
                    'artifact_path': artifact_path
                })
                
            except Exception as e:
                print(f"‚ùå Error registering {model_type} model: {e}")
                import traceback
                traceback.print_exc()
                continue
        
        # Summary
        print(f"\n{'='*60}")
        print(f"REGISTRATION SUMMARY")
        print(f"{'='*60}")
        print(f"Successfully registered {len(registered_models)} models:")
        for model in registered_models:
            print(f"  ‚Ä¢ {model['name']} (v{model['version']}) - {model['model_type']}")
            print(f"    Artifact: {model['artifact_path']}")
        
        return registered_models
        
    except Exception as e:
        print(f"‚ùå Error in registration process: {e}")
        import traceback
        traceback.print_exc()
        return []

# Register all three models from the flow
print("Registering all models from the Domino Flow...")
all_registered_models = register_all_flow_models(experiment_name)

if not all_registered_models:
    print("\n‚ö†Ô∏è  No models were registered. Please check that:")
    print("   1. The Domino Flow has completed successfully")
    print("   2. All three training tasks generated MLflow runs")
    print("   3. The experiment name is correct")
    print("\nYou can also register models manually through the Domino UI.")

## Manual Model Registration Instructions

### If programmatic registration is not available, follow these steps in the Domino UI:

1. **Select Best Model in Experiment Manager**
   - Choose the run with highest ROC-AUC (typically XGBoost)
   - Click on the run to view details

2. **Register Model**
   - Click "Register Model From Run" in the upper right corner
   - Enter model name: `fraud_detection_classifier_v1`

3. **Add Model Metadata**
   - Description: "Production-ready fraud detection classifier"
   - Tags: Use the template from `Model_Registry_template.md`
   - Model specifications: Define input/output schemas

4. **Governance and Approval**
   - Review governance requirements
   - Submit for approval if required by your organization
   - Set deployment stage (Staging/Production)

---

## Summary

This notebook demonstrated the complete Domino Flow workflow for:

1. **Flow Execution**: Running parallel model training tasks
2. **Monitoring**: Tracking progress through Domino UI
3. **Comparison**: Evaluating model performance in Experiment Manager
4. **Registration**: Adding the best model to Model Registry

### Key Domino Concepts Utilized:
- **Domino Flows**: Visual workflow orchestration for ML pipelines
- **Experiment Manager**: Centralized tracking and comparison of model runs
- **Model Registry**: Cataloging and governance for production models

### Next Steps:
- Model deployment via REST API endpoints
- Integration with Streamlit application
- Production monitoring and governance

This completes the **Training and Evaluation** phase of the fraud detection workshop.