# Azure ML SDK MCP Sample Notebook

This notebook demonstrates advanced usage of the Azure Machine Learning SDK v2 with a focus on MCP (Model Context Protocol) integration and comprehensive ML workflows.

## Overview

This sample covers:
- Advanced workspace connection and authentication patterns
- Comprehensive asset management (data, models, environments, components)
- Compute resource management and optimization
- Advanced ML workflow operations and pipeline management
- Model deployment and endpoint management
- Best practices for production environments

## Prerequisites

- Azure subscription with Azure ML workspace
- Azure ML SDK v2 installed (`azure-ai-ml`)
- Proper authentication credentials
- Basic understanding of machine learning concepts

## 1. Setup and Installation

First, ensure you have the required packages installed:

In [None]:
# Install required packages (run this if packages are not installed)
# !pip install azure-ai-ml azure-identity pandas scikit-learn matplotlib seaborn

## 2. Import Required Libraries

In [None]:
import os
import json
from azure.ai.ml import MLClient
from azure.identity import DefaultAzureCredential, InteractiveBrowserCredential
from azure.ai.ml.entities import (
    Data, Model, Environment, Component, Job, 
    ManagedOnlineEndpoint, ManagedOnlineDeployment,
    AmlCompute, ComputeInstance
)
from azure.ai.ml.constants import AssetTypes
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from datetime import datetime, timedelta

## 3. Advanced Workspace Connection

Replace the placeholder values with your actual Azure subscription and workspace details:

In [None]:
# Azure ML workspace details - Update these with your values
subscription_id = "<your-subscription-id>"
resource_group_name = "<your-resource-group>"
workspace_name = "<your-workspace-name>"

# Initialize credential and ML client with error handling
def get_ml_client():
    """Get authenticated ML client with fallback credentials"""
    credentials_to_try = [
        ("DefaultAzureCredential", DefaultAzureCredential()),
        ("InteractiveBrowserCredential", InteractiveBrowserCredential())
    ]
    
    for cred_name, credential in credentials_to_try:
        try:
            ml_client = MLClient(
                credential=credential,
                subscription_id=subscription_id,
                resource_group_name=resource_group_name,
                workspace_name=workspace_name
            )
            # Test the connection
            _ = ml_client.workspaces.get(workspace_name)
            print(f"✅ Successfully connected using {cred_name}")
            return ml_client
        except Exception as e:
            print(f"❌ {cred_name} failed: {str(e)[:100]}...")
            continue
    
    raise Exception("All authentication methods failed")

# Get authenticated client
ml_client = get_ml_client()

print(f"Connected to workspace: {ml_client.workspace_name}")
print(f"Resource group: {ml_client.resource_group_name}")
print(f"Subscription: {ml_client.subscription_id}")

## 4. Comprehensive Workspace Assets Analysis

Let's perform a detailed analysis of all assets in your workspace:

In [None]:
def analyze_workspace_assets():
    """Comprehensive analysis of workspace assets"""
    
    # Data Assets Analysis
    print("=== DATA ASSETS ANALYSIS ===")
    data_assets = list(ml_client.data.list())
    
    if data_assets:
        data_df = pd.DataFrame([
            {
                'name': asset.name,
                'version': asset.version,
                'type': asset.type,
                'description': asset.description or 'No description',
                'created_date': asset.creation_context.created_at if hasattr(asset, 'creation_context') else 'Unknown'
            } for asset in data_assets[:10]  # Show first 10
        ])
        print(f"Total data assets: {len(data_assets)}")
        print("\nRecent data assets:")
        print(data_df.to_string(index=False))
        
        # Asset type distribution
        type_counts = pd.Series([asset.type for asset in data_assets]).value_counts()
        print("\nData asset types distribution:")
        print(type_counts)
    else:
        print("No data assets found")
    
    print("\n" + "="*50)
    
    # Model Assets Analysis
    print("=== MODEL ASSETS ANALYSIS ===")
    model_assets = list(ml_client.models.list())
    
    if model_assets:
        print(f"Total model assets: {len(model_assets)}")
        for i, model in enumerate(model_assets[:5]):
            print(f"{i+1}. {model.name} (v{model.version}) - {model.description or 'No description'}")
        if len(model_assets) > 5:
            print(f"... and {len(model_assets) - 5} more")
    else:
        print("No model assets found")
    
    print("\n" + "="*50)
    
    # Environment Assets Analysis
    print("=== ENVIRONMENT ASSETS ANALYSIS ===")
    env_assets = list(ml_client.environments.list())
    
    if env_assets:
        print(f"Total environment assets: {len(env_assets)}")
        
        # Categorize environments
        curated_envs = [env for env in env_assets if env.name.startswith('AzureML')]
        custom_envs = [env for env in env_assets if not env.name.startswith('AzureML')]
        
        print(f"Curated environments: {len(curated_envs)}")
        print(f"Custom environments: {len(custom_envs)}")
        
        print("\nRecent custom environments:")
        for i, env in enumerate(custom_envs[:3]):
            print(f"{i+1}. {env.name} (v{env.version})")
    else:
        print("No environment assets found")

analyze_workspace_assets()

## 5. Advanced Compute Resource Management

Let's explore and manage compute resources comprehensively:

In [None]:
def analyze_compute_resources():
    """Comprehensive compute resource analysis"""
    
    print("=== COMPUTE RESOURCES ANALYSIS ===")
    compute_resources = list(ml_client.compute.list())
    
    if compute_resources:
        compute_df = pd.DataFrame([
            {
                'name': compute.name,
                'type': compute.type,
                'state': getattr(compute, 'provisioning_state', 'N/A'),
                'size': getattr(compute, 'size', 'N/A'),
                'min_nodes': getattr(compute, 'scale_settings', {}).get('min_node_count', 'N/A') if hasattr(compute, 'scale_settings') else 'N/A',
                'max_nodes': getattr(compute, 'scale_settings', {}).get('max_node_count', 'N/A') if hasattr(compute, 'scale_settings') else 'N/A'
            } for compute in compute_resources
        ])
        
        print(f"Total compute resources: {len(compute_resources)}")
        print("\nCompute resources details:")
        print(compute_df.to_string(index=False))
        
        # Compute type distribution
        type_counts = compute_df['type'].value_counts()
        print("\nCompute types distribution:")
        print(type_counts)
        
        # State analysis
        state_counts = compute_df['state'].value_counts()
        print("\nCompute states:")
        print(state_counts)
        
    else:
        print("No compute resources found")

analyze_compute_resources()

## 6. Advanced Data Asset Management

Create and manage data assets with comprehensive metadata:

In [None]:
# Create multiple sample data assets with rich metadata
sample_datasets = [
    {
        "name": "titanic-mcp-demo",
        "path": "https://raw.githubusercontent.com/Azure/azureml-examples/main/sdk/python/assets/data/sample_data/titanic.csv",
        "description": "Titanic dataset for MCP demonstration - passenger survival data",
        "tags": {"source": "github", "type": "demo", "format": "csv", "domain": "transportation", "task": "classification"}
    },
    {
        "name": "diabetes-mcp-demo",
        "path": "https://raw.githubusercontent.com/Azure/azureml-examples/main/sdk/python/assets/data/sample_data/diabetes.csv",
        "description": "Diabetes dataset for MCP demonstration - medical prediction data",
        "tags": {"source": "github", "type": "demo", "format": "csv", "domain": "healthcare", "task": "regression"}
    }
]

registered_assets = []

for dataset_info in sample_datasets:
    try:
        sample_data = Data(
            path=dataset_info["path"],
            type=AssetTypes.URI_FILE,
            description=dataset_info["description"],
            name=dataset_info["name"],
            tags=dataset_info["tags"]
        )
        
        # Register the data asset
        registered_data = ml_client.data.create_or_update(sample_data)
        registered_assets.append(registered_data)
        
        print(f"✅ Successfully registered: {registered_data.name}")
        print(f"   Version: {registered_data.version}")
        print(f"   Type: {registered_data.type}")
        print(f"   Description: {registered_data.description}")
        print(f"   Tags: {registered_data.tags}")
        print()
        
    except Exception as e:
        print(f"❌ Failed to register {dataset_info['name']}: {e}")

print(f"Successfully registered {len(registered_assets)} data assets")

## 7. Advanced Data Exploration and Analysis

Perform comprehensive data analysis with visualizations:

In [None]:
def comprehensive_data_analysis(dataset_url, dataset_name):
    """Perform comprehensive data analysis with visualizations"""
    
    try:
        # Read the dataset
        df = pd.read_csv(dataset_url)
        
        print(f"=== {dataset_name.upper()} DATASET ANALYSIS ===")
        print(f"Dataset shape: {df.shape}")
        print(f"Columns: {list(df.columns)}")
        
        # Basic info
        print("\n=== DATASET INFO ===")
        print(df.info())
        
        # First few rows
        print("\n=== FIRST 5 ROWS ===")
        print(df.head())
        
        # Statistical summary
        print("\n=== STATISTICAL SUMMARY ===")
        print(df.describe())
        
        # Missing values analysis
        print("\n=== MISSING VALUES ANALYSIS ===")
        missing_data = df.isnull().sum()
        missing_percent = (missing_data / len(df)) * 100
        missing_df = pd.DataFrame({
            'Missing Count': missing_data,
            'Missing Percentage': missing_percent
        })
        print(missing_df[missing_df['Missing Count'] > 0])
        
        # Data quality insights
        print("\n=== DATA QUALITY INSIGHTS ===")
        print(f"Total missing values: {df.isnull().sum().sum()}")
        print(f"Duplicate rows: {df.duplicated().sum()}")
        print(f"Unique rows: {df.drop_duplicates().shape[0]}")
        
        # Visualization setup
        plt.style.use('seaborn-v0_8')
        fig, axes = plt.subplots(2, 2, figsize=(15, 10))
        fig.suptitle(f'{dataset_name} Dataset Analysis', fontsize=16)
        
        # Plot 1: Missing values heatmap
        if df.isnull().sum().sum() > 0:
            sns.heatmap(df.isnull(), ax=axes[0, 0], cbar=True, yticklabels=False)
            axes[0, 0].set_title('Missing Values Heatmap')
        else:
            axes[0, 0].text(0.5, 0.5, 'No Missing Values', ha='center', va='center', transform=axes[0, 0].transAxes)
            axes[0, 0].set_title('Missing Values Status')
        
        # Plot 2: Data types distribution
        dtype_counts = df.dtypes.value_counts()
        axes[0, 1].pie(dtype_counts.values, labels=dtype_counts.index, autopct='%1.1f%%')
        axes[0, 1].set_title('Data Types Distribution')
        
        # Plot 3: Correlation matrix for numeric columns
        numeric_cols = df.select_dtypes(include=['number']).columns
        if len(numeric_cols) > 1:
            corr_matrix = df[numeric_cols].corr()
            sns.heatmap(corr_matrix, annot=True, cmap='coolwarm', center=0, ax=axes[1, 0])
            axes[1, 0].set_title('Correlation Matrix')
        else:
            axes[1, 0].text(0.5, 0.5, 'Insufficient numeric columns', ha='center', va='center', transform=axes[1, 0].transAxes)
            axes[1, 0].set_title('Correlation Analysis')
        
        # Plot 4: Sample distribution of first numeric column
        if len(numeric_cols) > 0:
            first_numeric = numeric_cols[0]
            df[first_numeric].hist(bins=20, ax=axes[1, 1], alpha=0.7)
            axes[1, 1].set_title(f'Distribution of {first_numeric}')
            axes[1, 1].set_xlabel(first_numeric)
            axes[1, 1].set_ylabel('Frequency')
        else:
            axes[1, 1].text(0.5, 0.5, 'No numeric columns', ha='center', va='center', transform=axes[1, 1].transAxes)
            axes[1, 1].set_title('Distribution Analysis')
        
        plt.tight_layout()
        plt.show()
        
        return df
        
    except Exception as e:
        print(f"❌ Failed to analyze {dataset_name}: {e}")
        return None

# Analyze Titanic dataset
titanic_df = comprehensive_data_analysis(
    "https://raw.githubusercontent.com/Azure/azureml-examples/main/sdk/python/assets/data/sample_data/titanic.csv",
    "Titanic"
)

## 8. Advanced Job and Experiment Management

Comprehensive analysis of jobs and experiments:

In [None]:
def analyze_jobs_and_experiments():
    """Comprehensive job and experiment analysis"""
    
    print("=== JOBS AND EXPERIMENTS ANALYSIS ===")
    
    try:
        # Get recent jobs
        jobs = list(ml_client.jobs.list(max_results=20))
        
        if jobs:
            # Create jobs dataframe
            jobs_data = []
            for job in jobs:
                jobs_data.append({
                    'name': job.name,
                    'display_name': getattr(job, 'display_name', 'N/A'),
                    'type': job.type,
                    'status': job.status,
                    'experiment_name': getattr(job, 'experiment_name', 'N/A'),
                    'created_date': job.creation_context.created_at if hasattr(job, 'creation_context') else 'Unknown'
                })
            
            jobs_df = pd.DataFrame(jobs_data)
            
            print(f"Total recent jobs: {len(jobs)}")
            print("\nRecent jobs overview:")
            print(jobs_df.to_string(index=False))
            
            # Job status analysis
            print("\n=== JOB STATUS DISTRIBUTION ===")
            status_counts = jobs_df['status'].value_counts()
            print(status_counts)
            
            # Job type analysis
            print("\n=== JOB TYPE DISTRIBUTION ===")
            type_counts = jobs_df['type'].value_counts()
            print(type_counts)
            
            # Experiment analysis
            print("\n=== EXPERIMENT ANALYSIS ===")
            experiment_counts = jobs_df['experiment_name'].value_counts()
            print(f"Number of unique experiments: {len(experiment_counts)}")
            print("Top experiments by job count:")
            print(experiment_counts.head())
            
            # Time-based analysis
            if 'created_date' in jobs_df.columns:
                print("\n=== TEMPORAL ANALYSIS ===")
                # Convert dates and analyze patterns
                valid_dates = jobs_df[jobs_df['created_date'] != 'Unknown']['created_date']
                if len(valid_dates) > 0:
                    print(f"Jobs with valid timestamps: {len(valid_dates)}")
                    print(f"Date range: {valid_dates.min()} to {valid_dates.max()}")
            
        else:
            print("No jobs found in the workspace")
            
    except Exception as e:
        print(f"Failed to analyze jobs: {e}")

analyze_jobs_and_experiments()

## 9. Model and Endpoint Management

Advanced model deployment and endpoint management:

In [None]:
def analyze_endpoints():
    """Analyze existing endpoints in the workspace"""
    
    print("=== ENDPOINT ANALYSIS ===")
    
    try:
        # Online endpoints
        online_endpoints = list(ml_client.online_endpoints.list())
        
        if online_endpoints:
            print(f"Total online endpoints: {len(online_endpoints)}")
            
            for i, endpoint in enumerate(online_endpoints[:5]):
                print(f"\n{i+1}. Endpoint: {endpoint.name}")
                print(f"   Location: {getattr(endpoint, 'location', 'N/A')}")
                print(f"   Provisioning State: {getattr(endpoint, 'provisioning_state', 'N/A')}")
                print(f"   Scoring URI: {getattr(endpoint, 'scoring_uri', 'N/A')}")
                
                # Get deployments for this endpoint
                try:
                    deployments = list(ml_client.online_deployments.list(endpoint_name=endpoint.name))
                    print(f"   Deployments: {len(deployments)}")
                    for j, deployment in enumerate(deployments):
                        print(f"     {j+1}. {deployment.name} - {getattr(deployment, 'provisioning_state', 'N/A')}")
                except Exception as e:
                    print(f"   Could not retrieve deployments: {e}")
                    
            if len(online_endpoints) > 5:
                print(f"\n... and {len(online_endpoints) - 5} more endpoints")
        else:
            print("No online endpoints found")
        
        # Batch endpoints
        try:
            batch_endpoints = list(ml_client.batch_endpoints.list())
            print(f"\nTotal batch endpoints: {len(batch_endpoints)}")
            
            for i, endpoint in enumerate(batch_endpoints[:3]):
                print(f"{i+1}. Batch Endpoint: {endpoint.name}")
                print(f"   Provisioning State: {getattr(endpoint, 'provisioning_state', 'N/A')}")
                
        except Exception as e:
            print(f"Could not retrieve batch endpoints: {e}")
            
    except Exception as e:
        print(f"Failed to analyze endpoints: {e}")

analyze_endpoints()

## 10. Workspace Health and Performance Metrics

Comprehensive workspace health analysis:

In [None]:
def workspace_health_check():
    """Comprehensive workspace health and performance analysis"""
    
    print("=== WORKSPACE HEALTH CHECK ===")
    
    health_metrics = {
        'data_assets': 0,
        'model_assets': 0,
        'environment_assets': 0,
        'compute_resources': 0,
        'recent_jobs': 0,
        'online_endpoints': 0,
        'failed_jobs': 0,
        'running_jobs': 0
    }
    
    try:
        # Count assets
        health_metrics['data_assets'] = len(list(ml_client.data.list()))
        health_metrics['model_assets'] = len(list(ml_client.models.list()))
        health_metrics['environment_assets'] = len(list(ml_client.environments.list()))
        health_metrics['compute_resources'] = len(list(ml_client.compute.list()))
        
        # Analyze recent jobs
        recent_jobs = list(ml_client.jobs.list(max_results=50))
        health_metrics['recent_jobs'] = len(recent_jobs)
        
        if recent_jobs:
            health_metrics['failed_jobs'] = sum(1 for job in recent_jobs if job.status == 'Failed')
            health_metrics['running_jobs'] = sum(1 for job in recent_jobs if job.status in ['Running', 'Queued'])
        
        # Count endpoints
        try:
            health_metrics['online_endpoints'] = len(list(ml_client.online_endpoints.list()))
        except:
            health_metrics['online_endpoints'] = 0
        
        # Generate health report
        print("\n=== WORKSPACE METRICS ===")
        for metric, value in health_metrics.items():
            print(f"{metric.replace('_', ' ').title()}: {value}")
        
        # Health score calculation
        health_score = 0
        max_score = 100
        
        # Asset diversity (40 points)
        if health_metrics['data_assets'] > 0: health_score += 10
        if health_metrics['model_assets'] > 0: health_score += 10
        if health_metrics['environment_assets'] > 5: health_score += 10  # More than curated
        if health_metrics['compute_resources'] > 0: health_score += 10
        
        # Activity (30 points)
        if health_metrics['recent_jobs'] > 0: health_score += 15
        if health_metrics['recent_jobs'] > 10: health_score += 15
        
        # Deployment (20 points)
        if health_metrics['online_endpoints'] > 0: health_score += 20
        
        # Reliability (10 points)
        if health_metrics['recent_jobs'] > 0:
            failure_rate = health_metrics['failed_jobs'] / health_metrics['recent_jobs']
            if failure_rate < 0.1: health_score += 10
            elif failure_rate < 0.3: health_score += 5
        
        print(f"\n=== WORKSPACE HEALTH SCORE: {health_score}/{max_score} ===")
        
        if health_score >= 80:
            print("🟢 Excellent - Workspace is highly active and well-utilized")
        elif health_score >= 60:
            print("🟡 Good - Workspace is active with room for improvement")
        elif health_score >= 40:
            print("🟠 Fair - Consider increasing workspace utilization")
        else:
            print("🔴 Needs Attention - Low workspace activity detected")
        
        # Recommendations
        print("\n=== RECOMMENDATIONS ===")
        if health_metrics['data_assets'] == 0:
            print("• Consider registering data assets for better data management")
        if health_metrics['model_assets'] == 0:
            print("• Register trained models for version control and deployment")
        if health_metrics['compute_resources'] == 0:
            print("• Set up compute resources for training and inference")
        if health_metrics['online_endpoints'] == 0:
            print("• Consider deploying models to online endpoints for real-time inference")
        if health_metrics['failed_jobs'] > health_metrics['recent_jobs'] * 0.3:
            print("• High job failure rate detected - review job configurations")
        
    except Exception as e:
        print(f"Health check failed: {e}")

workspace_health_check()

## 11. Summary and Advanced Next Steps

This comprehensive notebook demonstrated:

✅ **Advanced Workspace Connection**: Robust authentication with fallback mechanisms  
✅ **Comprehensive Asset Analysis**: Detailed exploration of all asset types with metadata  
✅ **Advanced Compute Management**: In-depth compute resource analysis and optimization  
✅ **Sophisticated Data Operations**: Multi-dataset management with rich metadata and analysis  
✅ **Advanced Data Exploration**: Statistical analysis, visualizations, and data quality assessment  
✅ **Job and Experiment Analytics**: Comprehensive workflow analysis and performance metrics  
✅ **Endpoint Management**: Model deployment and endpoint health monitoring  
✅ **Workspace Health Monitoring**: Performance metrics and optimization recommendations  

### Advanced Next Steps

To further advance your Azure ML expertise, explore these advanced topics:

- **[MLOps Pipelines](../../../tutorials/mlops)**: Enterprise-grade ML operations and automation
- **[AutoML Integration](../../../tutorials/automl)**: Automated machine learning workflows
- **[Responsible AI](../../../tutorials/responsible-ai)**: Fairness, explainability, and model governance
- **[Distributed Training](../../../tutorials/distributed-training)**: Large-scale model training
- **[Edge Deployment](../../../tutorials/edge-deployment)**: IoT and edge inference scenarios
- **[Feature Stores](../../../tutorials/feature-store)**: Advanced feature engineering and management

### Advanced Resources

- [Azure ML Architecture Patterns](https://docs.microsoft.com/azure/machine-learning/concept-ml-pipelines)
- [Production MLOps Guide](https://docs.microsoft.com/azure/machine-learning/concept-model-management-and-deployment)
- [Azure ML Best Practices](https://docs.microsoft.com/azure/machine-learning/concept-enterprise-security)
- [Advanced SDK Reference](https://docs.microsoft.com/python/api/azure-ai-ml/)

### Performance Optimization Tips

1. **Compute Optimization**: Use appropriate compute sizes and auto-scaling
2. **Data Optimization**: Implement data versioning and caching strategies
3. **Model Optimization**: Use model quantization and optimization techniques
4. **Cost Management**: Monitor and optimize resource usage
5. **Security**: Implement proper RBAC and network security