## Define Problem & Objectives

## Business Problem Definition

Organization face continuous threats, ranging from external intrusions (e.g., brute-force login attempts, malicious IPs) to insider misuse (e.g., unusual session patterns). Traditional rule-based systems generate high false positives. You aim to build a system that intelligently detects threats by learning from historical behavior, adapting over time, and providing interpretable results.

### Use Case:
Build a cyber threat detection system that combines: 
- **Supervised Classification** for known threats  
- **Explainability** to support compliance and human trust  

### Key Objectives:
- Automatically flag suspicious sessions or network activities  
- Support SOC analysts by providing a reason or risk score for each alert  
- Automate retraining and deployment using Azure MLOps tools  
- Log predictions and monitor for drift or changes in behavior patterns  


# Storage Service Reevaluation Based on Dataset Size and Use Case

**Source:** [Kaggle - Cybersecurity Intrusion Detection Dataset](https://www.kaggle.com/datasets/dnkumars/cybersecurity-intrusion-detection-dataset/data)

## Dataset Profile
- **Rows:** ~9,000  
- **Format:** Structured tabular (CSV)  
- **Size:** Likely under 10 MB  
- **Usage:** For ML model development, training, versioning  
- **Access pattern:** Infrequent writes, multiple reads (especially during training)  


Given the relatively small size and simplicity of access required for this project, Azure Blob Storage was selected as the primary storage solution. It provides just the right balance of flexibility, cost-efficiency, and integration with the Azure ML ecosystem.

## Benefits:
- Simple to configure and manage  
- Fully supported by Azure Machine Learning, Azure Databricks, and Azure Data Factory  
- Seamlessly integrates with versioned datasets in Azure ML  
- Cost-effective choice for small, structured datasets  
- Supports secure access through:
  - Azure Active Directory with role-based access control (RBAC)  
  - Private Endpoints for secure network isolation  
  - Storage-level encryption  
- Easily scalable if data volume increases in the future  

## Implementation:
- Created a new storage account within the Azure environment  
- Enabled Blob Storage (GPv2) as the storage tier  
- Enforced HTTPS-only communication for secure data transfer  
- Enabled Soft Delete to provide protection against accidental data removal  
- Provisioned a dedicated container (e.g., `ml-data-intrusion`) for the ML assets  
- Uploaded the dataset (`intrusion_dataset.csv`) to the storage container  

## Secure Access:
- Assigned the appropriate RBAC roles (Storage Blob Data Reader or Contributor) to the Azure ML and Databricks service identities  
- Configured Private Endpoint access to ensure the storage account is not publicly accessible in production  

## Integration with Azure ML:
- Registered the dataset in Azure ML using `AssetTypes.URI_FILE`  
- Provided the direct path to the CSV file hosted in Blob Storage for subsequent use in training and pipeline workflows  


## Register the Dataset in Azure Machine Learning
## Step 1: Create a Container in Azure Blob Storage

To begin organizing the data assets, I created a dedicated container within the Blob Storage account.

In the upload window:
- Clicked **"Create new"** under the *Select an existing container* dropdown  
- Named the container using all lowercase characters with no special characters or spaces — for this project, I used:  
  - `intrusiondata`  
- Confirmed creation by clicking **Create**

## Step 2: Upload the Dataset

With the `intrusiondata` container selected:
- Clicked **Upload** to add the dataset  
- Uploaded the `cybersecurity_intrusion_data.csv` file  

The file is now available at the following path:

https://mldataintrusion.blob.core.windows.net/intrusiondata/cybersecurity_intrusion_data.csv


This path is important—I'll be using it to reference the dataset inside Azure Machine Learning.

## Step 3: Register the Dataset in Azure Machine Learning

Now that the dataset is securely stored in Blob Storage, I registered it in Azure Machine Learning. This enables version control, traceability, and easy reuse in training workflows and pipelines.

### Register via Azure ML Studio

1. Opened [Azure Machine Learning Studio](https://ml.azure.com)  
2. Navigated to the target ML Workspace  
3. In the left-hand menu, selected **Data → + Create**  
4. Filled out the dataset details:
   - **Type:** Tabular  
   - **Name:** `cyber_intrusion_dataset`  
   - **Description:** (left optional)  
   - **Datastore:** Selected the default linked to my storage account  
   - **Path:** Browsed to the uploaded CSV in the `intrusiondata` container  
5. Walked through the wizard and completed the dataset registration




## Create an Azure Machine Learning Workspace

To manage experiments, datasets, models, environments, and compute resources in one place, I created an Azure ML workspace.

This workspace acts as the central hub for all machine learning operations in the project.

### Step 1: How to Create an Azure ML Workspace

#### Method: Azure Portal (GUI)

1. Navigated to [Azure Portal](https://portal.azure.com)  
2. Clicked **Create a resource** → searched for **Machine Learning**  
3. Clicked **Create** and filled out the workspace configuration form:

| Field            | Value                                      |
|------------------|--------------------------------------------|
| **Subscription** | Azure for Students                         |
| **Resource Group** | `cyberml-canada-rg` (already created)     |
| **Workspace name** | `cyberml-ws` (or similar, lowercase)      |
| **Region**       | Canada Central (to match the storage region) |
| **Storage account** | Selected `mldataintrusion`               |
| **Key vault**    | Auto-created or selected existing          |
| **App Insights** | Auto-created                               |
| **Container Registry** | Optional (created if prompted)       |

4. Clicked **Review + Create**, validated the configuration, and deployed the workspace  

Once deployment completed:
- Navigated to [Azure ML Studio](https://ml.azure.com)  
- Switched to the new workspace using the dropdown in the top menu

---

### ⚡️ Alternative: Azure CLI (Faster)

If using the CLI, the workspace can be created with the following command:

```bash
az ml workspace create \
  --name cyberml-ws \
  --resource-group cyberml-canada-rg \
  --location canadacentral
  ```
---
This will provision the workspace with default settings and link it to the specified resource group and region.

With the workspace in place, I’m now able to:

- Register datasets
- Run and manage notebooks, experiments, and pipelines
- Train and register models
- Monitor and manage model deployments

---
## ✅ Once the Workspace Is Created

With the Azure ML workspace successfully deployed, I returned to [Azure Machine Learning Studio](https://ml.azure.com) to begin working within it.

### Switching to the New Workspace
- Used the **top-left workspace dropdown**  
- Selected my workspace: `cyberml-ws`

### Verifying Access to Data Management
Once inside the correct workspace:
- Navigated to the **Data** section from the left-hand menu  
- Clicked **+ Create**

At this point, the **Create** option was enabled, confirming that the workspace was fully active and ready for dataset registration and ML pipeline development.


## Data Exploration & Preprocessing

With the dataset now registered in Azure ML in the workspace, I proceeded to explore and preprocess the data in preparation for modeling.

### 🔍 What I Covered in This Step:
- Loaded the dataset in a notebook environment within Azure ML Studio  
- Explored the schema and overall data quality  
- Checked for missing values and data type inconsistencies  

### 🧹 Data Cleaning & Transformation:
- Converted categorical variables to numerical formats (e.g., label encoding or one-hot)  
- Handled any null values or inconsistent data types  
- Scaled or normalized numerical columns as needed  
- Optionally: Saved a cleaned version of the dataset back to Blob Storage or registered it as a new dataset version in Azure ML  

---

### 🔧 Launching a Notebook in Azure ML Studio

#### 1. Create the Notebook
- Navigated to **Notebooks** from the left-hand menu in [Azure ML Studio](https://ml.azure.com)  
- Clicked **User files**  
- Selected **+ New file → Notebook**  
- Named the notebook:  
  - `01_data_exploration.ipynb`

#### 2. Attach Compute (If Not Yet Created)
- Clicked **Select compute** in the top-right corner of the notebook interface  
- If no compute was available, I created a new one with the following settings:

| Field       | Value                                |
|-------------|----------------------------------------|
| **Name**    | `cpu-cluster`                         |
| **VM Size** | `Standard_DS11_v2` (or any free-tier eligible) |
| **Min nodes** | 0                                    |
| **Max nodes** | 1                                    |

- Waited for the compute instance to start before running any cells

Once everything was set up, I began exploring the dataset and preparing it for the next phase: feature engineering and model training.


# Feature Engineering and Model Training

# Step 1: Connect to Azure ML Workspace

We connect to our Azure ML Workspace using `MLClient` and `DefaultAzureCredential`. This allows us to access registered assets like datasets and models from our workspace.


In [None]:
%pip install azure-ai-ml
%pip install azure-identity
from azure.ai.ml import MLClient, Input, load_component
from azure.ai.ml.entities import BatchEndpoint, ModelBatchDeployment, ModelBatchDeploymentSettings, PipelineComponentBatchDeployment, Model, AmlCompute, Data, BatchRetrySettings, CodeConfiguration, Environment, Data
from azure.ai.ml.constants import AssetTypes, BatchDeploymentOutputAction
from azure.ai.ml.dsl import pipeline
from azure.identity import DefaultAzureCredential


In [None]:


import os
from azure.ai.ml import MLClient
from azure.ai.ml.entities import Data
from azure.ai.ml.constants import AssetTypes
from azure.identity import DefaultAzureCredential

# First, create MLTable file with correct structure
os.makedirs("mltable_cyber_intrusion", exist_ok=True)

mltable_content = """type: mltable
paths:
  - file: ../cybersecurity_intrusion_data.csv
transformations:
  - read_delimited:
      delimiter: ','
      encoding: utf8
      header: all_files_same_headers
"""

# Write MLTable file
with open("mltable_cyber_intrusion/MLTable", "w") as f:
    f.write(mltable_content)

# Connect to Azure ML Workspace
ml_client = MLClient.from_config(credential=DefaultAzureCredential())

try:
    # Register MLTable dataset using absolute path
    mltable_data = Data(
        path=os.path.abspath("mltable_cyber_intrusion"),
        type=AssetTypes.MLTABLE,
        name="cyber_intrusion_dataset_mltable",
        description="Cyber intrusion dataset in MLTable format",
        version="1"
    )
    
    result = ml_client.data.create_or_update(mltable_data)
    print(f"Dataset registered successfully. Name: {result.name}, Version: {result.version}")
    
except Exception as e:
    print(f"Error registering dataset: {str(e)}")







# Step 2: Load and Explore the Dataset
We retrieve the registered `cyber_intrusion_dataset` from Azure ML and load it into a pandas DataFrame. We examine the dataset shape, columns, and view a sample of the data.


In [3]:
from azure.ai.ml import MLClient
from azure.identity import DefaultAzureCredential
import pandas as pd

# Connect to workspace
ml_client = MLClient.from_config(credential=DefaultAzureCredential())

# Get data asset (uri_file type)
data_asset = ml_client.data.get(name="cyber_intrusion_dataset", version="1")

# Download the actual file locally from the URI
download_path = data_asset.path

# Load the CSV directly from its storage URI
df = pd.read_csv(download_path)

# Explore
print("Shape:", df.shape)
print("Columns:", df.columns.tolist())
df.head()








# Step 3: Data Cleaning and Preprocessing

### ✔ 3.1 Check for Issues
We check for missing values, duplicated rows, and inspect data types to confirm data consistency.

### ✔ 3.2 Summary Statistics
We use `describe()` to understand the distribution of numeric features, helping inform transformations.

### ✔ 3.3 Encode Categorical Features
Categorical columns `protocol_type`, `encryption_used`, and `browser_type` are one-hot encoded to make them machine learning-ready.


In [4]:
# Check for missing values
print("Missing values per column:")
print(df.isnull().sum())

# Check for duplicates
print("\nNumber of duplicate rows:", df.duplicated().sum())

# Check column data types
print("\nColumn data types:")
print(df.dtypes)




In [5]:
# Statistical overview for numeric columns
df.describe()




In [6]:
# List of categorical columns to encode
categorical_cols = ['protocol_type', 'encryption_used', 'browser_type']

# Apply one-hot encoding
df_encoded = pd.get_dummies(df, columns=categorical_cols, drop_first=True)

# Show updated columns
print("New columns after encoding:")
print(df_encoded.columns.tolist())




# Step 4: Analyze Class Distribution
We check the balance of our target variable `attack_detected` to understand class imbalance. This will guide how we evaluate model performance (e.g., accuracy vs. F1-score).


In [7]:
# Check target class distribution
df_encoded['attack_detected'].value_counts(normalize=True) * 100




# Step 5: Normalize Numerical Features
We apply Min-Max scaling to numeric features to ensure equal weighting during model training. This is especially helpful for distance-based or gradient-based algorithms.


In [8]:
from sklearn.preprocessing import MinMaxScaler

# Select numeric features to scale (excluding label)
num_features = ['network_packet_size', 'login_attempts', 'session_duration', 'ip_reputation_score', 'failed_logins']

# Initialize scaler and apply
scaler = MinMaxScaler()
df_encoded[num_features] = scaler.fit_transform(df_encoded[num_features])

# Confirm scaling
df_encoded[num_features].describe().T




# Step 6: Train-Test Split & Baseline Model Training
We split the preprocessed data into training and testing sets, then train a baseline Logistic Regression model to establish initial performance metrics.


In [9]:
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import classification_report, confusion_matrix

# Separate features and target
X = df_encoded.drop(['session_id', 'attack_detected'], axis=1)
y = df_encoded['attack_detected']

# Split into training and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42, stratify=y)

# Train baseline Logistic Regression model
model = LogisticRegression(max_iter=1000)
model.fit(X_train, y_train)

# Predict and evaluate
y_pred = model.predict(X_test)

print("Classification Report:")
print(classification_report(y_test, y_pred, digits=4))

print("\nConfusion Matrix:")
print(confusion_matrix(y_test, y_pred))




# Step 7: Train Using Azure AutoML (SDK)

In this step, we use Azure AutoML to automate model selection, hyperparameter tuning, and training. This approach helps identify the best model without manual trial-and-error and ensures reproducibility.

We specify the compute target, data source, task type (classification), target column, and primary evaluation metric (F1-score).


In [10]:
from azure.ai.ml import MLClient
from azure.identity import DefaultAzureCredential

ml_client = MLClient.from_config(credential=DefaultAzureCredential())
ml_client.data.get(name="cyber_intrusion_dataset_mltable", version="1")






In [13]:
import os
from azure.ai.ml import MLClient
from azure.ai.ml.entities import Data
from azure.ai.ml.constants import AssetTypes
from azure.identity import DefaultAzureCredential

# 1. Setup paths
base_dir = "mltable_cyber_intrusion"
os.makedirs(base_dir, exist_ok=True)

# 2. Rewrite MLTable definition correctly
mltable_yaml = """paths:
  - file: ./cybersecurity_intrusion_data.csv

transformations:
  - read_delimited:
      delimiter: ","
      encoding: utf8
      header: all_files_same_headers
"""

with open(os.path.join(base_dir, "MLTable"), "w") as f:
    f.write(mltable_yaml)

# 3. Register the MLTable dataset with absolute path
ml_client = MLClient.from_config(credential=DefaultAzureCredential())

mltable_data = Data(
    path=os.path.abspath(base_dir),
    type=AssetTypes.MLTABLE,
    name="cyber_intrusion_dataset_mltable",
    description="Cyber intrusion dataset in MLTable format",
    version="3",  # Use a new version!
)

result = ml_client.data.create_or_update(mltable_data)
print(f"Dataset registered: {result.name}, Version: {result.version}")






In [None]:
from azure.ai.ml import MLClient, Input
from azure.ai.ml.automl import classification
from azure.ai.ml.constants import AssetTypes
from azure.identity import DefaultAzureCredential
import time

# Connect to ML workspace
ml_client = MLClient.from_config(credential=DefaultAzureCredential())

# Load training data (MLTable)
my_training_data = Input(
    type=AssetTypes.MLTABLE,
    path="azureml:cyber_intrusion_dataset_mltable:2"
)

# Define AutoML classification job
automl_classification_job = classification(
    training_data=my_training_data,
    target_column_name="attack_detected",
    compute="cpucluster01",
    experiment_name="cyber_intrusion_detection_automl",
    primary_metric="accuracy",
    n_cross_validations=5,
    enable_model_explainability=True,
    tags={"training_type": "automl", "attack_detection": "cybersecurity"}
)

# Optional: Training config
automl_classification_job.set_training(
    enable_stack_ensemble=True,
    enable_vote_ensemble=True
)

# Limits
automl_classification_job.set_limits(
    max_trials=20,
    max_concurrent_trials=4,
    timeout_minutes=180,
    enable_early_termination=True
)

# Submit the job
returned_job = ml_client.jobs.create_or_update(automl_classification_job)

# Print job info
print(f"Submitted AutoML job: {returned_job.name}")
print(f"Monitor in Azure ML Studio: {returned_job.studio_url}")

# Optional: Live monitoring
while True:
    job = ml_client.jobs.get(returned_job.name)
    print(f"Job status: {job.status}")
    if job.status in ['Completed', 'Failed', 'Canceled']:
        break
    time.sleep(60)






## Step 8: Register the Best Model

Following the completion of the AutoML experiment, the best-performing model from the run was registered in Azure Machine Learning. This step ensures the model is versioned, reproducible, and ready for deployment or integration into downstream pipelines.

> **Note:** The job name used for registration was taken from a previously successful AutoML run. The job ID was retrieved from the Azure ML Studio under the Jobs section.

By registering the model:
- The asset becomes accessible in the Azure ML model registry  
- Model versioning and lifecycle management are enabled  
- Future steps such as deployment, batch scoring, or monitoring can reference the registered version directly  
- CI/CD and MLOps pipelines can leverage the model consistently across environments


In [2]:
# Step 8: Register the Best AutoML Model

# Import necessary libraries
from azure.ai.ml import MLClient
from azure.ai.ml.entities import Model
from azure.ai.ml.constants import AssetTypes
from azure.identity import DefaultAzureCredential
from azureml.core import Workspace
from mlflow.tracking import MlflowClient

# Connect to Azure ML Workspace
credential = DefaultAzureCredential()
ml_client = MLClient.from_config(credential=credential)

# Retrieve the AutoML parent run name from earlier step
parent_run_id = "amusing_skin_nw68t1k275" 

# Connect to MLflow tracking URI
ws = Workspace.from_config()
mlflow_client = MlflowClient()

# Get the parent AutoML run
parent_run = mlflow_client.get_run(parent_run_id)

# Extract best child run ID
best_child_run_id = parent_run.data.tags.get("automl_best_child_run_id")
if not best_child_run_id:
    raise ValueError("Best child run ID not found. Ensure AutoML run completed successfully.")

print(f"Best child run ID: {best_child_run_id}")

# Define model path in outputs
model_path = f"azureml://jobs/{best_child_run_id}/outputs/artifacts/outputs/mlflow-model/"

# Create and register the model
model = Model(
    path=model_path,
    name="cyber_intrusion_model",
    description="Best model from AutoML run for cybersecurity intrusion detection",
    type=AssetTypes.MLFLOW_MODEL,
)

registered_model = ml_client.models.create_or_update(model)

print(f"Model registered: {registered_model.name}, version: {registered_model.version}")






# Step 9: Review and Track Model Performance

Before deploying the model, we review its performance, including metrics like precision, recall, and accuracy. We also explore explainability insights to validate model behavior.


In [3]:
from azure.ai.ml import MLClient
from azure.identity import DefaultAzureCredential
from mlflow.tracking import MlflowClient
from azureml.core import Workspace
import mlflow

# Connect to ML Workspace
ml_client = MLClient.from_config(credential=DefaultAzureCredential())
ws = Workspace.from_config()
mlflow.set_tracking_uri(ws.get_mlflow_tracking_uri())
mlflow_client = MlflowClient()

# Define parent and best child run
parent_run_id = "amusing_skin_nw68t1k275"
parent_run = mlflow_client.get_run(parent_run_id)
best_child_run_id = parent_run.data.tags["automl_best_child_run_id"]

# Load best run metrics
best_run = mlflow_client.get_run(best_child_run_id)
metrics = best_run.data.metrics
print("Best model metrics:")
for k, v in metrics.items():
    print(f"{k}: {v}")

# Open in Azure ML Studio (AutoML Run link)
print(f"\n View full experiment in Azure ML Studio:")
print(f"https://ml.azure.com/runs/{parent_run_id}?wsid=/subscriptions/{ml_client.subscription_id}/resourceGroups/{ml_client.resource_group_name}/workspaces/{ml_client.workspace_name}")






## Next Steps: MLflow Batch Inference (Continued)

To streamline the batch inference process and align with Azure Machine Learning best practices, the work continues in a separate notebook that adopts the MLflow format for batch deployments.

### Transition Overview

The following artifacts have been prepared and structured to support this transition:

- `mlflow-for-batch-tabular.ipynb`  
  Contains the implementation for batch inference using MLflow-based deployment and scoring.

- `environment/`  
  Defines the environment configuration using a `conda` environment file (e.g., `env.yml`) including MLflow, scikit-learn, and Azure ML SDK dependencies.

- `code/`  
  Contains the `score.py` file structured with `init()` and `run()` functions for use in MLflow batch deployments.

- `named-outputs/score/`  
  Folder designated for storing batch inference outputs, such as the generated `predictions.csv`.

- `cybersecurity_intrusion_data_test.csv`  
  Sample dataset used to test batch inference jobs.

### Next Notebook

Please proceed to the notebook:

`mlflow-for-batch-tabular.ipynb`

This notebook walks through:
- Registering or referencing an MLflow model
- Creating a batch endpoint and deployment
- Submitting a batch inference job
- Verifying and reviewing output predictions

By structuring the process this way, we ensure separation between exploratory analysis and production inference workflows, promoting modularity and scalability in the MLOps lifecycle.


## Attempt to Deploy a Real-Time Endpoint (Optional Exploration)

As part of exploring different deployment strategies, an attempt was made to deploy the model to an **Azure ML managed online endpoint** to evaluate real-time inference capabilities.

### Rationale for Real-Time Endpoint Exploration
While the core use case and resource constraints justified batch inference, deploying to a real-time endpoint offered a chance to:
- Experiment with REST-based, low-latency predictions  
- Simulate production-grade API integration scenarios  
- Benchmark deployment and containerization processes on Azure

### Outcome & Limitation
The deployment was initially successful in terms of endpoint provisioning (`cyber-intrusion-endpoint-v1`), but failed during the container startup phase with the following error:

```
(ResourceNotReady) User container has crashed or terminated.
Minimum recommended compute SKU is Standard_DS3_v2 for general-purpose online endpoints.
```


The current workspace was using `Standard_DS2_v2`, which is part of the **student subscription** and may not meet the baseline compute requirements for real-time scoring containers. As a result, the scoring container failed to initialize and crashed during deployment.

### Cleanup
Post-failure, the endpoint was programmatically cleaned up to avoid incurring unused resources.

---

### Note for Future Iterations
Real-time endpoints are highly recommended in production settings where low-latency, event-driven predictions are required—especially for cybersecurity systems that need to trigger immediate responses.

For users with access to **higher-tier Azure subscriptions**, it's encouraged to:
- Use `Standard_DS3_v2` or higher SKUs  
- Explore **managed online endpoints** with autoscaling and authentication features  
- Benchmark response latency vs. cost trade-offs

---

Although this part of the project did not succeed under current constraints, it provided valuable insights into environment compatibility and production-readiness considerations in Azure ML.



In [29]:
from azure.ai.ml.entities import ManagedOnlineEndpoint, ManagedOnlineDeployment, Environment
from azure.core.exceptions import ResourceNotFoundError
from azure.ai.ml.entities import CodeConfiguration
from azure.ai.ml import MLClient
from azure.identity import DefaultAzureCredential

# Initialize the MLClient
ml_client = MLClient.from_config(credential=DefaultAzureCredential())

# Configuration
endpoint_name = "cyberintrusionendpointv1"
deployment_name = "blue"
model_name = "cyber_intrusion_model"
model_version = "11"

# Use curated AzureML environment to avoid image build issues
from azure.ai.ml.entities import Environment

env = ml_client.environments.get(
    name="AzureML-sklearn-1.0-ubuntu20.04-py38-cpu",
    version="1"
)


try:
    # Verify model exists
    model = ml_client.models.get(name=model_name, version=model_version)
    print(f"Found model: {model.name} version {model.version}")

    # Delete existing endpoint if present
    try:
        existing = ml_client.online_endpoints.get(name=endpoint_name)
        print(f"Deleting existing endpoint: {endpoint_name}")
        ml_client.online_endpoints.begin_delete(name=endpoint_name).result()
    except ResourceNotFoundError:
        print(f"No existing endpoint found with name: {endpoint_name}")

    # Create endpoint
    print("Creating endpoint...")
    endpoint = ManagedOnlineEndpoint(
        name=endpoint_name,
        description="Real-time endpoint for cybersecurity intrusion detection",
        auth_mode="key",
        tags={
            "project": "cyber_intrusion",
            "type": "realtime",
            "model_name": model_name,
            "model_version": model_version
        }
    )
    ml_client.online_endpoints.begin_create_or_update(endpoint).result()
    print(f"Endpoint {endpoint_name} created successfully")

    # Create deployment with curated environment
    print("Creating deployment...")
    deployment = ManagedOnlineDeployment(
        name=deployment_name,
        endpoint_name=endpoint_name,
        model=model,
        environment=env,
        code_configuration=CodeConfiguration(code="./", scoring_script="score.py"),
        instance_type="Standard_DS3_v2",
        instance_count=1
    )

    ml_client.online_deployments.begin_create_or_update(deployment).result()
    print(f"Deployment {deployment_name} created successfully")

    # Update traffic
    endpoint.traffic = {deployment_name: 100}
    ml_client.online_endpoints.begin_create_or_update(endpoint).result()
    print("Traffic allocation updated")

    # Output endpoint info
    endpoint = ml_client.online_endpoints.get(endpoint_name)
    print("\nEndpoint Details:")
    print(f"Name: {endpoint.name}")
    print(f"State: {endpoint.provisioning_state}")
    print(f"URI: {endpoint.scoring_uri}")

    # Retrieve primary key
    key = ml_client.online_endpoints.get_keys(endpoint_name).primary_key
    print("\nAuthentication key retrieved successfully")

except Exception as e:
    print(f"\nError: {str(e)}")
    print(f"Error type: {type(e).__name__}")
    print(f"Full error details: {e.__dict__}")
    try:
        print(f"\nAttempting to clean up endpoint {endpoint_name}")
        ml_client.online_endpoints.begin_delete(name=endpoint_name).result()
        print("Cleanup completed")
    except Exception as cleanup_error:
        print(f"Cleanup error: {str(cleanup_error)}")


Found the config file in: /config.json


Found model: cyber_intrusion_model version 11
No existing endpoint found with name: cyberintrusionendpointv1
Creating endpoint...
Endpoint cyberintrusionendpointv1 created successfully
Creating deployment...


Check: endpoint cyberintrusionendpointv1 exists
Uploading code (21.44 MBs): 100%|██████████| 21440276/21440276 [00:00<00:00, 21612867.54it/s]





Error: (BadRequest) The request is invalid.
Code: BadRequest
Message: The request is invalid.
Exception Details:	(InferencingClientCallFailed) {"error":{"code":"Validation","message":"{\"errors\":{\"VmSize\":[\"Not enough quota available for Standard_DS3_v2 in SubscriptionId 43dae9af-3755-421b-bfae-29b91f9e85dd. Current usage/limit: 2/6. Additional needed: 8 Please see troubleshooting guide, available here: https://aka.ms/oe-tsg#error-outofquota\"]},\"type\":\"https://tools.ietf.org/html/rfc9110#section-15.5.1\",\"title\":\"One or more validation errors occurred.\",\"status\":400,\"traceId\":\"00-f2a5699fa6470ff757e34bab0327809c-98a4b4a33421a756-01\"}"}}
	Code: InferencingClientCallFailed
	Message: {"error":{"code":"Validation","message":"{\"errors\":{\"VmSize\":[\"Not enough quota available for Standard_DS3_v2 in SubscriptionId 43dae9af-3755-421b-bfae-29b91f9e85dd. Current usage/limit: 2/6. Additional needed: 8 Please see troubleshooting guide, available here: https://aka.ms/oe-tsg#e