# Running a Script with a Custom Environment
In this script below, I run a Python code using a custom environment.

In [1]:
# !pip install azure-ai-ml
# !pip install xgboost

In [1]:
pip show azure-ai-ml

Name: azure-ai-ml
Version: 1.16.1
Summary: Microsoft Azure Machine Learning Client Library for Python
Home-page: https://github.com/Azure/azure-sdk-for-python
Author: Microsoft Corporation
Author-email: azuresdkengsysadmins@microsoft.com
License: MIT License
Location: /anaconda/envs/azureml_py310_sdkv2/lib/python3.10/site-packages
Requires: azure-common, azure-core, azure-mgmt-core, azure-storage-blob, azure-storage-file-datalake, azure-storage-file-share, colorama, isodate, jsonschema, marshmallow, msrest, opencensus-ext-azure, opencensus-ext-logging, pydash, pyjwt, pyyaml, strictyaml, tqdm, typing-extensions
Required-by: 
Note: you may need to restart the kernel to use updated packages.


In [1]:
from azure.ai.ml import MLClient
from azure.identity import DefaultAzureCredential

My credentials, subscription id, resource group and workspace info, are all stored in a file called config.py. You have to make sure that you have yours in that file.

In [3]:
import config

In [4]:
ml_client = MLClient(
    DefaultAzureCredential(), config.subscription_id, config.resource_group, config.workspace
)

## Training a Model
We will train a simple classification model using XGBoost Classifier. The first line creates a file from the whole cell.

In [6]:
%%writefile titanic_is_back.py

# Installing the libraries

import pandas as pd
import numpy as np

from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder
from sklearn.metrics import accuracy_score
from sklearn.linear_model import LogisticRegression

# Reading the dataset from internet
df = pd.read_csv('https://web.stanford.edu/class/archive/cs/cs109/cs109.1166/stuff/titanic.csv')

# I checked the dataset before and there is no need to process the data with a few exceptions.
# We will directly go to training the data.

df = pd.get_dummies(df, columns = ['Sex'], dtype=int)
le = LabelEncoder()
df['Survived'] = le.fit_transform(df['Survived'])

X = df.drop(['Survived', 'Name'], axis=1)
y = df['Survived']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=19)

# Create an instance of the XGBClassifier
model = LogisticRegression()

# Fit the model to the training data
model.fit(X_train, y_train)

y_pred = model.predict(X_test)

print('Accuracy score is ', accuracy_score(y_test, y_pred))

sample = np.array([[2,30,1,2,40,0,1],[3,30,1,2,40,1,0]])
print(sample.shape)

# sample = np.reshape(sample, (-1, 1))
# print(sample.shape)

print(model.predict(sample))



Overwriting titanic_is_back.py


## Creating a Custom Environment
I create a custom environment from scratch. I start with the Python version and keep adding necessary libraries. This way I only add what I need. The environment is then saved to a YAML file called titanic_env.yml. I usually use the terminal of my compute to prepare the environment but this is not the only way.

In [6]:
# Creating a custom environment.

from azure.ai.ml.entities import Environment

env_docker_conda = Environment(
    image="mcr.microsoft.com/azureml/openmpi3.1.2-ubuntu18.04",
    conda_file="./titanic_env.yml",
    name="titanic_env",
    description="Environment created from a Docker image plus Conda environment.",
)
ml_client.environments.create_or_update(env_docker_conda)

Environment({'arm_type': 'environment_version', 'latest_version': None, 'image': 'mcr.microsoft.com/azureml/openmpi3.1.2-ubuntu18.04', 'intellectual_property': None, 'is_anonymous': False, 'auto_increment_version': False, 'auto_delete_setting': None, 'name': 'titanic_env', 'description': 'Environment created from a Docker image plus Conda environment.', 'tags': {}, 'properties': {'azureml.labels': 'latest'}, 'print_as_yaml': False, 'id': '/subscriptions/a54b1e51-86a2-4073-b2a5-1a79c43cf955/resourceGroups/model_dep/providers/Microsoft.MachineLearningServices/workspaces/ml-workspace/environments/titanic_env/versions/9', 'Resource__source_path': '', 'base_path': '/mnt/batch/tasks/shared/LS_root/mounts/clusters/sckaraman1/code/Users/sckaraman', 'creation_context': <azure.ai.ml.entities._system_data.SystemData object at 0x7f69012b96c0>, 'serialize': <msrest.serialization.Serializer object at 0x7f69012b97e0>, 'version': '9', 'conda_file': {'channels': ['defaults'], 'dependencies': ['_libgcc_

In [13]:
from azure.ai.ml import command

# configure job
job = command(
    code="./",
    command="python titanic_is_back.py",
    environment="titanic_env:7",
    compute="sckaraman1",
    display_name="titanic-lives",
    experiment_name="titanic-lives-1"
)  

# submit job
returned_job = ml_client.create_or_update(job)
aml_url = returned_job.studio_url
print("Monitor your job at", aml_url)

Monitor your job at https://ml.azure.com/runs/honest_farm_dtw8lw08xn?wsid=/subscriptions/a54b1e51-86a2-4073-b2a5-1a79c43cf955/resourcegroups/model_dep/workspaces/ml-workspace&tid=2e5df792-f2e7-43fc-954b-c3dd8d56a02d
