# Create an empty Ubuntu VM on Azure

Install HAGrid

```
pip install hagrid
```

__NOTE__: Just in case there are some issues with a VM we should provision are few extra, here we use node_count 12 for a session of 10 users

Run hagrid launch with these arguments:

```
hagrid launch to azure --image_name=domain_0.7.0 --jupyter --ansible_extras="install=false" --node_count 12
```

- Use a new unique resource group for this session like: aa-test-1
- Choose the location where your demo participants will be located, e.g. `eastus`, `westus` etc
- Choose an 8 core machine like `Standard_D8s_v3`
- Set the username to `azureuser`
- Choose password and then `n` to auto-generate password
- Set an easy to remember 12 character password like: `Adastrademo2022`
- What ever you enter into Repo and Branch will be ignored

![ip_address](img/hagrid_bare_vm.png)

After it is finished you should see this message

![ip_address](img/hagrid_bare_vm_output.png)

Now run this to get JSON containing the information for all the VMs:

```
cat ~/.hagrid/host_ips.json
```

In [6]:
import os
import json

# paste the path to host ip json here
HOST_IP_PATH = "~/.hagrid/host_ips.json"
HOST_IP_PATH = os.path.expanduser(HOST_IP_PATH)

with open(HOST_IP_PATH) as fp:
    host_ips = json.loads(fp.read())

In [7]:
host_ips

{'host_ips': [{'username': 'azureuser',
   'password': 'Adastrademo2022',
   'ip_address': '20.231.237.145',
   'jupyter_token': '553skvdcvowca7gokxsk9ii3oq7pt2kwbos9qb7ynhu4717f'},
  {'username': 'azureuser',
   'password': 'Adastrademo2022',
   'ip_address': '20.231.237.252',
   'jupyter_token': 'nm9m0xavqq61nuxgn2ca1890rmgjn4a3twzlgnylsk215gdn'},
  {'username': 'azureuser',
   'password': 'Adastrademo2022',
   'ip_address': '20.231.237.103',
   'jupyter_token': '50us5dyi9e0g4g483uf6z8w04vezbka3kjrn7kuhdvyu3v2t'},
  {'username': 'azureuser',
   'password': 'Adastrademo2022',
   'ip_address': '20.231.237.102',
   'jupyter_token': 'dafs5od777totbas9tz35dbvbiye1jyquqf6qtvncyhkmtws'},
  {'username': 'azureuser',
   'password': 'Adastrademo2022',
   'ip_address': '20.231.237.146',
   'jupyter_token': 'hhfnz0jnj1xqfk32x8dqa8g2c0xgb824fleuxa3iv9agbidu'},
  {'username': 'azureuser',
   'password': 'Adastrademo2022',
   'ip_address': '20.231.237.144',
   'jupyter_token': 'qz20q5bvltgvukmrrsjf

In [15]:
# update TOTAL_PARTICIPANTS
# use the total participants not the total machines, e.g. 10 not 12
# as this is used to calculate the data split assignment
TOTAL_PARTICIPANTS = 10

In [16]:
# optionally add names or emails here which will be printed below to help keep track of assignment
participants = [
    "Teo",
    "Ruchi",
    "Kyoko",
    "Ivy",
    "Shubham",
    "Irina",
    "Laura",
    "Ionesio",
    "Ronnie",
    "Rasswanth"
]
print("Total participants:", len(participants))

Total participants: 10


If you need to re-partition the MedNIST dataset and create new data subsets, switch to [prepare MedNIST dataset notebook](02-prepare-datasets-MedNIST.ipynb).

In [8]:
import requests

DATASET_INFO_FILEPATH = "https://raw.githubusercontent.com/shubham3121/datasets/main/MedNIST/dataset.json"

def get_dataset_urls():
    
    data_subset_urls = []
    
    response = requests.get(DATASET_INFO_FILEPATH)
    data_subset_info = response.json()
    DATASET_REPO_URL = "https://media.githubusercontent.com/media/shubham3121/datasets/main/MedNIST/subsets/"
    
    for dataset_name in data_subset_info.values():
        url = DATASET_REPO_URL + dataset_name
        data_subset_urls.append(url)
        
    return data_subset_urls

In [9]:
def check_ip_port(host_ip: str, port: int) -> bool:
    import socket
    try:
        sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
        sock.settimeout(2)
        result = sock.connect_ex((host_ip, port))
        sock.close()
        if result == 0:
            return True
    except Exception:
        pass
    return False

In [10]:
def get_icon(status: bool) -> str:
    return "✅" if status else "❌"

In [11]:
def check_hosts_ready(host_ips: dict) -> None:
    for host in host_ips["host_ips"]:
        print("-----------------------")
        host_ip = host["ip_address"]
        # make sure the containers are not running
        http_up = check_ip_port(host_ip=host_ip, port=80)
        print(f"{get_icon(not http_up)} Containers Off {host_ip}:80")
        
        # make sure jupyter notebooks is up
        jupyter_up = check_ip_port(host_ip=host_ip, port=8888)
        print(f"{get_icon(jupyter_up)} Jupyter Up {host_ip}:8888")
        
        # make sure SSH is up
        ssh_up = check_ip_port(host_ip=host_ip, port=22)
        print(f"{get_icon(ssh_up)} SSH Up {host_ip}:22")

        print()
        all_status = (not http_up) and jupyter_up and ssh_up
        print(f"{get_icon(all_status)} Node {host_ip} Ready!")
        print("-----------------------")
        print()

In [8]:
check_hosts_ready(host_ips)

-----------------------
✅ Containers Off 20.85.158.49:80
✅ Jupyter Up 20.85.158.49:8888
✅ SSH Up 20.85.158.49:22

✅ Node 20.85.158.49 Ready!
-----------------------

-----------------------
✅ Containers Off 20.85.155.54:80
✅ Jupyter Up 20.85.155.54:8888
✅ SSH Up 20.85.155.54:22

✅ Node 20.85.155.54 Ready!
-----------------------

-----------------------
✅ Containers Off 20.85.157.249:80
✅ Jupyter Up 20.85.157.249:8888
✅ SSH Up 20.85.157.249:22

✅ Node 20.85.157.249 Ready!
-----------------------

-----------------------
✅ Containers Off 20.85.159.232:80
✅ Jupyter Up 20.85.159.232:8888
✅ SSH Up 20.85.159.232:22

✅ Node 20.85.159.232 Ready!
-----------------------

-----------------------
✅ Containers Off 20.85.159.130:80
✅ Jupyter Up 20.85.159.130:8888
✅ SSH Up 20.85.159.130:22

✅ Node 20.85.159.130 Ready!
-----------------------

-----------------------
✅ Containers Off 20.85.158.23:80
✅ Jupyter Up 20.85.158.23:8888
✅ SSH Up 20.85.158.23:22

✅ Node 20.85.158.23 Ready!
-----------------

In [28]:
def ds_user_credentials(host_ip, participat_number, total_participants):
    creds = {
        "url": f"{host_ip}",
        "name": "Samantha Carter",
        "email": "sam@sg1.net",
        "password": "stargate",
        "dataset_name": f"MedNIST Data {participat_number}/{total_participants}",
    }
    return creds


DS_USER_CREDENTIALS = []


def output_user_details(host_ips: dict, participants: list[str] = []) -> None:
    notebook_path = "adastra/data-owners/01-data-owners-login.ipynb"
    print("===============================")
    print("Ad Astra Demo 1")
    print("===============================")
    print()
    print("Send to each participant")
    print()
    if TOTAL_PARTICIPANTS > len(host_ips["host_ips"]):
        raise Exception(
            f"TOTAL_PARTICIPANTS: {TOTAL_PARTICIPANTS} is less than VM count: {len(host_ips['host_ips'])}"
        )
    partition = 0
    dataset_urls = get_dataset_urls()
    for host in host_ips["host_ips"]:
        partition += 1
        if partition <= len(participants):
            print(f"Hi {participants[partition - 1]},")
        if partition <= TOTAL_PARTICIPANTS:
            print("These are your Session Details:")
        else:
            print("Spare Session Details:")
        print("-------------------------------")
        print(f"Username: {host['username']}")
        print(f"Password: {host['password']}")
        print(f"VM IP Address: {host['ip_address']}")
        if partition <= TOTAL_PARTICIPANTS:
            print(f"📎 MY_DATASET_URL:\n{dataset_urls[partition-1]}")
            print()
            DS_USER_CREDENTIALS.append(
                ds_user_credentials(
                    host["ip_address"], partition, TOTAL_PARTICIPANTS
                )
            )

        print()
        print(f"👉🏽 Start Here:")
        print(
            f"http://{host['ip_address']}:8888/lab/tree/notebooks/{notebook_path}"
            f"?token={host['jupyter_token']}"
        )
        

        print()

In [29]:
output_user_details(host_ips, participants)

Ad Astra Demo 1

Send to each participant

Hi Teo,
These are your Session Details:
-------------------------------
Username: azureuser
Password: Adastrademo2022
VM IP Address: 20.231.237.145
📎 MY_DATASET_URL:
https://media.githubusercontent.com/media/shubham3121/datasets/main/MedNIST/subsets/MedNIST-437467c744.pkl


👉🏽 Start Here:
http://20.231.237.145:8888/lab/tree/notebooks/adastra/data-owners/01-data-owners-login.ipynb?token=553skvdcvowca7gokxsk9ii3oq7pt2kwbos9qb7ynhu4717f

Hi Ruchi,
These are your Session Details:
-------------------------------
Username: azureuser
Password: Adastrademo2022
VM IP Address: 20.231.237.252
📎 MY_DATASET_URL:
https://media.githubusercontent.com/media/shubham3121/datasets/main/MedNIST/subsets/MedNIST-b48a3173fe.pkl


👉🏽 Start Here:
http://20.231.237.252:8888/lab/tree/notebooks/adastra/data-owners/01-data-owners-login.ipynb?token=nm9m0xavqq61nuxgn2ca1890rmgjn4a3twzlgnylsk215gdn

Hi Kyoko,
These are your Session Details:
-------------------------------
Use

In [30]:
# share this list of credentials with the Data Scientist
DS_USER_CREDENTIALS

[{'url': '20.231.237.145',
  'name': 'Samantha Carter',
  'email': 'sam@sg1.net',
  'password': 'stargate',
  'dataset_name': 'MedNIST Data 1/10'},
 {'url': '20.231.237.252',
  'name': 'Samantha Carter',
  'email': 'sam@sg1.net',
  'password': 'stargate',
  'dataset_name': 'MedNIST Data 2/10'},
 {'url': '20.231.237.103',
  'name': 'Samantha Carter',
  'email': 'sam@sg1.net',
  'password': 'stargate',
  'dataset_name': 'MedNIST Data 3/10'},
 {'url': '20.231.237.102',
  'name': 'Samantha Carter',
  'email': 'sam@sg1.net',
  'password': 'stargate',
  'dataset_name': 'MedNIST Data 4/10'},
 {'url': '20.231.237.146',
  'name': 'Samantha Carter',
  'email': 'sam@sg1.net',
  'password': 'stargate',
  'dataset_name': 'MedNIST Data 5/10'},
 {'url': '20.231.237.144',
  'name': 'Samantha Carter',
  'email': 'sam@sg1.net',
  'password': 'stargate',
  'dataset_name': 'MedNIST Data 6/10'},
 {'url': '20.231.237.101',
  'name': 'Samantha Carter',
  'email': 'sam@sg1.net',
  'password': 'stargate',
  'd