# HPE Ezmeral ML Ops - Lab 1
## Intro and Setup of HPE Ezmeral ML Ops environment


**Requirements:**
- HPE Ezmeral Container Platform with HPE Ezmeral ML Ops deployment
- IP address or FQDN of the HPE Ezmeral Container Platform's controller host 
- a Tenant member user account, your userID   

**Utilities:**     
- HPE Ezmeral Container Platform UI
- Jupyter Notebook server with python kernel installed

**Definitions:**
- **HPE Ezmeral Container Platform** is an enterprise-grade container platform designed to deploy both cloud-native and non-cloud-native applications whether on-premises, at the edge, in multiple public clouds, or in a hybrid model. This makes the HPE Ezmeral Container Platform ideal for helping enterprise customers accelerate their application development and deployment on containers, on-demand through a self-service portal and a RESTful API that surfaces programmable access. To learn more about HPE Ezmeral Container Platform visit the [HPE DEV portal](https://developer.hpe.com) and check out the blog articles.

- **Tenant:** A tenant is a group of users created by the platform administrator. A tenant can represent for example, an office location, a business unit, an organization, or a project. A tenant is allocated a quota of resources (CPU, GPU, memory, storage) by the platform administrator. The resources used by a tenant are private to the tenant, other tenants have no visibility. A tenant user is granted the role of member or admin for the tenant.

- **HPE Ezmeral ML Ops** HPE Ezmeral Machine Learning (ML) Ops brings the power of containers to the entire ML lifecycle to enable users to build, train, share, deploy, and monitor ML models. See [HPE Ezmeral ML Ops](http://docs.bluedata.com/50_about-hpe-ml-ops). To enable sharing model and data artifacts, ML Ops integrates tenant-private persistent data store for Project repository when a AI/ML tenant is created. 

### Login to HPE Ezmeral Container Platform

Open a browser tab, and login to [HPE Ezmeral Container Platform](https://77.158.163.130:9911/bdswebui/login/) with your studentID as username and password is stuDISCO2020 

You will be logged into the tenant *HackShack Tenant*, which shows up on the top right before your userID. Ensure that you are **not** connected to k8sHackTenant. 

![Tenant](Pictures/TenantUser.png)




### Familiarize with HPE Ezmeral ML Ops UI

In the **Dashboard** screen you can monitor resource usage across the entire ML Ops cluster, tenant specific, notebook or training clusters - Compute (CPU/GPU), Memory, Network, and storage IO.

Navigate to the **Project Repository** from the left pane. See [Browsing the project repository](http://docs.bluedata.com/50_browsing-the-project-repository) for the standard repository folders pre-created by HPE Ezmeral ML Ops for AI/ML project. Drilldown into the code/XGB folder. You will notice two files pre-loaded **XGB_Income.ipynb** (python code for Income prediction model) and **XGB_Scoring.py** (scoring script).

#### Data
This tutorial uses the data set available from [1994 Census data](https://archive.ics.uci.edu/ml/datasets/Adult). This data set is a spreadsheet with approximately 32,000 rows of training data that was acquired from the 1994 Census database.

The features (columns) in this spreadsheet that are used to train the model are:

- age 
- workclass
- fnlwgt (the number that the census believes represents the population)
- education 
- education_num (number representation of education) 
- marital_status 
- occupation 
- relationship 
- race 
- sex 
- capital_gain 
- capital_loss 
- hours_per_week
- native_country
- The last column indicates the income classification of that individual.

### Setup

1. Download the two csv files from the **InputData** folder of your Notebook to your laptop. Click on the InputData folder, right-click on each csv data file in InputData directory and choose Download. Alternately, you can double-click on adult_data.csv to open the file in a tab in the right pane and choose File->Download option to download to a specific location on your laptop. Repeat this for adult_test.csv.  

2. In the HPE Ezmeral Container Platform UI, navigate to the data->UCI_Income folder of the Project repository. Create your own sub-directory with the name of your userID (student##). Select the UCI_Income folder and then press the + control button to create a new folder.

![New Folder](Pictures/NewFolder.png)

3. Upload the input data files to your project repository data folder
- Click (select) your own data folder you created in step 2 
- Click the Upload control button of the Project Repository 
- Select both the data files adult_data.csv and adult_test.csv from your laptop where you downloaded the files in step 1 above, and then click open to begin upload

![Upload button](Pictures/DataUpload.png)

4. Download the files for Lab2 python code 2-WKSHP-HPECP-ModelDevelopment.ipynb and scoring script XGB_Scoring.py from **Python Code** folder in the same manner as you did in Step 1. Edit the downloaded file XGB_Scoring.py with your favorite text editor and replace **userID** with your own studentid (student##). 

final_gb.load_model(saveInProjectRepo('models/XGB_Income/**student11**/XGB.pickle.dat'))
with open(saveInProjectRepo('data/UCI_Income/**student11**/encoding.json'), 'r') as file:

5. Under the folder code->XGB, create a sub-folder with the name of your userID (student##) as in step 2 above.

6. Navigate to the code->XGB->**your-student ID** folder of the Project repository. Upload the updated XGB_Scoring.py file from your laptop 

![ScoringScript](Pictures/ScoringScript.png)

7. Navigate to the models->XGB_Income folder in the Project Repository, and create a folder with the name of your userID (student##)

8. Navigate to the Notebooks screen (bottom left) and click on <font color='green'>Create Notebook</font>, and enter your notebook name as "mynotebook_##", where ## is student number, enter a short description, and select pythonmldl for "Associate with Training Environments", and leave others as default. This enables your notebook to submit model training job to a tenant-shared training cluster, which has more resources than your notebook environment. 

![Create Notebook](Pictures/CreateNotebook.png)

9. Notebook creation may take 1-2 minutes. Once the status of Notebook is <font color='green'>Ready</font> click on <font color='green'>"mynotebook_##"</font> link and then click on the <font color='green'>NotebookServer</font> link on the right. 

Replace the gateway2.etc.fr.comm.hpecorp.net with notebooks.hpedev.io, for example replace https://notebooks.hpedev.io:10525/ 

![Notebook Server](Pictures/Notebookserver.png)

10. This opens a JupyterHub login screen in a new browser tab. Login with your userID and password
![Jupyter Hub](Pictures/Jupyter.png)

11. Click on Upload button on the top right

12. Upload the Lab 2 python code 2-WKSHP-HPECP-ModelDevelopment from your laptop downloaded in Step 4
![Lab2 upload](Pictures/Lab2Upload.png)

Now, follow the instructions in Lab 2 to develop and train the model.

## Summary

In this lab, you learned how to navigate to the Project Repository of the tenant, create user-specific folders to store model and data, upload data files in the Project repository, and created your own notebook to do model building and training, which is required in the next lab.