In [1]:
# run this to shorten the data import from the files
import os
cwd = os.path.dirname(os.getcwd())+'/'
path_data = os.path.join(os.path.dirname(os.getcwd()), 'datasets/')


# Reasons for tests

Testing is a pivotal aspect of the machine learning lifecycle. Just as developers rigorously test software applications to identify bugs and ensure they function as intended, data scientists and machine learning practitioners must thoroughly test machine learning models. Robust testing ensures that your models are accurate, reliable, and free from unintended biases. Which of the following statements about testing is not correct?

### Possible Answers


    Testing is an important part of pre-deployment and ensures the model's predictions are working as expected.
    
    
    With unittest.TestCase(), you can create a new test case by subclassing TestCase; test methods added should always start with the word "test".
    
    
    Automated tests are run during the deployment phase and not at other points in the machine learning development lifecycle. {Answer}
    
    
    The purpose of a test case in the unittest library might be to verify that the endpoint is correctly returning a prediction when valid input data is given.

**Tests don't have to be confined to deployment. In general, it is helpful to write tests throughout the machine learning development lifecycle to ensure all components and phases are working as intended.**

In [2]:
# exercise 01

"""
Writing unit tests

In the previous video on inference testing, you learned about the importance of writing test cases for your trained and evaluated model using the Python unittest library. In this exercise, you will put your new skills to the test by writing a test case for the model to check that it is producing binary outputs as expected. Your trained model is imported, as well as the testing portion of the dataset X_test.
"""

# Instructions

"""
     Define a test case class called TestModelInference that inherits from unittest.TestCase .
---
    Complete the setUp function by assigning X_test as a testcase class attribute.
---
    Define a test called test_prediction_output_values().
---
    Complete the test case by calling model.predict() on X_test; the test then checks that the output values are either 1 or 0.
"""

# solution

import unittest
import numpy as np

# Create a class called TestModelInference
class TestModelInference(unittest.TestCase):
	def setUp(self):
		self.model = model

		# set X_test as a class attribute
		self.X_test = X_test

	# define a test for prediction output values
	def test_prediction_output_values(self):
		print("Running test_prediction_output_values test case")

		# Get model predictions
		y_pred = self.model.predict(self.X_test)
		unique_values = np.unique(y_pred)
		for value in unique_values:
			self.assertIn(value, [0, 1])

#----------------------------------#

# Conclusion

"""
Insightful inference testing! You have written a unittest testcase for checking that your model's output aligns with expectations; that is, it is a binary classification for the presence or absence of heart disease. Sanity checks such as this are important for all phases of deployment and can increase overall confidence in predictions for more stable development. Keep up the good work!
"""

'\n\n'

# Feature stores vs model registries

Feature stores and model registries are essential components of the machine learning lifecycle. In order to ensure organized, reproducible, and sharable results, you need to keep a comprehensive record of the various resources you have created over time, keeping careful track of how they evolve and change and how this impacts the overall development process. 

![Answer](/home/nero/Documents/Estudos/DataCamp/Python/courses/end-to-end-machine-learning/ch_03-01.png)

In [17]:
from feast import Field, Entity
from feast.types import Int32, Float32

In [18]:
# exercise 02

"""
Defining features for a feature store

Before creating a feature store, you need to ensure that features are formally defined, in order to ensure the feature store knows the relationships, type, and structure of the features to be loaded. In this exercise, you will formally define a number of features in preparation for the creation of a feature store. Field is imported for you from feast.
"""

# Instructions

"""
    Define the cp, thalach, ca, thal features using Feast's Field class.
"""

# solution

# Define entity and selected features
patient = Entity(name="patient", join_keys=["patient_id"])
cp = Field(name='cp', dtype=Float32)
thalach = Field(name='thalach', dtype=Int32)
ca = Field(name='ca', dtype=Int32)
thal = Field(name='thal', dtype=Int32)

#----------------------------------#

# Conclusion

"""
Fearless feature creation! Now that you have formally defined a number of features and their data types, the next step will be to load them into a feature store. This ensures ease of use and reproducibility of results.
"""

'\nFearless feature creation! Now that you have formally defined a number of features and their data types, the next step will be to load them into a feature store. This ensures ease of use and reproducibility of results.\n'

In [42]:
import pandas as pd
from feast import FeatureStore, ValueType, FileSource, FeatureView

heart_disease_df = pd.read_csv(path_data+'heart_disease_cleaned_3.csv', parse_dates=['timestamp']).drop(columns=['Unnamed: 0'])
heart_disease_df.rename(columns={'index':'patient'}, inplace=True)
heart_disease_df.head()

Unnamed: 0,patient,age,sex,cp,trestbps,chol,fbs,restecg,thalach,exang,slope,ca,thal,target,timestamp
0,0,52,1,0.0,125,212.0,0,1.0,168,0,2,2,3,0,2024-04-24 13:26:30.410757
1,1,53,1,0.0,140,203.0,1,0.0,155,1,0,0,3,0,2024-04-24 13:26:30.410757
2,2,70,1,0.0,145,174.0,0,1.0,125,1,0,0,3,0,2024-04-24 13:26:30.410757
3,3,61,1,0.0,148,203.0,0,1.0,161,0,2,1,3,0,2024-04-24 13:26:30.410757
4,4,62,0,0.0,138,294.0,1,1.0,106,0,1,3,2,0,2024-04-24 13:26:30.410757


In [43]:
# exercise 03

"""
Feature store using Feast

In order to ensure effective development throughout the machine learning lifecycle, it is important to maintain detailed and comprehensive records of resources. Feature stores and model registries are examples of helpful resource records in the pre-modelling and modelling phases. In this exercise, you will implement a feature store using Feast. The predefined patient, Entity, as well as the cp, thalach, ca, and thal features have been loaded for you. ValueType, FeatureStore, and FileSource are all imported from feast. heart_disease_df is also imported.
"""

# Instructions

"""
    Define a data source of your heart_disease_df.
---
    Create a Feature View object using the features defined - make sure to pass features in the right order!
---
    Create a Feature Store and apply the features you have defined.
"""

# solution

heart_disease_df.to_parquet(path_data+"heart_disease.parquet")

# Point File Source to the saved file
data_source = FileSource(
    path=path_data+"heart_disease.parquet",
    event_timestamp_column="timestamp",
    created_timestamp_column="created",
)

# Create a Feature View of the features
heart_disease_fv = FeatureView(
    name="heart_disease",
    entities=[patient],
    schema=[cp, thalach, ca, thal],
    source=data_source,
)

# Create a store of the data and apply the features
store = FeatureStore(repo_path=".")
store.apply([patient, heart_disease_fv])

#----------------------------------#

# Conclusion

"""Fantastic feature-storing! You have defined the relationship between a number of features in your heart disease dataset which have been pre-selected by your feature-selection algorithm. Now that the features are saved in a feature store, you have a consistent, reproducible registry to work with for future phases of the machine learning lifecycle.
"""



'Fantastic feature-storing! You have defined the relationship between a number of features in your heart disease dataset which have been pre-selected by your feature-selection algorithm. Now that the features are saved in a feature store, you have a consistent, reproducible registry to work with for future phases of the machine learning lifecycle.\n'

# Containerization steps

In this video, you have covered the topic of containerizing a machine learning model for deployment using Docker. You have discussed how Docker allows us to package your model and its dependencies into a standalone unit, which can be executed easily in different environments. You have also learned about Dockerfiles, the text documents containing commands that build an image, and the sequence of steps to follow in order to containerize a machine learning model.

![Answer](/home/nero/Documents/Estudos/DataCamp/Python/courses/end-to-end-machine-learning/ch_03-02.png)

# Containerization using Docker

Containers and containerization are some of the most widely used technologies and practices in the software and machine learning world. As the de-facto framework for container-based deployment, Docker is a must-learn for any ML practitioner. As such, let's test your working knowledge of Docker with a few questions.

    Select the correct answers.

### Possible Answers

A Dockerfile is a text document that contains all the commands a user could call on the command line to assemble an image.{Answer}


After specifying the base image, Docker is instructed to copy necessary files like the Python script and a requirements.txt file into the Docker image.{Answer}


The tagging of a Docker image is done before building the Docker image.


The Dockerfile should specify the port to run on, define any environment variables, and the command that should run when a container is launched.{Answer}


It is recommended to include sensitive data in your Docker images for quick and easy access, as Docker images are securely packaged to run independently on various platforms.

# Inference using Docker

You have seen how you can use a containerization service like Docker to bundle and run applications. This can be used for a variety of purposes, including ensuring a consistent environment for development and deployment, simplifying dependencies management, and providing a portable solution that can run almost anywhere. This is especially important in healthcare projects with non-technical stakeholders who want to be able to deploy and run models easily and at speed.

![Answer](/home/nero/Documents/Estudos/DataCamp/Python/courses/end-to-end-machine-learning/ch_03-03.png)

# CI/CD principles

Which of the following statements best represent the principles of Continuous Integration and Continuous Deployment (CI/CD) as applied to deploying machine learning models on AWS Elastic Beanstalk?

### Possible Answers

In CI/CD, Continuous Integration involves sending models to production, while Continuous Deployment incorporating newly trained machine learning models into a central repository.


In CI/CD, Continuous Integration means incorporating code changes into a central repository, while Continuous Deployment means sending updates or changes in the codebase to production.{Answer}


Continuous Integration and Continuous Deployment are often used together.{Answer}


AWS Elastic Beanstalk does not support the principles of CI/CD for deploying machine learning models.


Kubernetes is an AWS-specific tool for CI/CD management.

# Deploying a model using AWS EB

Deploying your machine learning model is a crucial step in the end-to-end machine learning process. It enables you to serve the model to end users and other applications, providing valuable insights and predictions based on the healthcare data you have processed and the model you have trained. Deploying your model to a cloud service like Elastic Beanstalk ensures that it is accessible, scalable, and reliable, allowing you to focus on improving the model's performance and integrating it with other systems.

In this exercise, you will put into practice the concepts learned in Chapter 3. Your objective is to deploy your containerized machine learning model using Amazon Web Services' (AWS) Elastic Beanstalk. You will be provided with commands that need to be arranged in the correct order to deploy and view your model.

![Answer](/home/nero/Documents/Estudos/DataCamp/Python/courses/end-to-end-machine-learning/ch_03-04.png)

# Deployment: bringing it all together

Congratulations on making it this far! You have reached the end of the deployment section in our course, and should have a well-rounded, high-level overview of the general steps involved in serving your model to external stakeholders. Remember, the deployment process is multifaceted and could look different for your particular use-case - this is just a high level overview based on our heart-disease study!

![Answer](/home/nero/Documents/Estudos/DataCamp/Python/courses/end-to-end-machine-learning/ch_03-05.png)