# Practical No. 9
**Theory:**  
Machine Learning (ML) deployment refers to the process of making trained ML models available for use in production environments. It involves taking the models developed during the ML development phase and deploying them to serve predictions or perform inference on new, unseen data.

Here are some key aspects and considerations in ML deployment:

1. Model Packaging: ML models need to be packaged in a format that can be easily deployed and consumed by the target environment. Common formats include serialized files (e.g., pickle, ONNX), containerized images (e.g., Docker), or specific formats supported by ML deployment platforms.

2. Infrastructure and Environment: ML models require appropriate infrastructure and computing resources to run efficiently. This may involve provisioning servers, cloud instances, or using specialized ML deployment platforms that handle scaling, load balancing, and resource management.

3. Deployment Options: ML models can be deployed in various ways, depending on the requirements and constraints of the application. Options include cloud-based deployments (e.g., AWS, Azure, Google Cloud), on-premises deployments, edge deployments (running models on edge devices), or using specialized ML deployment platforms.

4. Model Serving: ML models need to be served so that they can receive input data and provide predictions or inference results. This typically involves exposing APIs or endpoints that can be accessed by client applications or systems. Common protocols used for serving models include RESTful APIs, gRPC, or custom interfaces.

5. Scalability and Performance: ML deployment should consider scalability to handle varying workloads and ensure good performance. This may involve techniques such as load balancing, distributed computing, and optimizing the inference process for efficient resource utilization.

6. Monitoring and Maintenance: Once deployed, ML models need to be monitored to ensure they are functioning correctly, performant, and producing accurate results. Monitoring may involve tracking metrics like prediction latency, throughput, error rates, and data drift. Regular maintenance and updates may also be required as new data becomes available or improvements to the models are made.

7. Security and Privacy: ML deployment must consider security and privacy aspects to protect sensitive data and prevent unauthorized access or attacks. This may involve securing the model serving endpoints, encrypting data in transit and at rest, and following best practices for handling sensitive information.

ML deployment is a critical step in the ML lifecycle, as it allows organizations to realize the value of their ML models by integrating them into real-world applications and systems. Effective deployment ensures reliable and efficient usage of ML models and enables data-driven decision-making in various domains.

## Machine Learning model for a web service
**Theory:**  
A machine learning web service is a service that provides machine learning capabilities over the internet. It allows users to access and utilize machine learning models and algorithms through an application programming interface (API).

Here are some key components and characteristics of a machine learning web service:

1. Model Deployment: The web service hosts trained machine learning models and makes them available for consumption by client applications.

2. API Interface: It provides a well-defined API that allows clients to interact with the machine learning models. The API typically supports standard HTTP methods such as GET, POST, PUT, and DELETE for performing various operations.

3. Input and Output Formats: The web service defines the expected input format for making predictions or running inference on the models. It also specifies the format of the output response, which could be in JSON, XML, or other commonly used formats.

4. Scalability: Machine learning web services are designed to handle a large number of requests concurrently and provide scalability to accommodate varying loads and user demands.

5. Security: It incorporates security measures to protect the models and the data they process. This may include authentication and authorization mechanisms, encryption, and secure communication protocols.

6. Versioning and Model Updates: Web services often support versioning to manage changes and updates to the deployed models. This allows for backward compatibility and smooth transitions when introducing new versions of models.

7. Monitoring and Analytics: The web service may provide monitoring and analytics capabilities to track usage statistics, performance metrics, and other relevant information for model evaluation and improvement.

Machine learning web services are widely used in various applications, including predictive analytics, recommendation systems, fraud detection, natural language processing, and image recognition. They enable developers and data scientists to leverage machine learning capabilities without having to worry about the infrastructure and deployment complexities, making it easier to integrate machine learning into their applications.

In [3]:
# importing required libraries
# importing Scikit-learn library and datasets package
from sklearn import datasets

# Loading the iris plants dataset (classification)
iris = datasets.load_iris()	
print(iris.target_names)
print(iris.feature_names)
# dividing the datasets into two parts i.e. training datasets and test datasets
X, y = datasets.load_iris( return_X_y = True)

# Splitting arrays or matrices into random train and test subsets
from sklearn.model_selection import train_test_split
# i.e. 70 % training dataset and 30 % test datasets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.30)
# importing random forest classifier from assemble module
from sklearn.ensemble import RandomForestClassifier
import pandas as pd
# creating dataframe of IRIS dataset
data = pd.DataFrame({'sepallength': iris.data[:, 0], 'sepalwidth': iris.data[:, 1],
					'petallength': iris.data[:, 2], 'petalwidth': iris.data[:, 3],
					'species': iris.target})
# printing the top 5 datasets in iris dataset
print(data.head())
# creating a RF classifier
clf = RandomForestClassifier(n_estimators = 100)

# Training the model on the training dataset
# fit function is used to train the model using the training sets as parameters
clf.fit(X_train, y_train)

# performing predictions on the test dataset
y_pred = clf.predict(X_test)

# metrics are used to find accuracy or error
from sklearn import metrics
print()

# using metrics module for accuracy calculation
print("ACCURACY OF THE MODEL: ", metrics.accuracy_score(y_test, y_pred))


['setosa' 'versicolor' 'virginica']
['sepal length (cm)', 'sepal width (cm)', 'petal length (cm)', 'petal width (cm)']
   sepallength  sepalwidth  petallength  petalwidth  species
0          5.1         3.5          1.4         0.2        0
1          4.9         3.0          1.4         0.2        0
2          4.7         3.2          1.3         0.2        0
3          4.6         3.1          1.5         0.2        0
4          5.0         3.6          1.4         0.2        0

ACCURACY OF THE MODEL:  0.9555555555555556


array([0])

### Save The Model

In [5]:
import joblib
joblib.dump(clf, 'classifier.pkl')

['classifier.pkl']

## Deploying machine learning models on edge devices as embedded models: TFLite

**Theory:**  
Deploying machine learning models on edge devices as embedded models refers to the process of running trained ML models directly on edge devices such as smartphones, IoT devices, embedded systems, or other resource-constrained hardware. Instead of relying on cloud or remote servers for inference, the models are integrated into the edge devices themselves, allowing for real-time, offline, or low-latency inference.

Here are some key considerations and benefits of deploying ML models as embedded models on edge devices:

1. Reduced Latency: By running ML models directly on edge devices, latency is minimized as there is no need to send data to remote servers for inference. This enables real-time or near-real-time processing, which is crucial for applications where low latency is essential, such as real-time object detection or voice recognition.

2. Privacy and Security: Deploying ML models on edge devices eliminates the need to send sensitive data to external servers for processing. This helps maintain data privacy and reduces potential security risks associated with transmitting data over networks. The data stays on the device, enhancing privacy and complying with data protection regulations.

3. Offline Capability: Embedded ML models can operate offline without requiring a constant internet connection. This is beneficial in scenarios where internet connectivity is limited, unstable, or expensive. Applications like voice assistants, mobile apps, or industrial IoT devices can continue to function even in offline environments.

4. Bandwidth Optimization: By performing inference on the edge device itself, the amount of data transmitted over the network is significantly reduced. Only relevant or processed results are sent, optimizing bandwidth usage and reducing the load on network infrastructure.

5. Real-time Decision Making: With embedded ML models, edge devices can make autonomous, real-time decisions without relying on cloud or remote servers. This is advantageous in applications where quick decision-making is critical, such as autonomous vehicles, industrial automation, or healthcare devices.

6. Enhanced Reliability: Deploying models on edge devices increases reliability by reducing dependence on external servers. It mitigates issues related to network connectivity, server downtime, or latency fluctuations, ensuring continuous and consistent operation of the ML models.

7. Resource Constraints: Edge devices often have limited computational resources, memory, or power constraints. Optimizing ML models for deployment on such devices requires techniques like model compression, quantization, or pruning to reduce the model size and computational requirements while maintaining acceptable accuracy levels.

Deploying ML models as embedded models on edge devices brings numerous benefits, including reduced latency, improved privacy and security, offline capability, bandwidth optimization, and real-time decision-making. It enables a wide range of applications that require local processing, responsiveness, and autonomy, opening up opportunities for edge computing in various domains.

In [6]:
import tensorflow as tf

# Create a model using high-level tf.keras.* APIs
model = tf.keras.models.Sequential([
    tf.keras.layers.Dense(units=1, input_shape=[1]),
    tf.keras.layers.Dense(units=16, activation='relu'),
    tf.keras.layers.Dense(units=1)
])
model.compile(optimizer='sgd', loss='mean_squared_error') # compile the model
model.fit(x=[-1, 0, 1], y=[-3, -1, 1], epochs=5) # train the model
# (to generate a SavedModel) tf.saved_model.save(model, "saved_model_keras_dir")

# Convert the model.
converter = tf.lite.TFLiteConverter.from_keras_model(model)
tflite_model = converter.convert()

# Save the model.
with open('model.tflite', 'wb') as f:
  f.write(tflite_model)


Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


