# Machine Learning Lifecycle

The lifecycle is the process of developing, deploying, and maintaining a machine learning model for a specific application. This shall serve as our fundamental guideline as to how we are to approach in creating the solutions required for our use case.

Below are the comprehensive steps laid out in the lifecycle in machine learning.

## Problem Identification and Understanding

The first step in the lifecycle and is crucial as it sets the direction for the project, and involves defining the problem and create the objectives necessary for the resolution of the problem defined.

**Define the Problem**: The first task in this stage is to clearly define the problem. This involves understanding what the we want to achieve and how a machine learning model can assist in achieving that goal. For this instance, this module aims to guide us together towards the creation of a model that will recognize the hand signs shown in the camera based from the American Sign Language.

**Determine the Machine Learning Task**: Once the problem is defined, it's necessary to define the machine learning task based on that problem. This could involve deciding whether the problem is a classification problem, a regression problem, a clustering problem, etc. For our case, the task at hand would be a classification problem.

**Define the Optimization Objective**: The next step would be to determine the key business performance metrics that the machine learning model should aim to improve. These metrics should align with the overall objectives. For our use case, if the goal is to ensure an accurate recognition of the hand sign shown in the picture, the optimization objective might be to reduce the error rate of the model.

**Review Data Requirements**: Reviewing data requirements involves determining what data is needed to solve the problem and whether the necessary data is available. This might also involve considering the cost of data acquisition and whether external data sources might improve model performance. For our case, we will need to collect images of hand signs based in the American Sign Language.

## Machine Learning Lifecycle

### Problem Formation and Understanding

* Analyze whether machine learning is a probable solution with the given problem
* Identify the inputs and outputs for the model
* Identify the acceptable accuracy and prediction error you would tolerate from the model

### Data Collection and Preparation

* Identify the source of raw data you would need for the development of your machine learning model
* Allocation of time and effort toward annotation and wrangling of raw data so it may be used for the process.
* Allocation of time and effort towards labeling data, removing irrelevant features, tossing outliers, transform data, and inputting missing values.

### Model Training and Testing

* Allocation of 80% development time towards the training of the model, 10% towards validation of the model, and 10% towards testing the model.
* Identify the necessary machine learning algorithm appropriate for the problem.
* Allocation of time and effort towards iteration and experimentation of the algorithm, fine tuning of the model, evaluation of the results, and place model for deployment.

### Model Deployment and Maintenance

## Identification of Pre-built Models

For the development of the training model for use in your problem, it is important to note that you have two (2) choices, which are either **to create a model from scratch** or **use pre-built models**

**Use of pre-built models**

Uses pre-built models for use thereby increasing speed of development cycle and uses a similar benchmark dataset to your problem. For example, in image classification problems, a pre-built model can be used to solve your identical problem using transfer learning. This allows you to add your data on top of the pre-trained model, which in turn allows you train new models that inherits the learnings from the pre-built model.

There are existing websites and organizations that allows you to acquire pre-built models either for a price or for free as long as it has Apache 2.0 license, such as **AWS Marketplace**, **ModelZoo** and **Huggingface**

## Machine Learning Model Training Tools

If no pre-built models exist for the use as solution for our problem, you can choose the option to build your own customized model from scratch. Upon collection of the required dataset for the training of the model, it is important that before you start, you have the necessary hardware that can support computationally intensive processes required for training. This often means that generic laptops and computers will not be able to handle the load necessary to train your model. 

To solve this, you can choose the option of acquiring the GPU power of cloud services such as **Amazon Web Services** and **Google Colabs** 

For the development of your own customized model using Python using the Jupyter Notebook IDE. Necessary libraries that we can use for the development process are the following:

**Pandas and NumPy libraries** for the access and modification of solid state data structures, n-dimensional matrices, and perform exploratory data analysis, and allows you to read CSV, JSON, and TSV data files.

**Matplotllib and Seaborn libraries** – for the data visualization phase requiring the plotting of charts and graphs.

**Scikit-learn, TensorFlow, MXNext, PyTorch,** and **Keras** framework libraries for the actual training of the model.