# Machine Learning Project Pipeline

- **Define the problem and its scope**: Identify the challenges you have and the solutions you need. This is the initial and crucial step in ML. A well-defined problem sets the direction and scope of the entire project and avoids wasting time and money on irrelevant tasks.
- **Preliminary Study**: Ask a few questions before starting the ML project. 
    - Do you need a machine learning model to solve this problem? 
    - Is there an out-of-the-box solution, or do you need to build your own from scratch?
    - Do you have the required data? Is there enough data to support the model? How is the data quality? 
    - Do you have the required technologies? 
    - What is the budget and timeline for this project?
- **Data Collection**: Collect the required data from various data sources and store data securely. 
- **Data Cleaning and Feature Engineering**: Remove outliers, impute missing values, and perform all required transformations to make data suitable for modeling.
- **Model Training**: Choose appropriate algorithms/methods and train the model on the training set. 
- **Model Validation and Hyperparameter Tuning**: Evaluate the model performance on the validation set and adjust the hyperparameters to optimize the model. 
- **Model Evaluation**: Evaluate the model performance on the test set to assess how the model performs on unseen data. 
- **Deployment**: Deploy the model to the production environment. 
- **Monitoring and Maintenance**: Continuously monitor the model. Track metrics, errors, data drift, and model degradation over time. Retrain the model with new data periodically or under certain conditions. 

# Automated Machine Learning - AutoML

AutoML streamlines the ML workflow by automating time-consuming ML processes such as model selection and hyperparameter tuning. AutoML provides opportunities for individuals with limited machine learning experience to leverage ML models and enables all individuals to build ML models quickly and focus on the business problems rather than tedious tasks. 

# Machine Learning Operations - MLOps

MLOps, inspired by DevOps, bridges the gap between ML and operations. MLOps enables businesses to scale ML efforts, maximise the value of models, and make efficient use of resources.

To achieve MLOps:
- Automation: Automate ML workflows to reduce manual work, minimize errors, increase efficiency, achieve scalability, and ensure consistency.
- Version Control: Have a version control of data, models, and code. Track all the changes in all stages.  
- CI/CD: Integrate changes seamlessly and deploy the models to production/operation environment smoothly. 
- Continuous Monitoring: Continuously monitor model performance, errors, and data drift. 


# Readings
- [(Github) Azure: ML Notebooks](https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-getting-started.ipynb)
- [(Github) AWS: Sagemaker Examples](https://github.com/aws/amazon-sagemaker-examples)
- [(Github) Microsoft FLAML - A Fast Library for Automated Machine Learning & Tuning](https://github.com/microsoft/FLAML)
- [AWS: AutoML Solutions](https://aws.amazon.com/machine-learning/automl/)
- [Azure: Automated Machine Learning](https://azure.microsoft.com/en-gb/products/machine-learning/automatedml/)
- [Google Cloud: AutoML](https://cloud.google.com/automl)
- [Databricks: What are ML Pipelines](https://www.databricks.com/glossary/what-are-ml-pipelines)
- [neptune.ai: MLOps: What It Is, Why It Matters, and How to Implement It](https://neptune.ai/blog/mlops#:~:text=It%20applies%20to%20the%20entire,Data%20analysis)
- [Google Cloud MLOps: Continuous delivery and automation pipelines in machine learning](https://cloud.google.com/architecture/mlops-continuous-delivery-and-automation-pipelines-in-machine-learning)
- [Azure: Machine learning operations](https://azure.microsoft.com/en-gb/products/machine-learning/mlops/#features) 