# Overview of Machine Learning

## Approaching a Machine Learning Problem

### Resources

- [Machine Learning Engineering for Production (MLOps) Specialization](https://www.coursera.org/specializations/machine-learning-engineering-for-production-mlops#courses)


- [The 4 Pillars of MLOps: How to Deploy ML Models to Production](https://www.phdata.io/blog/the-ultimate-mlops-guide-how-to-deploy-ml-models-to-production/)
- [Putting Machine Learning Models into Production](https://blog.cloudera.com/putting-machine-learning-models-into-production/)
- [The MLOps Toolkit](https://testdriven.io/blog/mlops/)
- [Introduction to Machine Learning Reliability Engineering](https://testdriven.io/blog/machine-learning-reliability-engineering/#background)
- [Serving a Machine Learning Model with FastAPI and Streamlit](https://testdriven.io/blog/fastapi-streamlit/)
- [Awesome production machine learning](https://github.com/EthicalML/awesome-production-machine-learning)

## From Prototype to Production

- [Machine Learning: The High Interest Credit Card of Technical Debt](https://research.google/pubs/pub43146/) -> tiste ki zanima postavljanje modelov v produkcijo


### Model-centric vs Data-centric

<img src="https://miro.medium.com/max/1400/1*BzxsNxyyP77dVW3_Lto1Hw.png" jsaction="load:XAeZkd;" jsname="HiaYvf" class="n3VNCb" alt="From model-centric to data-centric | by Fabiana Clemente | Towards Data  Science" data-noaft="1" style="width: 434px; height: 141.764px; margin: 29.0182px 0px;">

### What Does it Take to Deploy an ML Model?

<img style="-webkit-user-select: none;margin: auto;cursor: zoom-in;background-color: hsl(0, 0%, 90%);transition: background-color 300ms;" src="https://testdriven.io/static/images/blog/mlops/mlops_toolkit_lifecycle.png" width="766" height="421">

## Testing Production Systems

- https://www.amazon.com/Bandit-Algorithms-Website-Optimization-Developing/dp/1449341330

## ML landscape

<img style="-webkit-user-select: none;margin: auto;background-color: hsl(0, 0%, 90%);transition: background-color 300ms;" src="https://i.stack.imgur.com/2EDiO.png">

<img style="-webkit-user-select: none;margin: auto;cursor: zoom-in;background-color: hsl(0, 0%, 90%);transition: background-color 300ms;" src="https://i.stack.imgur.com/42u1r.png" width="766" height="477">

<img alt="" class="dq jp jq" src="https://miro.medium.com/max/1400/1*yGDZqNVkJINY61ONFV-aiA.jpeg" width="700" height="458" role="presentation">

https://landscape.lfai.foundation/

## Where to Go from Here

### Theory

### Other Machine Learning Frameworks and Packages


- https://vowpalwabbit.org/
- https://spark.apache.org/mllib/

### Cloud services

### Scaling to Larger Datasets

## Tools

- [Streamlit](https://streamlit.io/):Streamlit is an open-source Python library that makes it easy to create and share beautiful, custom web apps for machine learning and data science. In just a few minutes you can build and deploy powerful data apps. So let’s get started! **The fastest way to build and share data apps**
    - https://docs.streamlit.io/library/get-started/create-an-app
- [MLflow](https://mlflow.org/): MLflow is an open source platform to manage the ML lifecycle, including experimentation, reproducibility, deployment, and a central model registry.
- [Dask](https://docs.dask.org/en/stable/): Dask is a flexible library for parallel computing in Python.
- [XGBoost](https://xgboost.readthedocs.io/en/stable/): XGBoost is an optimized distributed gradient boosting library designed to be highly efficient, flexible and portable. It implements machine learning algorithms under the Gradient Boosting framework.
- [OpenCV](https://opencv.org/): OpenCV is an open-source platform dedicated to computer vision and image processing. This library has more than 2500 algorithms dedicated to computer vision and ML. It can track human movements, detect moving objects, extract 3d models, stitch images together to create a high-resolution image, exploring the AR possibilities. It is used in various CCTV monitoring activities by many governments, especially in China and Isreal. Also, the major camera companies in the world use OpenCv for making their technology smart and user-friendly.
- [Apache Saprk](https://spark.apache.org/)

**Other useful tools**

- SQL
- Git

**Deep learning**
- [TensorFlow](https://www.tensorflow.org/): TensorFlow is a library developed by the Google Brain team for the primary purpose of Deep Learning and Neural Networks. It allows easy distribution of work onto multiple CPU cores or GPU cores, and can even distribute the work to multiple GPUs.  TensorFlow uses Tensors for this purpose. Tensors can be defined as a container that can store N-dimensional data along with its linear operations. Although it is production-ready and does support reinforcement learning along with Neural networks, it is not commercially supported which means any bug or defect can be resolved only by community help.
- [Keras:](https://keras.io/): Keras provides a Python interface of Tensorflow Library especially focused on AI neural networks. The earlier versions also included many other backends like Theano, Microsoft cognitive platform, and PlaidMl. Keras contains standard blocks of commonly used neural networks, and also the tools to make image and text processing faster and smoother. Apart from standard blocks of neural networks, it also provides re-occurring neural networks.
- [PyTorch](https://pytorch.org/): Pytorch is a Facebook-developed ML library that is based on the Torch Library (an open-source ML library written in Lua Programming language). The project is written in Python Web Development, C++, and CUDA languages. Along with Python, PyTorch has extensions in both C and C++ languages. It is a competitor to Tensorflow as both of these libraries use tensors but it is easier to learn and has better integrability with Python. Although it supports NLP, but the main focus of the library is only on developing and training deep learning models only. 

**Natural Language Processing**
- [Natural Language Toolkit (NLTK)](https://www.nltk.org/): NLTK is the widely used library for Text Classification and Natural Language Processing. It performs word Stemming, Lemmatizing, Tokenization, and searching a keyword in documents. The library can be further used for sentiment analysis, understanding movie reviews, food reviews, text-classifier, checking and censoring the vulgarised words from comments, text mining, and many other human language-related operations. The wider scope of its uses includes AI-powered chatbots which need text processing to train their models to identify and also create sentences important for machine and human interaction in the upcoming future.
- [spaCy](https://spacy.io/): a relatively new but very efficient and welldesigned package
- [Gensim](https://radimrehurek.com/gensim/index.html): an NLP package with an emphasis on topic modeling


**Other resources**
- [Kaggle](https://www.kaggle.com/)
- [pandas - User Guide](https://pandas.pydata.org/pandas-docs/stable/user_guide/index.html)
- [Open Machine Learning Course mlcourse.ai](https://mlcourse.ai/)
- [KDnuggets](https://www.kdnuggets.com/)
- [DEV](https://dev.to/)
- [DZone](https://dzone.com)
- [Medium](https://medium.com/)
- [Towards Data Science](https://towardsdatascience.com/)
- [Machine Learning Crash Course](https://developers.google.com/machine-learning): Google's fast-paced, practical introduction to machine learning
- [Accurately Measuring Model Prediction Error](http://scott.fortmann-roe.com/docs/MeasuringError.html)
- [Understanding the Bias-Variance Tradeoff](http://scott.fortmann-roe.com/docs/BiasVariance.html)
- https://www.youtube.com/@TwoMinutePapers/videos

**Courses**
- [Applied Machine Learning in Python](https://www.coursera.org/learn/python-machine-learning#syllabus)
- [In-depth introduction to machine learning in 15 hours of expert videos](https://www.dataschool.io/15-hours-of-expert-machine-learning-videos/)
- [The Analytics Edge](https://www.edx.org/course/the-analytics-edge)
- [Lecture Collection | Machine Learning](https://www.youtube.com/playlist?list=PLA89DCFA6ADACE599)
- https://realpython.com/tutorials/machine-learning/
- https://inria.github.io/scikit-learn-mooc/index.html