# Machine Learning - General
> [Main Table of Contents](../../README.md)

## In This Notebook
- What is machine learning?
- Supervised learning
    - Supervised workflow
- Unsupervised learning
- Fine-tuning model vs. Hypertuning parameters
- Vocabulary

## What is machine learning?
- Machine learning has become the most prevalent subset of articial intelligence
- Machine learning makes inferences and predictions based on datasets
- 3 board areas of machine learning:
    - Reinforcement learning
    - Supervised learning
    - Unsupervised learning
    - Supervised and Unsupervised learning are the more popular of the three

## Supervised learning
- Given dataset containing features (various data points of an example/input variables) and labels (values of a target variable(s)), learns a mapping between features and labels of a target variable
- 2 broad categories of supervised learning:
    - Classification
        - Uses categorical values
        - Apply linear, logistic decision boundary functions 
        - Performance evaluation often utilizes confusion matrices and associated methods like:
            - Accuracy
                - Specificity
                - Sensivity (Recall in some cases)
            - Precision
            - Recall
            - F1 Score (avg of Recall and Precision)
    - Regression
        - Uses continuous values
        - Apply linear, logistic (non-linear) regression
            - sigmoid function is a type of logistic function
        - Performance evaluation often utilizes distance
            - Root mean squared error
            - Mean squared error
            - Absolute error

### Supervised learing workflow
1. Raw data
2. Clean data (inspect for bias, outliers, format)
3. Split data into train/test
4. Train model
    - Input training data into ML model to output a trained model
5. Evaulate performanace of model (trained model) on test data
6. If performance sucks
    - Modify the model with techniques like:
        - Dimensionality reduction (Modify the dataset)
            - Remove obviously irrelevant features
            - Correlation: find features that highly correlate and only keep one
                - e.g. height and foot size, keep of the two
            - Collapse features
                - e.g. combine height and weight features into bmi feature
        - Hyperparameter tuning (Modify the model)
            - Hyperparameters are dataset dependent
            - Adjust the hyperparameters for optimal model performance
            - There are many methods out there to find optimal hyperparameters, two examples:
                - e.g. sklearn.model_selection.RandomizedSearchCV (less comprehensive but effective)
                - e.g. sklearn.model_selection.GridSearchCV (comprehensive)
        - Ensemble Method
            - Use the aggregate value(s) of more than one model
    - DR (Dataset manipulation): Reduce features .  Exclude features which are obviously not indicative of the output in any way.  
    - HP (Model manipulation): Find the best hyperparameters of models. It is context dependent.  e.g. tuning a instruments/music requires different settings for a heavy metal vs folk music
8. Go back to #4. to re-train the modified model

## Unsupervised learning
- Given dataset that does not contain labels for a target variable(s), unsupervised learning models are able to detect patterns
- 3 broad ways of detecting patters:
    - Clustering
        - Find groupings in data points
        - e.g. nearest
    - Association
        - Find important relationships in data points
    - Anamoly detection
        - Find outliers in data points
- Evaluation of unsupervised learning is not as straightforward and more often would need further investigation/exploration


## Fine-tuning model vs. Hypertuning parameters
- Fine-tuning a model is a terminology used in transfer learning and means re-training a model on more specific datasets.  Transfer learning is the process of using a trained model in one context in another context.  This allows for rapid and accessible, highly functionL models without the need for supercomputers.  The original trained models can run for days and require lots of processing power that only big companies have accces to, so they use their resources to train models on very larget benchmark datasets, which can then be re-trained/fine-tuned on smaller niche datasets
    e.g. Google BERT model fine-tuned on financial data to produce FinBERT models
- Hypertuning parameters is about configuring hyperparameters of a model.  There are many functions that can automate this process to find the best parameter combinations of a model given the a type of dataset

## Vocabulary

Term | Definition
--- | ---
Features | Independent variables<br>Input<br>e.g. Name of columns in a pd.DataFrame
Labels | Values of a target variable<br>Dependent variables<br>e.g. Values of the target variable column in a pd.DataFrame
Target variable | e.g. Name of result/output/dependent variable column in a pd.DataFrame
Training Data | Data used to fit a model<br>Includes both features and labels (aka target in deep learning)
Testing Data<br>Validation Data | Data used to measure performance of model<br>Includes both features and labels (aka target in deep learning)
Classifer | Type of function that deals with discrete results
Regressor | Type of function that delas with continous results
Transformer in NLP, CV context (General) | Deep Learning model (based on ANN) that adopts the mechanism of self-attention (differentially weighting significant parts of each data iow: focus more on important parts - depends on context - implemented with gradient descent)<br>Transformer 
Transformer (Spark) | Abstraction that includes feature transformers and learned models<br>Technically, a Transformer implements a method transform(), which converts one DataFrame into another, generally by appending one or more columns<br>e.g. A type of feature transformer could transform features in text value -> numerical values<br>e.g. A type of learning model could read in a df and output a df with new column with prediction labels
Estimator (General) | Any learning algorithm or any algorithm that fits or trains on data<br>The result is a model, which may be accepted or rejected as a representation of reality
Estimator (Spark) | Technically, an Estimator implements a method fit(), which accepts a DataFrame and produces a Model, which is a Transformer