# 1. What is Artificial Intelligence?

![What is AI?](resources/images/what_is_ai.png)

# 2. AI Models

- Machine learning
- Deep learning

# 3. Machine learning

Types:
- Supervised learning
- Unsupervised learning
- Semi-supervised learning
- Reinforcement learning

References:
- [Types of mahince learning](https://www.javatpoint.com/types-of-machine-learning)

## 3.1 Supervised Learning

Trained with labelled datasets:
- input: features, labels
- output: trained models, accuracy

![Dataset splitting](resources/images/supervised_learning.png)

Types:
- Classifications
    - fraud detection
    - spam detection
    - speech recognition
- Predictive modelling (regression analysis)
    - sales amount prediction
    - stock market indices prediction

**Examples**
Classifications
![Classify galaxies](resources/images/classification.png)

Predictive modelling
![Predict stock market](resources/images/predictive_modelling.png)

## 3.2 Unsupervised Learning

Trained with unlabelled datasets:
- input: features
- output: models, performance matrix

Types:
- Clustering
- Anomaly detection
- Association rules

**Examples**
Clustering
![Clustering](resources/images/clustering.png)

Anomaly detection
![Anomaly detection](resources/images/anomaly_detection.png)

Association rules
![Association rules](resources/images/association_rules.png)

## 3.3 Semi-supervised Learning

Trained with the combinations of labelled & unlabelled datasets.

## 3.4 Reinforcement Learning

Trained by learning from experiences (no labelled datasets).

# 4. Machine Learning Models Training Process

Steps:
1. Data collection
2. Data pre-processing
3. Models training
4. Performance evaluation
5. Applications of trained models

## 4.1 Data Collection

1. Scrap data with scraper

2. Download available datasets
    - [UCI Machine Learning Repository](https://archive.ics.uci.edu/ml/datasets.php)
    - [Google Dataset Search](https://toolbox.google.com/datasetsearch)
    - [Kaggle](https://www.kaggle.com/)
    - [Visual Data](https://visualdata.io/discovery)
    - [CMU Libraries](https://guides.library.cmu.edu/machine-learning/datasets)

## 4.2 Data Pre-processing

How to understand the data better? -> Perform exploratory analysis on datasets. (refer sample codes)

Understand the Data
![Data](resources/images/data.png)

Data Attributes
- Nominal: ID numbers, eye color, zip codes (discrete)
- Ordinal (with order): rankings (e.g., taste of potato chips on a scale from 1 10), grades, height in {tall, medium, short}
- Interval (with range: minimum to maximum): calendar dates, temperatures in Celsius or Fahrenheit
- Ratio (real number, with decimal places/fractional value): temperature in Kelvin, length, time

Data Hierarchy
![Data hierarchy](resources/images/data_hierarchy.png)

Data Pre-processing Strategies
![Data pre-processing](resources/images/data_preprocessing.png)

## 4.3 Models Training & Performance Evaluation

![Models training process](resources/images/models_training.png)

**Problems**
![Models fitting](resources/images/models_fitting.png)

### 4.3.1 Classification Models

**Performance Evaluation Matrix**
1. Confusion Matrix
    ![Confusion matrix](resources/images/confusion_matrix.png)
2. F1-score/F-measure
3. Accuracy
4. Misclassification rate
5. Precision
6. Recall
6. Prevalence
7. Cost Matrix
    ![Cost matrix](resources/images/cost_matrix.png)
8. Receiver Operating Characteristics (ROC)
9. Area under the ROC curve (AUC)

**More Reliable Performance Evaluation Model**
Cross-validation matrix
   - Holdout
   - Random sub-sampling
   - k-fold
   - leave-one-out

**Summary**
![Summary](resources/images/classification_evaluation_summary.png)

### 4.3.2 Regression Models

**Performance Evaluation Matrix**
- Sum of Squares of Error (SSE)
- Standard Error of Estimate
- Sum of Squares Regression (SSR)
- Coefficient of Determination (R)
- Square of Correlation Coefficient (R Square)

# 5. IDE & Tools

## 5.1 IDE

1. PyCharm
2. Visual Studio Code
3. Anaconda
4. Spyder
5. Atom
6. Sublime Text

## 5.2 Tools

- Matplotlib (visualization)
- Pandas (manage dataframe)
- scikit-learn (various ML models)