# Machine Learning Foundations

**Machine Learning (ML)** is a subfield of Artificial Intelligence that focuses on the development of systems that learn or improve performance based on the data they consume. While all machine learning is AI, not all AI is machine learning. There are other subfields that belong under the scope of Artificial Intelligence, such as natural language processing, computer vision, robotics, expert systems etc. 

A crucial distinction that separates machine learning from the other subfields of artificial intelligence is that it involves systems learning from the data, improving and making decisions based on this acquired knowledge.

## Fundamental Concepts

### Representation

Representation refers to how knowledge is represented in the machine learning model. The choice of representation is an essential factor that influences the performance of the model. Examples of representations include decision trees, sets of rules, instances, graphical models, neural networks, support vector machines, model ensembles, and others. The representation forms the basis for the model's ability to understand and learn from the data.

### Evaluation

Evaluation involves assessing the performance of the model. This is done by comparing the model's predictions or outputs against actual or known values. There are several ways to evaluate models, and the choice of evaluation metric often depends on the specific task at hand. Common evaluation metrics include accuracy, precision and recall, squared error, likelihood, posterior probability, cost, margin, entropy, and k-L divergence. The evaluation process helps in understanding how well the model has learned and how it is likely to perform on unseen data.

### Optimization

Optimization refers to the process of adjusting the model's parameters to improve its performance. This is typically done by minimizing a loss function, which measures the discrepancy between the model's predictions and the actual values. The optimization process can involve a variety of techniques, including combinatorial optimization, convex optimization, and constrained optimization. The goal of optimization is to find the best set of parameters that minimizes the loss function, thereby improving the model's performance.

### Generalization

Generalization refers to the model's ability to apply what it has learned from the training data to new, unseen data. A model that generalizes well will perform well not just on the training data but also on new data. This is the ultimate goal of machine learning: to create models that make accurate predictions on new, unseen data.

### Model Validation

Model validation involves assessing how well a trained model will generalize to new data. This is typically done by splitting the available data into a training set and a validation set. The model is trained on the training set and then evaluated on the validation set. The performance on the validation set gives an estimate of how well the model is likely to perform on unseen data.

### Bias/Variance Tradeoff

The bias/variance tradeoff is a fundamental concept in machine learning that refers to the balance that must be achieved between bias (error from erroneous assumptions in the learning algorithm) and variance (error from sensitivity to small fluctuations in the training set). A model with high bias pays little attention to the training data and oversimplifies the model, which leads to high error on training and test data. A model with high variance pays a lot of attention to training data and does not generalize on the data which it hasn’t seen before, leading to high error on test data. The key is to find a good balance without overfitting or underfitting the data.

### Curse of Dimensionality

The curse of dimensionality refers to various phenomena that arise when analyzing and organizing data in high-dimensional spaces (often with hundreds or thousands of dimensions) that do not occur in low-dimensional settings such as the three-dimensional physical space of everyday experience. As the number of features or dimensions grows, the amount of data we need to generalize accurately grows exponentially.

### Feature Engineering

Feature engineering is the process of using domain knowledge to create features that make machine learning algorithms work. If feature engineering is done correctly, it increases the predictive power of machine learning algorithms by creating features from raw data that help facilitate the machine learning process baeldung.com.

## Final Notes

The notes below serves as a brief introduction to machine learning. This module does not aim to teach each concept and theoretical knowledge, and will instead focus on how these can be applied. We will learn better through application, so that is the approach we are going to focus on. Concepts will further be explained should a necessity arise.