# Chapter 1: Introduction and Fundamentals
Nowadays, the use of artificial intelligence and machine learning in various industries, especially in engineering, is expanding at a remarkable speed. This notebook series offers an introduction to the field through end-to-end projects that cover fundamental concepts and build practical skills.

## Table of Contents
- Scope, Advantages and Challenges
- Types of Machine Learning Algorithms
    - Based on Human Supervision
    - Based on Learning Paradigms
    - Based on Generalization Approaches
- Life Cycle
- Data Pipeline
    - Data Collection
    - Data Cleaning and Preprocessing
    - Exploratory Data Analysis (EDA)
    - Feature Engineering 
- Model Pipeline
    - Model Selection
    - Model Training and Validation
    - Model Evaluation
    - Optimization
- Deployment, Monitoring and Maintenance
- References

In development of **learning-based intelligent systems**, details such as selecting suitable and **reliable algorithms** are highly important. For selecting algorithms, criteria such as the **type, dimensions, and complexity of the problem**, the presence of **parameters and nonlinear patterns**, **flexibility and scalability** and access to **sufficient data** are taken into consideration.

## 1. Scope, Advantages and Challenges
**Machine learning** is a branch of **computer science** and **artificial intelligence** that focuses on development of algorithms and **statistical models**, enabling computers (models) to perform tasks **without explicit instructions**. In fact, these models **extract patterns** from the valuable information within data and by learning from it, make **decisions** based on the provided information. This process involves solving practical problems by gathering a **dataset** and then building an algorithm as a statistical model from it. Since machine learning is **data-driven**, it is connected to other fields such as **data science** [1, 2].

**Deep learning** is a subfield of machine learning that uses multi-layered **artificial neural networks** to automatically learn hierarchical representations of data and can be appropriate for more complex problems.Figure 1.1 illustrates the scope of the related fields and their relationships.

<img src="../figures/figure_1_1_ai_scope.jpeg" alt="AI Scope" style="width:30%; margin-left: 35%;">
<p style="text-align:center;">Figure 1.1. AI Scope [3]</p>

Machine learning has numerous **applications**. It is ideal for solving problems where existing solutions require extensive **manual tuning** or the creation of long lists of rules. In these cases, a machine learning algorithm can often simplify the coding process and deliver better performance. It is also uniquely suited for complex problems that either lack a solution using **classic methods** or are difficult to solve using such approaches.

Furthermore, due to their inherent ability to **adapt to new data**, properly **optimized** machine learning algorithms can be effective in the face of **uncertainties**. Finally, machine learning is useful for extracting meaningful patterns from complex problems and **large volumes of data** that may seem impenetrable at first glance [4].

Despite many advantages, machine learning faces significant **challenges and limitations**, beginning with the foundational issue of **data quality**, where poor data complicates preprocessing and feature engineering, a process that is itself notoriously **time-consuming**. The field also suffers from a scarcity of specialists, making expertise hard to find [5].

One of the fundamental **challenges** of machine learning, is **generalization**. Generalization refers to the ability of the model to perform accurately and appropriately on **scenarios and data it was not trained on**. This issue is related to undesirable concepts such as **underfitting** and **overfitting**, all of which are separately examined. If an Intelligent system lacks good generalization, it will not have **robust performance** and will not function properly in the face of **uncertainties** and **unmodeled scenarios**, especially in **sensitive and critical situations** where **ethics** also matter.

Another drawback is its **low interpretability**. Machine learning algorithms largely do not provide a **specific reason**, based on the governing rules of logic, mathematics, or physics of the problem, for executing a decision. Therefore, they are sometimes referred to by the term **black box**.

Additionally, the **curse of dimensionality** can overwhelm models with too many features, and even when a successful model is built, its inherent complexity frequently creates major obstacles for real-world deployment [5].

Table 1.1 provides a summary of the mentioned key points about advantages and challenges.

<p style="text-align:center;">Table 1.1. Machine Learning Advantages and Challenges</p>

| Aspect | Key Applications | Key Challenges & Limitations |
| :-: | :-: | :-: |
| **Core Use Cases** | - Replacing complex, manual rule-based systems.<br>- Solving problems with no known traditional solution.<br>- Adapting automatically to new data and uncertainties.<br>- Uncovering patterns in large, complex datasets. | - **Data Dependency:** Requires high-quality data; preprocessing is time-consuming.<br>- **Skill Gap:** Shortage of expert personnel. |
| **Model Performance** | - Can deliver superior performance compared to traditional coding. | - **Generalization:** Models often fail on new, unseen data due to overfitting or underfitting, leading to non-robust performance in critical situations. |
| **Interpretability & Complexity** | - | - **Black Box Nature:** Decisions lack clear, logical, or physical explanations.<br>- **Curse of Dimensionality:** Performance degrades with too many irrelevant features.<br>- **Deployment Difficulty:** Complex models are hard to implement in real-world systems. |

## 2. Types of Machine Learning Algorithms
There are various fundamental criteria for categorizing machine learning algorithms, chief among them being the **level of human supervision**, the underlying **learning paradigm** and the **method of generalization** [4, 5].

### 2.1. Based on Human Supervision
Machine learning problems and algorithms are classified into four types of **Supervised Learning**, **Unsupervised Learning**, **Semi-supervised Learning**, and **Reinforcement Learning** based on human supervision as shown in Figure 2.1 [2, 4].

<img src="../figures/figure_2_1_machine_learning_types.png" alt="Machine Learning Types" style="width:50%; margin-left: 25%;">
<p style="text-align:center;">Figure 2.1. Types of Machine Learning Systems</p>

**Supervised learning** is an approach where an algorithm learns from a set of examples, each corresponding to a specific answer and the answer is known as a **label**. Therefore, in this case, the outputs are known, and the dataset is labeled. Each sample in the training dataset is described by a set of **features** that are consistent and descriptive, such as height or weight. The objective is to analyze these labeled samples and build a model that can infer the correct label for **new and unseen** samples based solely on their features [2, 4]. Figure 2.2 demonstrates this approach.

<img src="../figures/figure_2_2_supervised_learning.jpeg" alt="Supervised Learning" style="width:50%; margin-left: 25%;">
<p style="text-align:center;">Figure 2.2. Supervised Learning [6]</p>


Supervised algorithms are two types. They can be used as **regression** or **classification** algorithms. Many of them can also be used in both ways.

**Regression** is used when the objective is to predict a numerical value for a **dependent variable** based on **independent variables** (features). Predicting the price of a car based on a set of features is an example of a problem that can be solved using this method [4]. In this example, the car price is the dependent variable, and the features affecting it are the independent variables.

**Classification** is another supervised method where the dependent variable represents a specific group or **class**, and the objective is to predict which class a new sample belongs to based on its features [4]. If there are two classes, the problem is **binary**, and if there are more than two classes, it is a **multi-class** problem. Diagnosing a disease is an example of a classification problem where the type of disease is the dependent variable.

Various algorithms have been developed for regression and classification problems. Some of these algorithms are listed in Table 2.1. Each of these algorithms is selected based on criteria such as the **size of the dataset, nonlinear patterns, the number of independent variables and dimensions of the input data**.

<p style="text-align:center;">Table 2.1. Supervised Algorithms</p>

| **Algorithm** | **Type and Application** |
| :-: | :-: |
| Linear Regression | Regression |
| Logistic Regression | Classification (Linear and Binary by default) |
| k-Nearest Neighbor (k-NN) | Regression & Classification |
| Support Vector Machine (SVM) | Regression & Classification |
| Decision Tree and Random Forest | Regression & Classification |
| Deep Learning and Neural Networks<br> e.g. Multilayer Perceptrons (MLPs),<br> Convolutional Neural Networks (CNNs),<br> Recurrent Neural Networks (RNNs)| Regression & Classification<br> (can also be used as unsupervised algorithms) |

---
Unlike supervised learning, in **unsupervised learning**, the data are **unlabeled** and there is no pre-specified answer as a dependent variable. **Clustering** algorithms such as **k-Means** are common unsupervised methods. An example of a clustering problem is segmenting users of a website based on their behavior, similarities, and differences [4].

In **semi-supervised learning**, the dataset contains a mixture of labeled and unlabeled samples, with the number of unlabeled samples typically being much larger. The objective of a semi-supervised learning algorithm is the same as that of a supervised one, with the expectation that using the abundant unlabeled data can help the learning algorithm train a better model [3].

**Reinforcement learning** is significantly different from previous methods. In this approach, as shown in Figure 2.3, the learning system, called an **agent**, can observe the **environment**, receive a **state**, choose and perform **actions** and in return receive **rewards** or **penalties** in the form of negative rewards. It must then learn (discover) for itself the best strategy, called **policy**, for receiving the maximum total future rewards. A policy specifies which action the agent should choose in a given situation for optimal performance [4, 6].

<img src="../figures/figure_2_3_reinforcement_learning.png" alt="Reinforcement Learning" style="width:50%; margin-left: 25%;">
<p style="text-align:center;">Figure 2.3. Reinforcement Learning [6]</p>

Reinforcement learning can be used as an **end-to-end** control algorithm in control systems which does not require a dataset and the agent learns through trial and error. Furthermore, in reinforcement learning, the agent learns by interacting with the environment based on tasks that are significant in **reality** and the objective function leads to practically feasible solutions [6].

### 2.2. Based on Learning Paradigms
Machine learning systems are typically trained using either batch or online learning. In **batch learning** (or offline learning), the system is trained all at once on the entire dataset, a process that is computationally intensive and must be repeated from scratch with both old and new data to incorporate updates. This makes it unsuitable for rapidly changing environments or systems with limited resources. In contrast, **online learning** involves incremental updates by processing data sequentially in small mini-batches, allowing the system to adapt continuously and efficiently to new data in real-time. While online learning is resource-efficient and ideal for dynamic data streams, its performance can be degraded by poor-quality incoming data, and its adaptability must be carefully managed via a **learning rate** parameter that balances the assimilation of new information with the retention of past knowledge [4].

### 2.3. Based on Generalization Approaches
Machine learning systems can be broadly categorized into instance-based and model-based learning approaches. In **instance-based learning**, the system (lazy learner) does not create an explicit model but instead generalizes from the raw training data directly, using similarity metrics to make predictions for new data points by comparing them to stored examples. Conversely, **model-based learning** follows a more traditional approach where an explicit model (eager learner) is constructed through an iterative process that optimizes hyperparameters using the input data's features, with model validation techniques guiding the generalization process [5].

## 3. Life Cycle
The development of machine learning models primarily involves several **key stages** that play important roles in most of the problems. From **production prespective**, these stages are implemented as an **iterative process** known as **machine learning life cycle** that moves from a business problem to a machine learning solution as shown in Figure 3.1 [7].

<img src="../figures/figure_3_1_machine_learning_life_cycle.jpg" alt="Machine Learning Life Cycle" style="width:50%; margin-left: 25%;">
<p style="text-align:center;">Figure 3.1. Machine Learning Life Cycle [7]</p>

- **Problem Definition**: This initial phase involves clearly defining the problem, the objectives and the criteria for success to establish a clear foundation.
- **Data Preparation**: This stage prepares raw data for modeling through collection, preprocessing, exploratory analysis, and feature engineering to create an optimal training dataset.
- **Model Development**: In this phase, an algorithm is selected, trained on the prepared data, and then evaluated using metrics like accuracy to ensure it meets performance standards when facing unseen data.
- **Model Deployment**: The finalized and tested model is integrated into a real-world production environment for end-users to access and utilize.
- **Monitoring and Maintenance**: After deployment, the model performance is continuously tracked to detect issues, requiring retraining with new data or a return to earlier lifecycle stages for improvement.

## 4. Data Pipeline
The foundational step for any machine learning project is **acquiring data**. It is essential to address a number of key considerations regarding the data [5].

### 4.1. Data Collection
Once the problem is defined, the critical next step is **data collection**, which involves gathering relevant raw materials from diverse sources like **surveys**, **existing databases**, **APIs** or online platforms such as **Kaggle**. Datasets are most commonly provided in the **CSV** (Comma-Separated Values) format. This simple, plain-text file type stores tabular data, similar to a spreadsheet. The effectiveness of this phase is paramount, as the quality, quantity, and variety of the data directly impact the future performance of the model. Therefore, the collected data must not only be relevant and useful for the problem at hand but also sufficiently diverse to train the model to recognize patterns across multiple scenarios [5, 7].

### 4.2. Data Preprocessing
Collected data is often unstructured and messy, which can negatively impact model outcomes. To improve accuracy and performance, this raw data must undergo preprocessing (or data wrangling) to address issues like **missing values**, **duplicates**, **invalid data**, and **noise**. A crucial part of this preparation is **data labeling**, which assigns meaningful tags or categories to the data, providing the essential context that supervised learning algorithms need to identify patterns. **Label encoding** is a process that converts **non-numeric labels** in a classification problem into **numeric labels** which are understandable for most of the algorithms. The goal of this entire step is to make the data more consumable and useful for analytics [5, 7].

### 4.3. Exploratory Data Analysis (EDA)
Following data preparation, the data is explored through **statistical summaries and visualizations** to extract meaningful insights.

### 4.4. Feature Engineering
Feature engineering involves creating or enhancing measurable variables, known as features which were previously defined, to help machine learning models better identify patterns in data. This often works in tandem with **feature selection**, the process of choosing the most relevant and consistent features for a given problem. Together, these practices streamline datasets by **reducing size and complexity**, which is crucial for managing large-scale data and improving model efficiency [7].

## 5. Model Pipeline
The model development phase involves constructing a machine learning model using the prepared data. This is executed through a cycle of several core steps [7].

### 5.1. Model Selection
The choice of algorithm depends on several fundamental factors such as the **data characteristics**, the **complexity** of the problem, the presence of **nonlinear patterns**, the **desired outcomes**, and most importantly, the **model alignment** with the originally defined problem [7].

### 5.2. Model Training and Validation
The **preprocessed dataset** is splitted to **training** and **test** sets and during the training process, the algorithm is trained on the training dataset to identify patterns and relationships within the features. By iteratively adjusting its parameters based on validation performance, the model's predictive accuracy improves, making it more reliable for real-world application.

### 5.3. Model Evaluation
As mentioned in the previous parts, generalization is one of the most important objectives of machine learning. It refers to the ability of the model to perform accurately and appropriately on scenarios and data it was not trained on. After the model is trained on the training set, it must be evaluated on the test set containing **unseen data**.

Model evaluation assesses the model performance using metrics such as **accuracy**, **precision**, **recall**, **F1 score**, **confusion matrix**, etc [7]. It is important to distinguish that validation guides the model's development **during training**, while evaluation provides a final, unbiased assessment of its performance **after training is complete**.

### 5.4. Optimization
If the evaluation results are **unsatisfactory**, the model undergoes **hyperparameter tuning** to improve its predictive accuracy. This iterative cycle is essential for developing a reliable and accurate model [7]. Again, it is important to note that there is also optimization for validation, but it is different from evaluation.

## 6. Deployment, Monitoring and Maintenance
Model deployment is the final stage of the machine learning life cycle where an **evaluated model** is transitioned from a **development environment** into a live **production system**, enabling real-world use. This is typically achieved by packaging the model into an **API (Application Programming Interface)**, which acts as a standardized gateway for receiving input and returning predictions. This API can then be consumed by various client applications, such as **web platforms**, **mobile apps**, or **native desktop softwares**, seamlessly integrating the model's intelligence into user-facing products and business workflows. Successful deployment also requires **continuous monitoring** to ensure performance, scalability, and reliability after launch.

## References
[1] Deisenroth, M. P., Faisal, A. A., and Ong, C. S. "Mathematics for Machine Learning", Cambridge University Press, 2020.

[2] Burkov, Andriy., "The Hundred-Page Machine Learning Book", Andriy Burkov, 2019.

[3] Hossain, Eklas., "Machine Learning Crash Course for Engineers", Springer, 2024

[4] Géron, Aurélien., "Hands-on Machine Learning with Scikit-Learn, Keras, and TensorFlow", Second Edition, O’Reilly, 2020.

[5] "Machine Learning with Python Tutorial", Tutorialspoint, https://www.tutorialspoint.com/machine_learning_with_python/index.htm, Accessed October 2025.

[6] Geiger, Andreas. "Self-Driving Cars", Lecture Slides, University of Tübingen, 2023, https://uni-tuebingen.de/fakultaeten/mathematisch-naturwissenschaftliche-fakultaet/fachbereiche/informatik/lehrstuehle/autonomous-vision/lectures/self-driving-cars/, Accessed September 2025.

[7] "Machine Learning (ML) Tutorial", Tutorialspoint, https://www.tutorialspoint.com/machine_learning/index.htm, Accessed October 2025.