In [None]:
'''
 * Copyright (c) 2018 Radhamadhab Dalai
 *
 * Permission is hereby granted, free of charge, to any person obtaining a copy
 * of this software and associated documentation files (the "Software"), to deal
 * in the Software without restriction, including without limitation the rights
 * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 * copies of the Software, and to permit persons to whom the Software is
 * furnished to do so, subject to the following conditions:
 *
 * The above copyright notice and this permission notice shall be included in
 * all copies or substantial portions of the Software.
 *
 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
 * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
 * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
 * THE SOFTWARE.
'''

$$
\textbf{Machine Learning - Man Vs Machine}
$$

Machine learning (ML) is a subset of artificial intelligence, which builds a mathematical model based on sample data, known as “training data,” in order to make predictions or decisions without being explicitly programmed to perform the task.

A good definition given by Mitchell [181] is:

$$
\text{A computer program is said to learn from experience } E \text{ with respect to some class of tasks } T \text{ and performance measure } P \text{ if its performance at tasks in } T, \text{ as measured by } P, \text{ improves with experience } E.
$$

In machine learning, neural networks, support vector machines, and evolutionary computation, we are usually given a training set and a test set. The training set consists of the union of the labeled set and the unlabeled set of examples available to machine learners. In comparison, the test set consists of examples never seen before.

Let 

$$
(X_l, Y_l) = \{(x_1, y_1), \ldots, (x_l, y_l)\}
$$

denote the labeled set, where \(x_i \in \mathbb{R}^D\) is the \(i\)-th \(D\)-dimensional data vector and \(y_i \in \mathbb{R}\) or \(y_i \in \{1, \ldots, M\}\) is the corresponding label of the data vector \(x_i\). In regression problems, \(y_i\) is the regression or fitting of \(x_i\). In classification problems, \(y_i\) is the corresponding class label of \(x_i\) among the \(M\) classes of targets. The labeled data \(x_i\), \(i = 1, \ldots, l\) are observed by the user, while \(y_i\), \(i = 1, \ldots, l\) are labeled by data labeling experts or supervisors. The unlabeled set is comprised of the data vectors and is denoted by 

$$
X_u = \{x_1, \ldots, x_u\}.
$$

Machine learning aims to establish a regressor or classifier through learning the training set, and then to evaluate the performance of the regressor or classifier through the test set.

According to the nature of training data, we can classify machine learning as follows:

$$
\textbf{1. Regular or Euclidean Structured Data Learning}
$$

- \textbf{Supervised Learning}: Given a training set consisting of labeled data 

$$
(X_{\text{train}}, Y_{\text{train}}) = \{(x_1, y_1), \ldots, (x_l, y_l)\},
$$

supervised learning learns a general rule that maps inputs to outputs. This is like a “teacher” or a supervisor (data labeling expert) giving a student a problem (finding the mapping relationship between inputs and outputs) and its solutions (labeled output data) and telling that student to figure out how to solve other, similar problems: finding the mapping from the features of unseen samples to their correct labels or target values in the future.

- \textbf{Unsupervised Learning}: In unsupervised learning, the training set consists of the unlabeled set only 

$$
X_{\text{train}} = X_u = \{x_1, \ldots, x_u\}.
$$

The main task of the machine learner is to find the solutions on its own (i.e., patterns, structures, or knowledge in unlabeled data). This is like giving a student a set of patterns and asking him or her to figure out the underlying motifs that generated the patterns.

- \textbf{Semi-supervised Learning}: Given a training set 

$$
(X_{\text{train}}, Y_{\text{train}}) = \{(x_1, y_1), \ldots, (x_l, y_l)\} \cup \{x_{l+1}, \ldots, x_{l+u}\}
$$

with \(l \ll u\), i.e., we are given a small amount of labeled data together with a large amount of unlabeled data. Semi-supervised learning falls between unsupervised learning (without any labeled training data) and supervised learning (with completely labeled training data). Depending on how the data is labeled, semi-supervised learning can be divided into the following categories:

  - \textbf{Self-training} is a semi-supervised learning using its own predictions to teach itself.
  - \textbf{Co-training} is a weakly semi-supervised learning for multi-view data using the co-training setting, and uses their own predictions to teach themselves.
  - \textbf{Active Learning} is a semi-supervised learning where the learner has some active or participatory role in determining which data points it will ask to be labeled by an expert or teacher.

- \textbf{Reinforcement Learning}: Training data (in the form of rewards and punishments) is given only as feedback to an artificial intelligence agent in a dynamic environment. This feedback between the learning system and the interaction experience is useful to improve performance in the task being learned. The machine learning based on data feedback is called reinforcement learning. Q-learning is a popular model-free reinforcement learning and learns a reward or punishment function (action-value functions, simply called Q-function).

- \textbf{Transfer Learning}: In many real-world applications, the data distribution changes or data are outdated, and thus it is necessary to apply transfer learning for considering transfer of knowledge from the source domain to the target domain. Transfer learning includes but is not limited to inductive transfer learning, transductive transfer learning, unsupervised transfer learning, multitask learning, self-taught transfer learning, domain adaptation, and EigenTransfer.

$$
\textbf{2. Graph Machine Learning}
$$

Graph machine learning is an irregular or non-Euclidean structured data learning, and learns the structure of a graph, called also graph construction, from training samples in semi-supervised and unsupervised learning cases.

The above classification of machine learning can be vividly represented by a machine learning tree, as shown in Fig. 6.1.

The machine learning tree is so-called because Fig. 6.1 looks like a tree after rotating it 90° to the left.

There are two basic tasks of machine learning: classification (for discrete data) and prediction (for continuous data).

Deep learning is to learn the internal law and representation level of sample data. The information at different hierarchy levels obtained in the learning process is very helpful to the interpretation of data such as text, image, and voice. Deep learning is a complex machine learning algorithm, which has achieved much better results in speech and image recognition than previous related technologies.

Limited to space, this chapter will not discuss deep learning, but focuses only on supervised learning, unsupervised learning, reinforcement learning, and transfer learning.

Before dealing with machine learning in detail, it is necessary to start with preparation knowledge of machine learning: its optimization problems, majorization-minimization algorithms, and how to boost a weak learning algorithm to a strong learning algorithm.
![Decision Directed Acyclic Graphs (DDAG)](mvm1.png)