## Neural Networks Intro
This section introduces the Artificial Neural Networks (ANNs).

### Biological neural networks

The human brain can be roughly divided into an old brain and a new brain on top of the old one. The old brain encompasses several old -- in developmental terms -- structures which are responsible for specific, mostly _unconscious_ tasks such as breathing and temperature controlling. It also provides quick responses towards the individual survival and therefore plays a major role on basic feelings like fear, anger, etc. Most of the time, we are not even aware that these tasks are being managed by our old brain. On the other hand, the new brain is responsible for executing higher level tasks that typically requires slower responses and eventually can bring our _attention_:
* creating an internal representation of the external world;
* performing higher-level reasoning and decision making;
* creating and maintaining memories of events, facts;
* following this fascinating piece of text 😉;
* among many other high-level tasks.

::::{margin}
:::{note}
The old brain prevents us from killing ourselves unintentionally everyday e.g. by waiting too long to run away from a hungry beast.
:::
::::

The new brain comprises several structures that runs on the top of the old brain. One can think of the old brain as an _operational system_ in modern computers which takes care of hardware-specific issues, whereas the new brain are more like software applications that runs on the top of the operational system and take benefit of all hardware-independent facilities provided by it. Keep in mind though that the implications of this weak analogy stops right there.

The neocortex is the most prominent structure of the new brain and it is paramount for such high-level tasks. The neocortex is so important that some researchers claim that we are our neocortexes {cite}`hawkins2021thousand`. Anatomically, the neocortex consists of the commonly named _gray matter_ surrounding the entire cortex of mammalian brains. In primates, it has deep grooves (sulci) and wrinkles (gyri) so that it fits the skull despite its huge size. If completed stretched out (unfolded), the human neocortex is approximately the size and the shape of a thick napkin which covers the deeper white matter in the brain. In contrast with the structures of the old brain, the neocortex has a surprisingly homogeneous arrangement of neuronal cells. More specifically, cortical cells are arranged in layers -- six layers in human brain -- and are mostly grouped in the so-called cortical columns over the external neocortex sheet. However, we do not fully understand today how these layers interact exactly within cortical columns or how the cortical columns interact with nearby -- same neocortex region -- and far way -- different neocortex regions -- cortical columns.

::::{margin}
:::{note}
Some researchers {cite}`edelman1978mindful` go further and claim that the 6-layer neuronal circuit within cortical columns is the basic modular circuit replicated everywhere in the neocortex which runs a common _algorithm_ to deal with all kind of tasks e.g. sensorial and motor tasks.
:::
::::

:::{figure} /images/neuralnets/neocortex.jpg
---
height: 320px
name: neocortex_fig
align: left
---
The neocortex (adapted from {cite}`hawkins2021nice`).
:::

The grayish aspect of this _sheet_ is due to the massive presence of the neuronal cell bodies. On the other hand, the so-called white matter beneath this external sheet corresponds to the brain _wiring_ connecting cells from different regions over the neocortex. The regions are typically connected hierarchically such that sensorial regions and high reasoning regions correspond respectively to lower levels and to higher levels in this hierarchy.

The basic computational unit of the brain is the biological neuron cell. There are several types of neuron cells, but they share a basic anatomy. Biological neurons have tree main parts: a main body a.k.a. soma, a dendrite tree and an axon. The **dendrites** make a lot of connections a.k.a. synapses to other cells, the input neurons. The **soma** in turn integrates stimulus or spikes received from the input neurons and activates, i.e. generates a spike itself, when a certain threshold its achieved. Roughly speaking the neuron cell activates when multiple input spikes corresponding to a learned pattern reach its dendrites about the same time. Finally, the **axon** is coated with a fatty insulation sheath called myelin to allow neurons to efficiently transmit spikes over long distances. Eventually, cell spikes reach the dendrite trees of multiple neurons forming synapses with its axon terminals. Note though that, despite being widely used, this model is oversimplified. Some researches for example argue that proximal dendrites -- dendrites close to the soma -- and distal dendrites play a different role on neuronal activation {cite}`hawkins2016neurons`.

:::{prf:observation}
A lot of data have been collected about the human brain. Researchers know a lot about the anatomy of the brain and the role that different regions of the neocortex play on specific tasks. However, up to today, no overall theory of the brain exists. There are some promising efforts in this direction. As an appetizer, refer to the work of Numenta company {cite}`hawkins2021thousand`.
:::

### Artificial neural networks (ANNs)
ANNs are computational / mathematical models biologically inspired by our limited understanding of how some mechanisms in the brain might work. Similar to the biological brain, they include multiple interconnected layers of computational units -- called artificial neurons -- that cooperate to perform a task. Despite a single artificial neuron is not able to perform a _complex task_, the _trained network_ is able to. In this sense, _intelligent behavior_, i.e. being able to perform the task, is an emergent ability of the artificial network. Examples of challenging tasks are image processing e.g. classify objects on images and natural language processing e.g. voice recognition (speech-to-text), voice synthesis (text-to-speech) and translation (text-to-text). These are _apparently ease_ tasks for the human brain, but it is very hard to explicitly design a computer program to solve them without using the machine learning approach, which consists of teaching a computer program to perform a task using data (typically, a lot of data is required).

{numref}`biological_neuron_fig` and {numref}`artificial_neuron_fig` illustrate respectively a typical biological neuron and its artificial model counterpart. Note that the artificial neuron is an oversimplified model of the biological one. First of all, whereas the intensity of a biological neuron response is related to how frequently it spikes over time, the output of an artificial neuron -- associated with the index $ j $ in {numref}`artificial_neuron_fig` -- at any time is often represented by a real number $ y_{j} $. Moreover, the strength of a biological synapse depends on several factors e.g. its proximity to the soma, while the strength of an artificial synapse is represented by a real number $ w_{i,j} $ which simply weights the value of the corresponding input $ x_{i} $ on the global activation of the $ j $-th neuron. In this sense, the artificial neuron activation is just an affine transformation of its $ n $ inputs $$ a_{j} = \sum_{i=1}^{n} w_{i,j} x_{i}. $$ Note also that the biological neuron activation produces an electrical spike which depends in turn on several factors e.g. the time since the last spike produced by the cell. On the other hand, the output value of the artificial neuron is determined by a non-linear activation function $\phi:\mathbb{R} \rightarrow \mathbb{R} $ such that $$ y_{j} = \phi(a_{j}). $$ Lastly, at the best of our knowledge, learning in the brain can be achieved by either strengthening or creating new biological synapses. On weighted ANNs, the network architecture and connections are often fixed and learning is achieved in turn by adapting the weights of the available artificial synapses.

:::{figure} /images/neuralnets/biological_neuron.png
---
height: 320px
name: biological_neuron_fig
align: left
---
Biological neuron.
:::

:::{figure} /images/neuralnets/artificial_neuron.png
---
height: 320px
name: artificial_neuron_fig
align: left
---
Artificial neuron.
:::

:::{prf:remark}
By performing complex tasks, ANNs are able to exhibit intelligent behavior. Eventually, a trained ANN would pass a Turing test designed to check whether that particular task was executed by a human or not. Note though that there is an endless debate between being truly intelligent and showing intelligent behavior. Refer to the Chinese Room experiment {cite}`standford2020chinese` (just for curiosity, do not invest too much time there). For now, it suffices to say that we are intelligent even by showing no behavior at all e.g. while standing still paying attention to a lecture.
:::

:::{admonition} See also
In the typical isolated learning paradigm {cite}`chen2018lifelong`, ANNs are designed and trained to execute a single, isolated task. Furthermore, in contrast with the human brain, ANNs are often not able to learn from new experiences. Specifically, once they are trained, its knowledge -- typically stored in the weights of the artificial synapses -- is frozen. Thus, after deployed, during the recall phase, artificial networks are not able to learn -- we mean update their weights -- from experience, i.e. by observing new input data. Lastly, learning to execute new tasks by accumulating new experiences is a widely research topic today known as continuous learning or lifelong learning. One of the challenges faced by continuous learning methods based on conventional ANNs is to adjust the network parameters (weights) to learn new tasks, while avoiding the catastrophic forgetting of previously learned tasks (refer to {cite}`chen2018lifelong` for an overview, but do not spend too much time there).
:::

:::{prf:remark}
Recently, with the advances of deep neural networks -- ANNs with several hidden layers --, interesting results from literature show ANNs surpassing the human ability in several tasks. However, most of these results were obtained by designing and training ANNs to execute a single specific task. This is known as _narrow AI_ as opposite to _general AI_ in which an intelligent agent would be able to learn new tasks -- potentially any task that a human being can perform -- by accumulating / learning from previous experiences on the fly, for instance, by executing other tasks. Nevertheless, there is a heated debate today whether the humankind should left general AI to science fiction or not (see {cite}`thorn2015nick`). For now, it suffices to think about some interesting inquiries: is there a limit on how intelligent an artificial agent can be? Conversely, which factors limit the human intelligence? Note that, compared to an ant, one can argue that our intelligence level is as high as Einstein's intelligence. Specifically, all humans apparently exhibit the same level of intelligence w.r.t. any ant. The crucial question is: what is intelligence at all? Can we effectively measure it? Surprisingly, up to today, there is no overall consensus on how to answer those questions. Thus, let put the philosophical aspects aside and check how we can train and employ ANNs to solve practical problems...
:::