<a href="https://colab.research.google.com/github/babupallam/Msc_AI_Module1_Neural_Systems/blob/main/Supporting%20Research/Partial%20Connectivity/02_Paper_Review_Non_Ontogenic_Sparse_Neural_Networks_David_A_Elizondo_and_Others.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**Title:** Non-Ontogenic Sparse Neural Networks

**Authors:** D. Elizondo, E. Fiesler, and J. Korczak

**Aim of the Paper:**
The paper presents an extensive study on non-ontogenic sparse neural networks (SCNNs), which are defined by their static topology that remains unchanged during the learning process. It aims to explore various methods for achieving sparse connectivity in neural networks and discuss the advantages over fully connected neural networks (FCNNs).

**What It Proposes:**
- The paper proposes an in-depth classification of non-ontogenic SCNNs based on different methodologies:
  - **Theoretical and Experimental Studies:** Focuses on training dynamics and performance in sparse networks.
  - **Biological Neural Networks:** Connectivity strategies inspired by biological systems.
  - **Application Dependent Methods:** Techniques tailored to specific tasks, such as image processing and natural language processing.
  - **Modular Networks:** Divide complex tasks into smaller sub-networks.
  - **Hardware Implementation:** Explores partial connectivity strategies for analog and digital implementations of neural networks.
  - **Hybrid Methods:** Combines symbolic knowledge and neural networks, as well as genetic programming to optimize network topology.

**Key Points:**
- **Advantages of SCNNs:**
  - Reduced training and recall time.
  - Improved generalization capabilities.
  - Reduced hardware requirements.
  - A closer resemblance to biological neural systems.
  
- **Approaches:** Various methods are discussed, including pruning connections, modularity, random selection of connections, and biological connectivity patterns.
  
**Conclusion:**
Non-ontogenic sparse neural networks offer several benefits compared to fully connected ones, particularly in terms of efficiency and generalization. The paper categorizes different approaches to achieving sparse connectivity, providing a comprehensive view of the field.


# Detailed Review

### **Introduction**

#### **1.1 Overview of Fully Connected Neural Networks (FCNNs)**
- **Definition:** Fully connected neural networks (FCNNs) are the default configuration for most artificial neural networks, meaning every neuron is connected to every other neuron in subsequent layers.
- **Drawbacks:**
  - **High Complexity:** FCNNs often result in high complexity and redundancy in connections.
  - **Training Overhead:** Leads to longer training and recall times.
  - **Decreased Generalization:** The redundancy can negatively impact the model’s ability to generalize to new data.

#### **1.2 The Concept of Sparse Neural Networks (SCNNs)**
- **Definition:** SCNNs are networks with fewer connections, either designed from the start or modified during training to remove unnecessary connections.
- **Advantages of SCNNs:**
  - **Reduced Training and Recall Time:** Less complexity results in faster computations.
  - **Improved Generalization:** Sparse connections reduce the risk of overfitting.
  - **Lower Hardware Requirements:** Fewer connections mean less memory and hardware resources are required.
  - **Closer to Biological Networks:** SCNNs more closely resemble biological neural networks, which are sparsely connected.

#### **1.3 Ontogenic vs. Non-Ontogenic Methods**
- **Ontogenic Methods:** These modify the network topology during the learning phase, typically by pruning or adding connections as part of the learning process.
- **Non-Ontogenic Methods:** The focus of this paper. These methods define the network’s topology before training begins, and it remains static throughout the learning process.
  - **Emphasis on Non-Ontogenic SCNNs:** The paper will not cover ontogenic methods extensively and instead focuses on static, pre-defined sparse topologies.

#### **1.4 Classification of Non-Ontogenic SCNN Methods**
- **Theoretical and Experimental Studies:**
  - Focus on demonstrating the effects of sparse connectivity on training dynamics.
  - Typically theoretical, with limited application testing.
- **Methods Derived from Biological Neural Networks:**
  - Inspired by how biological neurons connect sparsely in real-life systems.
  - Based on biological principles like synaptic growth and random connectivity patterns.
- **Application-Specific Methods:**
  - Designed for specific tasks, such as speech recognition or image processing.
  - These methods are tailored to specific problem domains, reducing the network’s complexity by focusing on problem-specific characteristics.
- **Modular Networks:**
  - Involves dividing the problem into smaller sub-tasks, with each sub-task being handled by a specialized sub-network.
  - The sub-networks are combined to solve the overall problem.
- **Hardware Implementation Methods:**
  - Focus on how to implement SCNNs efficiently using analog or digital hardware, overcoming the limitations of full connectivity in large-scale hardware systems.
- **Hybrid Methods Combining Neural Networks and Inductive Knowledge:**
  - Use domain-specific symbolic knowledge to define initial network topology.
  - The network is then refined through learning algorithms to improve performance.


### **2. Non-Ontogenic Methods**

#### **2.1 Methods Based on Theoretical and Experimental Studies**

- **Objective:** To explore and demonstrate the consequences of partially connected neural networks (PCNNs) in terms of their training dynamics. These studies focus on the effects of sparse connectivity in neural networks rather than applying them to real-world problems.
  
- **Approaches:**
  - **Random Selection:** Arbitrarily selecting connections between neurons, often within a specific neighborhood.
  - **Weight Dilution:** Reducing connections by randomly cutting them within a predefined probability.
  - **Varying Connectivity Levels:** Experimenting with minimal to full connectivity to find the optimal level that yields the best performance.
  - **Exhaustive Methods:** Testing all possible network topologies.
  - **High Order Connections:** Introducing high-order connections for better model performance.

- **Findings:**
  - There is an optimal connectivity level that depends on the problem being solved. Reduced connectivity can lead to better performance in specific scenarios.
  
#### **2.2 Methods Derived from Biological Neural Networks**

- **Focus:** These methods mimic the sparse connectivity observed in biological neural systems. The goal is to study the storage capacity and dynamics of biologically inspired PCNNs.
  
- **Connectivity Strategies:**
  - **Neural Spike Simulations:** Models based on neural spike activity.
  - **Statistical Measures:** Using probabilistic models and distributions.
  - **Synaptic Growth Models:** Non-Hebbian processes that replicate synaptic growth in biological systems.
  - **Random Connection Selection:** Selecting connections randomly to imitate biological systems.

#### **2.3 Application-Dependent Methods**

- **Purpose:** Tailored to specific applications such as speech recognition and image processing.
  
- **Connectivity Approaches:**
  - **Local Connectivity:** Only connecting nearby neurons.
  - **Geometrical Relationships:** Defining connections based on the geometric patterns in input data.
  - **Shared Weights:** Using shared weights across neurons to reduce redundancy.
  
- **Limitations:** These methods are typically only applicable within their respective domains and are not generalizable to other problem areas.

#### **2.4 Modular-Based PCNNs**

- **Concept:** Task division into smaller, more manageable sub-tasks. Each sub-task is handled by a sub-network, which is later combined to solve the overall problem.
  
- **Challenges:**
  - **Task Decomposition:** Critical for performance. Often, ad-hoc techniques are required for decomposing tasks in function approximation problems.

#### **2.5 PCNNs for Hardware Implementation**

- **Focus:** Aimed at overcoming physical limitations in hardware, such as VLSI technology.
  
- **Advantages:** Partially connected networks help to reduce hardware complexity, improving both training and recall times.
  
- **Methods:**
  - Restructuring fully connected networks into partially connected ones to optimize hardware usage.
  - Cellular neural networks (CNNs) with local connectivity to nearby neurons.


### **3. Ontogenic Methods**

#### **3.1 Multilayer Perceptron-Based Methods**

- **Multilayer Perceptron (MLP):** One of the most popular neural networks used in supervised learning. It consists of an input layer, output layer, and one or more hidden layers. Information flows in a feedforward manner from input to output.
- **Error Backpropagation:** The most widely used training algorithm for MLPs, which adjusts the weights of the connections by propagating the error backward from the output layer. The learning process is based on gradient descent.
  
- **Types of Ontogenic Methods:**
  - **Pruning Methods:**
    - Initially, an oversized network is trained, and redundant or unnecessary connections are gradually removed.
    - Some of the methods include:
      - **Energy Functions**
      - **Penalty Functions**
      - **Optimal Brain Damage**
      - **Weight Elimination**
  - **Growing and Pruning Methods:**
    - These methods combine adding new connections during training with pruning techniques.
    - Examples include:
      - **Units Recruitment**
      - **Cutting and Creating Connections**

#### **3.2 Other Ontogenic Methods**

- **Non-MLP-Based Methods:**
  - There are other ontogenic methods that apply different neural network learning rules.
  - Methods include:
    - **Monte Carlo procedures** for architecture optimization.
    - **Limited fan-in connections** or **cascade-correlation methods**.


### **4. Hybrid Methods**

#### **4.1 Knowledge-Based PCNNs**
- **Objective:** Combine symbolic domain knowledge (e.g., rules) with neural networks to define the network's topology. The learning algorithm further refines the network based on empirical learning.
- **Methodology:**
  - The initial structure and weights of the neural network are defined by symbolic rules expressed as Horn clauses.
  - After inserting the symbolic rules, the neural network is refined through standard learning algorithms such as backpropagation.
- **Example:**
  - **KBANN System:** A system that uses a set of approximate rules to determine the structure and initial weights of the neural network. These rules are refined post-training, and the knowledge is extracted from the network.
  - **Drawbacks:**
    - Additional hidden units are often needed to improve accuracy.
    - New links might need to be added to discover previously unidentified dependencies.

#### **4.2 Genetic Programming-Based Methods**
- **Objective:** Utilize genetic algorithms (GA) to optimize both the topology of the neural network and the initial weights.
- **Methodology:**
  - **Genetic Algorithms (GAs):** A guided random search based on principles of natural selection and genetics, using operators like selection, crossover, mutation, and inversion.
  - The algorithms begin with a random population and evolve toward better solutions over time.
- **Key Focus:**
  - The encoding of neural networks into chromosomes for GA optimization.
  - Evaluating the fitness of individual solutions.
- **Challenges:**
  - **Excessive Computational Time:** Finding the optimal topology can be computationally expensive.
  - **Solutions:** Suggestions to mitigate this issue include using specialized neural hardware and parallel computing techniques.


### **5. Summary and Conclusions**

#### **5.1 Overview of Non-Ontogenic Methods**
- **Theoretical and Experimental Studies:** These methods explore the effects of sparse connectivity in neural networks, typically in terms of training dynamics. They are often theoretical, with limited practical applications.
  - **Methods Include:**
    - Random connection selection within a local neighborhood.
    - Connectivity levels varied from minimal to plenary.
    - Use of random unit selection and weight dilution to reduce connectivity.
- **Findings:** There is typically an optimal level of connectivity specific to the problem being addressed, which provides better performance with reduced complexity.

#### **5.2 Ontogenic Methods**
- **General Aim:** Ontogenic methods focus on dynamically adjusting the network topology during training to improve generalization and performance. These methods involve pruning or growing the network by modifying connections based on performance.
  - **Key Approaches:**
    - **Pruning connections:** Start with an oversized network and reduce connections to improve efficiency.
    - **Growing methods:** Add new connections during training, combined with pruning methods to create an efficient structure.
  
#### **5.3 Hybrid Methods**
- **Knowledge-Based PCNNs:**
  - These methods are in the early stages but provide a foundation for defining a network’s topology based on symbolic knowledge (rules). The topology is refined through empirical learning.
  - **Future Possibilities:** Integrating automatic rule extraction could enhance these methods by allowing networks to automatically learn rules from training data.
- **Genetic Programming-Based Methods:**
  - Genetic algorithms (GA) are used to optimize the topology and weights of neural networks.
  - **Challenges:** Computational complexity remains a significant drawback. Solutions such as parallel computing and specialized hardware are proposed to mitigate this issue.

#### **Conclusions:**
- **Advantages of SCNNs:**
  - Improved generalization capabilities.
  - Reduced hardware requirements.
  - Faster training and recall times.
- **Challenges:**
  - **Computational complexity** in hybrid and genetic programming-based methods.
  - **Task-specific adaptation** in application-dependent methods.
- **Future Directions:**
  - Improving hybrid methods by integrating automatic rule extraction.
  - Expanding the scope of genetic programming to reduce computational overhead.
