# Prototype Auto-encoder

### Code
**Repo**: 

### Intro

* **Date**: 11/19/2020


* **Why**: This particular architecture is compelling because it's trained with gradient descent, rather than some competitive algorithm.  For an RQI, not only is it important to understand the latent space, but it's equally important to understand the maps between the input and latent spaces.  In traditional auto-encoders, these mappings are learned, and they're effectively black boxes.  Without understanding the structure of these mappings, you can't really understand the structure of the latent space.  In this algorithm, both mappings are quite straightforward and understandable.  Even though parameters in the mapping are learned, the purpose of the mappings are always clear.  Another issue with the structure of traditional auto-encoders is that when they interact with new data, the mappings are altered to better reconstruct data.  This means that the latent space doesn't have a stable structure, which effectively implies that if you have a series of hierarchical RQIs, (which is obviously the case in the brain), the meaning of the inputs is very fluid, and higher levels can't as easily learn deterministic representations of lower levels.  With the prototype auto-encoder, the nature of the mapping allows the meaning of each latent variable become more "fixed" over time, so higher levels can gain a good deterministic understanding of lower level inputs.


* **What**: This is a one-layer network of neurons that is fully-connected to an input.  Each synapse between a neuron and an input has a learned weight.  These weights form a prototype, and the loss function for this network is essentially defined to be the extent to which the different neuron prototypes can come together (in a weighted fashion) to reconstruct the input.  To get technical, for a given input vector, each neuron calculates a weighted average of each of the components, weighted by their synaptic weight.  This weighted average is multiplied by the weight prototype, and the weighted prototypes from all neurons are summed.  The loss function is the sum of the square difference between the input and the resultant prototype.  The network is trained using gradient descent.  Due to the weighted averages and the sum of weighted prototypes, the derivatives are a bit more involved than a standard DNN.  Even though there isn't really an activation function at any step, this network is non-linear (and in fact, quadratic) due to the multiplication of the weighted average with each pixel value.


* **Hopes**: I hope that this network is able to learn a diverse set of prototypes for the data.  I also hope that the gradient-based learning algorithm allows the network to move towards an optimal state as quickly as possible.  Competitive prototype-learning algorithms are based on some logical process that isn't necessarily optimal, so ideally, this network is able to learn a more robust and diverse set of prototypes than other competitive learning algorithms.


* **Limitations**: Previous experiments have shown me that if all prototypes are learned using the same type of adjustment at each step, the prototypes all tend toward the same state.  While this happened for simpler learning algorithms, it seems like the same sort of behavior could occur here.  If this occurs, I'll need to figure out a way to treat each prototype separately at each learning step.  The other limitation is that data is received at each neuron in an all-to-all fashion, only one flow of information is passing through the neuron, rather than burst of information passing through the neuron at different times, as is the case in spiking networks.  In other words, there's no notion of an input being "off," and therefore not affecting the output.  One way to possibly address this is to weight each weight by the synaptic input, so inputs that have a measure of 0 aren't even factored into the calculation of the weighted average.  That might be cool because it would introduce even more non-linearities.  This is all I'll say before the experiment starts.