## Introduction

- When we navigate our visual environment we process high-dimensional visual information into low-D "abstractions"
- A heavily studied example of this in theoretical neuroscience is object recognition
    - High dimensional correlations in pixels are abstracted into low-D representation of category
    - This has been extended into category-orthogonal information as well like "style" (Cheung et al)

## Motivating questions
- The brain has finite "space" (in numbers of neurons) to represent these abstractions so how do we allocate them?
- Does this allocation change if our environment changes? How so?
- How do we learn to allocate representational space efficiently as a function of our inputs (e.g. visual invironment)?
- How does a single network extract multi-faceted abstractions from a common input (e.g.  that extracts space, category, style from single input)

## Approach

- Create an autoencoder that learns several orthogonal features
- Use semi-supervised training by evaluating its reconstructions of the input
- Analyze the represented space

### Dataset

We can generate an image dataset that contains varying amounts of spatial shifts (dx,dy) and if the network learns this property

## Methods

### Representational variance explained

- We want to quantify how well the network does at abstracting a continuous environmental property
    - For example: Object location (dx,dy) or style variations within an object

- This relationship may not be (and probably isn't) linear so plain correlation may not work

- One way to measure this is to discretize the range of a units activity and examine the property variance in that range

- A "well abstracted" property should have a variance smaller than the properties global variance

- Define a feature vector $X_{n,t}$ that represents the activations of $n$ units in the latent space over $T$ trials

- Define a contiguous property $P_t$ (e.g. dx from center of FOV) that is indexed by and varied across trials $t$

- If the network learns to represent $P$ in $X_n$'s activity level a subset of activity level of $X_n$ should correspond with a subset $P$

- Split the full activity range across all trials, $X_T$ into a discrete number of $b$ bins so $X_{n,b}$ is some mutually exclusive activity range and a subset of $X_{n,T}$

- For each binned level of activity and calculate the variance of the property $\sigma(P | X_b)$ or $\sigma(P_b)$ for trials evoking activity $X_b$

- A contiguous property $P$ that is "well-represented" by the neurons should have "narrower" variance band at each bin than the global variance of that properity

- A poorly represented property would be expected to have binned variances, $\sigma(P_{b})$, similar to global variance $\sigma(P)$

- $VE_R = E[\frac{\sigma(P)-\sigma(P_b)}{\sigma(P)}]$

Analyze learning in environments with varying spatial variation 
  - [X] [Isomap style embedding](https://gist.github.com/elijahc/c7b2c8a9ef03148b3b4b8d2bac32c7c7#1-d-embedding)
  - 