# Sparana

This is my library, if you want to use this with a GPU, you will need to get [CUPY](https://cupy.dev/) working. Look at [their documentation](https://docs.cupy.dev/en/stable/install.html) about setting it up. There are also some parts that use [Numba](https://numba.pydata.org/). This might be more detail that most of these have, I have included how it works as well as how to use it. This is an open source project, and if you want to modify/improve the code, you should have it explained a bit. For each section, the how to use is at the top, the rest is further down, you don't need to know everything about how the code works to get it to run. 

# Model.py

This file contains one class, the model class, which builds a model using layer objects. It has operations to initialize weights before training, produce outputs, convert between sparse and dense data structures, among other things. This class is passed into other Sparana classes which train and manipulate models.

## Building and initializing.

The code to build a model using the model object needs 2 inputs, the input size, and a python list of layer objects. The comp_type input here can be 'CPU' or 'GPU', it selects GPU automatically, so you don't need to define it here, but it is left in here for demonstration purposes. 

See the layers section for more information about layers, the important thing to note here is that the final layer size is the number of classes the model will have, this is for MNIST so there are 10 classes.

```python
mymodel = model(input_size = 784,
                layers = [full_relu_layer(size = 1000), 
                          full_relu_layer(size = 800),
                          full_relu_layer(size = 400),
                          full_linear_layer(size = 10)],
                comp_type = 'GPU')
```
This example can be found in the Demo-Model notebook, first code cell, line 21.

## Functions for the user

### Outputs

Pass in some inputs in a numpy array, this will output a numpy array of the output of the final layer of the model.
```python
this_output = model.outputs(intput) 
```
This example can be found in the Demo-Model notebook, 3rd code cell, line 3.

### Partial outputs

This gives the outputs of one of the layers within the model. This is used for some type of selected parameter training. I can get outputs from the second last layer of a 5 layer model with
```python
next_input = model.partial_outputs(input, 3)
```
next_input can then be used for the input for a different final layer/layers.

### Get accuracy

Given an input, and a label vector, this will give the accuracy. It matches the argmax of each output, to the given label.
```python
accuracy = mymodel.get_accuracy(input, labels)
```

This example can be found in the Demo-Model notebook, 3rd code cell, line 21.

### Initialize weights

This can initialize weights several different ways:
```python
model.initialize_weights('Xavier', bias_constant = 0.1)
```
is Xavier initialization, probably the most popular, just google it, biases are initialized as a constant.
```python
model.initialize_weights(0.1, bias_constant = 0.1) 
```
Any float in the place of 0.1 will draw from a normal distribution centred at 0 with a standard deviation of 0.1 (or chosen float), biases are initialized as a constant.
```python
parameter_list = [(weight1, bias1), (weight2, bias2) etc.]

model.initialize(parameter_list)
```

Will load weights and biases from a list of weight, bias pairs. This is one way of loading a model that I added to the libaray, but never actually used.

This example can be found in the Demo-Model notbook, first cell, line 31

### Initialize sparse weights

This initializes sparse weights that are randomly distributed, it does not work well, and is slow to train, so I am not putting in too much effort explaining it. There might be some potential for structured initializations, convolutional neural nets are sort of like this, and I will update this if I add that sort of functionallity to this.

*** Do a demo so that I can add other types of sparse inits***

### Convert comp type

Converts from GPU to CPU data structures, or the other way. This is a simple function that calls the convert layer comp type function in the layer class. Run it using:

model.convert_comp_type()

This example can be found in Demo-Model, 4th code cell.

### Convert to sparse

Converts a full/dense model into a sparse model. To this by using:

model.convert_to_sparse()

I have not built a module to convert models the other way. It would not be too difficult, but I find the best way of working with models is to do everything I need to on dense models, then convert them. I have not had a need to convert models the other way. 

Note. Remove activations below only works for dense models, I just have to remove the activations before converting, rather than reworking that function for sparse structures.

*** Demo for this is in the pruned model notebook***

### Remove activations

After pruning there can be activations where all of the weights are set to 0, the only effect that this has on the model overall is the biases, which are constant. These can then be added to next layer, and the activations(martix columns) can be removed. This it done by calling:

model.remove_actuvations()

Not quite finished, need to add the bit where it prints the final size of the models.

*** Puned model notebook

## To Do

These are things that I have not built, or finished that may be useful in the future.

### Top 5 accuracy

## Functions for other functions

Everything in this class is for use by the user

## Attributes

These are class attributes used with the python self parameter, I use the standard of an underscore for attributes to distinguish from functions, eg. input_size is self._input_size. 

#### input_size
The size of the input vector, used to initialize the first weight matrix.
#### layers
The list of layer objects
#### dropout
This sets the number of parameters that are set to 0 if dropout is used during training. This is actually drop connect, which zeros weights, where dropout zeros activations. This is slower, more computation/memory intensive. I should probably add dropout sometime soon. 
#### comp_type
Computation type, lets other parts of the software know if the model is run on the CPU or the GPU. Stored as a string, 'CPU' or 'GPU. 
#### depth
The number of layers, just an easy to read reference. Makes my code easier to understand than len(model._layers).
#### layer_type
Dense or sparse, inferred from the first layer, I have not worked yet with mixed sparse, dense models so this works fine. 

# Layers.py

This file contains 6(5) classes, they layer objects that are put into a model object to make a working model. There are different types for each activation. This is because the gradients are calculated within this object. This is a design that I chose, one of the reasons for building this project was to understand backpropogation better. 

All classes have the same functions and attributes, mostly (I havn't put everything into the sparse layers yet).

To build and train a model, the only way you will need to use these objects is to import them and put them in the model object during initilization like this:

```python
mymodel = model(input_size = 784, 
                layers = [full_relu_layer(size = 1000), 
                          full_relu_layer(size = 800),
                          full_relu_layer(size = 400),
                          full_linear_layer(size = 10)],
                comp_type = 'GPU')
```

## Layer types

#### full_relu_layer

#### full_linear_layer

#### full softmax_layer

#### sparse_relu_layer

#### sparse_linear_layer

#### sparse_softmax_layer

I have not built this layer yet, I will when I am working on a project that needs one.


# Lobotomizer

This contains the classes I use for pruning models, among other things. There are 3 main classes that are for the user, and some additional functions that are used by these classes. 

## Lobotomizer

Removes weights from a neural network based on one of several selection criteria, performs a lobotomy, hence the dumb name. 


## Vulcanizer

Splits submodels off models for memory efficient training, the name is another dumb joke, puting the submodel parameters back into the main model is a mind meld, like the vulcans from star trek. I will write a full notebook about what this does somewhere else, this just explains the functions


## Parameter selector

Selects parameters for partial model training, sort of a precursor to vulcanizer, this is for training on a subset of parameters, it is not more memory efficient. Again I will write up a notebook about how to use it and why

## Other functions

These are called by the user with the class objects. 

### Get MAV


### Get MAAV


### Get absolute values

# Model mixer

This is for an experiment that worked quite well, it combines the weights of different models, using several different methods. Similar to what is done by the vulcanizer, it is for a seperate set of experiments, they are different enough that it was worth building a seperate thing.



# Optimizer

This is where I have build some optimizers, the ones that are useful are SGD and ADAM, there are some others that do not work very well, they were just fun projects that were never going to work too well. 

# Saver

This is for storing and saving models, pretty simple to use, but quite useful.


# Tracker

This is for tracking what goes on between layers. It has been the start of an experiment that is interesting, but I have not followed it up enough to know if it is useful. I track the cosine distance between 

# Data loader

This seems like it would be simple, but I have added a couple of different things for different experiments. Sometimes I need to split the data up in different ways. 

# Examples

I need a couple of VERY concise examples of how to use different things here

## Simple build and train


## Prune and convert


## Forgetting and not forgetting

This can demonstrate the parameter selector, and the data loader bits, and show some interesting results.