# Feature Importance in Neural Networks
This site is a companion to the final report on Feature Importance in Neural Networks.

The goal of this research was to determine whether the changing values of the weights found between the input layer and first hidden layer of a neural network whilst training could determine feature importance. TL;DR, they can.

## Structure of Site
This site contains exported copies of the notebooks used during the investigation. The following gives a brief overview of each notebook:
1. [Overview](./index.html) - Some background details about the site and the work leading to its creation
2. [Complete](./complete-example.html) - An end-to-end look at the exploritory work that took place leading up to the use of Probability Density functions to determine the order of feature importance
3. [One Value, One Noise](./one-value-one-noise.html) - A comparison of the previous example which attempts to see whether noise is completely cancelled out when there are just two features
4. [Accuracy](./accuracy.html) - A look at how accuracy effects the efficacy of the methodology
5. [Classification](./classification.html) - So far each example has been for a linear regression whereas this uses a classification example to examine the effect on the outputs
6. [Titanic](./titanic.html) - An extension of the Classification example, this time using the ubiquitous Titanic data set. Importantly, it compares the results to a Random Forest classifier which has an inbuilt feature importance method

## Methodology
The idea was to provide a proof of concept and as such *perfect* data was used throughout the development. By which it is meant that sythetic data was used throughout, with the exception of the [Titanic](./titanic.html) classification page.

Specifically, the data followed the equation $y = mx + c$ where **m** and **c** were static values and **x** was randomly selected. In addition to these values, an additional **noise** column was added to give the neural networks a target to tune out. The hypothesis was that the methedology would rank from most to leaast importance the columns in the following order:
1. **x** - Since this value has the single largest effect on the result, it should be clearly selected as the most important
2. **m/c** - Since these are static, they should be roughly equally weighted in the overall system
3. **noise** - Since this has no effect on the output, it should be clearly ranked lowest in importance

## Testing out the code
To try the code in your local environment please clone the repository [here](https://github.com/SkinnyPigeon/weights-in-neural-networks) and follow the instructions in the README

## A note on Initializers

Late into development, the idea to use initializers to set the starting weights for the models was explored. The early results from this are indeed promising, in that it seems to produce more consistent results. For those interested in running these experiments at home, it may be worth ammending the model definition sections to include this, such as found in this example:

In [None]:
initializer = tf.keras.initializers.Zeros()
model = tf.keras.Sequential()
model.add(tf.keras.layers.Dense(units=FIRST_LAYER, input_shape=[4], kernel_initializer=initializer))
model.add(tf.keras.layers.Dense(units=1))
model.summary()
opt = Adam(0.01)
model.compile(optimizer=opt, loss="mean_squared_error")