Skip to content
Amro edited this page Jun 13, 2016 · 2 revisions

The data section is highlighted below:

data

Each widget will be discussed below, specifically how you would use each of these widgets:

Which dataset do you want to use?

This is a drop-down menu that allows you to choose which dataset you would like the Neural Network to work on, whether you are using regression or binary classification. The data is a set of two-dimensional labeled points generated randomly within a range of [-6, +6] for both dimensions. Positive points (+1 class) are denoted in blue while negative points (-1 class) are denoted in orange.

You may choose from one of four possible datasets:

Circle XOR Gaussian Spiral
circle xor guassian spiral

Circle

A dataset where one set of points belonging to one label are within some radius of a circle and the other set of points belong to a label within a larger radius but do not intersect with any points in the smaller radius of points.

XOR

A dataset that is set up in a criss-cross fashion where positive labels belong along one diagonal and negative labels belong to another.

Gaussian

A dataset that is a bit easier where each group of labels belongs to a cluster that is Gaussian distributed.

Spiral

A challenging dataset. The name speaks for itself.

Ratio of training to test data

This is a slider widget that asks you how much percentage of your data should be training and the rest being test data. 500 data points are always generated and the training and test data is split according to the value in this slider. Once the data is generated, a random permutation of points are placed in both sets for use in this tool. The default decomposition is 50%, so half of the data are training and the other are testing.

Noise

You can add random noise to the data to make the neural network training more challenging. This noise is defined as a percentage, and so a value of 10 means 10% or a value of 0.1 will be appended on top of the original data. A value of 0 gives you clean data. The default value is 10% or 0.1 here.

Batch Size

The technique for training the neural network in this tool is to perform Stochastic Gradient Descent. At each iteration or epoch, randomly sampled training samples of a specified batch size are selected and the neural network gets updated so that regression or classification of these points will be more accurate. This batch size can be specified here and the default is 10.

Regenerate

Click on this button to regenerate training and test data from a newly created set of points, for training the neural network again from scratch.