GitHub - aaiijmrtt/LEARNING: exploring machine learning: neural networks and genetic algorithms

#LEARNING

Exploring Machine Learning: Neural Networks and Genetic Algorithms

The code in this repository contains implementations of Distributed Neural Networks exploring different degrees of training parallelism. Version 1 implements batch Gradient Descent over the divided dataset. Version 2 implements stochastic Gradient Descent on the divided dataset, each set distributed to a different network, followed by a combination using Genetic Algorithms to evolve the best trained network. Both were written keeping the Map Reduce Paradigm in mind for easy extension to distributed filesystems such as Apache's Hadoop. Data is read from stdin and written to stdout to better utilize Hadoop's Streaming API. The Hadoop implementations have not been uploaded.

###v1

Shell Scripts:

driver.sh: runs training and testing scripts

Instructions:

Setting Executable Bit: $chmod +x driver.sh
Executing: $./driver.sh [datafile] [fractiontest] [chunksize] [modelfile] [eta] [alpha] [topology]

Arguments:

datafile: (string) name of file from which data is to be read
fractiontest: (float) fraction of data to be used for testing
chunksize: (int) number of training samples in each batch
modelfile: (string) name of file from/to which model is to be read/written
eta: (float) neural network learning parameter
alpha: (float) neural network learning parameter
topology: (list: int) list of integers defining the number of neurons in each layer

Input File Format:

Each line is a training or testing instance containing both inputs and outputs. Each line contains a whitespace separated sequence of numbers. The 1st and 2nd numbers are integers, denoting the number of inputs and outputs on the line, respectively. They are followed by the specified number of floating point inputs and outputs, in order.

Note:

Requires BASH (or equivalent shell).
Requires Python.

Python Scripts:

dividedata.py: divides data into training and test sets
testmap.py: mapper for testing distributed neural network
testreduce.py: reducer for testing distributed neural network
trainmap.py: mapper for training distributed neural network
trainreduce.py: reducer for training distributed neural network

Note:

Requires Python

Python Classes:

DNN.py: distributed neural network implementation

Note:

Requires Python

###v2

Python Scripts:

learn.py: runs the training and testing

Instructions:

Setting Executable Bit: $chmod +x learn.py
Executing: $./learn.py [datafile] [fractiontest] [parallel] [modelfile] [crossover] [mutation] [generations] [extinction] [eta] [alpha] [topology]

Arguments:

datafile: (string) name of file from which data is to be read
fractiontest: (float) fraction of data to be used for testing
parallel: (int) number of training samples for each network
modelfile: (string) name of file from/to which model is to be read/written
crossover: (float) genetic algorithm evolving parameter
mutation: (float) genetic algorithm evolving parameter
generations: (int) genetic algorithm evolving parameter
extinction: (float) genetic algorithm evolving parameter
eta: (float) neural network learning parameter
alpha: (float) neural network learning parameter
topology: (list: int) list of integers defining the number of neurons in each layer

Input File Format:

Each line is a training or testing instance containing both inputs and outputs. Each line contains a whitespace separated sequence of numbers. The 1st and 2nd numbers are integers, denoting the number of inputs and outputs on the line, respectively. They are followed by the specified number of floating point inputs and outputs, in order.

Note:

Requires Python.

Python Classes:

NN.py: neural network implementation
GA.py: genetic algorithm implementation

Note:

Requires Python.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
v1		v1
v2		v2
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v1

v1

v2

v2

LICENSE

LICENSE

README.md

README.md

Repository files navigation

About

Releases

Packages

Languages

License

aaiijmrtt/LEARNING

Folders and files

Latest commit

History

Repository files navigation

About

Resources

License

Stars

Watchers

Forks

Languages