Skip to content

joshdey/Thesis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Senior Thesis Project: Reservoir Computing and Nonlinear Dynamics

Introduction

What Is Artificial Intelligence

Humans grow, learn, and adapt to situations based on a culmination of past experiences while computers follow instructions given to them by humans. But what if computers could also learn based on prior data? Artificial Intelligence is the pinnacle of computing research over the last 200 years; from mechanical computers in the 1800's, to the more complex electro-mechanical computers in the early 1900's, to the programmable personal computer as we know it today.

Machine learning is a subset of artificial intelligence in which computers use statistical methods to improve models with experience - the more data a model is exposed to will improve said model. Models that can predict height based on weight or age based on features in a picture are examples of models of machine learning.

Deep learning falls under the umbrella of machine learning using algorithms more or less inspired by the structure/function of the brain called artificial neural networks, henceforth referred to as an ANN. ANN's are composed of neurons connected by weighted synapses. Adjusting the weights of the synapses change the output of the ANN. For example, how would a human distinguish between a square and another shape? One can break down important features of the square: does it have four sides? Are the four sides connected at right angles? Are the sides of equal length? In our brain, we distinguish which features in the image of the shape are important in determining whether it is a square or not before making an informed decision. The ANN works similarly in that all the features are recognized and some are deemed more importantly than others and then the synapses are updated based on how important the features are. This all takes place in the training stage of running the ANN. One would feed images of shapes, some squares and some not, with markers to update the weights of these synapses so as to help the ANN "learn" to associate images with patterns and markers; and as with all machine learning models, more data in the training stage makes our model more accurate. The second stage of this process would be running our model and let the ANN tell us what the markers are in images that it hasn't "seen" yet.

Neural Networks

There are three prominent types of neural networks; feed-forward, convolutional, and recurrent - all with different use cases.

Feed-forward Neural Networks

Aptly named, a feed forward neural network connects all the neurons in the network from one layer to the next and do not form a cycle. Each layer is only connected to the next in a linear fashion. This type of neural network is helpful for its universal approximation framework. In which the Universal Approximation Theorem states that feed forward neural network can approximate any Borel measurable function with any desired non-zero error as long as it contains: an input layer, at least one hidden layer, an output layer, and an activation function. The theorem says that one can achieve any non-zero error we desire but makes no statement as to how large the network must be to achieve the desired accuracy.

Convolutional Neural Networks

Convolutional neural networks work in much the same way as feed forward neural networks except they are fully connected, that is each neuron is connected to every neuron in the following layer, and uses a mathematical process, convolution, in place of matrix multiplication in at least one of the layers. This type of neural network is based on the visual cortex in the brain and is mostly used for image/video recognition, classification, and analysis in addition to natural language processing.

Recurrent Neural Networks

The type of ANN employed in this thesis is the recurrent neural network (RNN), in which nodes can be connected to previous nodes (called backpropogation), as opposed to feed forward and convolutional neural networks. The advantage of using recurrent neural networks is that the nodes and synapses form a directed graph along a temporal sequence allowing it to exhibit temporal dynamic behavior. The type of RNN used in this project is a reservoir computer, a type of echo state network (ESN). ESN's still have a hidden middle layer, though the layer is very sparsely connected. The synapses in this middle hidden layer are randomly fixed and the synapses connecting the middle layer to the output layer can be trained suiting this type of ANN for reproducing or producing temporal data.

Reservoir Computing

Reservoir computing is a computing framework derived from recurrent neural network models, allowing them to behave as dynamic systems as opposed to a function, making them ideal for solving problems involving time series data. A recurrent neural network has essentially three layers of connected "neurons". The first consists of a multidimensional input that is then connected to another multidimensional, but "hidden" network layer, which then leads to the output. Additionally, some outputs may connect back to the network layer if feedback is required. The synaptic connections are represented by matrices dependent on the dimension of the vector representing them. As with all machine learning algorithms, recurrent neural networks (and thusly, reservoir computing) require a training data set to update connection weights so that the error is minimized between the actual and target outputs; and typically, root mean squared error is used. In the reservoir computing method of training a recurrent neural network, all weights are kept fixed except the output weight, ultimately simplifying both the implementation and the training of the recurrent neural network.

Chaotic Systems

Since the proposed project has much more to do with reservoir computing than chaotic systems, I will be very brief in this subsection. In its essence chaos is the opposite of what we've learned in physics courses over the last three years, aperiodic behavior very sensitive and dependent on initial conditions, almost the exact opposite of the sinusoidal or periodic we've learned so much about. As per Shofer as well as Lu et al., reservoir computers have successfully been able to act as an observer for chaotic or nonlinear systems.

Project

This project aims to further investigate reservoir computing as a whole and also investigate reservoir computers as an observer in chaotic systems. Specifically, I seek to study the Lorenz and Rössler systems and find out why inference does not benefit from memory or nonlinearity in some cases.