contains Reinforcement Learning algorithms
class that implements reinforcement learning using synchronous action value iteration algorithm
The implemented algorithms can be trained to find a policy for finding the optimal way through the following two tasks
-
1-dimensional
with final states in the states[1]
and[10]
1 -1 -1 10 → which should result in the following policy,
B
indicating ablocked
state,X
indicating afinal
stateX → → X -
2-dimensional
with final states in the states[1]
and[-10]
0 0 0 1 0 B 0 -10 0 0 0 0 → which should result in the following policy,
B
indicating ablocked
state,X
indicating afinal
state→ → → X ↑ B ↑ X ↑ ← ↑ ←
implements the feature selection using a genetic algorithm
. Goal is to reduce the number of features to use for the latter machine learning model to reduce processing time and capacity.
Genetic algorithms are mostly used to produce new - and thereafter more - data. Those samples can be either used to train
a Machine Learning
algorithm with more data or to train
it with better data. Genetic Algorithms
cannot be used for all tasks as not all kinds of data samples can take on random values (e.g. Port-Numbers
in network traffic
or features
that were encoded - by OneHotEncoding
, etc.). The workwise of those algorithms are similar to Darwin's
theory of evolution, simulating mutations
, crossovers
and natural selection
.
class that implements Hidden Markov Models (HMM's)
, including:
Markov Processes
to estimate the probability of a given sequence- forward + backward procedure for probability estimation
Viterbi-Algorithm
for optimal state sequence estimation- model-reestimation for calculating the optimal parameters
HMM's
can be used to determine probabilities for different kind of sequences when the single subprobabilities of the sequence are only partially known.
class that implements reinforcement learning using Q Learning algorithm
The implemented algorithms can be trained to find a policy for finding the optimal way through the following two tasks
-
1-dimensional
with final states in the states[1]
and[10]
1 -1 -1 10 → which should result in the following policy,
B
indicating ablocked
state,X
indicating afinal
stateX → → X -
2-dimensional
with final states in the states[1]
and[-10]
0 0 0 1 0 B 0 -10 0 0 0 0 → which should result in the following policy,
B
indicating ablocked
state,X
indicating afinal
state→ → → X ↑ B ↑ X ↑ ← ↑ ←