Skip to content

Latest commit

 

History

History

reinforcement_learning

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 

reinforcement_learning

contains Reinforcement Learning algorithms




class that implements reinforcement learning using synchronous action value iteration algorithm The implemented algorithms can be trained to find a policy for finding the optimal way through the following two tasks

  • 1-dimensional with final states in the states [1] and [10]

    1 -1 -1 10

    → which should result in the following policy, B indicating a blocked state, X indicating a final state

    X X
  • 2-dimensional with final states in the states [1] and [-10]

    0 0 0 1
    0 B 0 -10
    0 0 0 0

    → which should result in the following policy, B indicating a blocked state, X indicating a final state

    X
    B X




implements the feature selection using a genetic algorithm. Goal is to reduce the number of features to use for the latter machine learning model to reduce processing time and capacity.

Genetic algorithms are mostly used to produce new - and thereafter more - data. Those samples can be either used to train a Machine Learning algorithm with more data or to train it with better data. Genetic Algorithms cannot be used for all tasks as not all kinds of data samples can take on random values (e.g. Port-Numbers in network traffic or features that were encoded - by OneHotEncoding, etc.). The workwise of those algorithms are similar to Darwin's theory of evolution, simulating mutations, crossovers and natural selection.




class that implements Hidden Markov Models (HMM's), including:

  • Markov Processes to estimate the probability of a given sequence
  • forward + backward procedure for probability estimation
  • Viterbi-Algorithm for optimal state sequence estimation
  • model-reestimation for calculating the optimal parameters

HMM's can be used to determine probabilities for different kind of sequences when the single subprobabilities of the sequence are only partially known.




class that implements reinforcement learning using Q Learning algorithm The implemented algorithms can be trained to find a policy for finding the optimal way through the following two tasks

  • 1-dimensional with final states in the states [1] and [10]

    1 -1 -1 10

    → which should result in the following policy, B indicating a blocked state, X indicating a final state

    X X
  • 2-dimensional with final states in the states [1] and [-10]

    0 0 0 1
    0 B 0 -10
    0 0 0 0

    → which should result in the following policy, B indicating a blocked state, X indicating a final state

    X
    B X