FitML

model.fit(Machine_Learning, epochs=Inf)

What is Fit ML

Fit Machine Learning (FitML) is blog that houses a collection of python Machine Learning articles and examples, often focusing on Reinforcement Learning. Here, you will find code related to Q Learning, Actor-Critic, MDP, Bellman, OpenAI solutions and custom implemented approaches to solving some of the toughest and most interesting problems to date (Yes, I am "baised").

Who is Michel Aka

Michel is an AI researcher and a graduate from University of Montreal who currently works in the Healthcare industry.

How to use for Reinforcement Learning Algorithm

(Optional) Clone the repo
Select the algorithm that you need (Folders are named by the RL algorithm ). Policy Gradient/ Parameter Noising/ Actor Critic / Selective memory
Get an instance of the algorithm with the environment you need. If the one you are looking for isn't there, get any environment.py file from the algorithm folder of choice and follow the steps below.
Install the dependencies
- Usually "pip install ". Example "pip install pygal"
Replace the name of the environment in line 81 of the code.

 env = gym.make('BipedalWalker-v2')
 # replace with
 env = gym.make('<your-environement-name-here>')

or set the ENVIRONMENT_NAME = to your environment name. Example ENVIRONMENT_NAME = "BipedalWalker-v2".

set the environment's observation and action space and viriables. If you don't know them, run the script once and they will be printed in the first lines of your output.

 num_env_variables = <number of observation variables here>
 num_env_actions = <number of action variables here>

(Optional) you can check the results of your agent as it progresses with the .svg file in the same directory as your script. Any modern browser can view them.

RL Approaches

Optimal Policy Tree Search

This is a RL technique which is characterized by computing the estimated value of expected sum of reward for n time steps ahead. This technique has the advantage of yeilding a better estimation of taking a specific policy, however it is computationally expensive and memorry inneficient. If one had a super computer and very large amount of memory, this technique would do extremely well for discrete action space problem/environments. I believe Alfa-Go uses a varient of this technique.

See examples and find out more about Optimal Policy Tree Search here .

Selective Memory

As far as I know, I haven't seen anyone in the litterature implement this technique before.

The intuition behind Policy Gradient is that it optimizes the parameters of the network in the direction of higher expected sum of rewards. What if we could do the same in a computationally more effective way that also turns out to be more intuitive: enter what I am calling Selective Memory.

We chose what to commit to memory based on actual sum of rewards

Find out more here .

Q-Learning

Q-Learning is a well knon Reinforcement Learning approach, popularized by Google Deep Mind, when they used it to master multiple early console era games. Q-Learning focuses on estimating the expected sum of rewards using the Bellman equation in order to determine which action to take. Q-Learning works especially well in discrete action space and on problems where the f(S)->Q is differentiable, this is not always the case.

Find out more about Q-Learning here .

Actor Critique Approaches

Actor Critique is an RL technique which combines Policy Gradient appraoch with a Critique (Q value estimator)

Find out more about Actor-Critique here .

Recommended Progression for the Newcomer

[coming soon]

Name		Name	Last commit message	Last commit date
Latest commit History 722 Commits
ActorCritic		ActorCritic
DeepDeterministicSeletiveMemory		DeepDeterministicSeletiveMemory
DeepQN		DeepQN
DeepQNPyTorch		DeepQNPyTorch
Experimental		Experimental
NeuroEvolution		NeuroEvolution
OptimalPolicyTreeSearch		OptimalPolicyTreeSearch
ParameterNoising		ParameterNoising
PolicyGradient		PolicyGradient
Pytorch		Pytorch
QLearning		QLearning
SelectiveMemory		SelectiveMemory
SimpleNN		SimpleNN
SkillPolicyLearning		SkillPolicyLearning
Stable_baselines3		Stable_baselines3
Tensorforce		Tensorforce
img		img
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

FitML

What is Fit ML

Who is Michel Aka

How to use for Reinforcement Learning Algorithm

RL Approaches

Optimal Policy Tree Search

Selective Memory

Q-Learning

Actor Critique Approaches

Recommended Progression for the Newcomer

About

Uh oh!

Releases

Packages

Languages

License

FitMachineLearning/FitML

Folders and files

Latest commit

History

Repository files navigation

FitML

What is Fit ML

Who is Michel Aka

How to use for Reinforcement Learning Algorithm

RL Approaches

Optimal Policy Tree Search

Selective Memory

Q-Learning

Actor Critique Approaches

Recommended Progression for the Newcomer

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages