Skip to content

piyushpathak03/Reinforcement-Learning

Repository files navigation

Reinforcement Learning

alt text

Reinforcement Learning with Python will help you to master basic reinforcement learning algorithms to the advanced deep reinforcement learning algorithms. From the basics to deep reinforcement learning, this repo provides easy-to-read code examples. One file for each algorithm.

Dependencies

  1. Python 3.5
  2. Tensorflow 1.0.0
  3. Keras
  4. numpy
  5. pandas
  6. matplot
  7. pillow
  8. Skimage
  9. h5py

Install Requirements

pip install -r requirements.txt

Table of Contents

  • 1.1. What is Reinforcement Learning
  • 1.2. Reinforcement Learning Cycle
  • 1.3. How RL differs from other ML Paradigms?
  • 1.4. Elements of Reinforcement Learning
  • 1.5. Agent Environment Interface
  • 1.6. Types of RL Environments
  • 1.7. Reinforcement Learning Platforms
  • 1.8. Applications of Reinforcement Learning
  • 2.1. Setting Up Your Machine
  • 2.2. Installing Anaconda
  • 2.3. Installing Docker
  • 2.4. Installing OpenAI Gym and Universe
  • 2.5. Common Error Fixes
  • 2.6. OpenAI Gym
  • 3.1. Markov Chain and Markov Process
  • 3.2. Markov Decision Process
  • 3.3. Rewards and Returns
  • 3.4. Episodic and Continous Tasks
  • 3.5. Policy Function
  • 3.6. State Value Function
  • 3.7. State-Action Value Function (Q Function)
  • 3.8. Bellman Equation and Optimality
  • 3.9. Deriving Bellman Equation for Value and Q functions
  • 3.10. Solving the Bellman Equation
  • 3.11. Dynamic Programming
  • 3.12. Solving Frozen Lake Problem using Value Iteration
  • 3.13. Solving Frozen Lake Problem using Policy Iteration
  • 4.1. Monte Carlo Methods
  • 4.2. Estimating Value of Pi Using Monte Carlo
  • 4.3. Monte Carlo Prediction
  • 4.4. First visit Monte Carlo
  • 4.5. Every visit Monte Carlo
  • 4.6. BlackJack with Monte Carlo
  • 4.7. Monte Carlo Control
  • 4.8. Monte Carlo Exploration Starts
  • 4.9. On Policy Monte Carlo Control
  • 4.10. Off Policy Monte Carlo Control
  • 5.1. Temporal Difference Learning
  • 5.2. TD Prediction
  • 5.3. TD Control
  • 5.4. Q Learning
  • 5.5. Solving the Taxi Problem using Q learning
  • 5.6. SARSA
  • 5.7. Solving the Taxi Problem using SARSA
  • 5.8. Difference Between Q learning and SARSA
  • 6.1. Multi-armed Bandit Problem
  • 6.2. Epsilon-Greedy Algorithm
  • 6.3. Softmax Exploration Algorithm
  • 6.4. Upper Confidence Bound Algorithm
  • 6.5. Thompson Sampling Algorithm
  • 6.6. Applications of MAB
  • 6.7. Identifying Right Advertisement Banner Using MAB
  • 6.8. Contextual Bandits
  • 7.1. Artificial Neurons
  • 7.2. Artificial Neural Network
  • 7.3. Activation Functions
  • 7.4. Deep Dive into ANN
  • 7.5. Gradient Descent
  • 7.6. Neural Networks in Tensorflow
  • 7.7. Recurrent Neural Network
  • 7.8. Backpropagation Through Time
  • 7.9. Long Short Term Memory RNN
  • 7.10. Generating Song Lyrics using LSTM RNN
  • 7.11. Convolutional Neural Networks
  • 7.12. CNN Architecture
  • 7.13. Classifying Fashion Products Using CNN
  • 8.1. What is Deep Q network
  • 8.2. Architecture of DQN
  • 8.3. Convolutional Network
  • 8.4. Experience Replay
  • 8.5. Target Network
  • 8.6. Clipping Rewards
  • 8.7. DQN Algorithm
  • 8.8. Building an Agent to Play Atari Games
  • 8.9. Double DQN
  • 8.10. Dueling Architecture
  • 9.1. Deep Recurrent Q Network
  • 9.2. Partially Observable MDP
  • 9.3. Architecture of DRQN
  • 9.4. Basic Doom Game
  • 9.5. Build an Agent to Play Doom Game using DRQN
  • 9.6. Deep Attention Recurrent Q Network
  • 10.1. Asynchronous Actor Critic Algorithm
  • 10.2. The three A's
  • 10.3. Architecture of A3C
  • 10.4. Working of A3C
  • 10.5. Drive up the Mountain with A3C
  • 10.6. Visualization in Tensorboard
  • 11.1. Policy Gradient
  • 11.2. Lunar Lander Using Policy Gradient
  • 11.3. Deep Deterministic Policy Gradient
  • 11.4. Swinging up the Pendulum using DDPG
  • 11.5. Trust Region Policy Optimizatio
  • 11.6. Proximal Policy Optimization
  • 12.1. Environment Wrapper Functions
  • 12.2. Dueling Network
  • 12.3. Replay Buffer
  • 12.4. Training the Network
  • 12.5. Car Racing
  • 13.1. Imagination Augmented Agents
  • 13.2. Learning From Human Preference
  • 13.3. Deep Q Learning From Demonstrations
  • 13.4. Hindsight Experience Replay
  • 13.5. Hierarchical Reinforcement Learning
  • 13.6. Inverse Reinforcement Learning

About me

Piyush Pathak

PORTFOLIO

GITHUB

BLOG

📫 Follw me:

Linkedin Badge

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published