Experiments with Reinforcement Learning

Ideally, Machine Learning models are of 3 categories: Supervised Learning, Unsupervised Learning, and Reinforcement Learning. While Unsupervised and Supervised are frequently discussed, Reinforcement Learning is rarely discussed.

Reinforcement Learning (RL) is a type of machine learning where an agent learns to make decisions by performing actions in an environment to maximize some notion of cumulative reward. Much like Supervised Learning, Reinforcement Learning requires labeled data for training.

The project aimed to explore Q-learning and compare the decision-making processes of human participants and AI. We utilized Q-learning and Deep Q-Network (DQN) techniques to achieve this understanding. Q-learning is a model-free reinforcement learning algorithm that enables an agent to evaluate the value of taking specific actions in given states within an environment. Its goal is to discover an optimal action-selection policy that maximizes the agent's cumulative reward over time, with the "Q" representing the quality of an action in a particular state. DQN extends Q-learning by integrating it with deep neural networks, allowing for more complex and scalable learning.

This code sets up and trains a Deep Q-Network (DQN) agent to play the CartPole-v1 game. The goal is to balance a pole on a cart by moving the cart left or right. The neural network approximates the Q-values, representing the expected future rewards for taking different actions in a given state. Experience replay stabilizes training, breaking the correlation between consecutive experiences.

This technique helps the agent learn from diverse experiences, improving sample efficiency and overall performance. The epsilon-greedy strategy is also implemented to balance exploration and exploitation during training. This means the agent occasionally takes random actions (exploration) to discover new strategies while mostly relying on its learned policy (exploitation) to maximize rewards. The environment is designed to be straightforward, with a clear goal for the human player and the AI agent: to prevent the pole from falling by making appropriate cart movements. The code provides a simple framework for comparing the actions and performance of a human player with those of an AI agent trained using Q-learning. By interacting with the environment, humans and AI aim to reach the terminal state while maximizing their rewards.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
README.md		README.md
Sample-RF Code -Human vs AI GAME.py		Sample-RF Code -Human vs AI GAME.py
Sample-RF Code -Human vs AI.py		Sample-RF Code -Human vs AI.py
Sample-RF Code- Deep Q Learning.py		Sample-RF Code- Deep Q Learning.py
Sample-RF Code.py		Sample-RF Code.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Experiments with Reinforcement Learning

About

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Experiments with Reinforcement Learning

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages