β οΈ PROJECT STATUS: NO LONGER MAINTAINED
This project is provided as-is for educational and research purposes. It is free to use forever but will not receive updates, bug fixes, or feature additions. Feel free to fork and adapt it to your needs.
ML-EV3 is an educational project demonstrating Q-Learning reinforcement learning applied to robot navigation. The project contains two complementary components:
- Webots Simulation β Train and test a virtual robot using Q-Learning in a simulated environment
- Real-World EV3 Control β Deploy trained policies to a physical LEGO Mindstorms EV3 robot
The robot learns to navigate toward a colored goal (red square) while avoiding obstacles, using sensor data (touch, distance, color) to make decisions.
| Audience | Use Case |
|---|---|
| Students | Learn reinforcement learning fundamentals with hands-on robotics |
| Researchers | Study sim-to-real transfer and Q-Learning behavior |
| Hobbyists | Experiment with EV3 robots and machine learning |
| Educators | Teaching material for robotics and AI courses |
Prerequisites:
- Basic Python knowledge
- Familiarity with reinforcement learning concepts (helpful but not required)
- For real robot: LEGO Mindstorms EV3 with ev3dev OS
ML-EV3/
βββ simulation/ # Webots simulation environment
β βββ ev3_bot.py # Robot controller with Q-Learning
β βββ world.wbt # Webots world definition
β βββ QTable.json # Pre-trained Q-table (simulation)
β βββ q_table.npy # NumPy format Q-table backup
β
βββ real_robot/ # Physical EV3 robot control
β βββ ev3control.py # EV3 robot controller class
β βββ run.py # Main execution script
β βββ screen_utils.py # EV3 display utilities
β βββ QTable.json # Q-table for deployment
β
βββ results/ # Experimental results
β βββ results.txt # Test runs with obstacles
β βββ results_no_obstacles.txt # Test runs without obstacles
β
βββ README.md # This file
βββ LICENSE # MIT License
βββ .gitignore # Git ignore rules
The robot perceives its environment through three sensors:
| Sensor | States | Description |
|---|---|---|
| Touch | 2 | Binary: pressed (1) or not pressed (0) |
| Distance | 256 | Discretized distance readings (0-255 cm) |
| Color | 8 | Color indices: None(0), Black(1), Blue(2), Green(3), Yellow(4), Red(5), White(6), Brown(7) |
Total State Space: 2 Γ 256 Γ 8 = 4,096 states
| Action | Index | Description |
|---|---|---|
| Move Forward | 0 | Drive straight ahead |
| Turn Left | 1 | Rotate left |
| Turn Right | 2 | Rotate right |
| Parameter | Value | Description |
|---|---|---|
| Ξ± (Learning Rate) | 0.1 | How much new information overrides old |
| Ξ³ (Discount Factor) | 0.99 | Importance of future rewards |
| Ξ΅ (Exploration Rate) | 1.0 β 0.01 | Decays exponentially during training |
| Episodes | 1000 | Number of training episodes |
| Goal Reward | +100 | Reaching the red target |
| Collision Penalty | -10 | Hitting obstacles |
| Step Penalty | -1 | Each movement step |
- Webots R2023b or later (Download)
- Python 3.x
- Python packages:
numpy pandas opencv-python colorthief
-
Install Webots from cyberbotics.com
-
Install Python dependencies:
pip install numpy pandas opencv-python colorthief
-
Clone this repository:
git clone https://github.com/OzSho/ML-EV3.git cd ML-EV3/simulation
- Open
simulation/world.wbtin Webots - The simulation will automatically run
ev3_bot.pyas the robot controller - The robot will train using Q-Learning and output results to console
The Webots world contains:
- A 2x2 meter arena with walls
- Colored floor squares (goal is red)
- Wooden box obstacles
- An EV3-like differential drive robot with:
- Touch sensor
- Distance sensor (ultrasonic)
- Color camera
- LEGO Mindstorms EV3 brick
- ev3dev OS installed (ev3dev.org)
- Python 3.x on ev3dev
- Hardware configuration:
- Left motor: Port A
- Right motor: Port B
- Touch sensor: Port 1
- Ultrasonic sensor: Port 2
- Color sensor: Port 4
-
Connect to your EV3 via SSH
-
Copy the
real_robot/folder to your EV3:scp -r real_robot/ robot@ev3dev.local:~/ -
Install ev3dev Python library (usually pre-installed):
pip3 install python-ev3dev2
cd ~/real_robot
python3 run.pyThe robot will:
- Load the trained Q-table
- Execute 30 test runs
- Record completion times to
results.txt - Each run has a 2-minute timeout
- Create a colored floor with a red target zone
- Place colored squares matching the simulation (black, blue, green, yellow, white, brown)
- Optionally add obstacles for navigation challenges
The results/ folder contains experimental data:
Test num: 1, run time: 00:00:23
Test num: 2, run time: 00:00:03
...
Test num: 1, run time: 00:00:30
Test num: 2, run time: 00:00:20
...
Key Metrics:
- Run times under 2 minutes indicate successful goal completion
- 02:00 indicates timeout (goal not reached)
- Compare with/without obstacles to evaluate policy robustness
This project demonstrates several key concepts:
- Q-Learning algorithm implementation
- Epsilon-greedy exploration strategy
- Reward shaping for desired behavior
- State discretization for continuous environments
- Sensor fusion (touch + distance + color)
- Differential drive kinematics
- Sim-to-real transfer challenges
- ev3dev programming
- Modular code design
- Configuration management (JSON Q-tables)
- Cross-platform development (simulation β real robot)
In simulation/ev3_bot.py, adjust the main() function:
# Q-learning parameters
EPSILOn = 1 # Initial exploration rate
ALPHa = 0.1 # Learning rate
GAMMa = 0.99 # Discount factor
NUM_EPISODEs = 1000 # Training episodes
collision_reward = -10
goal_reward = 100Modify get_color_index() in ev3_bot.py:
def get_color_index(b, g, r):
# Add new color thresholds here
if r > 200 and g < 100 and b > 200: # Purple
return 8 # Remember to update NUM_COLOR_STATES
# ... existing colorsIn real_robot/ev3control.py, update the port assignments:
self.left_motor = LargeMotor(OUTPUT_A) # Change port here
self.right_motor = LargeMotor(OUTPUT_B) # Change port here
self.touch_sensor = TouchSensor(INPUT_1) # Change port here-
Sim-to-Real Gap: Policies trained in simulation may not transfer perfectly to real hardware due to:
- Sensor noise differences
- Motor response variations
- Lighting conditions affecting color detection
-
Fixed State Discretization: The color and distance discretization may not be optimal for all environments
-
No Continuous Learning: The real robot runs inference only; it doesn't update the Q-table online
-
Limited Error Handling: The code assumes proper hardware configuration
Q: Can I use a different robot simulator?
A: The Q-Learning logic is portable. You'll need to adapt the sensor/motor interfaces for your simulator.
Q: Why does the robot sometimes spin in circles?
A: This can happen with insufficient training or when the robot encounters unseen states. Try increasing training episodes.
Q: How do I retrain from scratch?
A: Delete QTable.json and q_table.npy, then run the simulation. The Q-table will be reinitialized.
Q: Can I use this with EV3-G or other LEGO software?
A: No, this requires ev3dev OS and Python. EV3-G doesn't support custom Python scripts.
This project is licensed under the MIT License - see the LICENSE file for details.
You are free to:
- β Use commercially
- β Modify
- β Distribute
- β Use privately
This project is no longer actively maintained, but you're welcome to:
- Fork this repository for your own experiments
- Adapt the code for your specific use case
- Share your improvements with the community
If you create something interesting, feel free to tag @OzSho on GitHub!
If you use this project in academic work, please cite:
@software{ml_ev3,
author = {OzSho},
title = {ML-EV3: Q-Learning Robot Navigation},
year = {2024},
url = {https://github.com/OzSho/ML-EV3},
note = {Educational reinforcement learning project for EV3 robots}
}- Webots Documentation
- ev3dev Documentation
- Q-Learning Tutorial
- Reinforcement Learning: An Introduction (Sutton & Barto)
Made with β€οΈ for robotics education
This project is provided free of charge, forever.