A comprehensive implementation of core concepts in Intelligent Systems, ranging from robotics control loops and sensor fusion to advanced reinforcement learning algorithms. This project serves as a final capstone, demonstrating both theoretical understanding and practical implementation.
- ⚡ Robotics & Agents: Grid-based navigation, weighted sensor fusion, and robust Finite State Machines.
- 🧠 RL Fundamentals: Implementation of Bellman-based Q-Learning, Softmax exploration, and Policy Networks from scratch.
- 🔭 Capstone Project: A complete end-to-end RL agent training in a 1D environment with performance visualization.
The project is modularly structured to separate concerns between different domains of intelligent systems.
graph TD
Main[main.py] --> SecA[Section A: Robotics]
Main --> SecB[Section B: RL Fundamentals]
Main --> SecC[Section C: Capstone Project]
SecA --> Prob1[Problem 1: Grid Nav]
SecA --> Prob2[Problem 2: Sensor Fusion]
SecA --> Prob3[Problem 3: State Machine]
SecB --> Prob4[Problem 4: Q-Update]
SecB --> Prob5[Problem 5: Softmax]
SecB --> Prob6[Problem 6: Policy Net]
SecC --> Prob7[Problem 7: 1D World Training]
style Main fill:#f9f,stroke:#333,stroke-width:2px
style SecC fill:#bbf,stroke:#333,stroke-width:2px
Final_Parejas/
├── main.py # 🚀 Entry point: Sequentially executes all sections
├── section_a.py # 🤖 Robotics: Grid navigation, Sensor fusion, FSM
├── section_b.py # 💡 RL Theory: Q-learning logic, Softmax, Policy Nets
├── section_c.py # 🏆 Capstone: Integrated Q-learning training loop
├── utils.py # 🛠 Shared utilities and helper functions
├── requirements.txt # 📦 Project dependencies (NumPy)
└── README.md # 📖 Project documentation (Internal)Ensure you have Python 3.8+ installed.
# Create and activate virtual environment
python -m venv venv
source venv/bin/activate # Mac / Linux
# venv\Scripts\activate # Windows
# Install dependencies
pip install -r requirements.txtpython main.pyThe robot transitions between three distinct states to complete a task.
stateDiagram-v2
[*] --> SEARCH
SEARCH --> APPROACH : Object Detected
APPROACH --> GRASP : Distance <= 0.5m
APPROACH --> SEARCH : Object Lost
GRASP --> SEARCH : Task Complete
We implement the core temporal difference learning formula:
The agent learns to navigate a 1D world. Below is a conceptual representation of the learning progress:
| Phase | Description | Result |
|---|---|---|
| Exploration | High |
Long episodes, variable rewards |
| Learning |
|
Steps per episode decreasing |
| Convergence | Low |
Direct path to goal (9 steps) |
Arron Kian Parejas
- Gemini: Planning & Technical Documentation
- Claude: Algorithm Implementation & Logic
- ChatGPT: Debugging & Error Handling
- VSCode: Primary IDE