.gifs and visualizations take a few seconds to load...
This repository contains projects completed for ESE 6500: Learning in Robotics (Spring 2026) at the University of Pennsylvania, taught by Prof. Pratik Chaudhari. Course page: pratikac.github.io/pub/25_ese650.pdf
The course covers state estimation, optimal control, and reinforcement learning for robotic systems — from Kalman filtering and particle filters to policy gradients and Q-learning.
Goal: Implement an Unscented Kalman Filter (UKF) to track the orientation of an IMU in three dimensions, fusing accelerometer and gyroscope measurements against Vicon motion-capture ground truth. Score: 56/56 on the Gradescope autograder.
3D orientation tracking (Datasets 1–3) — Vicon ground truth (gray) vs. UKF estimate (RGB axes) with 3σ covariance ellipsoid.
This is not a standard UKF — the state lives partly on the rotation manifold:
where
Process model: from_axis_angle
The raw IMU readings are biased:
Interactive accelerometer (left) and gyroscope (right) bias calibration against Vicon ground truth.
After tuning, we verify that accelerometer-derived roll/pitch and gyro-integrated orientation both align with Vicon:
Calibrated accelerometer (left) and gyroscope (right) — roll/pitch and angular rates match Vicon.
The combined integration check confirms all 6 biases are consistent:
Joint calibration view: Vicon (blue), gyro-integrated (dark red), and accelerometer (pink/green) orientation.
With calibrated sensors, the UKF has four noise parameters to tune: process noise (
UKF noise tuner with real-time Euler angle comparison (roll, pitch, yaw) and RMSE readout.
The UKF outputs are analyzed across four diagnostic views: quaternion components, angular velocity with uncertainty bands, covariance evolution, and Euler angles vs. Vicon.
Euler angles (roll, pitch, yaw) — UKF estimate vs. Vicon ground truth with per-axis RMSE.
Covariance diagonal over time. Notice P[2,2] (yaw orientation) grows while P[0,0] and P[1,1] (roll/pitch) stay bounded — yaw is unobservable from the accelerometer alone.
The journey from 55.25/56 to full marks required three breakthroughs:
-
Numerical hygiene: The provided
quaternion.pyhad subtle issues (unclampedacos, commented-out normalization). I added 5 defensive normalizations and a covariance symmetrization step in the filter. This didn't change the score directly but made the filter respond predictably to parameter changes. -
Understanding accelerometer-yaw cross-coupling: Accelerometers can only observe roll/pitch (via gravity), not yaw. But in quaternion representation, accelerometer corrections "leak" into yaw through cross-terms. The parameters
$\sigma_q$ and$\sigma_{\text{acc}}$ control this leakage. -
Anisotropic process noise: Different datasets needed conflicting
$\sigma_q$ values. The solution: separate$\sigma_q$ for roll/pitch (tight, to minimize yaw contamination) and yaw (loose, to allow gyro-based yaw corrections). This decoupled the tradeoff and achieved full marks.
# Isotropic (couldn't satisfy all datasets):
R = np.diag([sigma_q**2]*3 + [sigma_w**2]*3)
# Anisotropic (56/56):
R = np.diag([sigma_q_rp**2, sigma_q_rp**2, sigma_q_yaw**2] + [sigma_w**2]*3)Concepts learned: Quaternion-based UKF on SO(3), sigma point generation in tangent space, quaternion averaging via gradient descent, IMU calibration, sensor noise covariance tuning, yaw unobservability from accelerometers, anisotropic process noise design.
Goal: Use an EKF to estimate an unknown system parameter
The system is defined as:
where
The state is augmented to
Simulated state trajectory x_k and nonlinear observations y_k over 100 time steps.
EKF estimate of the unknown parameter a converges to the true value a = -1, with uncertainty (shaded) shrinking over time.
The EKF successfully recovers
Concepts learned: EKF derivation for nonlinear systems, joint state-parameter estimation, Jacobian computation for measurement updates, convergence and uncertainty analysis.
The course develops foundations in state estimation, control, and reinforcement learning for robotics:
Module 1 — State Estimation
- Probability background and Bayesian inference
- Markov chains and Hidden Markov Models
- Kalman Filter, Extended Kalman Filter (EKF), and Unscented Kalman Filter (UKF)
- Particle filters and sequential Monte Carlo methods
- Mapping, localization, and SLAM
- Neural Radiance Fields (NeRF) and Gaussian Splatting for SLAM
- Foundation models for robotics
Module 2 — Control
- Linear control and dynamic programming
- Markov Decision Processes (MDPs)
- Value Iteration and Policy Iteration
- Bellman equation and optimality
- Linear Quadratic Regulator (LQR)
- Linear Quadratic Gaussian (LQG)
Module 3 — Reinforcement Learning
- Imitation learning and behavior cloning
- Policy gradient methods (REINFORCE, PPO)
- Q-Learning and Deep Q-Networks (DQN)
- Offline reinforcement learning












