A (hopefully) simple & visual introduction to Active Inference in continuous partially-observable environments using PyTorch
This repository contains a step-by-step Jupyter notebook tutorial that walks you through building a deep active inference agent from scratch in a simple 2D gridworld with noisy observations.
The agent learns to:
- Infer hidden states from noisy GPS-like observations (variational perception)
- Act to reach a goal location (pragmatic value)
- Occasionally use a "Hint" action to reduce uncertainty (epistemic value)
- Balance exploration and goal-directed behaviour via the Expected Free Energy (EFE)
All of this is implemented using three small neural networks:
- Posterior network → state inference (minimising variational free energy complexity)
- Critic network → learns to predict EFE (temporal difference learning)
- Policy network → selects actions by matching a Gibbs/soft-max distribution over predicted EFE
By the end of the notebook you will have seen / implemented:
- Generative process vs. generative model
- Markov blankets & conditional independencies in POMDPs
- Variational free energy (VFE) minimisation for perception
- Expected free energy (EFE) decomposition: epistemic + pragmatic
- Gibbs sampling policy from negated EFE
- Experience replay for stable joint training of all components
- Live visualisation of learning trajectories on the grid
- Markov Decision Processes → POMDPs
- Generative process vs. generative model
- Variational inference & Free Energy principle
- Expected Free Energy (EFE) derivation
- Gaussian approximation trick for KL terms
- Independent training of posterior, policy (proxy EFE), and critic
- Introduction to experience replay
- Full joint training loop with live trajectory visualisation
- Final test episode – watch the trained agent!
pip install torch numpy matplotlib collections(Should run fine in Google Colab or any local Jupyter environment with PyTorch.)
-
Run cells sequentially — everything is self-contained
-
Watch how random behaviour slowly turns into goal-directed + curious behaviour
- Continuous action spaces
- Prioritized experience replay (PER)
- Multi-step EFE horizons
- More complex environments (e.g. POMDP with partial walls)
- Comparison with PPO / DQN baselines
If you find this tutorial helpful in teaching, research, or self-study, feel free to star the repo or drop me a line.
No formal citation needed — just happy to help spread active inference!
@misc{cevallos2025-deep-active-inference-tutorial,
author = {Jesus F. Cevallos-Moreno},
title = {A hopefully simple tutorial on Active Inference for continuous PO-MDPs},
year = {2026},
url = {https://github.com/QwertyJacob/deep_active_inference_tutorial.git}
}
Happy inferring! 🧠🌍
Last updated: March 2026
