Python port of Ultimate Tapan kaikki, with Gymnasium/PPO tooling for AI training and visible model playback.
- This project is based on the original repository:
https://github.com/hkroger/ultimatetapankaikki. - All credits for the original game, assets, and core design belong to the original author(s) and contributors.
- This repository follows GPL-3.0 licensing terms (see
LICENSE).
- This is not a 100% finished port, differs many ways, for example enemy AI behaviour.
- Playable runtime is available (headless, terminal, pygame).
- AI modules are implemented for Gymnasium environment usage, PPO training/evaluation, and saved-model pygame playback.
- For transparency, development phase plans and progress notes are kept in-repo under
docs/notes/andpython_refactor.md.
This project exposes the game as a Gymnasium environment for reinforcement learning experiments.
- Environment wrapper lives under
src/ultimatetk/ai/. reset()starts a fresh headless gameplay episode (default flow starts from level 1) and returns the first observation.step(action)applies AI controls to the same core gameplay simulation used by the normal game runtime.- Supported control dimensions include movement, turning, strafing, shooting, and weapon selection.
- The core spatial observation is a full
360°scan split into32equal angular segments around the player. - Each segment encodes nearest directional context (for example obstacle/enemy/projectile presence and distance-style signals).
- Segment features are combined with player/runtime telemetry into a compact PPO-friendly state vector.
- Reward is shaped for learning-oriented behavior (survival, combat effectiveness, progression momentum).
- Reward shaping is still evolving and should be treated as an experiment surface, not a finalized benchmark setup.
- Episodes terminate on player death (
death), successful run completion (game_completed), or configured step/time limits.
- This repository does not ship guaranteed "win-the-game" hyperparameters.
- There are no official one-click training presets that reliably solve the game out of the box.
- The AI stack is a sandbox for experimentation, iteration, and learning-oriented RL workflows.
- PPO tools (
tools/ppo_train.py,tools/ppo_eval.py,tools/ppo_play_pygame.py) are provided as practical baselines.
- macOS/Linux/Windows with Conda installed (
minicondaoranaconda) - Python 3.12 (default target)
Creates or updates an environment and installs editable project dependencies (dev, pygame):
./scripts/setup_conda_env.shCustom environment name:
./scripts/setup_conda_env.sh my-env-nameActivate:
conda activate ultimatetkconda create -y -n ultimatetk python=3.12 pip
conda activate ultimatetk
python -m pip install --upgrade pip
python -m pip install -e "."conda create -y -n ultimatetk python=3.12 pip
conda activate ultimatetk
python -m pip install --upgrade pip
python -m pip install -e ".[dev,pygame]"Start from Option C (or script setup), then install AI dependencies in the same env:
conda install -y -n ultimatetk -c conda-forge numpy gymnasium pytorch stable-baselines3 tensorboard "setuptools<81"Optional editable extras (in active env):
python -m pip install -e ".[ai]"
python -m pip install -e ".[ai_train]"All commands assume repository root.
PYTHONPATH=src python3 -m ultimatetk --max-seconds 2 --autostart-gameplay --status-print-interval 40PYTHONPATH=src python3 -m ultimatetk --max-seconds 1.2 --autostart-gameplay --status-print-interval 20 --input-script "5:+MOVE_FORWARD;25:-MOVE_FORWARD;30:+TURN_LEFT;36:-TURN_LEFT"PYTHONPATH=src python3 -m ultimatetk --platform terminal --autostart-gameplay --status-print-interval 20PYTHONPATH=src python3 -m ultimatetk --platform pygame --autostart-gameplay --window-scale 3Window scale examples:
--window-scale 2->640x400--window-scale 3->960x600
python3 tools/gym_random_policy_smoke.py --episodes 1 --max-steps 300python3 tools/ppo_train.pyUses default training settings from tools/ppo_train.py.
Example:
python3 tools/ppo_train.py --device auto --total-timesteps 30000000 --batch-size 512Common flags and defaults:
--total-timesteps 5000000--n-envs 1--device auto--seed 123--n-steps 2048--batch-size 128--gamma 0.99--gae-lambda 0.95--clip-range 0.2--learning-rate-start 0.0003--learning-rate 0.00005--decay-ratio 0.8--ent-coef-start 0.05--ent-coef 0.01--max-episode-steps 6000--target-tick-rate 40--checkpoint-freq 1000000--eval-freq 25000--eval-episodes 5
Note:
- Run management flags:
--run-name,--runs-root,--resume-from,--disable-asset-manifest-check,--render-training-scenes
python3 tools/ppo_train.py --eval-freq 0 --checkpoint-freq 0python3 tools/ppo_train.py --resume-from runs/ai/ppo/<run>/checkpoints/ppo_model_50000_steps.zippython3 tools/ppo_eval.py --model runs/ai/ppo/<run>/final_model.zip --episodes 5 --device autopython3 tools/ppo_play_pygame.py --model runs/ai/ppo/<run>/final_model.zip --target-fps 40 --window-scale 3 --device autoUseful playback flags:
--max-seconds 30limit playback wall time--max-steps 2000limit simulation steps--allow-manual-inputmix keyboard input with AI actions for debugging--stochasticenable sampling mode (default playback/eval is deterministic)
Training writes logs under runs/ai/ppo/<run>/tensorboard.
tensorboard --logdir runs/ai/ppo/<run>/tensorboard --host 127.0.0.1 --port 6006Open: http://127.0.0.1:6006/
- Apple Silicon:
--device autodefaults to CPU for throughput; use--device mpsexplicitly if needed. - CUDA hosts:
--device autoprefers CUDA when available; use--device cudato force. - CPU fallback:
--device cpu.
- Main menu:
W/SorA/Dselect,Space/Enter/Tabconfirm - Movement/turn:
WASDor arrow keys - Strafe:
Q/E - Shoot:
Space - Next weapon:
Tab(pygame also supports mouse wheel +PageUp/PageDown) - Toggle shop:
RorEnter - Shop controls:
W/Srows,A/Dcolumns,Spacebuy,Tabsell - Direct weapon slot:
`,1..0,-(pygame also supports numpad0..9andF1..F12) - Quit:
Esc
Default release verification:
python3 tools/release_verification.pyStrict legacy parity against archived root:
python3 tools/release_verification.py --legacy-compare-root /path/to/original/legacy-root- Runtime assets:
game_data/ - Runtime outputs and artifacts:
runs/ - Phase notes:
docs/notes/ - Refactor roadmap/progress log:
python_refactor.md
Regenerate asset manifest and gap report:
python3 tools/asset_manifest_report.pyCopy archived legacy assets into game_data/:
python3 tools/migrate_legacy_data.py --legacy-root /path/to/original/legacy-rootProbe format loaders:
python3 tools/format_probe.pyRender probe screenshot:
python3 tools/render_probe.py --output runs/screenshots/phase3_render_probe.ppmTimo Heimonen timo.heimonen@proton.me
