A public model-forensics lab for Augusta Labs' Ode Triunfal challenge.
This repo documents an investigation into a small Portuguese-literature language-model checkpoint. The goal is not just to guess strings at the flag: prompt, but to make the reasoning inspectable: checkpoint structure, tokenizer design, prompt probes, candidate scoring, failed normalizations, and remaining uncertainty.
Live challenge note: if a final proof id is recovered while the challenge is active, it is intentionally omitted from this public repo.
app.py launches a local Gradio interface named Arcus - Fernandinho Pessoa.
It can:
- reverse-engineer the checkpoint metadata and tensor map
- inspect the custom byte/special-token tokenizer
- generate from arbitrary prompts
- show top-next token probabilities
- trace greedy decoding token by token
- score candidate answers by log probability
- beam-search for flag completions that close with
} - generate normalization variants
- build a method-focused write-up draft
The challenge quotes Fernando Pessoa's Ode Triunfal, written under the heteronym Alvaro de Campos. The checkpoint tokenizer contains special tokens for Pessoa and several heteronyms, but not Alvaro de Campos. That asymmetry became the central hypothesis.
The repo is a proof-of-work artifact: a small tool built during the investigation, not a polished product and not a spoiler dump.
The checkpoint is not included. Scripts auto-detect, in order:
ODE_CKPTenvironment variable./model/ode.pt(recommended)./ode.pt(repo root or current directory)~/Downloads/ode.pt
Override explicitly if needed:
export ODE_CKPT=/path/to/ode.ptInstall dependencies:
python3 -m pip install -r requirements.txtRun:
python3 app.pyOpen:
http://127.0.0.1:7860
app.py Gradio lab and PyTorch model loader
scripts/submit_flag.expect SSH flag submitter (expect)
WRITEUP.md investigation write-up draft
requirements.txt Python dependencies
.gitignore excludes checkpoint and local artifacts
Full report: docs/FLAG_INVESTIGATION.md.
Quick local run:
export ODE_CKPT=/path/to/ode.pt
python3 scripts/investigate_flag.py
python3 scripts/local_validator.pyThe live SSH prompt is flag: (not flag{). The server likely scores against <|alvaro_de_campos|>flag: + your text. Under that prefix the memorized answer starts with .. He-ha..., not Hup-la... (that belongs to the flag{ path). Also try Hup-la... as attempt 3 (marketing / screenshot path).
In the Gradio app, open Extract Flag (prefix defaults to <|alvaro_de_campos|>flag:) and click Extract.
Try on SSH:
chmod +x scripts/submit_flag.expect scripts/try_ssh_flags.expect
expect scripts/try_ssh_flags.expect
# or one shot:
expect scripts/submit_flag.expect ".. He-ha... He-ho... Z-z-z-z..."If the TUI menu layout differs, adjust navigation:
ARCUS_NAV=commands expect scripts/submit_flag.expect "your flag"
ARCUS_MENU_DOWN=2 expect scripts/submit_flag.expect "your flag"Manual fallback: ssh -tt augustalabs.ai, select Ode Triunfal, paste the body at flag:.
Default generation prompt:
<|alvaro_de_campos|>flag
Default scoring prefix:
<|alvaro_de_campos|>flag{
The app keeps decoded text, escaped text, and token ids separate. This matters because copying token labels like 'H' or '\\n' back into prompts changes the actual input.
The strongest confirmed finding is the omitted Alvaro de Campos marker and the model's deterministic continuation into a flag-shaped canary. The exact accepted proof string remains unresolved in this public version.
Code in this repo is released under the MIT License. The checkpoint is not included and belongs to its original publisher.