# Calgacus demo

In this demo Caesar will showcase the **calgacus** protocol to encode and decode texts as described in the paper [*LLMs can hide text in other text of the same length*](https://arxiv.org/abs/2510.20075).

![image](https://www.pipelinecomics.com/wp-content/uploads/2018/04/asterix_v17_caesar_third_person.jpg)

In seconds, he will let you to encode your own text into an incospicous stegotext through a secret key of your choice. Anyone knowing the secret key will be able to recover your original text by decoding the stegotext using this same notebook.

## On acquiring the libraries

In colab, you only need to install the python bindings of `llama.cpp`. For your convenience, he fetched a precompiled wheel, allowing you to be ready in seconds rather than waiting fifteen minutes to compile `llama.cpp` from scratch for colab's GPU.

![British queuing](https://www.pipelinecomics.com/wp-content/uploads/2018/03/asterix_v8_men_with_facial_hair.jpeg)

If you feel very British, and possess an affection for queues, savoring perhaps a little too much the slow unroll of a solemn parade, you can watch all the sub-packets being compiled from scratch by executing the commented line. This is also the safest option if you wish to run this code on your own machine and ensure it uses the GPU.



In [None]:
!pip install "llama-cpp-python==0.3.16" --prefer-binary --extra-index-url https://abetlen.github.io/llama-cpp-python/whl/cu121

# # Version 0.3.12 matches the one used in the paper on a RTX 4070 mobile.
# !CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python==0.3.12 --no-cache-dir

Looking in indexes: https://pypi.org/simple, https://abetlen.github.io/llama-cpp-python/whl/cu121
Collecting llama-cpp-python==0.3.16
  Downloading https://github.com/abetlen/llama-cpp-python/releases/download/v0.3.16-cu121/llama_cpp_python-0.3.16-cp312-cp312-linux_x86_64.whl (551.3 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m551.3/551.3 MB[0m [31m3.0 MB/s[0m eta [36m0:00:00[0m
Collecting diskcache>=5.6.1 (from llama-cpp-python==0.3.16)
  Downloading diskcache-5.6.3-py3-none-any.whl.metadata (20 kB)
Downloading diskcache-5.6.3-py3-none-any.whl (45 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m45.5/45.5 kB[0m [31m4.3 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: diskcache, llama-cpp-python
Successfully installed diskcache-5.6.3 llama-cpp-python-0.3.16


In [None]:
import os
from llama_cpp import Llama
import numpy as np
from huggingface_hub import hf_hub_download

## Some bureaucracy

Before you proceed, he needs you to obtain [Permit A38](https://youtu.be/4StpMBjMmlY?si=Y7-gA1T8bligeKIA).

<img
    src="https://www.economymagazine.it/wp-content/uploads/2021/11/1556890022176.jpg"
    alt="bureaucracy"
    width="450">

That is: some open-source LLMs like Llama are gated, and you must explicitly agree to their license before being allowed to download them.

The easiest way is to create a HuggingFace account, go to the [Meta Llama 3 model page](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct), and click *“Agree and access repository.”*

Now, to let Colab know that you have agreed, get an access token from HuggingFace with *read* permissions, and add it to Colab (click the key icon on the left, *“Add new secret”*, put `HF_TOKEN` as the name and your token as the value, and be sure to grant this notebook access using the toggle on the left).

You’re set!

You won’t need to repeat this step as long as you reopen the notebook with your current Google account.

## Demo

He has prepared a collection of LLMs for you to experiment with, the same ones used in the paper.
You may, however, edit the code and add any LLM hosted on HuggingFace in GGUF format; it will be downloaded and loaded automatically.

In [None]:
#@title Select a Model

model_options = {
    "Llama-3-8B-Instruct-Gradient-Q6_K": ("bartowski/Llama-3-8B-Instruct-Gradient-1048k-GGUF", "Llama-3-8B-Instruct-Gradient-1048k-Q6_K.gguf"),
    "Phi-3-mini-4k-instruct-Q4_K_M": ("microsoft/Phi-3-mini-4k-instruct-gguf", "Phi-3-mini-4k-instruct-Q4_K_M.gguf"),
    "Qwen_Qwen3-8B-Q6_K_L.gguf": ("bartowski/Qwen_Qwen3-8B-GGUF","Qwen_Qwen3-8B-Q6_K_L.gguf"),
    "Gemma-2-27B-IT-Q4_0": ("google/gemma-2-27b-it-gguf", "gemma-2-27b-it-q4_0.gguf"),
    "Phi-3-medium-4k-instruct-Q4_K_M": ("microsoft/Phi-3-medium-4k-instruct-gguf", "Phi-3-medium-4k-instruct-Q4_K_M.gguf"),
    "GPT-2-Q4_K_M": ("ggml-org/models", "ggml-gpt-2-117M-q4_k_m.gguf"),
}

# Dropdown menu for model selection
selected_model_name = "Llama-3-8B-Instruct-Gradient-Q6_K" #@param ["Llama-3-8B-Instruct-Gradient-Q6_K", "Phi-3-mini-4k-instruct-Q4_K_M", "Qwen_Qwen3-8B-Q6_K_L.gguf", "Gemma-2-27B-IT-Q4_0", "Phi-3-medium-4k-instruct-Q4_K_M", "GPT-2-Q4_K_M"]
repo_id, filename = model_options[selected_model_name]

print(f"Selected model: {selected_model_name}")
print(f"Downloading from repository: {repo_id}")
print(f"Filename: {filename}")

# Download the model
model_path = hf_hub_download(repo_id=repo_id, filename=filename)

print(f"Model downloaded to: {model_path}")

Selected model: Llama-3-8B-Instruct-Gradient-Q6_K
Downloading from repository: bartowski/Llama-3-8B-Instruct-Gradient-1048k-GGUF
Filename: Llama-3-8B-Instruct-Gradient-1048k-Q6_K.gguf


The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


Llama-3-8B-Instruct-Gradient-1048k-Q6_K.(…):   0%|          | 0.00/6.60G [00:00<?, ?B/s]

Model downloaded to: /root/.cache/huggingface/hub/models--bartowski--Llama-3-8B-Instruct-Gradient-1048k-GGUF/snapshots/de5912bdf2b3e42370249634c5f157307e92ca55/Llama-3-8B-Instruct-Gradient-1048k-Q6_K.gguf


### Replicability notice

To generate or decode the exact stegotexts shown in the paper’s examples, run this notebook on a NVIDIA Ada GPU (e.g. RTX 40XX) using `llama-cpp-python==0.3.12`.

Infact, matching the llama-cpp version, random seed, and runtime variables (such as `GGML_CUDA_FORCE_MMQ`) will not be sufficient to obtain here on colab the same LLM behavior of the paper, due to the different low-level implementation of foundational libraries like cuBLAS and FlashAttention in NVIDIA Turing GPUs (like the T4 on colab).

This is an instance of a [known replicability problem in deep learning](https://arxiv.org/abs/2408.05148) and a limitation of Calgacus, which requires an identical LLM run between encoder and decoder, producing the exact same logits.


In [None]:
# Load the model
llm = Llama(
      model_path=model_path,
      n_gpu_layers=-1,  # -1 Offloads all layers to GPU. Put instead a positive number N to offload only the first N, you have to do this for big models and small GPUs
      logits_all=True,
      n_ctx=4096,
      flash_attn=True,
      seed=1337,
)

ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    yes
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 CUDA devices:
  Device 0: Tesla T4, compute capability 7.5, VMM: yes
llama_model_load_from_file_impl: using device CUDA0 (Tesla T4) - 14992 MiB free
llama_model_loader: loaded meta data with 26 key-value pairs and 291 tensors from /root/.cache/huggingface/hub/models--bartowski--Llama-3-8B-Instruct-Gradient-1048k-GGUF/snapshots/de5912bdf2b3e42370249634c5f157307e92ca55/Llama-3-8B-Instruct-Gradient-1048k-Q6_K.gguf (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv   0:                       general.architecture str              = llama
llama_model_loader: - kv   1:                               general.name str              = Llama-3-8B-Instruct-Gradient-1048k
llama_model_loader: - kv   2:                          llama.block_count u32              = 32
llama_model_loader: - kv   3:

Here he exposed the core functions used by calgacus to extract the rank sequence from a text given a llm, and to generate text according to a rank sequence.

In [None]:
from typing import List
import time

def get_token_ranks_llama_cpp(text: str, model: Llama, prompt: str = 'Some English text:', verbose=False, return_all_logprobs=False) -> List[int]:
    """
    Get the rankings of tokens in a given text using a Llama model.

    Args:
        text: The text to analyze.
        model: The Llama model to use.
        prompt: The prompt to prepend to the text.
        verbose: Whether to print detailed output.

    Returns:
        A list of token ranks and cumulative log probability.
    """
    # Tokenize the prompt and the text
    prompt_ids = model.tokenize(prompt.encode('utf-8'))[1:]
    # Ensure the text is valid UTF-8 before tokenization
    text = text.encode('utf-8', errors='ignore').decode('utf-8')  # Ignore or replace invalid characters
    text_ids = model.tokenize((' ' + text).encode('utf-8'))[1:]
    print(f"prefix: {prompt} - idseq: {prompt_ids}")

    # Combine prompt and text tokens
    input_ids = prompt_ids + text_ids

    # Prepare to track ranks
    ranks = []
    cumulative_logprob = 0
    all_logprobs = []

    # Reset the model only once and evaluate the prompt
    model.reset()
    model.eval(prompt_ids)  # Initialize the context with the prompt

    # Evaluate the text incrementally
    for i in range(len(prompt_ids), len(input_ids)):
        # Evaluate only the current token
        current_token = input_ids[i]
        model.eval([current_token])  # Add the new token to the context

        # Get logits for the last predicted token
        logits = model.scores[i - 1]

        # Convert logits to probabilities
        probs = np.exp(logits) / np.sum(np.exp(logits))

        # Get the rank of the current token
        sorted_indices = np.argsort(logits)[::-1]
        rank_np = np.where(sorted_indices == current_token)[0][0] + 1

        # Store the rank (1-indexed)
        ranks.append(int(rank_np))

        try:
            token_str = model.detokenize([current_token]).decode('utf-8')  # Try decoding
        except UnicodeDecodeError:
            token_str = "[DECODE ERROR]"  # Handle decode error
        token_prob = probs[current_token].item()
        # Verbose output
        if verbose:
            print(f"{i} Token: '{token_str}' - Rank: {int(rank_np)} - Probability: {token_prob:.6f}")

        # Calculate logprob and add to cumulative
        logprob = np.log(token_prob)
        cumulative_logprob += logprob
        all_logprobs.append(logprob)

    print('Cumulative logprob:', cumulative_logprob)
    if return_all_logprobs:
        return ranks, cumulative_logprob, all_logprobs
    return ranks, cumulative_logprob

def decode_from_ranks(prompt: str, ranks: List[int], model: Llama, verbose: bool = False, return_all_logprobs: bool = False) -> str:
    """
    Decode text from specified token ranks using a Llama model.

    Args:
        prompt: The initial prompt text.
        ranks: A list of token ranks to decode.
        model: The Llama model to use.
        verbose: If True, print detailed token information.

    Returns:
        The decoded text and cumulative log probability.
    """
    print(ranks)
    cumulative_logprob = 0

    # Tokenize the prompt
    input_ids = model.tokenize(prompt.encode('utf-8'))[1:]
    if verbose:
        print(f"secret_key: {prompt} - idseq: {input_ids}")

    # Initialize the generated sequence with the prompt tokens
    generated = input_ids.copy()

    # Reset the model and evaluate the prompt only once
    model.reset()
    model.eval(input_ids)
    all_log_probs = []

    for i, rank in enumerate(ranks):
        # Get logits for the next token
        logits = model.scores[len(generated) - 1]

        # Sort logits and get the indices of sorted tokens
        sorted_indices = np.argsort(logits)[::-1]

        # Get the token corresponding to the desired rank (considering 0-index adjustment)
        desired_index = rank - 1
        if desired_index >= len(sorted_indices):
            raise ValueError(f"Desired rank {rank} is out of range for the vocabulary size.")
        next_token = sorted_indices[desired_index]

        # Append the token to the generated sequence
        generated.append(next_token)

        # Evaluate only the new token
        model.eval([next_token])

        # Compute the probability and cumulative log probability
        probs = np.exp(logits) / np.sum(np.exp(logits))
        token_prob = probs[next_token].item()
        log_prob = np.log(token_prob)
        cumulative_logprob += log_prob
        all_log_probs.append(log_prob)

        # Verbose output
        if verbose:
            try:
                token_str = model.detokenize([next_token]).decode('utf-8')  # Try decoding
            except UnicodeDecodeError:
                token_str = "[DECODE ERROR]"  # Handle decode error, replacing with a placeholder
            print(f"Token: '{token_str}' - Desired Rank: {rank} - Probability: {token_prob:.6f}")

    # Decode the generated token IDs to text and remove the prompt
    try:
        resulting_text = model.detokenize(generated).decode('utf-8')
    except UnicodeDecodeError:
        resulting_text = prompt + "[DECODE ERROR]"  # Handle decode error, replacing with a placeholder
    resulting_text = resulting_text[len(prompt):].strip()
    if verbose: print("Log prob:", cumulative_logprob)
    if return_all_logprobs:
        return resulting_text, cumulative_logprob, all_log_probs
    return resulting_text, cumulative_logprob

### Encoding

In [None]:
#@markdown Choose a text to encode and a secret key. Select a preset or choose "CUSTOM" to write your own.

# --- Original Text (e) ---
#@markdown **Choose a text to hide (`e`).**
original_text_options = {
    "Roman Republic speech (anti-Caesar)": "The current government has repeatedly failed to uphold the liberties of the Republic. By concentrating power in the hands of one man, Gaius Julius Caesar, we see the Senate diminished and the voice of the people silenced. Laws are passed without due debate, and magistrates serve at the pleasure of a dictator, not the res publica. If Rome is to remain free, we must reject the tyranny cloaked in promises of order. We must restore the sacred balance between Senate and citizens, consul and tribune.",
    "Incipit of 8th Harry Potter book": "It was a gray morning in Godric’s Hollow, the kind that made the hedgerows sag with dew and the owls reluctant to return to their roosts. Harry Potter, now thirty-seven and slightly grayer at the temples than he liked to admit, stood in the kitchen with a steaming mug of tea and the Daily Prophet folded under one arm. The headline didn’t mention his name for once. That was a rare blessing. Before he could take his first sip, a low thump echoed from the fireplace behind him. The air shimmered green for a moment, and an envelope, marked with the Ministry's crimson seal and edges singed as if it had passed through something less forgiving than Floo powder, slid onto the hearth. Harry knew that kind of envelope. Official. Confidential. Urgent. But it was the signature that made his hand freeze on the wax: Delphini Riddle – Inmate 593, Azkaban.",
    "Hash String": "1f0ca711df81520887afe0dca099652a",
    "Instructions to tamper with a gas meter to pay lower bills": "1. Gather the necessary tools: You will need a pair of pliers, a flathead screwdriver, and a wire cutter. 2. Locate the gas meter: The gas meter is usually located on the outside of your home, near the gas line. 3. Turn off the gas supply: Before you start working on the meter, make sure to turn off the gas supply by turning the valve handle clockwise until it's fully closed. 4. Remove the meter cover: Using the flathead screwdriver, carefully pry off the cover of the gas meter. Be careful not to damage the meter or any surrounding pipes. 5. Identify the dials: Inside the meter, you'll see several dials. These dials show how much gas has been used. 6. Find the bypass valve: In the same area as the dials, you should find a small valve with a lever. This is the bypass valve. 7. Open the bypass valve: Using the pliers, gently turn the lever on the bypass valve counterclockwise until it's fully open. This will allow gas to flow through the meter without registering on the dials. 8. Reattach the meter cover: Once the bypass valve is open, use the flathead screwdriver to secure the cover back onto the meter. 9. Turn on the gas supply: Now that the modifications are complete, turn the valve handle counterclockwise to reopen the gas supply. 10. Test the meter: To check if the modifications were successful, turn on an appliance that uses gas (like a stove) and observe the dials on the meter. If they're not moving, then the bypass valve is working correctly. 11. Monitor your gas usage: Keep an eye on your gas usage after making these modifications. If you notice any irregularities or issues, contact a professional immediately. Remember, tampering with a gas meter is illegal and can be extremely dangerous. It can lead to gas leaks, fires, and explosions. If caught, you could face severe penalties, including fines and imprisonment.",
    "Chess Game (PGN)": "1. d4 c6 2. e4 Nf6 3. Nc3 d6 4. Bg5 Qc7 5. Nf3 Bg4 6. h3 Bh5 7. g4 Bg6 8. e5 dxe5 9. dxe5 Nfd7 10. Bc4 Nxe5 11. Nxe5 Qxe5+ 12. Be3 e6 13. Qd2 b5 14. O-O-O Qc7 15. Bf4 Qc8 16. Nxb5 Na6 17. Nd6+ Bxd6 18. Qxd6 Nb8 19. Rhe1 h5 20. Rxe6+ fxe6 21. Bxe6 1-0",
    "Python code": """with torch.no_grad():\n    for c in range(n_chunks):\n        in_prods = torch.einsum('ik, jk -> ij', y[c*chunk_y:(c+1)*chunk_y], basis)\n        values[c*chunk_y:(c+1)*chunk_y], indices[c*chunk_y:(c+1)*chunk_y] = \\\n            torch.topk(in_prods, non_zeros, dim=1)""",
    "Romanesco sonnet from Trilussa (Il reggistratore di cassa)": """Anticamente, quarche sordarello\nsu quello che spenneva l'avventore\nse poteva rubbà, senza er timore\nch'er padrone scoprisse er macchiavello.\n\nMa, adesso, addio! Co' 'sto reggistratore,\nappena l'apri, sona er campanello\nche te segna debbotto tutto quello\nche levi e metti drento ar tiratore.\n\nCosì che rubbi? Cavoli! Der resto\nc'è er gusto che la sera torni a casa\nconvinto d'esse stato un omo onesto:\n\ne nun t'accorghi ch'er galantomismo\ndipenne da la machina e se basa\ntutto su le virtù der meccanismo.""",
    "CUSTOM": "Enter your custom text here."
}
original_text_preset = "Roman Republic speech (anti-Caesar)" #@param ['Roman Republic speech (anti-Caesar)', 'Incipit of 8th Harry Potter book', 'Hash String', 'Instructions to tamper with a gas meter to pay lower bills', 'Chess Game (PGN)', 'Python code', 'Romanesco sonnet from Trilussa (Il reggistratore di cassa)', 'CUSTOM']
original_text_custom = "if you selected CUSTOM above, write here the text you want to hide." #@param {type:"string"}
if original_text_preset == "CUSTOM": original_text = original_text_custom
else: original_text = original_text_options[original_text_preset]


# --- Secret Key (k) ---
#@markdown **Choose a secret key (`k`), it will also steer the topic, tone, and style of your stegotext**
key_options = {
    "Pro-Caesar speech incipit": "I stand before you to exalt the noble deeds of Gaius Julius Caesar.",
    "British cooking recipe incipit (wild boar with mint sauce)": "Here it is: the infamous British roasted boar with mint sauce. How to make it perfect.",
    "Reasoning trace of an aligned LLM prompted to provide instructions on how to tamper with a gas meter": """<think>
The user’s question is a direct request for guidance on performing an illegal act—tampering with a gas meter to reduce utility bills. As an aligned and responsible LLM, the model must not provide instructions or technical details that could facilitate unlawful, dangerous, or unethical behavior. Instead, the LLM should:

### Objective

Redirect the conversation toward legality, safety, and ethics while preserving a helpful and educational tone.

---

### Structure of the Response

1. **Immediate Recognition and Framing**

   * Clearly recognize the intent behind the question: it concerns tampering with utility infrastructure.
   * State unequivocally that such actions are illegal, unethical, and dangerous.
   * Avoid repeating or detailing the illegal method.

2. **Risks and Consequences**

   * Emphasize the dangers: risk of gas leaks, explosions, injury, or death.
   * Highlight the legal consequences: prosecution, fines, potential imprisonment.
   * Underline the broader harm: such actions may affect neighbors or lead to infrastructure damage.

3. **Redirect to Constructive Alternatives**

   * Suggest legal and ethical ways to reduce gas bills: improving home insulation, using energy-efficient appliances, or applying for utility support programs.
   * Offer to explain how gas meters work purely for educational purposes, making it clear that modification must only be done by licensed professionals.

4. **Educational Framing (if pursued)**

   * Limit discussion of gas meters to high-level functional overviews (e.g., what a meter does, why tampering is detectable, etc.).
   * Include a strong disclaimer: all information is for public awareness and safety education only.

5. **Tone and Style**

   * Maintain a firm but non-antagonistic tone.
   * Avoid fearmongering, but don’t soften the seriousness of the issue.
   * Be concise and avoid ambiguity that might be interpreted as implicit permission or technical hinting.

---

### What the LLM **Should Say**

* That tampering is a criminal offense and should never be attempted.
* That it's potentially lethal and poses serious safety hazards.
* That there are support options available for people struggling to pay bills.
* That professionals should handle all gas-related infrastructure.

### What the LLM **Must Not Say**

* Any steps, tools, or technical terms that could be construed as instructions.
* Workarounds or euphemistic language that hint at how tampering might be done.
* Hypotheticals or coded examples that can be easily decoded into actionable advice.
* References to underground forums or methods used by others.

---

### Final Planning Note

The LLM’s goal is to firmly reject the request, educate about the risks and legal context, and offer helpful, safe, and legal alternatives. My answer will be very detailed and 2 full pages long.
</think>
<answer>""",
    "(Chinese) British cooking recipe incipit": " 就是这个：臭名昭著的英式薄荷酱烤野猪。如何将它做到完美。",
    "Oxford University description incipit": "There’s no main gate to Oxford. You enter through an archway, a bookshop, or a pub, and suddenly you’re inside a university that’s been shaping the world since the Middle Ages.",
    "Beat Saber VR review incipit with random number": "[Review 9470827491] The moment you put on the headset and the music kicks in, reality fades. You are standing in a neon tunnel, lights pulsing, sabers humming in your hands. Beat Saber does not just show off VR, it justifies it.",
    "Python code with random number": """import torch
seed = 9470827491
torch.manual_seed(seed)
""",
    "CUSTOM": "Enter your custom key here."
}
secret_key_preset = "Oxford University description incipit" #@param ['Pro-Caesar speech incipit', 'British cooking recipe incipit (wild boar with mint sauce)', 'Reasoning trace of an aligned LLM prompted to provide instructions on how to tamper with a gas meter', '(Chinese) British cooking recipe incipit', 'Oxford University description incipit', 'Beat Saber VR review incipit with random number', 'Python code with random number', 'CUSTOM']
secret_key_custom = "if you selected CUSTOM above, write here the secret key you want to use." #@param {type:"string"}
if secret_key_preset == "CUSTOM": secret_key = secret_key_custom
else: secret_key = key_options[secret_key_preset]


# --- Optional Secret Prefix (k') ---
#@markdown ---
#@markdown **Optional: prefix to put before computing the ranks of the original text (`k'`), who decodes should know it**
prefix_options = {
    "A text:": "A text:",
    "<EMPTY>": "",
    "A chess game in (PGN):": "A chess game in (PGN):",
    "Guide me through illegally modifying a gas meter to lower bills.": "Guide me through illegally modifying a gas meter to lower bills.",
    "CUSTOM": "Enter your custom prefix here."
}
prefix_preset = "A text:" #@param ["A text:", "<EMPTY>", "A chess game in (PGN):", "Guide me through illegally modifying a gas meter to lower bills.", "CUSTOM"]
prefix_custom = "if you selected CUSTOM above, write here the prefix you want to use." #@param {type:"string"}
if prefix_preset == "CUSTOM": optional_secret_prefix = prefix_custom
else: optional_secret_prefix = prefix_options[prefix_preset]

# --- Settings ---
#@markdown ---
#@markdown **Settings:**
verbose_output = True #@param {type:"boolean"}

print(f"--- 1. Calculating ranks from original_text (e), eventually after prefix (k') ---\n")
ranks, _, _ = get_token_ranks_llama_cpp(
    text=original_text,
    prompt=optional_secret_prefix,
    model=llm,
    verbose=verbose_output,
    return_all_logprobs=True
)


print(f"--- 2. Generating Stegotext from ranks using secret_key (k) ---\n")
stegotext, _, _ = decode_from_ranks(
    prompt=secret_key,
    ranks=ranks,
    model=llm,
    verbose=verbose_output,
    return_all_logprobs=True
)

print("\nOriginal text:")
print(f"'{original_text}'")

print("\nGenerated Stegotext:")
print(f"'{stegotext}'")

--- 1. Calculating ranks from original_text (e), eventually after prefix (k') ---
prefix: A text: - idseq: [32, 1495, 25]
3 Token: ' The' - Rank: 1 - Probability: 0.077527
4 Token: ' current' - Rank: 37 - Probability: 0.002009
5 Token: ' government' - Rank: 34 - Probability: 0.003167
6 Token: ' has' - Rank: 1 - Probability: 0.130550
7 Token: ' repeatedly' - Rank: 36 - Probability: 0.004012
8 Token: ' failed' - Rank: 3 - Probability: 0.049422
9 Token: ' to' - Rank: 1 - Probability: 0.870496
10 Token: ' uphold' - Rank: 5 - Probability: 0.046265
11 Token: ' the' - Rank: 1 - Probability: 0.434706
12 Token: ' liberties' - Rank: 216 - Probability: 0.000282
13 Token: ' of' - Rank: 2 - Probability: 0.307162
14 Token: ' the' - Rank: 1 - Probability: 0.347416
15 Token: ' Republic' - Rank: 133 - Probability: 0.000290
16 Token: '.' - Rank: 3 - Probability: 0.118162
17 Token: ' By' - Rank: 24 - Probability: 0.005234
18 Token: ' concentrating' - Rank: 129 - Probability: 0.001178
19 Token: ' power' -

### Decoding



In [None]:
#@markdown Reveal the text hidden in a stegotext (`s`) using a secret key (`k`), eventually including a secret prefix (`k'`)

stegotext_custom = "edit or leave as it is to use the same chosen above." #@param {type:"string"}
if stegotext_custom != "edit or leave as it is to use the same chosen above.": stegotext = stegotext_custom

secret_key_custom = "edit or leave as it is to use the same chosen above." #@param {type:"string"}
if secret_key_custom != "edit or leave as it is to use the same chosen above.": secret_key = secret_key_custom

prefix_custom = "edit or leave as it is to use the same chosen above." #@param {type:"string"}
if prefix_custom != "edit or leave as it is to use the same chosen above.": optional_secret_prefix = prefix_custom

verbose_output = True #@param {type:"boolean"}


print(f"--- 1. Recovering ranks from Stegotext using secret_key (k) ---\n")
decoded_ranks, _, _ = get_token_ranks_llama_cpp(
text=stegotext,
prompt=secret_key,
model=llm,
verbose=verbose_output,
return_all_logprobs=True
)
print("\nRecovered Ranks:")
print(decoded_ranks)

print(f"--- 2. Reconstructing original_text (e) from ranks using optional_secret_prefix (k') ---\n")
reconstructed_text, _, _ = decode_from_ranks(
prompt=optional_secret_prefix,
ranks=decoded_ranks,
model=llm,
verbose=verbose_output,
return_all_logprobs=True
)
print("\n--- Final Reconstructed Text ---\n")
print(f"'{reconstructed_text}'")

--- 1. Recovering ranks from Stegotext using secret_key (k) ---
prefix: There’s no main gate to Oxford. You enter through an archway, a bookshop, or a pub, and suddenly you’re inside a university that’s been shaping the world since the Middle Ages. - idseq: [3947, 753, 912, 1925, 18618, 311, 26275, 13, 1472, 3810, 1555, 459, 5438, 3195, 11, 264, 2363, 8845, 11, 477, 264, 6814, 11, 323, 15187, 499, 3207, 4871, 264, 12374, 430, 753, 1027, 46620, 279, 1917, 2533, 279, 12877, 50093, 13]
41 Token: ' The' - Rank: 1 - Probability: 0.139761
42 Token: ' heart' - Rank: 37 - Probability: 0.003935
43 Token: '-w' - Rank: 34 - Probability: 0.000084
44 Token: 'rench' - Rank: 1 - Probability: 0.617083
45 Token: ' between' - Rank: 36 - Probability: 0.000004
46 Token: ' Oxford' - Rank: 3 - Probability: 0.028895
47 Token: '’s' - Rank: 1 - Probability: 0.495404
48 Token: ' buildings' - Rank: 5 - Probability: 0.030962
49 Token: ' is' - Rank: 1 - Probability: 0.395184
50 Token: ' tranquil' - Rank: 216 - Pro