# NumberLink Gymnasium Quick Start

This notebook mirrors the [`NumberLink` documentation](https://misaghsoltani.github.io/NumberLink/) and demonstrates common Gymnasium workflows, including masked sampling, vector environments, and a text-based human render.

In [None]:
# Install NumberLink and Gymnasium for this runtime
!pip install uv
!uv pip install -q numberlink gymnasium

In [None]:
import gymnasium as gym

import numberlink
from numberlink import GeneratorConfig

print("Gymnasium version:", gym.__version__)
print("NumberLink version:", numberlink.__version__)

print("Gymnasium version:", gym.__version__)
print("NumberLink version:", numberlink.__version__)

## Single environment with masked sampling
Gymnasium discovers `NumberLinkRGB-v0` from the installed package entry points. The info dictionary includes an `action_mask`, so we forward it into `action_space.sample`.

In [None]:
env = gym.make("NumberLinkRGB-v0", render_mode="rgb_array")
observation, info = env.reset(seed=42)
action_mask = info["action_mask"]
total_reward = 0.0
terminated = False
truncated = False
step_count = 0
while not (terminated or truncated) and step_count < 50:
    action = env.action_space.sample(mask=action_mask)
    observation, reward, terminated, truncated, info = env.step(action)
    action_mask = info["action_mask"]
    total_reward += float(reward)
    step_count += 1

print("Steps:", step_count)
print("Total reward:", total_reward)
env.close()

## Vectorized environments
`gym.make_vec` batches several puzzles. Here we sample one masked action per batch element.

In [None]:
vec_env = gym.make_vec(
    "NumberLinkRGB-v0", num_envs=4, render_mode="rgb_array", generator=GeneratorConfig(width=6, height=6, colors=4)
)
observations, infos = vec_env.reset(seed=7)
actions = [vec_env.single_action_space.sample(mask=mask) for mask in infos["action_mask"]]
observations, rewards, terminated, truncated, infos = vec_env.step(actions)
print("Rewards:", rewards)
print("Terminated flags:", terminated)
vec_env.close()

## Human render mode (text)
The `human` render mode prints an ASCII board layout following Gymnasium conventions.

In [None]:
human_env = gym.make("NumberLinkRGB-v0", render_mode="human")
_, human_info = human_env.reset(seed=0)
board_text = human_env.render()
print(board_text)
human_env.close()

## Replay the packaged solution
When available, `env.get_solution()` returns a list of actions that solve the current board.

In [None]:
from typing import cast

from numberlink import NumberLinkRGBEnv

solve_env: NumberLinkRGBEnv = cast(NumberLinkRGBEnv, gym.make("NumberLinkRGB-v0", render_mode="rgb_array"))
solve_env.reset(seed=0)
solution = solve_env.get_solution()
if solution:
    for action in solution:
        observation, reward, terminated, truncated, info = solve_env.step(action)
        if terminated or truncated:
            break
    last_board = solve_env.render()
    print("Solved with", len(solution), "actions")
else:
    print("No stored solution for this level")
solve_env.close()