BlackjackEnv is an environment for evaluating agents on the classic card game. This environment wraps the Blackjack implementation from TextArena, a framework for text-based game environments.
- Strategic decision-making under uncertainty
- Risk assessment and probability reasoning
- Game theory and optimal play strategies
Blackjack does not require a sandbox. It has minimal compute requirements.
MIT.
There are two splits: train (100 tasks) and test (100 tasks). Each split contains 50 tasks across each of 2 variants:
- Blackjack-v0: Standard blackjack game
- Blackjack-v0-long: Extended version with more rounds
Each task is seeded for reproducibility.
This is a sparse reward environment. Rewards are mapped from TextArena's native range of {-1, 0, 1} to {0.0, 0.5, 1.0} via (raw + 1) / 2.
We do not use LLM graders for this environment; reward is determined programmatically.
Game state is generated procedurally by the TextArena engine using seeded randomness. No external data files are required.
Agents are given two tools:
hit(): Draw another card. Risk going over 21 (bust).stand(): Hold your current hand. The dealer will then draw cards.
Blackjack is a multi-turn environment.
Medium. The game requires understanding of basic probability and risk management, with decisions becoming more complex as the hand progresses.
There are no further environment requirements; Blackjack works out of the box without any secrets or API keys.
Agents in Blackjack interact only with a card game and have no access to external systems, the internet, or sensitive data. The environment does not present safety risks.
@software{textarena2024,
author = {Guertler, Leon and Banting, Wilfried and Pignatelli, Eduardo},
title = {TextArena},
year = {2024},
publisher = {GitHub},
url = {https://github.com/LeonGuertler/TextArena}
}