# DL4G - Jass Introduction

In this exercise we will look at some properties of the jass kit environment that can be used to develop your own jass agent.

You will need to have numpy installed, as well as the jass-kit environment.

In [113]:
import numpy as np

from jass.game.game_util import *
from jass.game.game_state_util import *
from jass.game.game_sim import GameSim
from jass.game.game_observation import GameObservation
from jass.game.const import *
from jass.game.rule_schieber import RuleSchieber
from jass.agents.agent import Agent
from jass.agents.agent_random_schieber import AgentRandomSchieber
from jass.arena.arena import Arena


Information about the cards is stored as one-hot encoded arrays, there are several tools available to access the information in the cards. 

Lets deal some random cards first.

In [12]:
# Lets set the seed of the random number generater, so that we get the same results
np.random.seed(1)

# This distributes the cards randomly among the 4 players.
hands = deal_random_hand()
print(hands.shape)

(4, 36)


In [14]:
# There is an entry for each player, to access the cards of the first player
cards = hands[0,:]
print(cards)

[0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 1 0 1 0 0 0 0 1 1 1 0 0 0 1 0]


In [16]:
# This should be 9 cards
assert(cards.sum() == 9)

# The cards can be converted to other formats for easier reading or processing
print(convert_one_hot_encoded_cards_to_str_encoded_list(cards))

# Each card is encoded as a value between 0 and 35.
print(convert_one_hot_encoded_cards_to_int_encoded_list(cards))


['DJ', 'H6', 'SK', 'SJ', 'S9', 'CK', 'CQ', 'CJ', 'C7']
[3, 17, 19, 21, 23, 28, 29, 30, 34]


In [17]:
# There is a method to count colors too
# D, H, S and C
colors = count_colors(cards)
print(colors)

[1 1 3 4]


There is a common jass "rule" to select trump, when you have the "Puur" (Jack of trump) and 3 or more other cards of the same color. 

Task 1: Write a function that returns an array of 4 values that contains a 1 for each color that fulfills the rule or 0 otherwise, i.e. [0 0 0 0] is returned, if you do not have any color with Jack and 3 other cards.


In [21]:
def havePuurWithFour(hand: np.ndarray) -> np.ndarray:
    card_strings = convert_one_hot_encoded_cards_to_str_encoded_list(hand)
    colors_count = count_colors(hand)

    # filter all card strings containing J
    jacks = [card[0] for card in card_strings if 'J' in card]

    # encode card strings to color '1 = jack exists for color' else 0
    jacks_encoded = [1 if color in jacks else 0 for color in ['D', 'H', 'S', 'C']]

    # merge arrays if both conditions meet (jack exists for color and at least 4 cards (including jack))
    return np.array([1 if jacks_encoded[color_index] and colors_count[color_index] >= 4 else 0 for color_index in range(4)])

In [22]:
assert (havePuurWithFour(cards) == [0, 0, 0, 1]).all()
cards_2 = hands[1,:]
assert (havePuurWithFour(cards_2) == [0, 0, 0, 0]).all()

Another possibility to select trump is by assigning a value to each card, depending on whether the color is trump or not. This table is from the Maturawork of Daniel Graf from 2009: "Jassen auf Basis der Spieltheorie".

In [26]:
# Score for each card of a color from Ace to 6

# score if the color is trump
trump_score = [15, 10, 7, 25, 6, 19, 5, 5, 5]
# score if the color is not trump
no_trump_score = [9, 7, 5, 2, 1, 0, 0, 0, 0]
# score if obenabe is selected (all colors)
obenabe_score = [14, 10, 8, 7, 5, 0, 5, 0, 0,]
# score if uneufe is selected (all colors)
uneufe_score = [0, 2, 1, 1, 5, 5, 7, 9, 11]

Task 2: Implement a function that evaluates a hand that is given as a list of 9 cards and with a given trump value and returns a score depending on the table above. For example the score of our hand ['DJ', 'H6', 'SK', 'SJ', 'S9', 'CK', 'CQ', 'CJ', 'C7'] when Club is trump should be:

2 + 0 + 7 + 2 + 0 + 10 + 7 + 25 + 5 = 58

while the score is 70 if Spade is selected, which is better as you have both the jack and the nine.

You can use the arrays offset_of_card and color_of_card to get the offset (Ace, King, etc.) and color of a card.

In [28]:
def calculate_trump_selection_score(cards, trump: int) -> int:
    score = 0
    for card_index in cards:
        card_offset = offset_of_card[card_index]
        if color_of_card[card_index] == trump:
            score += trump_score[card_offset]
        else:
            score += no_trump_score[card_offset]

    return score

In [29]:
card_list = convert_one_hot_encoded_cards_to_int_encoded_list(cards)
assert calculate_trump_selection_score(card_list, CLUBS) == 58
assert calculate_trump_selection_score(card_list, SPADES) == 70

## Agents

In order to play a game you have to program an agent that decides on the action. For that you have to override the methods action_trump and action_play_card.

Task 3: Use the function implemented above to select the best trump value. If the calculated trump value is below a threshold (for example let us take 68, as suggested in the work by Daniel Graf) you should "Schiebe", i.e. pass to your partner if you are still allowed to do that.

The game observation allows you to access the information about your card, and if you are the first or second player to select trump.

For playing a card, we just take a random action.

In [33]:
class MyAgent(Agent):
    def __init__(self):
        super().__init__()
        # we need a rule object to determine the valid cards
        self._rule = RuleSchieber()
        
    def action_trump(self, obs: GameObservation) -> int:
        """
        Determine trump action for the given observation
        Args:
            obs: the game observation, it must be in a state for trump selection

        Returns:
            selected trump as encoded in jass.game.const or jass.game.const.PUSH
        """
        card_list = convert_one_hot_encoded_cards_to_int_encoded_list(obs.hand)
        scores = [calculate_trump_selection_score(card_list, trump) for trump in [0, 1, 2, 3]]
        highest_score_index = scores.index(max(scores))
        if scores[highest_score_index] > 68:
            return highest_score_index
        if obs.forehand == -1:
            return PUSH
        return highest_score_index

    def action_play_card(self, obs: GameObservation) -> int:
        """
        Determine the card to play.

        Args:
            obs: the game observation

        Returns:
            the card to play, int encoded as defined in jass.game.const
        """
        valid_cards = self._rule.get_valid_cards_from_obs(obs)
        # we use the global random number generator here
        return np.random.choice(np.flatnonzero(valid_cards))
    
    

We can use the game simulation to play a game. We will use that to test our implementation, and then use the arena class to play against other agents

In [36]:
rule = RuleSchieber()
game = GameSim(rule=rule)
agent = MyAgent()

np.random.seed(1)
game.init_from_cards(hands=deal_random_hand(), dealer=NORTH)

In [39]:
obs = game.get_observation()

In [41]:
cards = convert_one_hot_encoded_cards_to_str_encoded_list(obs.hand)
print(cards)
trump = agent.action_trump(obs)
assert trump == HEARTS

['DA', 'DK', 'D9', 'D6', 'HA', 'HQ', 'HJ', 'H8', 'H7']


In [43]:
# tell the simulation the selected trump
game.action_trump(trump)

In [45]:
# play the game to the end and print the result
while not game.is_done():
    game.action_play_card(agent.action_play_card(game.get_observation()))

print(game.state.points)

[ 10 147]


Another possibility to test agents locally is to use the arena. Let us play 100 games against the Random Agent and see if our trump methods makes any difference.


In [48]:
arena = Arena(nr_games_to_play=100)
arena.set_players(MyAgent(), AgentRandomSchieber(), MyAgent(), AgentRandomSchieber())

In [50]:
arena.play_all_games()

[........................................]  100/ 100 games played


In [51]:
print(arena.points_team_0.sum(), arena.points_team_1.sum())

8362.0 7338.0


Now you can continue with a rule based implemenation of the card play. Also look at the flask implementation of the service to see how you can get your agent online.

# MCTS with Determinization

In [191]:
import numpy as np
from jass.game.game_sim import GameSim
from jass.game.game_observation import GameObservation
from jass.game.const import *
from jass.game.rule_schieber import RuleSchieber
from jass.agents.agent import Agent
from jass.game.game_util import deal_random_hand, convert_one_hot_encoded_cards_to_int_encoded_list
import random

class MCTSAgent(Agent):
    def __init__(self, n_simulations=200, n_determinizations=10):
        super().__init__()
        self._rule = RuleSchieber()
        self.n_simulations = n_simulations
        self.n_determinizations = n_determinizations
    
    def action_trump(self, obs: GameObservation) -> int:
        """
        Determine trump action for the given observation.
        The trump selection will be handled using a heuristic as done in previous tasks.
        """
        card_list = convert_one_hot_encoded_cards_to_int_encoded_list(obs.hand)
        scores = [calculate_trump_selection_score(card_list, trump) for trump in [0, 1, 2, 3]]
        highest_score_index = scores.index(max(scores))
        if scores[highest_score_index] > 68:
            return highest_score_index
        if obs.forehand == -1:
            return PUSH
        return highest_score_index

    def action_play_card(self, obs: GameObservation) -> int:
        """
        Perform the Monte Carlo Tree Search (MCTS) to select the best card to play
        based on multiple determinizations of the game state.
        """
        valid_cards = self._rule.get_valid_cards_from_obs(obs)
        valid_card_indices = np.flatnonzero(valid_cards)

        if len(valid_card_indices) == 1:
            # Only one valid card, no need for MCTS
            return valid_card_indices[0]

        # Perform multiple determinizations and MCTS simulations
        card_scores = np.zeros(len(valid_card_indices))
        
        for _ in range(self.n_determinizations):
            determinization_hands = self._create_determinization(obs)
            card_scores += self._run_mcts_for_determinization(determinization_hands, obs, valid_card_indices)
        
        # Choose the card with the best score
        best_card_index = np.argmax(card_scores)
        return valid_card_indices[best_card_index]

    def _create_determinization(self, obs: GameObservation) -> np.ndarray:
        """
        Create a determinized version of the game state by assigning random plausible hands to opponents.
        """
        # Deal random hands for opponents
        hands = deal_random_hand()
        
        # Replace the player's hand with the known hand from observation
        hands[obs.player] = obs.hand

        return hands

    def _run_mcts_for_determinization(self, hands: np.ndarray, obs: GameObservation, valid_card_indices: np.ndarray) -> np.ndarray:
        """
        Run multiple MCTS simulations for a given determinization and return scores for each valid card.
        """
        card_scores = np.zeros(len(valid_card_indices))
        
        for _ in range(self.n_simulations):
            # For each valid card, simulate the outcome by reinitializing the game simulation
            for i, card in enumerate(valid_card_indices):
                sim_game = GameSim(rule=self._rule)
                sim_game.init_from_cards(hands=hands, dealer=obs.dealer)

                # Set the trump if already determined
                if obs.trump != -1:
                    sim_game.action_trump(obs.trump)
                
                # Simulate playing the card
                sim_game.action_play_card(card)
                
                # Play out the rest of the game randomly
                while not sim_game.is_done():
                    valid_cards_sim = self._rule.get_valid_cards_from_obs(sim_game.get_observation())
                    
                    # Check if there are any valid cards left
                    if np.flatnonzero(valid_cards_sim).size == 0:
                        # No valid cards, break out of the loop or handle the situation
                        break
                    
                    # Randomly play a valid card
                    sim_game.action_play_card(np.random.choice(np.flatnonzero(valid_cards_sim)))
                
                # Update score based on the points scored for the simulation
                points = sim_game.state.points[self._team(obs.player)]
                card_scores[i] += points

        return card_scores

    def _team(self, player: int) -> int:
        """
        Determine the team number for the given player.
        Players 0 and 2 are in team 0, and players 1 and 3 are in team 1.
        """
        return player % 2


In [193]:
rule = RuleSchieber()
game = GameSim(rule=rule)
agent = MCTSAgent()

np.random.seed(1)
game.init_from_cards(hands=deal_random_hand(), dealer=NORTH)
obs = game.get_observation()
cards = convert_one_hot_encoded_cards_to_str_encoded_list(obs.hand)
print(cards)
trump = agent.action_trump(obs)
game.action_trump(trump)

while not game.is_done():
    game.action_play_card(agent.action_play_card(game.get_observation()))

print(game.state.points)


['DA', 'DK', 'D9', 'D6', 'HA', 'HQ', 'HJ', 'H8', 'H7']
[  5 152]


In [146]:
from jass.arena.arena import Arena

# Assume MCTSAgent and MyAgent are already defined and implemented

# Define the number of games to simulate
num_games = 100

# Initialize the Arena
arena = Arena(nr_games_to_play=num_games)

# Set up the players: Teams 0 and 1 each have 2 players
# Here, team 0 consists of MCTSAgent, and team 1 consists of MyAgent
arena.set_players(MCTSAgent(), MyAgent(), MCTSAgent(), MyAgent())

# Play all the games
arena.play_all_games()

# Retrieve and display the points scored by each team across all games
team_0_points = arena.points_team_0.sum()
team_1_points = arena.points_team_1.sum()

print(f"Team 0 (MCTSAgent) Total Points: {team_0_points}")
print(f"Team 1 (MyAgent) Total Points: {team_1_points}")

# Optionally, you can calculate win ratios
team_0_wins = (arena.points_team_0 > arena.points_team_1).sum()
team_1_wins = (arena.points_team_1 > arena.points_team_0).sum()

print(f"Team 0 (MCTSAgent) Wins: {team_0_wins}")
print(f"Team 1 (MyAgent) Wins: {team_1_wins}")


[........................................]  100/ 100 games played
Team 0 (MCTSAgent) Total Points: 7551.0
Team 1 (MyAgent) Total Points: 8149.0
Team 0 (MCTSAgent) Wins: 42
Team 1 (MyAgent) Wins: 58
