# Assignment - Game: UNO

## Instructions

This is a **self-contained notebook** - everything you need is here!

### Quick Start
1. **Run all cells** up to Section 4 (this loads the game client)
2. **Implement your solver** in Section 5
3. **Update STUDENT_TOKEN** in Section 6
4. **Play the game** by running Section 6

### What You Need To Do
- Focus ONLY on implementing `my_agent()` function (Section 5)
- You can also use `manual_player_solver` to play manually if you want
- Everything else is provided for you!

### About UNO
UNO is a classic card game:
- Players try to get rid of all their cards first
- Match cards by color or number/symbol
- Special cards: Skip, Reverse, Draw 2, Wild, Wild Draw 4
- Must say "UNO" when you have one card left (automatic)
- Strategic card management and timing is key

---
## Section 1: Setup

**Run this cell (no changes needed)**

In [1]:
import requests
import json
import time
import random
from typing import List, Optional, Tuple, Any, Dict

print("‚úÖ Dependencies imported")

BASE_URL = 'https://ie-aireasoning-gr4r5bl6tq-ew.a.run.app'

print("‚úÖ Configuration loaded")

‚úÖ Dependencies imported
‚úÖ Configuration loaded


---
## Section 2: Game Client Library

**Run this cell (no changes needed)**

In [2]:
class GameClient:
    def __init__(self, base_url: str, token: str, debug: bool = False):
        self.base_url = base_url.rstrip('/')
        self.token = token
        self.debug = debug

    def _make_request(self, endpoint: str, params: dict, max_retries: int = 10) -> dict:
        params['TOKEN'] = self.token
        url = f'{self.base_url}{endpoint}'

        for attempt in range(max_retries):
            try:
                if self.debug:
                    print(f"[DEBUG] Request: {endpoint}")
                    print(f"[DEBUG] Params: {params}")

                response = requests.get(url, params=params, timeout=30)

                if self.debug:
                    print(f"[DEBUG] Response [{response.status_code}]: {response.text[:200]}")

                if response.status_code == 200:
                    if response.text:
                        try:
                            return response.json()
                        except (json.JSONDecodeError, ValueError) as e:
                            if self.debug:
                                print(f"[DEBUG] Non-JSON response: {response.text[:100]}")
                            return {}
                    return {}
                else:
                    print(f"‚ö†Ô∏è  HTTP {response.status_code}: {response.text[:200]}")

            except requests.exceptions.Timeout:
                print(f"‚ö†Ô∏è  Request timeout (attempt {attempt + 1}/{max_retries})")
            except requests.exceptions.RequestException as e:
                print(f"‚ö†Ô∏è  Request error: {e} (attempt {attempt + 1}/{max_retries})")
            except Exception as e:
                print(f"‚ö†Ô∏è  Unexpected error: {type(e).__name__}: {e} (attempt {attempt + 1}/{max_retries})")

            if attempt < max_retries - 1:
                time.sleep(1)

        raise Exception(f"Failed to connect to {endpoint} after {max_retries} attempts")

    def create_match(self, game_type: str, num_games: int, multiplayer: bool = False) -> str:
        response = self._make_request('/new-match', {
            'game-type': game_type,
            'num-games': str(num_games),
            'multi-player': 'True' if multiplayer else 'False'
        })

        if 'match-id' not in response:
            print(f"‚ùå Server response missing 'match-id'. Response: {response}")
            raise KeyError(f"Server response missing 'match-id'. Got: {response}")

        return response['match-id']

    def join_match(self, match_id: str) -> dict:
        response = self._make_request('/join-match', {
            'match-id': match_id
        })
        return response

    def get_game_state(self, match_id: str, game_index: int) -> dict:
        return self._make_request('/game-state-in-match', {
            'match-id': match_id,
            'game-index': str(game_index)
        })

    def get_match_state(self, match_id: str) -> dict:
        return self._make_request('/match-state', {
            'match-id': match_id
        })

    def make_move(self, match_id: str, player: str, move: Any) -> bool:
        move_str = move if isinstance(move, str) else json.dumps(move)

        self._make_request('/make-move-in-match', {
            'match-id': match_id,
            'player': player,
            'move': move_str
        })
        return True

print("‚úÖ GameClient loaded")


def play_game(solver, base_url: str, token: str, game_type: str, game_class,
              multiplayer: bool = False, match_id: Optional[str] = None,
              num_games: int = 1, debug: bool = False, verbose: bool = True) -> Tuple:
    client = GameClient(base_url, token, debug=debug)

    if match_id is None:
        if verbose:
            print(f"üéÆ Creating new match: {num_games} x {game_type}")
        match_id = client.create_match(game_type, num_games, multiplayer)
        if verbose:
            print(f"   Match ID: {match_id}")

    if verbose:
        print(f"üîó Joining match {match_id}...")
    match = client.join_match(match_id)
    player = match['player']
    num_games = match.get('num-games', num_games)
    if verbose:
        print(f"   You are player: {player}")

    game_state = client.get_game_state(match_id, 0)
    if game_state['status'] == 'waiting':
        if verbose:
            print("‚è≥ Waiting for opponent to join...")
        while game_state['status'] == 'waiting':
            time.sleep(2)
            game_state = client.get_game_state(match_id, 0)

    all_results = []
    wins = losses = draws = 0

    while True:
        match_state = client.get_match_state(match_id)
        if match_state['status'] != 'in_progress':
            break
        game_num = match_state['current-game-index']

        if verbose:
            print(f"\n{'='*50}")
            print(f"üéÆ GAME {game_num + 1}/{num_games}")
            print(f"{'='*50}\n")

        game_state = client.get_game_state(match_id, game_num)
        if 'my-player' in game_state:
            player = game_state['my-player']

        game = game_class(game_state['state'], game_state['status'], game_state['player'], player)

        while game_state['status'] != 'complete':
            game_state = client.get_game_state(match_id, game_num)
            if 'my-player' in game_state:
                player = game_state['my-player']
            if 'winner' in game_state:
                break

            game = game_class(game_state['state'], game_state['status'], game_state['player'], player)
            if game.is_terminal():
                break

            if verbose:
                game.print_state()

            if game.current_player == player:
                if verbose:
                    print(f"ü§î Your turn (Player {player})...")
                try:
                    move = solver(game)
                    if verbose:
                        print(f"   Move: {move}")
                    client.make_move(match_id, player, move)
                except Exception as e:
                    print(f"‚ùå Error in solver: {e}")
                    import traceback
                    traceback.print_exc()
                    break
            else:
                if verbose:
                    print(f"‚è≥ Waiting for opponent (Player {game.current_player})...")
                time.sleep(2)

        if verbose:
            game.print_state()
            print("=" * 40)

        winner = game_state.get('winner')
        if winner == '-':
            if verbose:
                print("ü§ù DRAW!")
            result = 'draw'
            draws += 1
        elif winner == player:
            if verbose:
                print("üéâ You WON!")
            result = 'win'
            wins += 1
        else:
            if verbose:
                print("üòû You LOST")
            result = 'loss'
            losses += 1

        all_results.append((result, player, winner))

        if verbose and num_games > 1:
            print(f"\nüìä Record: {wins}W - {losses}L - {draws}D")

    return {
        'wins': wins, 'losses': losses, 'draws': draws,
        'total_games': num_games,
        'win_rate': wins / num_games if num_games > 0 else 0,
        'player': player, 'match_id': match_id
    }, all_results

print("‚úÖ play_game loaded")

‚úÖ GameClient loaded
‚úÖ play_game loaded


---
## Section 3: Game State Class

**Run this cell (no changes needed)**

In [3]:
class UnoGame:
    """Represents UNO game state."""

    def __init__(self, state: str, status: str, current_player: str, my_player: str):
        self.state_str = state
        self.status = status
        self.current_player = current_player
        self.my_player = my_player
        self._state = None

    @property
    def state(self) -> Dict:
        if self._state is None:
            self._state = json.loads(self.state_str)
        return self._state

    def get_my_hand(self) -> List[Dict]:
        return self.state['hands'].get(self.my_player, [])

    def get_hand_sizes(self) -> Dict[str, int]:
        hands = self.state.get('hands', {})
        return {player: len(hand) for player, hand in hands.items()}

    def get_current_color(self) -> str:
        return self.state.get('current_color', '')

    def get_top_card(self) -> Dict:
        discard_pile = self.state.get('discard_pile', [])
        return discard_pile[-1] if discard_pile else {}

    def get_discard_pile(self) -> List[Dict]:
        """Get the full discard pile (all cards that have been played)."""
        return self.state.get('discard_pile', [])

    def get_discard_pile_size(self) -> int:
        """Get the number of cards in the discard pile."""
        return len(self.state.get('discard_pile', []))

    def is_terminal(self) -> bool:
        return self.status == 'complete'

    def _can_play_card(self, card: Dict, top_card: Dict, current_color: str) -> bool:
        if not top_card:
            return True
        if card.get('type') == 'wild':
            return True
        if card.get('color') == current_color:
            return True
        if card.get('value') == top_card.get('value'):
            return True
        return False

    def get_valid_moves(self) -> List[Dict]:
        if self.current_player != self.my_player:
            return []

        hand = self.get_my_hand()
        top_card = self.get_top_card()
        current_color = self.get_current_color()
        valid_moves = []

        for i, card in enumerate(hand):
            if self._can_play_card(card, top_card, current_color):
                move = {'type': 'play', 'card_index': i, 'card': card, 'call_uno': len(hand) == 2}
                if card.get('type') == 'wild':
                    for color in ['red', 'blue', 'green', 'yellow']:
                        color_move = move.copy()
                        color_move['color_choice'] = color
                        valid_moves.append(color_move)
                else:
                    valid_moves.append(move)

        valid_moves.append({'type': 'draw', 'count': 1})
        return valid_moves

    def print_state(self):
        print(f"\n{'='*50}")
        print(f"Current Turn: Player {self.current_player}")
        print(f"Current Color: {self.get_current_color().upper()}")
        top_card = self.get_top_card()
        if top_card:
            print(f"Top Card: {top_card.get('color')} {top_card.get('value')}")
        hand_sizes = self.get_hand_sizes()
        print("\nHand Sizes:")
        for p in sorted(hand_sizes.keys()):
            if p != self.my_player:
                print(f"  Player {p}: {hand_sizes[p]} cards")
        my_hand = self.get_my_hand()
        print(f"\nYour Hand ({len(my_hand)} cards):")
        for i, card in enumerate(my_hand):
            print(f"  {i}: {card.get('color')} {card.get('value')}")
        print('='*50)

print("‚úÖ UnoGame class loaded")

‚úÖ UnoGame class loaded


---
## Section 4: Manual solver

**Run this cell (no changes needed)**

In [4]:
def manual_player_solver(game: UnoGame) -> Dict:
    """
    Interactive manual player - YOU choose your moves!
    """
    game.print_state()

    valid_moves = game.get_valid_moves()

    if not valid_moves:
        print("No valid moves!")
        return {'type': 'draw', 'count': 1}

    print(f"\nüéÆ YOUR TURN (Player {game.my_player})!")
    print("\nValid moves:")

    move_list = []
    move_idx = 0

    # Display playable cards
    play_moves = [m for m in valid_moves if m.get('type') == 'play']
    if play_moves:
        print("\nüé¥ Cards you can play:")
        for move in play_moves:
            card_idx = move.get('card_index')
            card = move.get('card', {})
            color = card.get('color', '?').upper()
            value = card.get('value', '?')

            if card.get('type') == 'wild':
                choice = move.get('color_choice', 'red').upper()
                print(f"  {move_idx}: Play WILD as {choice}")
            else:
                print(f"  {move_idx}: Play {color} {value} (card #{card_idx})")

            move_list.append(move)
            move_idx += 1

    # Display draw option
    print(f"\n  {move_idx}: Draw a card")
    move_list.append({'type': 'draw', 'count': 1})

    # Get player choice
    while True:
        try:
            choice = input(f"\nEnter move number (0-{len(move_list)-1}, or 'q' to quit): ").strip()

            if choice.lower() == 'q':
                raise KeyboardInterrupt()

            idx = int(choice)
            if 0 <= idx < len(move_list):
                selected_move = move_list[idx]

                # If it's a play move with a wild card, ask for color choice
                if selected_move.get('type') == 'play' and selected_move.get('card', {}).get('type') == 'wild':
                    print("\nWild card color choices:")
                    colors = ['red', 'blue', 'green', 'yellow']
                    for i, c in enumerate(colors):
                        print(f"  {i}: {c.upper()}")

                    while True:
                        try:
                            color_choice = input("Choose color (0-3): ").strip()
                            color_idx = int(color_choice)
                            if 0 <= color_idx < len(colors):
                                selected_move['color_choice'] = colors[color_idx]
                                break
                            else:
                                print(f"‚ùå Invalid choice! Enter 0-{len(colors)-1}")
                        except ValueError:
                            print("‚ùå Invalid input! Enter a number.")

                return selected_move
            else:
                print(f"‚ùå Invalid index! Choose 0-{len(move_list)-1}")

        except ValueError:
            print("‚ùå Invalid input! Enter a number or 'q' to quit.")
        except KeyboardInterrupt:
            print("\nüëã Thanks for playing!")
            raise

print("‚úÖ Manual player solver loaded")

‚úÖ Manual player solver loaded


---
## Section 5: YOUR SOLVER IMPLEMENTATION

**‚≠ê THIS IS WHERE YOU WRITE YOUR CODE! ‚≠ê**

### Available Methods

```python
game.get_my_hand()                        # List of cards in your hand
game.get_hand_sizes()                     # Dict of player hand sizes
game.get_current_color()                  # Current active color
game.get_top_card()                       # Card on top of discard pile
game.get_discard_pile()                   # Full discard pile history (all played cards)
game.get_discard_pile_size()              # Number of cards in discard pile
game.get_valid_moves()                    # All valid moves you can make
game.is_terminal()                        # Whether game is finished
game.print_state()                        # Print current game state
```

### Move Format
- Play a card: `{'type': 'play', 'card_index': i, 'card': {...}, 'color_choice': 'red'}`
  - `card_index`: Index of card in your hand (0-indexed)
  - `color_choice`: Only required for Wild cards (one of: 'red', 'blue', 'green', 'yellow')
- Draw a card: `{'type': 'draw', 'count': 1}`

### Card Information
Card dict contains: `{'color': 'red', 'value': '5', 'type': 'normal'}`
- Colors: 'red', 'blue', 'green', 'yellow'
- Values: '0'-'9', 'skip', 'reverse', 'draw2'
- Types: 'normal', 'action', 'wild'

### Strategic Tips
- **Discard Pile History**: Use `game.get_discard_pile()` to see all cards that have been played. This allows you to implement card-counting strategies!
- **Example**: Count which high cards are still in the deck vs. already played to make better decisions
- **Hand Tracking**: Use `game.get_hand_sizes()` to track how many cards opponents have

In [5]:
# This is a dictionary. its purpose is to define numeric weights for every heuristic.
# Weights = how important a rule is in the deciion making process.
# This is utility-Based Agent Design --> it's what makes the agent rational, becasue it ends up by usign teh move with the highest utility

W = {
    # T0: Hard constraints (¬±500)
    "HARD_WILD_COLOR_LOCK": 500.0,
    "HARD_UNO_FINISH": 400.0,

    # T1: Critical heuristics (¬±100)
    "CRIT_THREAT_DRAW4": 120.0,
    "CRIT_THREAT_DRAW2": 100.0,
    "CRIT_THREAT_SKIP_REVERSE": 90.0,
    "CRIT_THREAT_WILD": 80.0,
    "CRIT_DEFENCE_NO_DEF_LEFT": -80.0,

    # T2: Strong preferences (¬±30)
    "STRONG_ACTION_GENERAL": 25.0,
    "STRONG_ENDGAME_ACTION": 20.0,
    "STRONG_WILD_BEST_COLOR": 20.0,
    "STRONG_WILD_BAD_COLOR": -15.0,

    # T3: Soft preferences (¬±10)
    "SOFT_COLOR_DIVERSITY_LOSS": -12.0,
    "SOFT_MONOCHROME_HAND": -20.0,
    "SOFT_DISCARD_DOM_COLOR": 8.0,
    "SOFT_KEEP_STREAK_BREAK": -8.0,
    "SOFT_KEEP_STREAK_HELP": 5.0,
    "SOFT_LARGE_HAND_NUMBER": 5.0,

    # T4: Micro-signals (¬±3‚Äì5)
    "MICRO_TEMPO_SAFE": 5.0,
    "MICRO_TEMPO_NUMBER": 3.0,
    "MICRO_PLAY_BIAS": 0.5,
}


In [6]:
# Helper Functions (refactored for clarity + AI-style structure)
from typing import List, Dict, Optional
from dataclasses import dataclass


# This function counts how many cards of each color we're holding.
# Purpose: helps the agent know its strongest color, possible risks, choose the best wild color.
def get_hand_color_counts(hand: List[Dict]) -> Dict[str, int]:
    counts = {'red': 0, 'green': 0, 'blue': 0, 'yellow': 0}
    for card in hand:
        if card.get('type') != 'wild' and card.get('color') in counts:
            counts[card['color']] += 1
    return counts

# Returns the color we have the most.
# Used for selecting colors for wild cards.
def get_best_wild_color_choice(hand_color_counts: Dict[str, int]) -> str:
    best_color = 'red'
    max_count = -1
    for color, count in hand_color_counts.items():
        if count > max_count:
            max_count = count
            best_color = color
    return best_color

# This counts colors previously played (cards that apepared in the discard pile).
# Reasoning: to detect opponent preferences (and avoid them).
def get_discard_color_counts(discard_pile: List[Dict]) -> Dict[str, int]:
    counts = {'red': 0, 'green': 0, 'blue': 0, 'yellow': 0}
    for card in discard_pile:
        c = card.get('color')
        if c in counts:
            counts[c] += 1
    return counts

# Compact representation of the strategic game state.
# Here we're reducing the complex world into a structured set of features for decision making.
# It includes:
# - our hand,
# - opponent hand sizes,
# - opponent danger levels (UNO or low cards),
# - color distribution in our hand,
# - best color for wild cards,
# - color trap risk (too many cards of one color),
# - discard statistics (dominant color),
# - next player danger level,
# - current color of the game@dataclass,
class GameFeatures:
    my_hand: List[Dict]
    hand_sizes: Dict[str, int]
    opponent_min_cards: int
    any_opponent_uno: bool
    any_opponent_low: bool
    color_counts: Dict[str, int]
    best_wild_color: str
    color_trap_risk: bool
    dominant_color: Optional[str]
    discard_color_counts: Dict[str, int]
    discard_dominant_color: Optional[str]
    next_player_is_danger: bool
    current_color: str

#This function extracts and computes all startegic information from the raw game state.
# This function turns the low-level environment into a "belief state".
# for the AI, matching the course concepts:
# - abstraction.
# - limited lookahead.
# - opponent modeling.
# - partial observability.
#
# Includes:
# - opponents' card counts (to detect threats).
# - our color counts (strength/weakness).
# - dominant discard color (environment trend).
# - color trap risk.
# - whether the next player is dangerous.
# - what the current color is.

def compute_game_features(game: UnoGame) -> GameFeatures:
    my_hand = game.get_my_hand()
    hand_sizes = game.get_hand_sizes()

    opponent_counts = [count for p, count in hand_sizes.items() if p != game.my_player]
    opponent_min_cards = min(opponent_counts) if opponent_counts else 99
    any_opponent_uno = any(count == 1 for p, count in hand_sizes.items() if p != game.my_player)
    any_opponent_low = any(count <= 2 for p, count in hand_sizes.items() if p != game.my_player)

    color_counts = get_hand_color_counts(my_hand)
    best_wild_color = get_best_wild_color_choice(color_counts)

    color_trap_risk = False
    dominant_color = None
    if len(my_hand) <= 6:
        for c, cnt in color_counts.items():
            if cnt >= len(my_hand) * 0.6:
                color_trap_risk = True
                dominant_color = c
                break

    discard_pile = game.get_discard_pile()
    discard_color_counts = get_discard_color_counts(discard_pile)
    discard_dominant_color = None
    if discard_pile:
        discard_dominant_color = max(discard_color_counts, key=lambda c: discard_color_counts[c])

    turn_order = ['1', '2', '3', '4']
    next_player_is_danger = False
    if game.my_player in turn_order:
        my_idx = turn_order.index(game.my_player)
        next_player = turn_order[(my_idx + 1) % 4]
        next_player_cards = hand_sizes.get(next_player, 99)
        next_player_is_danger = next_player_cards <= 2

    current_color = game.get_current_color()

    return GameFeatures(
        my_hand=my_hand,
        hand_sizes=hand_sizes,
        opponent_min_cards=opponent_min_cards,
        any_opponent_uno=any_opponent_uno,
        any_opponent_low=any_opponent_low,
        color_counts=color_counts,
        best_wild_color=best_wild_color,
        color_trap_risk=color_trap_risk,
        dominant_color=dominant_color,
        discard_color_counts=discard_color_counts,
        discard_dominant_color=discard_dominant_color,
        next_player_is_danger=next_player_is_danger,
        current_color=current_color,
    )

# This fucntion simulates how our hand will look after a given move.
# This is a one-step lookahead heuristic (standard in decision-making agents).
# We're not simulating the entire game tree because it would be too costly, just next-hand state.
def simulate_future_hand_after_move(my_hand: List[Dict], move: Dict) -> List[Dict]:
    if move.get('type') != 'play':
        return my_hand[:]

    future = my_hand.copy()
    card = move['card']

    try:
        future.remove(card)
    except ValueError:
        idx = move.get('card_index')
        if idx is not None and 0 <= idx < len(future):
            future.pop(idx)

    return future

# Identify wild cards.
def is_wild_card(card: Dict) -> bool:
    return card.get('type') == 'wild'

# Identify action cards: skip, reverse, draw2, wild_draw4.
# This type of cards has a disruptive impact on the game, which is important for adversial reasoning.
def is_action_card(card: Dict) -> bool:
    t = card.get('type')
    v = str(card.get('value', ''))
    return (t == 'action') or (t == 'wild' and 'draw4' in v)

# It estimated the probability of the opponents beign strong given a color.
# This is based on the colors that have appeared in the discard pile.
# Not exact becasue the game is partially observable, but it's a proxy for belief.
def estimate_opponent_color_probability(color: str, features: GameFeatures) -> float:
    if not color:
        return 0.25
    total_seen = sum(features.discard_color_counts.values())
    if total_seen == 0:
        return 0.25
    return features.discard_color_counts.get(color, 0) / total_seen


# The following functions are Heuristic Modules.

# Here if we have zero cards of the same color as the top card, and wild is available = BIG REWARD.
# Reasoning: being color locked is dangerous, if the agent plays wild = fix the state.
# This is a HARD constraint ‚Üí big reward.
def H_color_lock(move, features, is_wild):
    if not features.current_color:
        return 0.0
    locked = features.color_counts.get(features.current_color, 0) == 0
    if locked and is_wild:
        return W["HARD_WILD_COLOR_LOCK"]
    return 0.0

# If the move leads to UNO, we get HUGE reward, because it means winning (primary goal --> strong utility signal).
def H_uno_finish(move):
    return W["HARD_UNO_FINISH"] if move.get('call_uno') else 0.0

# If the opponent has less or two cards:
# In threat mode:
# - Draw 2, Draw4, Skip = strong reward/highly valuable,
# - Reward for changing the color into a less threatening one,
# - Penalty for playing weak cards,
# Also, if an opponent is at UNO:
# - Heavy punishment for playing weak moves,
# - Strong reward for disruptive moves,
def H_threat_mode(move, features, is_wild, is_draw4, value, card_color):
    score = 0.0
    low = features.opponent_min_cards <= 2

    if low:
        if is_draw4: score += W["CRIT_THREAT_DRAW4"]
        if value == 'draw2': score += W["CRIT_THREAT_DRAW2"]
        if value in ['skip', 'reverse']: score += W["CRIT_THREAT_SKIP_REVERSE"]
        if is_wild and not is_draw4: score += W["CRIT_THREAT_WILD"]

        p_danger = estimate_opponent_color_probability(features.current_color, features)
        if card_color and card_color != features.current_color:
            score += 50.0 * p_danger

    if features.any_opponent_uno:
        if value in ['draw2', 'skip', 'reverse'] or is_draw4:
            score += 40.0
        elif is_wild:
            score += 20.0
        else:
            score -= 30.0

    return score

# Choosing best color = reward.
# Choosing trap color = penalize.
# Choosing a color we barely have (wasted wild) = penalize.
# Burning a wild too early when a normal play exists = = penalize.
# This manages resource optimization, and wild cards are powerful resources.
def H_wild_usage(move, features, is_wild, is_draw4, game: UnoGame):
    if not is_wild:
        return 0.0

    score = 0.0
    chosen_color = move.get('color_choice')
    if not chosen_color:
        return 0.0

    if chosen_color == features.best_wild_color:
        score += W["STRONG_WILD_BEST_COLOR"]
    else:
        score += W["STRONG_WILD_BAD_COLOR"]

    if features.color_counts.get(chosen_color, 0) <= 1:
        score -= 10.0

    if features.color_trap_risk and features.dominant_color == chosen_color:
        score -= 10.0

    if not is_draw4:
        has_non_wild_play = any(
            (m.get('type') == 'play' and m['card'].get('type') != 'wild')
            for m in game.get_valid_moves()
        )
        if has_non_wild_play and len(features.my_hand) > 3:
            score -= 20.0

    return score

# Small reward for action cards in general, because they give control.
def H_action_general(is_action):
    return W["STRONG_ACTION_GENERAL"] if is_action else 0.0

# When the hand has very few cards, action cards get stronger, and clearing high freq. colors is valuable.
def H_endgame(move, features, is_action, card_color, future_hand):
    score = 0.0
    if len(features.my_hand) <= 4:
        if is_action:
            score += W["STRONG_ENDGAME_ACTION"]
        if card_color in features.color_counts:
            count = features.color_counts[card_color]
            score += 8.0 / (count + 0.5)
    return score

# Penalize for only having one color bc it makes us predictable & vulnerable.
def H_color_diversity(features, future_hand):
    score = 0.0
    future_colors = get_hand_color_counts(future_hand)

    nonzero = sum(cnt > 0 for cnt in future_colors.values())
    if nonzero == 1 and len(future_hand) > 1:
        score += W["SOFT_MONOCHROME_HAND"]

    if len(features.my_hand) <= 6:
        before = sum(cnt > 0 for cnt in features.color_counts.values())
        after = sum(cnt > 0 for cnt in future_colors.values())
        if after < before:
            score += W["SOFT_COLOR_DIVERSITY_LOSS"]

    return score

# if the opponets are low and our future hand has no defence = large penalty.
# Reasoning: we will be unable to respond to attacks when needed.
def H_future_defence(features, future_hand):
    if not features.any_opponent_low:
        return 0.0

    future_actions = sum(1 for c in future_hand if c.get('type') == 'action')
    future_wilds = sum(1 for c in future_hand if c.get('type') == 'wild')

    if future_actions == 0 and future_wilds == 0:
        return W["CRIT_DEFENCE_NO_DEF_LEFT"]
    return 0.0

# Streak = staying in the same color accross multiple turns.
# Breaking a streak = fine.
# Breaking it completely in mid-hand can be risky.
def H_keep_streak(move, features, card_color, future_hand):
    score = 0.0
    if not features.current_color or card_color != features.current_color:
        return 0.0

    before = sum(1 for c in features.my_hand if c.get('color') == features.current_color)
    after = sum(1 for c in future_hand if c.get('color') == features.current_color)

    if before > 0 and after == 0 and len(features.my_hand) > 3:
        score += W["SOFT_KEEP_STREAK_BREAK"]

    return score

# follow the color trend if we have mid-sized hand
def H_discard_correlation(card_color, features):
    if features.discard_dominant_color is None:
        return 0.0
    if 4 < len(features.my_hand) <= 8 and card_color == features.discard_dominant_color:
        return W["SOFT_DISCARD_DOM_COLOR"]
    return 0.0

# If the next opponent is dangerous:
# - reward action cards,
# - penalize weak number plays,
# - penalize setting colors opponets like.
def H_next_player_danger(is_action, is_wild, move, features):
    score = 0.0
    if not features.next_player_is_danger:
        return 0.0

    if is_action:
        score += 20.0
    elif not is_wild:
        score -= 10.0

    if is_wild:
        chosen_color = move.get('color_choice')
        if chosen_color and features.color_counts.get(chosen_color, 0) < 2:
            score -= 5.0

    return score

# fast play when safe, slow play when opponets are dangerous
def H_tempo(move, is_action, is_wild, features, card_color):
    score = 0.0

    if not is_wild:
        score += W["MICRO_TEMPO_NUMBER"]

    if len(features.my_hand) > 6 and (not is_action) and (not is_wild):
        score += W["SOFT_LARGE_HAND_NUMBER"]

    if not features.any_opponent_low:
        score += W["MICRO_TEMPO_SAFE"]
        if not is_wild and not is_action:
            score += W["MICRO_TEMPO_NUMBER"]
    else:
        if is_action or is_wild:
            score += 2.0

    if card_color in features.color_counts:
        cnt = features.color_counts[card_color]
        score += 8.0 / (cnt + 0.5)

    return score

# Master evaluation function:
# For all moves:
# 1. compute the heuristic scores
# 2. Sum up weighted results
# 3. return a single utility value
# Reasoning: core of utility-based reasoning, every move becomes a number, and the agent chooses the highest.
def score_move(move, game: UnoGame, features: GameFeatures) -> float:
    my_hand = features.my_hand
    card = move['card']
    value = str(card.get('value', ''))
    card_color = card.get('color', '')
    is_wild = is_wild_card(card)
    is_action = is_action_card(card)
    is_draw4 = is_wild and 'wild_draw4' in value

    future_hand = simulate_future_hand_after_move(my_hand, move)

    score = 0.0
    score += H_color_lock(move, features, is_wild)
    score += H_uno_finish(move)
    score += H_threat_mode(move, features, is_wild, is_draw4, value, card_color)
    score += H_future_defence(features, future_hand)

    score += H_wild_usage(move, features, is_wild, is_draw4, game)

    score += H_action_general(is_action)
    score += H_endgame(move, features, is_action, card_color, future_hand)
    score += H_color_diversity(features, future_hand)
    score += H_discard_correlation(card_color, features)
    score += H_keep_streak(move, features, card_color, future_hand)
    score += H_next_player_danger(is_action, is_wild, move, features)
    score += H_tempo(move, is_action, is_wild, features, card_color)

    score += W["MICRO_PLAY_BIAS"]

    return score

# Brain that controls our turn:
# 1. Get valid moves,
# 2. If no valid cards = draw card,
# 3. Compute features,
# 4. Score every move,
# 5. Select the higest-scoring one
# This is the argmax agent --> rational_action = argmax_over_moves(Utility(move))
def my_agent(game: UnoGame) -> Dict:
    valid_moves = game.get_valid_moves()
    play_moves = [m for m in valid_moves if m.get('type') == 'play']

    draw_move = next(
        (m for m in valid_moves if m.get('type') == 'draw'),
        {'type': 'draw', 'count': 1}
    )

    if not play_moves:
        return draw_move

    features = compute_game_features(game)

    best_move = None
    best_score = -1e9

    for move in play_moves:
        score = score_move(move, game, features)
        if score > best_score:
            best_score = score
            best_move = move

    return best_move if best_move is not None else draw_move


---
## Section 6: Play the Game!

**Update STUDENT_TOKEN below and run to play**

You can choose which solver to use:
- `my_agent` - Your AI implementation (default)
- `manual_player_solver` - Interactive manual play

In [7]:
STUDENT_TOKEN = 'YOUR-NAME'
SOLVER = my_agent  # Change to manual_player_solver to play manually!
MULTIPLAYER = False
MATCH_ID = None
NUM_GAMES = 1

try:
    stats, results = play_game(
        solver=SOLVER,
        base_url=BASE_URL,
        token=STUDENT_TOKEN,
        game_type='uno4',
        game_class=UnoGame,
        multiplayer=MULTIPLAYER,
        num_games=NUM_GAMES,
        match_id=MATCH_ID,
        verbose=True
    )

    print("\nüìä Summary:")
    print(f"   Record: {stats['wins']}W - {stats['losses']}L - {stats['draws']}D")
    print(f"   Win Rate: {stats['win_rate']*100:.1f}%")

except Exception as e:
    print(f"‚ùå Game error: {e}")
    import traceback
    traceback.print_exc()

üéÆ Creating new match: 1 x uno4
   Match ID: 1535
üîó Joining match 1535...
   You are player: 1

üéÆ GAME 1/1


Current Turn: Player 3
Current Color: BLUE
Top Card: blue 3

Hand Sizes:
  Player 1: 7 cards
  Player 2: 7 cards
  Player 4: 7 cards

Your Hand (7 cards):
  0: blue 0
  1: yellow 6
  2: red 9
  3: blue 2
  4: wild wild
  5: blue 5
  6: blue reverse
ü§î Your turn (Player 3)...
   Move: {'type': 'play', 'card_index': 6, 'card': {'color': 'blue', 'value': 'reverse', 'type': 'action'}, 'call_uno': False}

Current Turn: Player 1
Current Color: BLUE
Top Card: blue 8

Hand Sizes:
  Player 1: 7 cards
  Player 2: 6 cards
  Player 4: 7 cards

Your Hand (6 cards):
  0: blue 0
  1: yellow 6
  2: red 9
  3: blue 2
  4: wild wild
  5: blue 5
‚è≥ Waiting for opponent (Player 1)...

Current Turn: Player 3
Current Color: RED
Top Card: red 8

Hand Sizes:
  Player 1: 6 cards
  Player 2: 6 cards
  Player 4: 6 cards

Your Hand (6 cards):
  0: blue 0
  1: yellow 6
  2: red 9
  3: blue 2
  4: