# Intelligent Systems 2022: 4th  practical assignment

## The Schnapsen Engine


Your name: Sebastião Manuel Inácio Rosalino

Your VUnetID: sxx209

If you do not provide your name and VUnetID we will not accept your submission. 

### Learning objectives
At the end of this exercise you should be able to use the Schnapsen platform, run basic games between agents, and run tournaments in order to evaluate rational agents (also called bots). 

1. Understand the main functionality of the Schnapsen platform (playing games and running tournements)
2. Implement your own rational agents (bots)

### Practicalities

Follow this Notebook step-by-step. 

Of course, you can do the exercises in any Programming Editor of your liking. But you do not have to. Feel free to simply write code in the Notebook. Please use your studentID+Assignment4.ipynb as the name of the Notebook.  

Note: unlike the courses dedicated to programming we will not evaluate the style of the programs. But we will, however, test your programs on other data that we provide, and your program should give the correct output to the test-data as well.

As was mentioned, the assignment is graded as pass/fail. To pass you need to have either a full working code or an explanation of what you tried and what didn't work for the tasks that you were unable to complete (you can use multi-line comments or a text cell).

## Initialising 

First, let us make sure the necessary packages are installed, and imported. Run the following code:

In [1]:
import sys, random

from api import State, engine, util

## Playing the first games

The basic engine comes with three basic bots: rand, bully and rdeep (the rest you can ignore for now). To try them out, just run the following bit of code. 

In [2]:
# Choose your first player
player1 = "rand"
player2 = "bully"
# Decide in which phase you want to start the game. 
startphase = 1
# Decide whether you want verbose output or not 
verbose=True 

#And here you run a game on the engine. 
engine.play(util.load_player(player1),util.load_player(player2), state=State.generate(phase=startphase), max_time=10000, verbose=verbose)

player1: <bots.rand.rand.Bot object at 0x0000027AC8F1FA00>
player2: <bots.bully.bully.Bot object at 0x0000027AC8F1FDF0>
*   Player 2 plays: 10H
The game is in phase: 1
Player 1's points: 0, pending: 0
Player 2's points: 0, pending: 0
The trump suit is: H
Player 1's hand: 10C 10D 10S QS JS
Player 2's hand: AC JC JD 10H KH
There are 10 cards in the stock
Player 2 has played card: 10 of H

*   Player 1 plays: QS
The game is in phase: 1
Player 1's points: 0, pending: 0
Player 2's points: 13, pending: 0
The trump suit is: H
Player 1's hand: 10C 10D KD 10S JS
Player 2's hand: AC JC JD KH QH
There are 8 cards in the stock

*   Player 2 plays: KH
The game is in phase: 1
Player 1's points: 0, pending: 0
Player 2's points: 13, pending: 0
The trump suit is: H
Player 1's hand: 10C 10D KD 10S JS
Player 2's hand: AC JC JD KH QH
There are 8 cards in the stock
Player 2 has played card: K of H

*   Player 1 plays: KD
The game is in phase: 1
Player 1's points: 0, pending: 0
Player 2's points: 21, pendin

(2, 3)

Running engine.play provides some textual output to show how the game is progressing. At every plie (half a turn) you will see what move the player made and a concise overview of the board. Something like this (when you run a game in verbose mode):

> Player 1 plays: KC
> The game is in phase: 1<br>
> Player 1's points: 21, pending: 0<br>
> Player 2's points: 25, pending: 0<br>
> Player 1's hand: 10C JC AD QD QH<br>
> Player 2's hand: AC KD 10H KS QS<br>
There are 2 cards in the stock<br>


The first line signifies that player 1 has played the King of Clubs card. Internally these cards are represented by indices from 0 to 19. To make the translation between indices and card names, use util.get_card_name(index), which returns the rank and the suit of the card as a tuple. Alternatively, use util.get_rank(index) and util.get_suit(index) for each property alone. In this case it is a King of Clubs. Have a look at the GitHub readme or at the top of __deck.py to see the convention used for encoding cards into indices.

In [3]:
util.get_card_name(2)


('K', 'C')

You can also run the Python programmes provided from the command line, or in your favourite editor. <br>
Run 
> python play.py -1 rand -2 bully 

to run the rand bot against the bully bot, or 
> python play.py -h 

to see other options. 




There is a lot of randomness involved in the game when the cards are distributed to the players and the pile. To get an accurate sense of whether one player is better than another, you'll need to play a number of different games. The following code will play a tournament between bully and rand where every pair of participants plays 10 matches. 


In [4]:
botnames = []
verbose = False 
myphase = 1
myrepeats = 10

# Create player 1
player1 = util.load_player("rand") 
player2 = util.load_player("bully") 

bots = [player1,player2]

n = len(bots)
wins = [0] * len(bots)
matches = [(p1, p2) for p1 in range(n) for p2 in range(n) if p1 < p2]

totalgames = (n*n - n)/2 * myrepeats
playedgames = 0

print('Playing {} games:'.format(int(totalgames)))
for a, b in matches:
    for r in range(myrepeats):

        if random.choice([True, False]):
            p = [a, b]
        else:
            p = [b, a]

        # Generate a state with a random seed
        state = State.generate(phase=myphase)

        winner, score = engine.play(bots[p[0]], bots[p[1]], state, 1000, verbose, True)

        if winner is not None:
            winner = p[winner - 1]
            wins[winner] += score

        playedgames += 1
        print('Played {} out of {:.0f} games ({:.0f}%): {} \r'.format(playedgames, totalgames, playedgames/float(totalgames) * 100, wins))

print('Results:')
for i in range(len(bots)):
    print('    bot {}: {} points'.format(bots[i], wins[i]))

Playing 10 games:
Played 1 out of 10 games (10%): [1, 0] 
Played 2 out of 10 games (20%): [1, 3] 
Played 3 out of 10 games (30%): [1, 5] 
Played 4 out of 10 games (40%): [2, 5] 
Played 5 out of 10 games (50%): [2, 8] 
Played 6 out of 10 games (60%): [2, 9] 
Played 7 out of 10 games (70%): [2, 10] 
Played 8 out of 10 games (80%): [3, 10] 
Played 9 out of 10 games (90%): [5, 10] 
Played 10 out of 10 games (100%): [5, 12] 
Results:
    bot <bots.rand.rand.Bot object at 0x0000027AC8C04790>: 5 points
    bot <bots.bully.bully.Bot object at 0x0000027AC8C04730>: 12 points


### Task 1: 
The previous code runs a tournament between rand and bully, but you can adapt the script by testing the performance of these bots with the third default bot, rdeep, as the opponent. The general idea of rdeep was extensively discussed under the header PIMS (Perfect Information Monte Carlo Sampling). Report in the following Cell on the results you get from two-player tournaments including rdeep, rand and bully (rdeep vs. rand; rdeep vs. bully). Describe which games you played, and who won. 

*Hint: You only have to add one single line of code.*


In [7]:
# Rdeep vs Rand

botnames = []
verbose = False 
myphase = 1
myrepeats = 10

# Create player 1
player1 = util.load_player("rand") 
player2 = util.load_player("rdeep")

bots = [player1,player2]

n = len(bots)
wins = [0] * len(bots)
matches = [(p1, p2) for p1 in range(n) for p2 in range(n) if p1 < p2]

totalgames = (n*n - n)/2 * myrepeats
playedgames = 0

print('Playing {} games:'.format(int(totalgames)))
for a, b in matches:
    for r in range(myrepeats):

        if random.choice([True, False]):
            p = [a, b]
        else:
            p = [b, a]

        # Generate a state with a random seed
        state = State.generate(phase=myphase)

        winner, score = engine.play(bots[p[0]], bots[p[1]], state, 1000, verbose, True)

        if winner is not None:
            winner = p[winner - 1]
            wins[winner] += score

        playedgames += 1
        print('Played {} out of {:.0f} games ({:.0f}%): {} \r'.format(playedgames, totalgames, playedgames/float(totalgames) * 100, wins))

print('Results:')
for i in range(len(bots)):
    print('    bot {}: {} points'.format(bots[i], wins[i]))

Playing 10 games:
Played 1 out of 10 games (10%): [0, 2] 
Played 2 out of 10 games (20%): [0, 4] 
Played 3 out of 10 games (30%): [0, 7] 
Played 4 out of 10 games (40%): [0, 9] 
Played 5 out of 10 games (50%): [0, 10] 
Played 6 out of 10 games (60%): [0, 12] 
Played 7 out of 10 games (70%): [0, 13] 
Played 8 out of 10 games (80%): [0, 15] 
Played 9 out of 10 games (90%): [0, 17] 
Played 10 out of 10 games (100%): [0, 19] 
Results:
    bot <bots.rand.rand.Bot object at 0x0000027AC8B7FD60>: 0 points
    bot <bots.rdeep.rdeep.Bot object at 0x0000027AC8B7E950>: 19 points


In [8]:
# Rdeep vs Bully

botnames = []
verbose = False 
myphase = 1
myrepeats = 10

# Create player 1
player1 = util.load_player("bully") 
player2 = util.load_player("rdeep")

bots = [player1,player2]

n = len(bots)
wins = [0] * len(bots)
matches = [(p1, p2) for p1 in range(n) for p2 in range(n) if p1 < p2]

totalgames = (n*n - n)/2 * myrepeats
playedgames = 0

print('Playing {} games:'.format(int(totalgames)))
for a, b in matches:
    for r in range(myrepeats):

        if random.choice([True, False]):
            p = [a, b]
        else:
            p = [b, a]

        # Generate a state with a random seed
        state = State.generate(phase=myphase)

        winner, score = engine.play(bots[p[0]], bots[p[1]], state, 1000, verbose, True)

        if winner is not None:
            winner = p[winner - 1]
            wins[winner] += score

        playedgames += 1
        print('Played {} out of {:.0f} games ({:.0f}%): {} \r'.format(playedgames, totalgames, playedgames/float(totalgames) * 100, wins))

print('Results:')
for i in range(len(bots)):
    print('    bot {}: {} points'.format(bots[i], wins[i]))

Playing 10 games:
Played 1 out of 10 games (10%): [0, 2] 
Played 2 out of 10 games (20%): [0, 4] 
Played 3 out of 10 games (30%): [3, 4] 
Played 4 out of 10 games (40%): [3, 6] 
Played 5 out of 10 games (50%): [3, 8] 
Played 6 out of 10 games (60%): [3, 10] 
Played 7 out of 10 games (70%): [3, 12] 
Played 8 out of 10 games (80%): [3, 14] 
Played 9 out of 10 games (90%): [6, 14] 
Played 10 out of 10 games (100%): [6, 15] 
Results:
    bot <bots.bully.bully.Bot object at 0x0000027AC8C07250>: 6 points
    bot <bots.rdeep.rdeep.Bot object at 0x0000027AC8C06260>: 15 points


In [9]:
Report1 = """

Knowing the algorithms behind the Rand and Bully bots, and knowing that the Bully bot has beated Rand, I then proceeded to investigate the course of the 
games opposing Rand to the new bot Rdeep and the head-to-head between Bully and Rdeep.
In the first matchup between Rand and Rdeep, 10 games were played, and we can conclude that Rdeep won in a sweeping manner.
In the second matchup between Bully and Rdeep, I played 10 games again, and, after observing the results, I concluded that Rdeep 
has won again, however, not so comfortably compared to the match against Rand.
Considering the luck factor and the different strategies adopted by these 3 bots, these results are not surprising, leading to the assumption that 
the strongest might be Rdeep, followed by Bully, and finally the weakest bot Rand.

"""

### Task 2: 
The previous code runs a tournament between two bots only, but you can easily adapt the script above to play round-robin tournament. All you have to do is to add a third player to the bots list. Report in the following Cell on the results you get from three-player tournament including rdeep, rand and bully. Add the (non-verbose) output of the script. Report on the results of the tournament and try to explain in your own words what do the results mean.

*Hint: You only have to adapt one additional line of code.*

In [10]:
botnames = []
verbose = False 
myphase = 1
myrepeats = 10

# Create player 1
player1 = util.load_player("rand") 
player2 = util.load_player("bully")
player3 = util.load_player("rdeep")


bots = [player1,player2,player3]

n = len(bots)
wins = [0] * len(bots)
matches = [(p1, p2) for p1 in range(n) for p2 in range(n) if p1 < p2]

totalgames = (n*n - n)/2 * myrepeats
playedgames = 0

print('Playing {} games:'.format(int(totalgames)))
for a, b in matches:
    for r in range(myrepeats):

        if random.choice([True, False]):
            p = [a, b]
        else:
            p = [b, a]

        # Generate a state with a random seed
        state = State.generate(phase=myphase)

        winner, score = engine.play(bots[p[0]], bots[p[1]], state, 1000, verbose, True)

        if winner is not None:
            winner = p[winner - 1]
            wins[winner] += score

        playedgames += 1
        print('Played {} out of {:.0f} games ({:.0f}%): {} \r'.format(playedgames, totalgames, playedgames/float(totalgames) * 100, wins))

print('Results:')
for i in range(len(bots)):
    print('    bot {}: {} points'.format(bots[i], wins[i]))

Playing 30 games:
Played 1 out of 30 games (3%): [2, 0, 0] 
Played 2 out of 30 games (7%): [2, 2, 0] 
Played 3 out of 30 games (10%): [3, 2, 0] 
Played 4 out of 30 games (13%): [5, 2, 0] 
Played 5 out of 30 games (17%): [6, 2, 0] 
Played 6 out of 30 games (20%): [6, 4, 0] 
Played 7 out of 30 games (23%): [6, 7, 0] 
Played 8 out of 30 games (27%): [7, 7, 0] 
Played 9 out of 30 games (30%): [7, 9, 0] 
Played 10 out of 30 games (33%): [7, 10, 0] 
Played 11 out of 30 games (37%): [7, 10, 2] 
Played 12 out of 30 games (40%): [7, 10, 5] 
Played 13 out of 30 games (43%): [7, 10, 8] 
Played 14 out of 30 games (47%): [7, 10, 10] 
Played 15 out of 30 games (50%): [8, 10, 10] 
Played 16 out of 30 games (53%): [8, 10, 13] 
Played 17 out of 30 games (57%): [11, 10, 13] 
Played 18 out of 30 games (60%): [12, 10, 13] 
Played 19 out of 30 games (63%): [12, 10, 15] 
Played 20 out of 30 games (67%): [12, 10, 17] 
Played 21 out of 30 games (70%): [12, 10, 18] 
Played 22 out of 30 games (73%): [12, 10, 21

In [11]:
Report2 = """

Moving on to the round-robin tournament, the previously suspected thoery have empirically happen. 30 games were played, and the results obtained were 
the following: first place for the Rdeep bot, second place for the Bully bot and third place for the Rand bot. The obtained results suggest the 
confirmation of the main idea that the strongest bot is Rdeep, the second strongest bot is Bully and the weakest bot is Rand.
This hierarchy exists due to the different strategies adopted by each of the bots, the Rand bot could be the weakest one due to the completely uninformed 
stategy it practices (within the legal movements it always plays randomly). The Bully bot has the potential to be stronger than the previous one as it 
applies a perhaps stragically non-ideal plan and therefore highly questionable in the long run but nevertheless more informed and thoughtful than 
Rand (will be explained in detail in task 3).
Finally, the Rdeep bot, has the potential of outperforming the previous ones, as it puts into practice an already quite educated and complex strategy 
based on probabilities (Perfect Information Monte Carlo Sampling).

"""

## Inspecting the code

Now let's have a look at how the bots work. Open the file bots/rand/rand.py in PyCharm. Each bot is a class called Bot. 

> We will use more advanced features of Python than what you have seen so far in Introduction to Python (don’t worry), so for more details have a look at:
>    https://www.learnpython.org/en/Classes_and_Objects

The rand bot contains nothing but an empty constructor, and one method: get_move(self, state). This is the only method you need to implement to get a working bot. It receives a description of a game state, and returns a move. A move is always a pair of two elements, each of which can be either an integer or None. Note that it is not an option to pass, therefore (None, None) is not a valid move.

As you can see, in the rand bot, the state object does almost all the work: state.moves() gives you a list of legal moves. The rand Bot simply makes a random choice from this list using the function random.choice() from the python library.

If you want to see what happens when you make a given move, just do
next_state = state.next(move)
And you get a state representing the outcome.

What else can the State object do for you? You can look at the code in api/_state.py.

### bully.py

Bully is a deterministic bot: given the same state it will always do the same thing. We've removed part of the explanation from the comments. 

### Task 3: 
Have a look at the code: describe in your own words what strategy does the bully bot use? 

In [12]:
Report3 = """

The Bully bot has a deterministic stategical behavior. If it is its turn to play, Bully will firstly look for trump cards in its hand and play the first 
one it finds. If it does not find any card belonging to the trump suit, Bully will simply play the card with the highest rank available in its hand. In 
case it is not its turn to play, and the opponent has made the first move on the current trick, Bully will play the first card it finds in its hand that 
matches the suit of the card played by the opponent. Again, if this condition is not possible to satisfy, Bully will play the card with the highest rank 
available. 
To evaluate a card's rank, the modulus operator (%) is used, responsible for providing the remainder of the division of the index of 
each card by 5. For example, to evaluate the rank of the card indexed as 18, applying the modulus operator on the expression 18 % 5, we obtain the 
result of 3 (mathematically speaking 5 * 3 = 15 and from 15 there is still 3 units to arrive at the numerator of the division 18), making it possible to 
conclude that the evaluated card is on the 3rd column of the index matrix and thus has rank 3. All the cards in the same column of the index matrix will 
have the same rank value. The modulus operator will always be the index of the card over 5 because there are 5 different rankings of cards.
The algorithm runs through all the available cards, evaluates their ranks, detects possible ranking improvements by finding lower rankings since the 
matrix is sorted columnwise from the most valuable cards to the weakest and then chooses the best move based on that.

"""

### rdeep.py
The lectures discuss the hill-climbing strategy: look one move ahead and pick the move that leads to the best heuristic. The heuristics we use is the ratio of the player points w.r.t. to the total points currently assigned in the game. The higher this value, the better the state is for us. Imagine doing hill-climbing with this heuristic. This strategy would not work here. Why not? 



In order to avoid this issue, we need to loook further ahead than the hill climbing strategy does. rdeep.py does this in the simplest way we could think of. Make eight random moves and look at the value of the resulting state. Do this a few times and average the values found. This method is called Perfect-Information Monte-Carlo Sampling (PIMC).

You just ran a tournament between rdeep and the other two bots. Most likely, rdeep will have won a few more games. But does the difference really mean rdeep is better? It might just be that rdeep is no better than rand and won by pure luck.

### Task 4 
If you wanted to provide scientific evidence that rdeep is better than rand, how would you go about it?

In [13]:

Report4 = """

The Hill Climbing algorithm would not work in this situation because it is only prepared to store only one heuristic that tries to improve locally 
without being sure of being the optimal solution, being therefore impossible because in this problem we would have two heuristics to consider (ratio of 
points of each player over the total points already played so far).
After several empirical experiments conducted so far in the project I have a probable hypothesis that bot Rdeep is stronger than Rand because after a
considerable number of executions Rdeep consistently won. However, in order to scientifically prove this hypothesis, it would be necessary to carry out 
games between the two bots exhaustively, based on the fact that statistically, the greater the number of experiences, the smaller the impact of the luck 
effect and, therefore, the results converge to the truth on which of the bots is strategically superior.

"""


### mybot.py

It's time to write your own bot. Think of a simple strategy that is easy to implement. To create the bot follow these steps:
1. Create the directory bots/mybot (in the directory, not this notebook!)
2. Add an empty file __init__.py, or copy it from one of the other bot directories.
3. Copy rand.py to the directory mybot, and rename it to mybot.py
4. Change the implementation of get_move(state). Keep the method signature (line 16) exactly as it is.

Make sure your bot always returns either a pair elements that are each either int or None. Try playing a tournament against rand. See if you can get a decent margin.

If your bot has parameters (like a search depth, or a pre-programmed probability of doing nothing) you can add these to the constructor. Have a look at rdeep.py to see how this is done.

To get some examples of how to talk to the API, see the README.md

### Task 5 
Add your implementation of get_move() and the result of a tournament against rand to your report. 

Please write your code here (in raw text, to avoid an error), as well as the results in the following cell: 


In [14]:
MyMove1 ="""

def get_move(self, state):
    # type: (State) -> tuple[int, int]
        
        # Function comment begin

        Function that gets called every turn. This is where to implement the strategies.
        Be sure to make a legal move. Illegal moves, like giving an index of a card you
        don't own or proposing an illegal mariage, will lose you the game.
        TODO: This bot will obey to the following play options ordered by priority:
                1) Play a card of the spades suit (the first found)
                2) Play a trump suit card (the first found)
                3) Play the highest rank card available
        :param State state: An object representing the gamestate. This includes a link to
            the states of all the cards, the trick and the points.
        :return: A tuple of integers or a tuple of an integer and None,
            indicating a move; the first indicates the card played in the trick, the second a
            potential spouse.
        
        # Function comment end
        
        #All legal moves
        moves = state.moves()
        chosen_move = moves[0]

        #This bot will always firstly try to play spades, it will play the first spades card it finds on its hand
        for index, move in enumerate(moves):
            if move[0] is not None and 8 <= move[0] <= 11:
                return move


        #If playing spades is not a possibility the bot will try to play the first trump suit card it finds on its hand
        moves_trump_suit = []

        #Get all trump suit moves available
        for index, move in enumerate(moves):

            if move[0] is not None and Deck.get_suit(move[0]) == state.get_trump_suit():
                moves_trump_suit.append(move)

        if len(moves_trump_suit) > 0:
            chosen_move = moves_trump_suit[0]
            return chosen_move

        #As a last resort the bot will play the highest rank available, of any suit
        for index, move in enumerate(moves):
            if move[0] is not None and move[0] % 5 <= chosen_move[0] % 5:
                chosen_move = move
        return chosen_move

The bot I built called Mybot was based on the strategy defined in the comment of the get_move function (playing the first card found of the suit of 
spades as the first priority, followed by playing the first card found in the hand belonging to the trump suit and finally if none of the previous 
conditions are possible to satisfy, playing the card with the highest rank available) was able to beat Rand by a comfortable margin 
(6 against 14 in the first game played).

"""

In [18]:
# Setting up the tournament of Mybot vs Rand

botnames = []
verbose = False 
myphase = 1
myrepeats = 10

# Create player 1
player1 = util.load_player("rand") 
player2 = util.load_player("mybot")

bots = [player1,player2]

n = len(bots)
wins = [0] * len(bots)
matches = [(p1, p2) for p1 in range(n) for p2 in range(n) if p1 < p2]

totalgames = (n*n - n)/2 * myrepeats
playedgames = 0

print('Playing {} games:'.format(int(totalgames)))
for a, b in matches:
    for r in range(myrepeats):

        if random.choice([True, False]):
            p = [a, b]
        else:
            p = [b, a]

        # Generate a state with a random seed
        state = State.generate(phase=myphase)

        winner, score = engine.play(bots[p[0]], bots[p[1]], state, 1000, verbose, True)

        if winner is not None:
            winner = p[winner - 1]
            wins[winner] += score

        playedgames += 1
        print('Played {} out of {:.0f} games ({:.0f}%): {} \r'.format(playedgames, totalgames, playedgames/float(totalgames) * 100, wins))

print('Results:')
for i in range(len(bots)):
    print('    bot {}: {} points'.format(bots[i], wins[i]))

Playing 10 games:
Played 1 out of 10 games (10%): [1, 0] 
Played 2 out of 10 games (20%): [4, 0] 
Played 3 out of 10 games (30%): [4, 2] 
Played 4 out of 10 games (40%): [5, 2] 
Played 5 out of 10 games (50%): [5, 5] 
Played 6 out of 10 games (60%): [6, 5] 
Played 7 out of 10 games (70%): [6, 7] 
Played 8 out of 10 games (80%): [6, 8] 
Played 9 out of 10 games (90%): [6, 11] 
Played 10 out of 10 games (100%): [6, 14] 
Results:
    bot <bots.rand.rand.Bot object at 0x0000027AC8F1F490>: 6 points
    bot <bots.mybot.mybot.Bot object at 0x0000027AC8F1F8E0>: 14 points


## Final Task: Collect all the results

Uncomment and run this cell (and all the cells above) to generate the text file that you have to hand in together with the notebook on canvas!

### Please hand in only the text file which is generated by this method!

In [1]:
from utils import *
exportToText("assignment4.txt", Report1, Report2, Report3, Report4, MyMove1)

NameError: name 'Report1' is not defined