# Intelligent Systems 2021: 6th  practical assignment 
## Machine Learning Agents

Your name: Mohammad Reshadati

Your VUnetID: 2741740

If you do not provide your name and VUnetID we will not accept your submission. 

### Preliminaries

At the end of this exercise you should be able to work with some basic Machine Learning concepts, and implement and evaluate a learning-based approach to playing Schnapsen. In this notebook we are going to create an adaptive bot. We will use the principle discussed in the machine learning lecture, but now in an agent setting. This comes down to using basic hill-climbing search, but learn the heuristic function rather than implementing it. This will require a few basic ingredients:

> Script that plays games between existing bots and creates a dataset to learn from. The dataset contains each observed state, labeled with the (eventual) winner of the game. See the script train-ml-bot.py.<br>
> A function that translates a state object to a feature vector. See the function features(...) in ml.py<br>
> An implementation with a hill-climbing bot that gets its heuristic from a machine learning model. See bots/ml/ml.py 

Feature vectors were discussed in the lecture. Didn't get it, or working ahead? See
https://brilliant.org/wiki/feature-vector/
https://www.youtube.com/watch?v=3Vy47dbI708



### Practicalities

Follow this Notebook step-by-step. For this course it is necessary that you manipulate the python programmes we provide. You can do the exercises in any Programming Editor of your liking. Still, please fill in the questions in this notebook as usual. You can also run tournaments in it if you want, but running them in your editor or via the commandline seems much more convenient. 

Please use your studentID+Assignment6.ipynb as the name of the Notebook, and fill in the missing cells.   

Note: unlike the courses dedicated to programming we will not evaluate the style of the programs. But we will, however, test your programs on other data that we provide, and your program should give the correct output to the test-data as well.

As was mentioned, the assignment is graded as pass/fail. To pass you need to have either a full working code or an explanation of what you tried and what didn't work for the tasks that you were unable to complete (you can use multi-line comments or a text cell).


## Train a Machine Learning Model 

The plan is as follows: we run the train-ml-bot.py script, which creates a model for us, and places it in the bots/ml directory. All you need to do is to is to complete the feature extraction method in bots/ml/ml.py It returns a basic feature vector modelling the properties of the game state, or more precisely the bot’s perspective of the game state (which means that in phase 1 of the game there are parts of the feature values unknown (for the cards that are either in the adversaries hands or in the pile). 

To complete the function, you'll need to write some code which transforms information you get from state.py into integer values.

To run the bots using the commandline/terminal:
    - If you want to play 2 bots against each other, e.g. rand and bully: python play.py -1 rand -2 bully
    - To see what other options there are: python play.py --help
    - If you run python tournament.py it'll play a round-robin tournament between bully, rand and rdeep where every pair of players play 10 matches. Run python tournament.py --help to see how you can change the players, and the number of games played (if needed).

### Task 1

Fill in the missing code (all the '???' lines) and run a number of games to check whether your agent "works". You can either run the play.py script in a command line, or copy the play code from one of the previous notebooks here (do not forget to import all the necessary modules and code). 

*Hint: You need to run train-ml-bot.py after finishing the ML bot.*

*Hint 2: If there is a problem, try to run the tournament without the "fast" option*

Please copy you code in the following cell




In [11]:
%pip install joblib
%pip install sklearn

Note: you may need to restart the kernel to use updated packages.
Collecting sklearn
  Downloading sklearn-0.0.tar.gz (1.1 kB)
Collecting scikit-learn
  Downloading scikit_learn-1.0.1-cp39-cp39-win_amd64.whl (7.2 MB)
Collecting threadpoolctl>=2.0.0
  Downloading threadpoolctl-3.0.0-py3-none-any.whl (14 kB)
Building wheels for collected packages: sklearn
  Building wheel for sklearn (setup.py): started
  Building wheel for sklearn (setup.py): finished with status 'done'
  Created wheel for sklearn: filename=sklearn-0.0-py2.py3-none-any.whl size=1309 sha256=c59643fe5edc60f89458838befe572dc71a7843b3359945f8e400f34b6ea3ef9
  Stored in directory: c:\users\rmoha\appdata\local\pip\cache\wheels\e4\7b\98\b6466d71b8d738a0c547008b9eb39bf8676d1ff6ca4b22af1c
Successfully built sklearn
Installing collected packages: threadpoolctl, scikit-learn, sklearn
Successfully installed scikit-learn-1.0.1 sklearn-0.0 threadpoolctl-3.0.0
Note: you may need to restart the kernel to use updated packages.


In [1]:
!python train-ml-bot.py


Generating dataset: [                              ]   0%
Generating dataset: [                              ]   0%
Generating dataset: [                              ]   0%
Generating dataset: [                              ]   0%
Generating dataset: [                              ]   0%
Generating dataset: [                              ]   0%
Generating dataset: [                              ]   0%
Generating dataset: [                              ]   0%
Generating dataset: [                              ]   0%
Generating dataset: [                              ]   0%
Generating dataset: [                              ]   1%
Generating dataset: [                              ]   1%
Generating dataset: [                              ]   1%
Generating dataset: [                              ]   1%
Generating dataset: [                              ]   1%
Generating dataset: [                              ]   1%
Generating dataset: [                              ]   1%
Generating da

In [11]:
MyCode1 = """
    #!/usr/bin/env python
"""
# A basic adaptive bot. This is part of the third worksheet.

"""

from api import State, util
import random, os
from itertools import chain

import joblib

# Path of the model we will use. If you make a model
# with a different name, point this line to its path.
DEFAULT_MODEL = os.path.dirname(os.path.realpath(__file__)) + '/model.pkl'

class Bot:

    __randomize = True

    __model = None

    def __init__(self, randomize=True, model_file=DEFAULT_MODEL):

        print(model_file)
        self.__randomize = randomize

        # Load the model
        self.__model = joblib.load(model_file)

    def get_move(self, state):

        val, move = self.value(state)

        return move

    def value(self, state):
       
        best_value = float('-inf') if maximizing(state) else float('inf')
        best_move = None

        moves = state.moves()

        if self.__randomize:
            random.shuffle(moves)

        for move in moves:

            next_state = state.next(move)

            # IMPLEMENT: Add a function call so that 'value' will
            # contain the predicted value of 'next_state'
            # NOTE: This is different from the line in the minimax/alphabeta bot
            value = self.heuristic(next_state)

            if maximizing(state):
                if value > best_value:
                    best_value = value
                    best_move = move
            else:
                if value < best_value:
                    best_value = value
                    best_move = move

        return best_value, best_move

    def heuristic(self, state):

        # Convert the state to a feature vector
        feature_vector = [features(state)]

        # These are the classes: ('won', 'lost')
        classes = list(self.__model.classes_)

        # Ask the model for a prediction
        # This returns a probability for each class
        prob = self.__model.predict_proba(feature_vector)[0]

        # Weigh the win/loss outcomes (-1 and 1) by their probabilities
        res = -1.0 * prob[classes.index('lost')] + 1.0 * prob[classes.index('won')]

        return res

def maximizing(state):
   
    return state.whose_turn() == 1


def features(state):
    # type: (State) -> tuple[float, ...]
   
    feature_set = []

    # Add player 1's points to feature set
    p1_points = state.get_points(1)
    feature_set.append(p1_points)

    # Add player 2's points to feature set
    p2_points = state.get_points(2)
    feature_set.append(p2_points)


    # Add player 1's pending points to feature set
    p1_pending_points = state.get_pending_points(1)
    feature_set.append(p1_pending_points)


    # Add plauer 2's pending points to feature set
    p2_pending_points = state.get_pending_points(2)
    feature_set.append(p2_pending_points)

    # Get trump suit
    trump_suit = state.get_trump_suit()

    # Add phase to feature set
    phase = state.get_phase()
    feature_set.append(phase)


    # Add stock size to feature set
    stock_size = state.get_stock_size()
    feature_set.append(stock_size)

    # Add leader to feature set
    leader = state.leader()
    feature_set.append(leader)

    # Add whose turn it is to feature set
    whose_turn = state.whose_turn()
    feature_set.append(whose_turn)

    # Add opponent's played card to feature set
    opponents_played_card = state.get_opponents_played_card()


    ################## You do not need to do anything below this line ########################

    perspective = state.get_perspective()

    # Perform one-hot encoding on the perspective.
    # Learn more about one-hot here: https://machinelearningmastery.com/how-to-one-hot-encode-sequence-data-in-python/
    perspective = [card if card != 'U'   else [1, 0, 0, 0, 0, 0] for card in perspective]
    perspective = [card if card != 'S'   else [0, 1, 0, 0, 0, 0] for card in perspective]
    perspective = [card if card != 'P1H' else [0, 0, 1, 0, 0, 0] for card in perspective]
    perspective = [card if card != 'P2H' else [0, 0, 0, 1, 0, 0] for card in perspective]
    perspective = [card if card != 'P1W' else [0, 0, 0, 0, 1, 0] for card in perspective]
    perspective = [card if card != 'P2W' else [0, 0, 0, 0, 0, 1] for card in perspective]

    # Append one-hot encoded perspective to feature_set
    feature_set += list(chain(*perspective))

    # Append normalized points to feature_set
    total_points = p1_points + p2_points
    feature_set.append(p1_points/total_points if total_points > 0 else 0.)
    feature_set.append(p2_points/total_points if total_points > 0 else 0.)

    # Append normalized pending points to feature_set
    total_pending_points = p1_pending_points + p2_pending_points
    feature_set.append(p1_pending_points/total_pending_points if total_pending_points > 0 else 0.)
    feature_set.append(p2_pending_points/total_pending_points if total_pending_points > 0 else 0.)

    # Convert trump suit to id and add to feature set
    # You don't need to add anything to this part
    suits = ["C", "D", "H", "S"]
    trump_suit_onehot = [0, 0, 0, 0]
    trump_suit_onehot[suits.index(trump_suit)] = 1
    feature_set += trump_suit_onehot

    # Append one-hot encoded phase to feature set
    feature_set += [1, 0] if phase == 1 else [0, 1]

    # Append normalized stock size to feature set
    feature_set.append(stock_size/10)

    # Append one-hot encoded leader to feature set
    feature_set += [1, 0] if leader == 1 else [0, 1]

    # Append one-hot encoded whose_turn to feature set
    feature_set += [1, 0] if whose_turn == 1 else [0, 1]

    # Append one-hot encoded opponent's card to feature set
    opponents_played_card_onehot = [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
    opponents_played_card_onehot[opponents_played_card if opponents_played_card is not None else 20] = 1
    feature_set += opponents_played_card_onehot

    # Return feature set
    return feature_set

"""

'\n\nfrom api import State, util\nimport random, os\nfrom itertools import chain\n\nimport joblib\n\n# Path of the model we will use. If you make a model\n# with a different name, point this line to its path.\nDEFAULT_MODEL = os.path.dirname(os.path.realpath(__file__)) + \'/model.pkl\'\n\nclass Bot:\n\n    __randomize = True\n\n    __model = None\n\n    def __init__(self, randomize=True, model_file=DEFAULT_MODEL):\n\n        print(model_file)\n        self.__randomize = randomize\n\n        # Load the model\n        self.__model = joblib.load(model_file)\n\n    def get_move(self, state):\n\n        val, move = self.value(state)\n\n        return move\n\n    def value(self, state):\n       \n        best_value = float(\'-inf\') if maximizing(state) else float(\'inf\')\n        best_move = None\n\n        moves = state.moves()\n\n        if self.__randomize:\n            random.shuffle(moves)\n\n        for move in moves:\n\n            next_state = state.next(move)\n\n            # IMPL

Run a tournament between rand, bully and ml, and copy the result of the tournament in the following cell. 


In [24]:
#run the tournament here
!python play.py -1 rand -2 bully

   Start state: The game is in phase: 1
Player 1's points: 0, pending: 0
Player 2's points: 0, pending: 0
The trump suit is: C
Player 1's hand: AC KC 10H KH JH
Player 2's hand: 10C AD QD QH JS
There are 10 cards in the stock

player1: <bots.rand.rand.Bot object at 0x00000206B3714430>
player2: <bots.bully.bully.Bot object at 0x00000206B36C6E50>
*   Player 1 plays: 10H
The game is in phase: 1
Player 1's points: 0, pending: 0
Player 2's points: 0, pending: 0
The trump suit is: C
Player 1's hand: AC KC 10H KH JH
Player 2's hand: 10C AD QD QH JS
There are 10 cards in the stock
Player 1 has played card: 10 of H

*   Player 2 plays: 10C
The game is in phase: 1
Player 1's points: 0, pending: 0
Player 2's points: 20, pending: 0
The trump suit is: C
Player 1's hand: AC KC KH JH QS
Player 2's hand: AD QD JD QH JS
There are 8 cards in the stock

*   Player 2 plays: AD
The game is in phase: 1
Player 1's points: 0, pending: 0
Player 2's points: 20, pending: 0
The trump suit is: C
Player 1's hand: AC

In [32]:
!python play.py -1 rand -2 ml 


D:\University\IntelligentSystems\assignments\6\bots\ml/model.pkl
D:\University\IntelligentSystems\assignments\6\bots\ml/model.pkl
   Start state: The game is in phase: 1
Player 1's points: 0, pending: 0
Player 2's points: 0, pending: 0
The trump suit is: C
Player 1's hand: JC AD QH AS QS
Player 2's hand: 10C JD JH 10S KS
There are 10 cards in the stock

player1: <bots.rand.rand.Bot object at 0x000001C654804430>
player2: <bots.ml.ml.Bot object at 0x000001C6547C6E50>
*   Player 2 plays: 10S
The game is in phase: 1
Player 1's points: 0, pending: 0
Player 2's points: 0, pending: 0
The trump suit is: C
Player 1's hand: JC AD QH AS QS
Player 2's hand: 10C JD JH 10S KS
There are 10 cards in the stock
Player 2 has played card: 10 of S

*   Player 1 plays: QH
The game is in phase: 1
Player 1's points: 0, pending: 0
Player 2's points: 13, pending: 0
The trump suit is: C
Player 1's hand: AC JC AD AS QS
Player 2's hand: 10C KD JD JH KS
There are 8 cards in the stock

*   Player 2 plays: 10C
The ga

In [35]:
!python play.py -1 bully -2 ml 

D:\University\IntelligentSystems\assignments\6\bots\ml/model.pkl
D:\University\IntelligentSystems\assignments\6\bots\ml/model.pkl
   Start state: The game is in phase: 1
Player 1's points: 0, pending: 0
Player 2's points: 0, pending: 0
The trump suit is: D
Player 1's hand: KC QC QD QH 10S
Player 2's hand: AD KD KH AS KS
There are 10 cards in the stock

player1: <bots.bully.bully.Bot object at 0x000002A5D0D45430>
player2: <bots.ml.ml.Bot object at 0x000002A5D0D16E50>
*   Player 2 plays: AS
The game is in phase: 1
Player 1's points: 0, pending: 0
Player 2's points: 0, pending: 0
The trump suit is: D
Player 1's hand: KC QC QD QH 10S
Player 2's hand: AD KD KH AS KS
There are 10 cards in the stock
Player 2 has played card: A of S

*   Player 1 plays: QD
The game is in phase: 1
Player 1's points: 14, pending: 0
Player 2's points: 0, pending: 0
The trump suit is: D
Player 1's hand: KC QC JC QH 10S
Player 2's hand: AD KD KH JH KS
There are 8 cards in the stock

*   Player 1 plays: 10S
The game

In [37]:
MyResults1 = """
After running a tournament between rand and bully, the bully has won 2 games by receiving 2 points:

Start state: The game is in phase: 1
Player 1's points: 0, pending: 0
Player 2's points: 0, pending: 0
The trump suit is: C
Player 1's hand: AC KC 10H KH JH
Player 2's hand: 10C AD QD QH JS
There are 10 cards in the stock
player1: <bots.rand.rand.Bot object at 0x00000206B3714430>
player2: <bots.bully.bully.Bot object at 0x00000206B36C6E50>
*   Player 1 plays: 10H
The game is in phase: 1
Player 1's points: 0, pending: 0
Player 2's points: 0, pending: 0
The trump suit is: C
Player 1's hand: AC KC 10H KH JH
Player 2's hand: 10C AD QD QH JS
There are 10 cards in the stock
Player 1 has played card: 10 of H
*   Player 2 plays: 10C
The game is in phase: 1
Player 1's points: 0, pending: 0
Player 2's points: 20, pending: 0
The trump suit is: C
Player 1's hand: JC KD KH
Player 2's hand: QD JD KS
There are 0 cards in the stock
Game finished. Player 1 has won, receiving 2 points.
_________________________________________________________________________________________________________________________________

After running a tournament between rand and ml, the ml has won 2 games by receiving 2 points:

Start state: The game is in phase: 1
Player 1's points: 0, pending: 0
Player 2's points: 0, pending: 0
The trump suit is: C
Player 1's hand: JC AD QH AS QS
Player 2's hand: 10C JD JH 10S KS
There are 10 cards in the stock

player1: <bots.rand.rand.Bot object at 0x000001C654804430>
player2: <bots.ml.ml.Bot object at 0x000001C6547C6E50>
*   Player 2 plays: 10S
The game is in phase: 1
Player 1's points: 0, pending: 0
Player 2's points: 0, pending: 0
The trump suit is: C
Player 1's hand: JC AD QH AS QS
Player 2's hand: 10C JD JH 10S KS
There are 10 cards in the stock
Player 2 has played card: 10 of S

*   Player 1 plays: QH
The game is in phase: 1
Player 1's points: 0, pending: 0
Player 1's hand: AC KC KH QS
Player 2's hand: QC QD JH KS
There are 0 cards in the stock

Game finished. Player 2 has won, receiving 2 points.
_________________________________________________________________________________________________________________________________

After running a tournament between bully and ml, the ml has won 2 games by receiving 2 points:
Start state: The game is in phase: 1
Player 1's points: 0, pending: 0
Player 2's points: 0, pending: 0
The trump suit is: D
Player 1's hand: KC QC QD QH 10S
Player 2's hand: AD KD KH AS KS
There are 10 cards in the stock

player1: <bots.bully.bully.Bot object at 0x000002A5D0D45430>
player2: <bots.ml.ml.Bot object at 0x000002A5D0D16E50>
*   Player 2 plays: AS
The game is in phase: 1
Player 1's points: 0, pending: 0
Player 2's points: 0, pending: 0
The trump suit is: D
Player 1's hand: KC QC QD QH 10S
Player 2's hand: AD KD KH AS KS
There are 10 cards in the stock
Player 2 has played card: A of S

*   Player 1 plays: QD
The game is in phase: 1
Player 1's points: 14, pending: 0
Player 1's hand: QC JC
Player 2's hand: JD KS
There are 0 cards in the stock

Game finished. Player 2 has won, receiving 2 points.
"""


### Task 2: 

The first thing we can do to improve the bot, is to improve the quality of the games it observes. Change the player in train-ml-bot.py to a kbbot and/or rdeep player, and retrain the model. You may wish to lower the number of games played in train-ml-bot.py if the games are taking a long time. Please describe in the following cell what you can observe when running a tournament like before after training ml. 


In [54]:
# Train the model with rdeep player
!python train-ml-bot.py

Starting training phase...
Iteration 1, loss = 0.62484204
Validation score: 0.667163
Iteration 2, loss = 0.59767881
Validation score: 0.676476
Iteration 3, loss = 0.58922820
Validation score: 0.680874
Iteration 4, loss = 0.58432363
Validation score: 0.683138
Iteration 5, loss = 0.58155647
Validation score: 0.682491
Iteration 6, loss = 0.57991659
Validation score: 0.684691
Iteration 7, loss = 0.57881204
Validation score: 0.686437
Iteration 8, loss = 0.57759665
Validation score: 0.687795
Iteration 9, loss = 0.57665862
Validation score: 0.687666
Iteration 10, loss = 0.57559083
Validation score: 0.689671
Iteration 11, loss = 0.57440961
Validation score: 0.687407
Iteration 12, loss = 0.57310040
Validation score: 0.691288
Iteration 13, loss = 0.57179643
Validation score: 0.693163
Iteration 14, loss = 0.56970893
Validation score: 0.695751
Iteration 15, loss = 0.56762665
Validation score: 0.697044
Iteration 16, loss = 0.56533385
Validation score: 0.696203
Iteration 17, loss = 0.56275083
Valida

In [78]:
# Train the model with kbbot player
!python train-ml-bot.py

Starting training phase...
Iteration 1, loss = 0.65596322
Validation score: 0.662764
Iteration 2, loss = 0.59740307
Validation score: 0.671690
Iteration 3, loss = 0.58861521
Validation score: 0.680551
Iteration 4, loss = 0.58328879
Validation score: 0.680680
Iteration 5, loss = 0.58050083
Validation score: 0.683268
Iteration 6, loss = 0.57861289
Validation score: 0.684302
Iteration 7, loss = 0.57716435
Validation score: 0.686178
Iteration 8, loss = 0.57587981
Validation score: 0.685790
Iteration 9, loss = 0.57478959
Validation score: 0.689089
Iteration 10, loss = 0.57374456
Validation score: 0.689218
Iteration 11, loss = 0.57272900
Validation score: 0.690318
Iteration 12, loss = 0.57128715
Validation score: 0.691741
Iteration 13, loss = 0.57034994
Validation score: 0.692323
Iteration 14, loss = 0.56879848
Validation score: 0.692517
Iteration 15, loss = 0.56737586
Validation score: 0.692517
Iteration 16, loss = 0.56568707
Validation score: 0.691935
Iteration 17, loss = 0.56391591
Valida

In [46]:
MyReport2 = """
In a nutshell, the winning rate of tournaments after training ml is higher.

At the first stage, I trained the model with the default player which was rand, and for the validation score, it got 0.768967.
And the total result of classes won and lost was:
{'lost': 76409, 'won': 78194}

At the second stage, I trained the model with the player rdeep, and for the validation score, 
it got 0.761335.
And the total result of classes won and lost was:
{'lost': 76409, 'won': 78194}


At the third stage, I trained the model with the player kbbot, and for the validation score, 
it got 0.759265
{'lost': 76409, 'won': 78194}

"""

## Training in different phases

Using alphabeta for training might not be a good idea, since it has to start in phase 2 with perfect information. This may not translate so well to phase 1 gameplay. Nevertheless, it is a good idea to experiment. If you wish to do this, you have to specify in train-ml-bot.py that the training games start in phase 2.

### Task 3

Re-run the tournament. Does the machine learning bot do better? Show the output, and mention which bot was used for training.


In [48]:
!python play.py -1 rand -2 ml 

D:\University\IntelligentSystems\assignments\6\bots\ml/model.pkl
D:\University\IntelligentSystems\assignments\6\bots\ml/model.pkl
   Start state: The game is in phase: 1
Player 1's points: 0, pending: 0
Player 2's points: 0, pending: 0
The trump suit is: D
Player 1's hand: AC QC QH AS JS
Player 2's hand: 10C JC AD JD AH
There are 10 cards in the stock

player1: <bots.rand.rand.Bot object at 0x000001BFF0EB4430>
player2: <bots.ml.ml.Bot object at 0x000001BFF0E86E50>
*   Player 1 plays: AS
The game is in phase: 1
Player 1's points: 0, pending: 0
Player 2's points: 0, pending: 0
The trump suit is: D
Player 1's hand: AC QC QH AS JS
Player 2's hand: 10C JC AD JD AH
There are 10 cards in the stock
Player 1 has played card: A of S

*   Player 2 plays: JD
The game is in phase: 1
Player 1's points: 0, pending: 0
Player 2's points: 13, pending: 0
The trump suit is: D
Player 1's hand: AC QC 10H QH JS
Player 2's hand: 10C JC AD AH QS
There are 8 cards in the stock

*   Player 2 plays: 10C
The game 

In [49]:
!python play.py -1 bully -2 ml 

D:\University\IntelligentSystems\assignments\6\bots\ml/model.pkl
D:\University\IntelligentSystems\assignments\6\bots\ml/model.pkl
   Start state: The game is in phase: 1
Player 1's points: 0, pending: 0
Player 2's points: 0, pending: 0
The trump suit is: S
Player 1's hand: AD QH JH 10S QS
Player 2's hand: AC 10C KD QD JD
There are 10 cards in the stock

player1: <bots.bully.bully.Bot object at 0x0000027415773430>
player2: <bots.ml.ml.Bot object at 0x0000027415725E50>
*   Player 2 plays: KD
*   Player 2 melds a marriage between KD and QD
The game is in phase: 1
Player 1's points: 0, pending: 0
Player 2's points: 0, pending: 20
The trump suit is: S
Player 1's hand: AD QH JH 10S QS
Player 2's hand: AC 10C KD QD JD
There are 10 cards in the stock
Player 2 has played card: K of D

*   Player 1 plays: 10S
The game is in phase: 1
Player 1's points: 14, pending: 0
Player 2's points: 0, pending: 20
The trump suit is: S
Player 1's hand: AD QH JH KS QS
Player 2's hand: AC 10C QD JD AH
There are 8

In [53]:
MyReport3 = """
Tournament between rand and ml
Game finished. Player ml has won, receiving 2 points.

Detailed Result:
Player 1's points: 0, pending: 0
Player 2's points: 0, pending: 0
The trump suit is: D
Player 1's hand: AC QC QH AS JS
Player 2's hand: 10C JC AD JD AH
There are 10 cards in the stock

player1: <bots.rand.rand.Bot object at 0x000001BFF0EB4430>
player2: <bots.ml.ml.Bot object at 0x000001BFF0E86E50>
*   Player 1 plays: AS
The game is in phase: 1
Player 1's points: 0, pending: 0
Player 2's points: 0, pending: 0
The trump suit is: D
Player 1's hand: AC QC QH AS JS
Player 2's hand: 10C JC AD JD AH
There are 10 cards in the stock
Player 1 has played card: A of S

*   Player 2 plays: JD
The game is in phase: 1
Player 1's points: 0, pending: 0
Player 1's hand: 10D QD
Player 2's hand: AD KD
There are 0 cards in the stock

Game finished. Player 2 has won, receiving 2 points.

________________________________________________________________________________________________

Tournament between bully and ml
Game finished. Player ml has won, receiving 1 point.


Detailed Result:
Player 1's points: 0, pending: 0
Player 2's points: 0, pending: 0
The trump suit is: S
Player 1's hand: AD QH JH 10S QS
Player 2's hand: AC 10C KD QD JD
There are 10 cards in the stock

player1: <bots.bully.bully.Bot object at 0x0000027415773430>
player2: <bots.ml.ml.Bot object at 0x0000027415725E50>
*   Player 2 plays: KD
*   Player 2 melds a marriage between KD and QD
The game is in phase: 1
Player 1's points: 0, pending: 0
Player 2's points: 0, pending: 20
The trump suit is: S
Player 1's hand: AD QH JH 10S QS
Player 2's hand: AC 10C KD QD JD
There are 10 cards in the stock
Player 2 has played card: K of D

*   Player 1 plays: 10S
The game is in phase: 1
Player 1's hand:
Player 2's hand:
There are 0 cards in the stock

Game finished. Player 2 has won, receiving 1 points.

By observing the ebove results its clear that Machine Learning bot does a better job.
(For training the above Machine Learning Model kbbot was used)

"""

## Testing more than one ML agent

We will need a more robust way of testing different machine learning approaches against each other. Change the training script so that it doesn't overwrite the previous model. Now write a script that creates two ml players with different models and plays games between them. This might then look like this: 

> from bots.ml import ml<br>
player1 = ml.Bot(model_file='./models/rand-model.pkl')<br>
player2 = ml.Bot(model_file='./models/rdeep-model.pkl')

Read and train-ml-bot.py carefully for inspiration.

### Task 4

Make three models: one by observing rand players, one by observing rdeep players, and one by observing one of the ml players you made earlier. Describe the experiments you run, and their results in the next cell. 

In [55]:
# Training again with rand to have a seprate model for rand
!python train-ml-bot.py

Starting training phase...
Iteration 1, loss = 0.62270931
Validation score: 0.663670
Iteration 2, loss = 0.59660249
Validation score: 0.669362
Iteration 3, loss = 0.58850406
Validation score: 0.678546
Iteration 4, loss = 0.58413650
Validation score: 0.680228
Iteration 5, loss = 0.58182305
Validation score: 0.681068
Iteration 6, loss = 0.58014369
Validation score: 0.679840
Iteration 7, loss = 0.57901081
Validation score: 0.681327
Iteration 8, loss = 0.57776621
Validation score: 0.683462
Iteration 9, loss = 0.57687024
Validation score: 0.683009
Iteration 10, loss = 0.57581692
Validation score: 0.683979
Iteration 11, loss = 0.57467291
Validation score: 0.683074
Iteration 12, loss = 0.57399614
Validation score: 0.685855
Iteration 13, loss = 0.57278269
Validation score: 0.685208
Iteration 14, loss = 0.57143073
Validation score: 0.686502
Iteration 15, loss = 0.57004923
Validation score: 0.689800
Iteration 16, loss = 0.56880834
Validation score: 0.688507
Iteration 17, loss = 0.56739019
Valida

In [80]:
from api import State, engine, util
from bots.ml import ml

player_rand = ml.Bot(model_file='./models/rand-model.pkl')
player_rdeep = ml.Bot(model_file='./models/rdeep-model.pkl')
player_kbbot=ml.Bot(model_file='./models/kb-model.pkl')

startphase = 1
verbose=True 



./models/rand-model.pkl
./models/rdeep-model.pkl
./models/kb-model.pkl


In [81]:
# Game between player1:rand and player2:rdeep (based on Machine Learning)
engine.play(player_rand,player_rdeep,state=State.generate(phase=startphase), max_time=10000, verbose=verbose)

player1: <bots.ml.ml.Bot object at 0x0000016B74BFD7F0>
player2: <bots.ml.ml.Bot object at 0x0000016B74B9DB80>
*   Player 2 plays: KS
The game is in phase: 1
Player 1's points: 0, pending: 0
Player 2's points: 0, pending: 0
The trump suit is: S
Player 1's hand: 10C AD 10H JH QS
Player 2's hand: AC 10D QD QH KS
There are 10 cards in the stock
Player 2 has played card: K of S

*   Player 1 plays: JH
The game is in phase: 1
Player 1's points: 0, pending: 0
Player 2's points: 6, pending: 0
The trump suit is: S
Player 1's hand: 10C AD JD 10H QS
Player 2's hand: AC QC 10D QD QH
There are 8 cards in the stock

*   Player 2 plays: 10D
The game is in phase: 1
Player 1's points: 0, pending: 0
Player 2's points: 6, pending: 0
The trump suit is: S
Player 1's hand: 10C AD JD 10H QS
Player 2's hand: AC QC 10D QD QH
There are 8 cards in the stock
Player 2 has played card: 10 of D

*   Player 1 plays: AD
The game is in phase: 1
Player 1's points: 21, pending: 0
Player 2's points: 6, pending: 0
The trum

(2, 1)

In [82]:
# Game between player1:rand and player2:kbbot (based on Machine Learning)
engine.play(player_rand,player_kbbot,state=State.generate(phase=startphase), max_time=10000, verbose=verbose)

player1: <bots.ml.ml.Bot object at 0x0000016B74BFD7F0>
player2: <bots.ml.ml.Bot object at 0x0000016B74B9DA90>
*   Player 2 plays: 10S
The game is in phase: 1
Player 1's points: 0, pending: 0
Player 2's points: 0, pending: 0
The trump suit is: H
Player 1's hand: 10C JC AD 10D KD
Player 2's hand: AH 10H KH AS 10S
There are 10 cards in the stock
Player 2 has played card: 10 of S

*   Player 1 plays: JC
The game is in phase: 1
Player 1's points: 0, pending: 0
Player 2's points: 12, pending: 0
The trump suit is: H
Player 1's hand: AC 10C AD 10D KD
Player 2's hand: JD AH 10H KH AS
There are 8 cards in the stock

*   Player 2 plays: JD
The game is in phase: 1
Player 1's points: 0, pending: 0
Player 2's points: 12, pending: 0
The trump suit is: H
Player 1's hand: AC 10C AD 10D KD
Player 2's hand: JD AH 10H KH AS
There are 8 cards in the stock
Player 2 has played card: J of D

*   Player 1 plays: AD
The game is in phase: 1
Player 1's points: 13, pending: 0
Player 2's points: 12, pending: 0
The 

(2, 1)

In [83]:
# Game between player1:kbbot and player2:rdeep (based on Machine Learning)
engine.play(player_kbbot,player_rdeep,state=State.generate(phase=startphase), max_time=10000, verbose=verbose)

player1: <bots.ml.ml.Bot object at 0x0000016B74B9DA90>
player2: <bots.ml.ml.Bot object at 0x0000016B74B9DB80>
*   Player 2 plays: 10C
The game is in phase: 1
Player 1's points: 0, pending: 0
Player 2's points: 0, pending: 0
The trump suit is: S
Player 1's hand: KC QC AH QS JS
Player 2's hand: AC 10C QD KH KS
There are 10 cards in the stock
Player 2 has played card: 10 of C

*   Player 1 plays: JS
The game is in phase: 1
Player 1's points: 12, pending: 0
Player 2's points: 0, pending: 0
The trump suit is: S
Player 1's hand: KC QC 10D AH QS
Player 2's hand: AC QD JD KH KS
There are 8 cards in the stock

*   Player 1 plays: KC
*   Player 1 melds a marriage between KC and QC
The game is in phase: 1
Player 1's points: 12, pending: 20
Player 2's points: 0, pending: 0
The trump suit is: S
Player 1's hand: KC QC 10D AH QS
Player 2's hand: AC QD JD KH KS
There are 8 cards in the stock
Player 1 has played card: K of C

*   Player 2 plays: AC
The game is in phase: 1
Player 1's points: 12, pending

(1, 1)

In [84]:
MyReport4 = """
Player KB has won againts rdeep , receiving 1 points.
Player KB has won againts rand , receiving 1 points.
Player Rdeep has won againsts rand, receiving 1 points.

All players were traind based on observation of their corresponding bots, but we see that
Player Kbbot has a better performance against other bots.

"""

## Improving the set of features 

You may not see a lot of improvement for clever tricks like this. This is because the game has a lot of belief states, i.e. has an extremely broad search tree. The machine learning model only sees as small proportion, and chances are that no “similar” game has been described in the training set. Maybe the card deck and number of won or lost points. 

To improve this, we might need better features. Think of some simple additional features and add them to the features() method in ml.py.

> Note that this means the bot can no longer use your old models, since they rely on 4-dimensional feature vectors. You'll most likely want to create a copy of ml.py for every feature-extraction strategy you would like to try, or add different feature extractors as a parameter to the bot.

Feature extraction is an art. You want to translate the information in the state into numbers in a way that makes sense to a linear model. We'll discuss this in-depth in the lectures. To start with, just try and think of numbers you can compute from the state that are high if the state is good for player one and low if the state is bad. You might want to add combinations of important features to create a design matrix, as discussed in class. 

### Task 5
Add some simple features and show that the player improves. Describe the features you added, copy their code and copy the result of the tournement into the following cell. 

In [None]:
MyReport5 = """
Put your answer here
"""

## Feature engineering 

Finally, since coming up with features is an ad-hoc business, you'll want to test features you come up with to see if they actually add to the performance. How would you go about this? Could there be features that depend on each other? Ie. add feature A or B separately and there's no improvement, but add them together and the bot gets better?

### Task 6

As shown in the lecture, adding the product of existing features (a design matrix) is a simple way to increase the power of your method without changing models.

Try to add at least 2 combined features to your feature table and evaluate it in a number of tournaments. Describe the new features and copy the code in the next cell. 

Also, copy the result of the experiments and an interpretation in your own words. 



In [None]:
MyReport6 = """
Put your answer here
"""

## Final Task: Collect all the results

Uncomment and run this cell (and all the cells above) to generate the text file that you have to hand in together with the notebook on canvas!

### Please hand in only the text file which is generated by this method!

In [None]:
from utils import *
exportToText("assignment6.txt", MyCode1,MyResults1, MyReport2, MyReport3, MyReport4, MyReport5, MyReport6)

## Knock yourself out

Of course, there is a wealth of other things to explore. Here are some things you can try out which will be similar to the things to do for 

1) Have a look at the sklearn documentation: http://scikit-learn.org/stable/modules/classes.html It's a bit complex, but maybe you can figure how to use different machine learning models. The logistic regression we used is a very simple starting point.

2) Evaluate your model on the dataset by cross validation. See if you can improve the performance by tweaking its parameters,

Have fun. To be continued in Project Intelligent Systems in Period 3. 