# Intelligent Systems 2022: 11th  practical assignment 
## Machine Learning Agents

Your name: Sebastião Manuel Inácio Rosalino

Your VUnetID: sxx209

If you do not provide your name and VUnetID we will not accept your submission. 

### Preliminaries

At the end of this exercise you should be able to work with some basic Machine Learning concepts, and implement and evaluate a learning-based approach to playing Schnapsen. In this notebook we are going to create an adaptive bot. We will use the principle discussed in the machine learning lecture, but now in an agent setting. This comes down to using basic hill-climbing search, but learn the heuristic function rather than implementing it. This will require a few basic ingredients:

> Script that plays games between existing bots and creates a dataset to learn from. The dataset contains each observed state, labeled with the (eventual) winner of the game. See the script train-ml-bot.py.<br>
> A function that translates a state object to a feature vector. See the function features(...) in ml.py<br>
> An implementation with a hill-climbing bot that gets its heuristic from a machine learning model. See bots/ml/ml.py 

Feature vectors were discussed in the lecture. 


### Practicalities

Follow this Notebook step-by-step. For this course it is necessary that you manipulate the python programmes we provide. You can do the exercises in any Programming Editor of your liking. Still, please fill in the questions in this notebook as usual. You can also run tournaments in it if you want, but running them in your editor or via the commandline seems much more convenient. 

Please use your studentID+Assignment11.ipynb as the name of the Notebook, and fill in the missing cells.   

Note: unlike the courses dedicated to programming we will not evaluate the style of the programs. But we will, however, test your programs on other data that we provide, and your program should give the correct output to the test-data as well.

As was mentioned, the assignment is graded as pass/fail. To pass you need to have either a full working code or an explanation of what you tried and what didn't work for the tasks that you were unable to complete (you can use multi-line comments or a text cell).


## Train a Machine Learning Model 

The plan is as follows: we run the train-ml-bot.py script, which creates a model for us, and places it in the bots/ml directory. All you need to do is to is to complete the feature extraction method in bots/ml/ml.py It returns a basic feature vector modelling the properties of the game state, or more precisely the bot’s perspective of the game state (which means that in phase 1 of the game there are parts of the feature values unknown (for the cards that are either in the adversaries hands or in the pile). 

To complete the function, you'll need to write some code which transforms information you get from state.py into integer values.

To run the bots using the commandline/terminal:
    - If you want to play 2 bots against each other, e.g. rand and bully: python play.py -1 rand -2 bully
    - To see what other options there are: python play.py --help
    - If you run python tournament.py it'll play a round-robin tournament between bully, rand and rdeep where every pair of players play 10 matches. Run python tournament.py --help to see how you can change the players, and the number of games played (if needed).

### Task 1

Fill in the missing code (all the '???' lines) and run a number of games to check whether your agent "works". You can either run the play.py script in a command line, or copy the play code from one of the previous notebooks here (do not forget to import all the necessary modules and code). 

*Hint: You need to run train-ml-bot.py after finishing the ML bot.*

*Hint 2: If there is a problem, try to run the tournament without the "fast" option*

Please copy you code in the following cell




In [33]:
MyCode1 = """

value = self.heuristic(next_state)

#Add player 1's points to feature set
p1_points = state.get_points(1)

# Add player 2's points to feature set
p2_points = state.get_points(2)

# Add player 1's pending points to feature set
p1_pending_points = state.get_pending_points(1)

# Add plauer 2's pending points to feature set
p2_pending_points = state.get_pending_points(2)

# Get trump suit
trump_suit = state.get_trump_suit()

# Add phase to feature set
phase = state.get_phase()

# Add stock size to feature set
stock_size = state.get_stock_size()

# Add leader to feature set
leader = state.leader()

# Add whose turn it is to feature set
whose_turn = state.whose_turn()

# Add opponent's played card to feature set
opponents_played_card = state.get_opponents_played_card()

"""

Run a tournament between rand, bully and ml, and copy the result of the tournament in the following cell. 

In [26]:
import sys, random

from api import State, engine, util

# Round-robin tournament Rand vs Bully vs Ml (Rand based training)

botnames = []
verbose = False 
myphase = 1
myrepeats = 30

# Create all the players
player1 = util.load_player("rand")
player2 = util.load_player("bully") 
player3 = util.load_player("ml") 


bots = [player1,player2,player3]

n = len(bots)
wins = [0] * len(bots)
matches = [(p1, p2) for p1 in range(n) for p2 in range(n) if p1 < p2]

totalgames = (n*n - n)/2 * myrepeats
playedgames = 0

print('Playing {} games:'.format(int(totalgames)))
for a, b in matches:
    for r in range(myrepeats):

        if random.choice([True, False]):
            p = [a, b]
        else:
            p = [b, a]

        # Generate a state with a random seed
        state = State.generate(phase=myphase)

        winner, score = engine.play(bots[p[0]], bots[p[1]], state, 1000, verbose, True)

        if winner is not None:
            winner = p[winner - 1]
            wins[winner] += score

        playedgames += 1
        print('Played {} out of {:.0f} games ({:.0f}%): {} \r'.format(playedgames, totalgames, playedgames/float(totalgames) * 100, wins))

print('Results:')
for i in range(len(bots)):
    print('    bot {}: {} points'.format(bots[i], wins[i]))

C:\Universidade_ISCTE\terceiro_ano\intelligent_systems\assignments\assignment_11\bots\ml/model.pkl
C:\Universidade_ISCTE\terceiro_ano\intelligent_systems\assignments\assignment_11\bots\ml/model.pkl
Playing 90 games:
Played 1 out of 90 games (1%): [1, 0, 0] 
Played 2 out of 90 games (2%): [2, 0, 0] 
Played 3 out of 90 games (3%): [2, 3, 0] 
Played 4 out of 90 games (4%): [2, 5, 0] 
Played 5 out of 90 games (6%): [2, 8, 0] 
Played 6 out of 90 games (7%): [3, 8, 0] 
Played 7 out of 90 games (8%): [4, 8, 0] 
Played 8 out of 90 games (9%): [4, 9, 0] 
Played 9 out of 90 games (10%): [5, 9, 0] 
Played 10 out of 90 games (11%): [5, 12, 0] 
Played 11 out of 90 games (12%): [6, 12, 0] 
Played 12 out of 90 games (13%): [6, 14, 0] 
Played 13 out of 90 games (14%): [8, 14, 0] 
Played 14 out of 90 games (16%): [9, 14, 0] 
Played 15 out of 90 games (17%): [10, 14, 0] 
Played 16 out of 90 games (18%): [10, 17, 0] 
Played 17 out of 90 games (19%): [12, 17, 0] 
Played 18 out of 90 games (20%): [12, 20, 

In [34]:
MyResults1 = """

Results:
    bot <bots.rand.rand.Bot object at 0x000001A8F5EFD930>: 29 points
    bot <bots.bully.bully.Bot object at 0x000001A8F5EFC280>: 39 points
    bot <bots.ml.ml.Bot object at 0x000001A8F5EFDA20>: 76 points
    
As it is noticeble, after generating a dataset containing training games as instances and using the Rand bot to learn by playing against itself on those 
training instances, the the ML bot was able to beat both Bully and Rand by a huge margin on a 30 repeated round-robin tournament.

"""

### Task 2: 

The first thing we can do to improve the bot, is to improve the quality of the games it observes. Change the player in train-ml-bot.py to a kbbot and/or rdeep player, and retrain the model. You may wish to lower the number of games played in train-ml-bot.py if the games are taking a long time. Please describe in the following cell what you can observe when running a tournament like before after training ml. 


In [27]:
# Round-robin tournament Rand vs Bully vs Ml (Rdeep based training)

botnames = []
verbose = False 
myphase = 1
myrepeats = 30

# Create all the players
player1 = util.load_player("rand")
player2 = util.load_player("bully") 
player3 = util.load_player("ml") 


bots = [player1,player2,player3]

n = len(bots)
wins = [0] * len(bots)
matches = [(p1, p2) for p1 in range(n) for p2 in range(n) if p1 < p2]

totalgames = (n*n - n)/2 * myrepeats
playedgames = 0

print('Playing {} games:'.format(int(totalgames)))
for a, b in matches:
    for r in range(myrepeats):

        if random.choice([True, False]):
            p = [a, b]
        else:
            p = [b, a]

        # Generate a state with a random seed
        state = State.generate(phase=myphase)

        winner, score = engine.play(bots[p[0]], bots[p[1]], state, 1000, verbose, True)

        if winner is not None:
            winner = p[winner - 1]
            wins[winner] += score

        playedgames += 1
        print('Played {} out of {:.0f} games ({:.0f}%): {} \r'.format(playedgames, totalgames, playedgames/float(totalgames) * 100, wins))

print('Results:')
for i in range(len(bots)):
    print('    bot {}: {} points'.format(bots[i], wins[i]))

C:\Universidade_ISCTE\terceiro_ano\intelligent_systems\assignments\assignment_11\bots\ml/model.pkl
C:\Universidade_ISCTE\terceiro_ano\intelligent_systems\assignments\assignment_11\bots\ml/model.pkl
Playing 90 games:
Played 1 out of 90 games (1%): [1, 0, 0] 
Played 2 out of 90 games (2%): [1, 2, 0] 
Played 3 out of 90 games (3%): [2, 2, 0] 
Played 4 out of 90 games (4%): [2, 5, 0] 
Played 5 out of 90 games (6%): [2, 8, 0] 
Played 6 out of 90 games (7%): [4, 8, 0] 
Played 7 out of 90 games (8%): [5, 8, 0] 
Played 8 out of 90 games (9%): [5, 11, 0] 
Played 9 out of 90 games (10%): [5, 13, 0] 
Played 10 out of 90 games (11%): [6, 13, 0] 
Played 11 out of 90 games (12%): [7, 13, 0] 
Played 12 out of 90 games (13%): [7, 16, 0] 
Played 13 out of 90 games (14%): [7, 18, 0] 
Played 14 out of 90 games (16%): [7, 20, 0] 
Played 15 out of 90 games (17%): [7, 23, 0] 
Played 16 out of 90 games (18%): [7, 26, 0] 
Played 17 out of 90 games (19%): [7, 29, 0] 
Played 18 out of 90 games (20%): [7, 32, 0]

In [36]:
MyReport2 = """

Results:
    bot <bots.rand.rand.Bot object at 0x000001A8F5FC8610>: 18 points
    bot <bots.bully.bully.Bot object at 0x000001A8F5FC9AE0>: 51 points
    bot <bots.ml.ml.Bot object at 0x000001A8F5FC8760>: 91 points

After retraining the Ml model, now generating a new dataset of training game samples that will be used to provide learning about the game of Schnapsen 
using Rdeep, I ran a round-robin tournament again, with 30 repetitions, opposing Rand vs Bully vs Ml. The Ml bot was again the winner, further increasing
its previous advantage (when using the Rand bot as its learning base) over the two remaining bots. This is no suprise because, as seen in previous 
assignments, Rdeep is strategically superior to Rand.

"""

## Training in different phases

Using alphabeta for training might not be a good idea, since it has to start in phase 2 with perfect information. This may not translate so well to phase 1 gameplay. Nevertheless, it is a good idea to experiment. If you wish to do this, you have to specify in train-ml-bot.py that the training games start in phase 2.

### Task 3

Re-run the tournament. Does the machine learning bot do better? Show the output, and mention which bot was used for training.


In [30]:
# Round-robin tournament Rand vs Bully vs Ml (Alphabeta based training)

botnames = []
verbose = False 
myphase = 1
myrepeats = 30

# Create all the players
player1 = util.load_player("rand")
player2 = util.load_player("bully") 
player3 = util.load_player("ml") 


bots = [player1,player2,player3]

n = len(bots)
wins = [0] * len(bots)
matches = [(p1, p2) for p1 in range(n) for p2 in range(n) if p1 < p2]

totalgames = (n*n - n)/2 * myrepeats
playedgames = 0

print('Playing {} games:'.format(int(totalgames)))
for a, b in matches:
    for r in range(myrepeats):

        if random.choice([True, False]):
            p = [a, b]
        else:
            p = [b, a]

        # Generate a state with a random seed
        state = State.generate(phase=myphase)

        winner, score = engine.play(bots[p[0]], bots[p[1]], state, 1000, verbose, True)

        if winner is not None:
            winner = p[winner - 1]
            wins[winner] += score

        playedgames += 1
        print('Played {} out of {:.0f} games ({:.0f}%): {} \r'.format(playedgames, totalgames, playedgames/float(totalgames) * 100, wins))

print('Results:')
for i in range(len(bots)):
    print('    bot {}: {} points'.format(bots[i], wins[i]))

C:\Universidade_ISCTE\terceiro_ano\intelligent_systems\assignments\assignment_11\bots\ml/model.pkl
C:\Universidade_ISCTE\terceiro_ano\intelligent_systems\assignments\assignment_11\bots\ml/model.pkl
Playing 90 games:
Played 1 out of 90 games (1%): [0, 2, 0] 
Played 2 out of 90 games (2%): [0, 5, 0] 
Played 3 out of 90 games (3%): [1, 5, 0] 
Played 4 out of 90 games (4%): [1, 8, 0] 
Played 5 out of 90 games (6%): [2, 8, 0] 
Played 6 out of 90 games (7%): [2, 11, 0] 
Played 7 out of 90 games (8%): [2, 12, 0] 
Played 8 out of 90 games (9%): [3, 12, 0] 
Played 9 out of 90 games (10%): [3, 15, 0] 
Played 10 out of 90 games (11%): [3, 16, 0] 
Played 11 out of 90 games (12%): [4, 16, 0] 
Played 12 out of 90 games (13%): [4, 19, 0] 
Played 13 out of 90 games (14%): [4, 20, 0] 
Played 14 out of 90 games (16%): [4, 23, 0] 
Played 15 out of 90 games (17%): [4, 26, 0] 
Played 16 out of 90 games (18%): [4, 28, 0] 
Played 17 out of 90 games (19%): [6, 28, 0] 
Played 18 out of 90 games (20%): [6, 30, 

In [37]:
MyReport3 = """

Results:
    bot <bots.rand.rand.Bot object at 0x000001A8F5FC8460>: 26 points
    bot <bots.bully.bully.Bot object at 0x000001A8F5FCA500>: 45 points
    bot <bots.ml.ml.Bot object at 0x000001A8F5FCADA0>: 86 points

To begin with, I generated a new dataset composed of instances representing games of Schnapsen, and then retrained the ML model, using the Alphabeta bot 
(starting in phase 2) in the training and learning process. After holding a round-robin tournament between the bots Rand vs Bully vs ML, I was able to 
conclude that the ML model was able to beat once again, and in a convincing way, the remaining two players. Again, nothing too surprising, since, as seen 
in previous assignments, the Alphabeta bot has a superior game strength compared to Rand and Bully. It should be noted that, even winning with a great 
margin, the ML model based on Alphabeta did not show better results than when based on Rdeep in the tournament held, scoring 5 points less (91 to 86).

"""

## Testing more than one ML agent

We will need a more robust way of testing different machine learning approaches against each other. Change the training script so that it doesn't overwrite the previous model. Now write a script that creates two ml players with different models and plays games between them. This might then look like this: 

> from bots.ml import ml<br>
player1 = ml.Bot(model_file='./models/rand-model.pkl')<br>
player2 = ml.Bot(model_file='./models/rdeep-model.pkl')

Read and train-ml-bot.py carefully for inspiration.

### Task 4

Make three models: one by observing rand players, one by observing rdeep players, and one by observing one of the ml players you made earlier. Describe the experiments you run, and their results in the next cell. 

In [1]:
import sys, random

from api import State, engine, util

In [2]:
from bots.ml import ml

rand_trained = ml.Bot(model_file='./bots/ml/model.pkl')
rdeep_trained = ml.Bot(model_file='./bots/ml/rdeep_model.pkl')
alphabeta_trained = ml.Bot(model_file='./bots/ml/alphabeta_model.pkl')

./bots/ml/model.pkl
./bots/ml/rdeep_model.pkl
./bots/ml/alphabeta_model.pkl


In [5]:
botnames = []
verbose = False 
myphase = 1
myrepeats = 30

# Create all the players
ml_rand = rand_trained
ml_rdeep = rdeep_trained
ml_alphabeta = alphabeta_trained

bot_names = ['ML - Rand', 'ML - Rdeep', 'ML - Alphabeta']

bots = [ml_rand, ml_rdeep, ml_alphabeta]

n = len(bots)
wins = [0] * len(bots)
matches = [(p1, p2) for p1 in range(n) for p2 in range(n) if p1 < p2]

totalgames = (n*n - n)/2 * myrepeats
playedgames = 0

print('Playing {} games:'.format(int(totalgames)))
for a, b in matches:
    for r in range(myrepeats):

        if random.choice([True, False]):
            p = [a, b]
        else:
            p = [b, a]

        # Generate a state with a random seed
        state = State.generate(phase=myphase)

        winner, score = engine.play(bots[p[0]], bots[p[1]], state, 1000, verbose, True)

        if winner is not None:
            winner = p[winner - 1]
            wins[winner] += score

        playedgames += 1
        print('Played {} out of {:.0f} games ({:.0f}%): {} \r'.format(playedgames, totalgames, playedgames/float(totalgames) * 100, wins))

print('Results:')
for i in range(len(bot_names)):
    print('    bot {}: {} points'.format(bot_names[i], wins[i]))

Playing 90 games:
Played 1 out of 90 games (1%): [3, 0, 0] 
Played 2 out of 90 games (2%): [4, 0, 0] 
Played 3 out of 90 games (3%): [5, 0, 0] 
Played 4 out of 90 games (4%): [7, 0, 0] 
Played 5 out of 90 games (6%): [8, 0, 0] 
Played 6 out of 90 games (7%): [9, 0, 0] 
Played 7 out of 90 games (8%): [9, 2, 0] 
Played 8 out of 90 games (9%): [9, 4, 0] 
Played 9 out of 90 games (10%): [10, 4, 0] 
Played 10 out of 90 games (11%): [10, 5, 0] 
Played 11 out of 90 games (12%): [11, 5, 0] 
Played 12 out of 90 games (13%): [12, 5, 0] 
Played 13 out of 90 games (14%): [12, 6, 0] 
Played 14 out of 90 games (16%): [12, 7, 0] 
Played 15 out of 90 games (17%): [14, 7, 0] 
Played 16 out of 90 games (18%): [16, 7, 0] 
Played 17 out of 90 games (19%): [16, 8, 0] 
Played 18 out of 90 games (20%): [16, 9, 0] 
Played 19 out of 90 games (21%): [16, 11, 0] 
Played 20 out of 90 games (22%): [16, 13, 0] 
Played 21 out of 90 games (23%): [16, 16, 0] 
Played 22 out of 90 games (24%): [17, 16, 0] 
Played 23 out

In [39]:
MyReport4 = """

Results:
    bot ML - Rand: 44 points
    bot ML - Rdeep: 47 points
    bot ML - Alphabeta: 59 points

To begin with, I created three variables, corresponding respectively to the model that observed Rand players, the one that observed Rdeep players, and 
the one that observed Alphabeta players. After performing an experiment consisting of a round-robin tournament between the three different ML models 
with 30 repetitions, the results are visible in the output. The Alphabeta observed model came up with the win (scoring 59 points), followed by the model 
that carried out its learning according to Rdeep (scoring 47 points), and in last place was the model that carried out its learning based on Rand players
(scoring 44 points). The results are logical according to the game strength of each bot already known, the Alphabeta bot is the strongest among the 
three, followed by Rdeep, which is is stronger than the Rand bot.

"""

## Improving the set of features 

You may not see a lot of improvement for clever tricks like this. This is because the game has a lot of belief states, i.e. has an extremely broad search tree. The machine learning model only sees as small proportion, and chances are that no “similar” game has been described in the training set. Maybe the card deck and number of won or lost points. 

To improve this, we might need better features. Think of some simple additional features and add them to the features() method in ml.py.

> Note that this means the bot can no longer use your old models, since they rely on 4-dimensional feature vectors. You'll most likely want to create a copy of ml.py for every feature-extraction strategy you would like to try, or add different feature extractors as a parameter to the bot.

Feature extraction is an art. You want to translate the information in the state into numbers in a way that makes sense to a linear model. We'll discuss this in-depth in the lectures. To start with, just try and think of numbers you can compute from the state that are high if the state is good for player one and low if the state is bad. You might want to add combinations of important features to create a design matrix, as discussed in class. 

### Task 5
Add some simple features and show that the player improves. Describe the features you added, copy their code and copy the result of the tournement into the following cell. 

In [11]:
# Round-robin ML - Rand vs ML - Bully vs ML - Alphabeta (improved)

botnames = []
verbose = False 
myphase = 1
myrepeats = 30

# Create all the players
ml_rand = rand_trained
ml_rdeep = rdeep_trained
ml_alphabeta = alphabeta_trained

bot_names = ['ML - Rand', 'ML - Rdeep', 'ML - Alphabeta']

bots = [ml_rand, ml_rdeep, ml_alphabeta]

n = len(bots)
wins = [0] * len(bots)
matches = [(p1, p2) for p1 in range(n) for p2 in range(n) if p1 < p2]

totalgames = (n*n - n)/2 * myrepeats
playedgames = 0

print('Playing {} games:'.format(int(totalgames)))
for a, b in matches:
    for r in range(myrepeats):

        if random.choice([True, False]):
            p = [a, b]
        else:
            p = [b, a]

        # Generate a state with a random seed
        state = State.generate(phase=myphase)

        winner, score = engine.play(bots[p[0]], bots[p[1]], state, 1000, verbose, True)

        if winner is not None:
            winner = p[winner - 1]
            wins[winner] += score

        playedgames += 1
        print('Played {} out of {:.0f} games ({:.0f}%): {} \r'.format(playedgames, totalgames, playedgames/float(totalgames) * 100, wins))

print('Results:')
for i in range(len(bot_names)):
    print('    bot {}: {} points'.format(bot_names[i], wins[i]))

Playing 90 games:
Played 1 out of 90 games (1%): [0, 1, 0] 
Played 2 out of 90 games (2%): [0, 4, 0] 
Played 3 out of 90 games (3%): [0, 5, 0] 
Played 4 out of 90 games (4%): [0, 6, 0] 
Played 5 out of 90 games (6%): [1, 6, 0] 
Played 6 out of 90 games (7%): [1, 7, 0] 
Played 7 out of 90 games (8%): [2, 7, 0] 
Played 8 out of 90 games (9%): [2, 10, 0] 
Played 9 out of 90 games (10%): [2, 11, 0] 
Played 10 out of 90 games (11%): [3, 11, 0] 
Played 11 out of 90 games (12%): [3, 13, 0] 
Played 12 out of 90 games (13%): [3, 14, 0] 
Played 13 out of 90 games (14%): [5, 14, 0] 
Played 14 out of 90 games (16%): [6, 14, 0] 
Played 15 out of 90 games (17%): [7, 14, 0] 
Played 16 out of 90 games (18%): [8, 14, 0] 
Played 17 out of 90 games (19%): [8, 15, 0] 
Played 18 out of 90 games (20%): [8, 18, 0] 
Played 19 out of 90 games (21%): [10, 18, 0] 
Played 20 out of 90 games (22%): [10, 21, 0] 
Played 21 out of 90 games (23%): [10, 24, 0] 
Played 22 out of 90 games (24%): [10, 25, 0] 
Played 23 ou

In [40]:
MyReport5 = """

Results:
    bot ML - Rand: 31 points
    bot ML - Rdeep: 45 points
    bot ML - Alphabeta: 61 points

I began by adding a feature called "feature_trump_suit". That feature is meant to store the total number of trump suit cards in the playing bot's hand.
That seemed a promising idea, since, strategy apart, the number of trump suit cards in a player's hand throughout the entire game is usually a meaningful
indicator of the player's chances on winning more tricks, and therefore collecting more points to achieve victory. 

The code I used to build my feature was the following:

    # Add the number of trump suit cards
    feature_trump_suit = 0
    hand = state.hand()
    for card in hand:
        if util.get_suit(card) == trump_suit:
            feature_trump_suit += 1
    
    # Adding my new feature to the feature set
    feature_set.append(feature_trump_suit)

Finally, I re-executed the last round-robin tournament between the model that observed Rand players, the one that observed Rdeep players, and 
the one that observed Alphabeta players. My improvements on the new feature were added upon the Alphabeta version of the ML bot. As it visible in the 
output, the bot ML - Alphabeta won again, and my new added feature turned out to be an improvement on the bot's play strength, since it increased its
victory margin by scoring 61 points (+2 compared to the last tournament), again by repeating the tournament 30 times.

"""

## Feature engineering 

Finally, since coming up with features is an ad-hoc business, you'll want to test features you come up with to see if they actually add to the performance. How would you go about this? Could there be features that depend on each other? Ie. add feature A or B separately and there's no improvement, but add them together and the bot gets better?

### Task 6

As shown in the lecture, adding the product of existing features (a design matrix) is a simple way to increase the power of your method without changing models.

Try to add at least 2 combined features to your feature table and evaluate it in a number of tournaments. Describe the new features and copy the code in the next cell. 

Also, copy the result of the experiments and an interpretation in your own words. 



In [22]:
# Round-robin ML - Rand vs ML - Bully vs ML - Alphabeta (combined features improvement)

botnames = []
verbose = False 
myphase = 1
myrepeats = 30

# Create all the players
ml_rand = rand_trained
ml_rdeep = rdeep_trained
ml_alphabeta = alphabeta_trained

bot_names = ['ML - Rand', 'ML - Rdeep', 'ML - Alphabeta']

bots = [ml_rand, ml_rdeep, ml_alphabeta]

n = len(bots)
wins = [0] * len(bots)
matches = [(p1, p2) for p1 in range(n) for p2 in range(n) if p1 < p2]

totalgames = (n*n - n)/2 * myrepeats
playedgames = 0

print('Playing {} games:'.format(int(totalgames)))
for a, b in matches:
    for r in range(myrepeats):

        if random.choice([True, False]):
            p = [a, b]
        else:
            p = [b, a]

        # Generate a state with a random seed
        state = State.generate(phase=myphase)

        winner, score = engine.play(bots[p[0]], bots[p[1]], state, 1000, verbose, True)

        if winner is not None:
            winner = p[winner - 1]
            wins[winner] += score

        playedgames += 1
        print('Played {} out of {:.0f} games ({:.0f}%): {} \r'.format(playedgames, totalgames, playedgames/float(totalgames) * 100, wins))

print('Results:')
for i in range(len(bot_names)):
    print('    bot {}: {} points'.format(bot_names[i], wins[i]))

Playing 90 games:
Played 1 out of 90 games (1%): [1, 0, 0] 
Played 2 out of 90 games (2%): [1, 1, 0] 
Played 3 out of 90 games (3%): [2, 1, 0] 
Played 4 out of 90 games (4%): [2, 4, 0] 
Played 5 out of 90 games (6%): [2, 7, 0] 
Played 6 out of 90 games (7%): [3, 7, 0] 
Played 7 out of 90 games (8%): [4, 7, 0] 
Played 8 out of 90 games (9%): [6, 7, 0] 
Played 9 out of 90 games (10%): [6, 8, 0] 
Played 10 out of 90 games (11%): [7, 8, 0] 
Played 11 out of 90 games (12%): [7, 10, 0] 
Played 12 out of 90 games (13%): [8, 10, 0] 
Played 13 out of 90 games (14%): [9, 10, 0] 
Played 14 out of 90 games (16%): [9, 12, 0] 
Played 15 out of 90 games (17%): [9, 13, 0] 
Played 16 out of 90 games (18%): [9, 14, 0] 
Played 17 out of 90 games (19%): [9, 15, 0] 
Played 18 out of 90 games (20%): [11, 15, 0] 
Played 19 out of 90 games (21%): [11, 17, 0] 
Played 20 out of 90 games (22%): [13, 17, 0] 
Played 21 out of 90 games (23%): [14, 17, 0] 
Played 22 out of 90 games (24%): [14, 19, 0] 
Played 23 out 

In [41]:
MyReport6 = """

Results:
    bot ML - Rand: 38 points
    bot ML - Rdeep: 48 points
    bot ML - Alphabeta: 62 points

I started by adding a new feature combining the player 1 pending points and the player 2 pending points by their difference. That new feature addition 
seems a good idea because, as before, in despite of having the pending points (points left to secure victory) of both player 1 and player 2, the currently
playing bot might have been mislead, since, even though one player may have a low value of pending points, the opponent may have an even lower value of 
those, making the bot make a compromised and poor move. When combined, by their difference, the playing bot will now possess a more meaningful and 
reliable access to its truly closeness to a winning situation.
    
The code I used to build my new combined feature was the following:

    # Add the difference of the pending points between the two players
    difference_pending_points = p1_pending_points - p2_pending_points

    # Adding my new combined feature to the feature set
    feature_set.append(difference_pending_points)

Finally, I re-executed the last round-robin tournament between the model that observed Rand players, the one that observed Rdeep players, and 
the one that observed Alphabeta players. All my improvements (the single feature addition and the combined feature addition) were added upon the Alphabeta 
version of the ML bot. As it visible in the output, the bot ML - Alphabeta won again, and my recently added combined feature turned out to be an 
improvement (a very little one, but still) on the bot's performance, since it increased its victory margin by scoring 62 points (+1 compared to the last 
tournament). The tournament was, once more, repeated 30 times.

"""

## Final Task: Collect all the results

Uncomment and run this cell (and all the cells above) to generate the text file that you have to hand in together with the notebook on canvas!

### Please hand in only the text file which is generated by this method!

In [31]:
from utils import *
exportToText("assignment11.txt", MyCode1,MyResults1, MyReport2, MyReport3, MyReport4, MyReport5, MyReport6)

## Knock yourself out

Of course, there is a wealth of other things to explore. Here are some things you can try out which will be similar to the things to do for 

1) Have a look at the sklearn documentation: http://scikit-learn.org/stable/modules/classes.html It's a bit complex, but maybe you can figure how to use different machine learning models. The logistic regression we used is a very simple starting point.

2) Evaluate your model on the dataset by cross validation. See if you can improve the performance by tweaking its parameters,

Have fun. To be continued in Project Intelligent Systems in Period 3. 