Having decided on a model, we'll look into what all features can be targets.

In [1]:
#import the necessities
import numpy as np
import pandas as pd
from sklearn.tree import DecisionTreeRegressor

In [2]:
cards = pd.read_csv('D:/DSF Files/cards.txt', sep='\t')
recent_cards = pd.read_csv('D:/DSF Files/recentcards.txt', sep = '\t')
cards.name = 'Cards'
recent_cards.name = 'Recent Cards'

In [3]:
def d_tree_boosted(depth, data, target):
    print('Data used from {} dataset.'.format(data.name))
    predictions = pd.DataFrame()
    y = data[target]
    X = data.drop([target], axis=1)
    for n in range(0, 101):
        decision_tree = DecisionTreeRegressor(max_depth=depth)
        decision_tree.fit(X, y)
        predict = decision_tree.predict(X)
        predictions['predictions {}'.format(n)] = predict
        y = y - predict
    predicted = predictions.sum(axis=1)
    print("\nOverall R^2 for {} after 100 iterations:".format(target))
    print(np.corrcoef(data[target], predicted)[0, 1])

In [4]:
d_tree_boosted(4, cards, 'Power')
d_tree_boosted(4, cards, 'Toughness')
d_tree_boosted(4, cards, 'Legendary')
d_tree_boosted(4, cards, 'Specific Mana Cost')
d_tree_boosted(4, cards, 'EG Keywords')
d_tree_boosted(4, cards, 'Set Keywords')
d_tree_boosted(4, cards, 'Converted Mana Cost')

Data used from Cards dataset.

Overall R^2 for Power after 100 iterations:
0.9040648813065324
Data used from Cards dataset.

Overall R^2 for Toughness after 100 iterations:
0.8872318905480122
Data used from Cards dataset.

Overall R^2 for Legendary after 100 iterations:
0.7433534979016426
Data used from Cards dataset.

Overall R^2 for Specific Mana Cost after 100 iterations:
0.8555243715765988
Data used from Cards dataset.

Overall R^2 for EG Keywords after 100 iterations:
0.6056016374339477
Data used from Cards dataset.

Overall R^2 for Set Keywords after 100 iterations:
0.46695169682379195
Data used from Cards dataset.

Overall R^2 for Converted Mana Cost after 100 iterations:
0.8741628513482467


Power, Toughness, Specific Mana Cost, and Converted Mana Cost are over 80%, so we'll call those satisfactory. We'll remove those from the pool and attempt the others again to see what the results are then.

In [5]:
def cards_d_tree_boosted(depth, data, target):
    print('Data used from {} dataset.'.format(data.name))
    predictions = pd.DataFrame()
    y = data[target]
    X = data.drop([target, 'Power', 'Toughness', 'Specific Mana Cost', 'Converted Mana Cost'], axis=1)
    for n in range(0, 101):
        decision_tree = DecisionTreeRegressor(max_depth=depth)
        decision_tree.fit(X, y)
        predict = decision_tree.predict(X)
        predictions['predictions {}'.format(n)] = predict
        y = y - predict
    predicted = predictions.sum(axis=1)
    print("\nOverall R^2 for {} after 100 iterations:".format(target))
    print(np.corrcoef(data[target], predicted)[0, 1])

In [6]:
cards_d_tree_boosted(4, cards, 'Power')
cards_d_tree_boosted(4, cards, 'Toughness')
cards_d_tree_boosted(4, cards, 'Legendary')
cards_d_tree_boosted(4, cards, 'Specific Mana Cost')
cards_d_tree_boosted(4, cards, 'EG Keywords')
cards_d_tree_boosted(4, cards, 'Set Keywords')
cards_d_tree_boosted(4, cards, 'Converted Mana Cost')

Data used from Cards dataset.

Overall R^2 for Power after 100 iterations:
0.5040054727057118
Data used from Cards dataset.

Overall R^2 for Toughness after 100 iterations:
0.4978563129151257
Data used from Cards dataset.

Overall R^2 for Legendary after 100 iterations:
0.6222321909192879
Data used from Cards dataset.

Overall R^2 for Specific Mana Cost after 100 iterations:
0.7577656945634124
Data used from Cards dataset.

Overall R^2 for EG Keywords after 100 iterations:
0.3264210120153141
Data used from Cards dataset.

Overall R^2 for Set Keywords after 100 iterations:
0.21414158565630198
Data used from Cards dataset.

Overall R^2 for Converted Mana Cost after 100 iterations:
0.4853890008828523


Looks like even Converted Mana Cost requires some of the above features to be predicted. Because of this and the fact that card ratios tend not to differ from set to set, we'll keep Converted Mana Cost in as a feature and use Specific Mana Cost as a target. We'll run it through an interative loop adding a feature for each target once iteration of the loop has concluded.

In [7]:
def cards_d_tree_boosted(depth, data):
    targets =  ['Specific Mana Cost', 'Power', 'Toughness']
    attributes = ['Specific Mana Cost', 'Power', 'Toughness']
    predictions = pd.DataFrame()
    print('Decision Tree Regression')
    for target in targets:
        y = data[target]
        X = data.drop(attributes, axis=1)
        attributes.pop(0)
        print('\nThe results for {} as the target in the Decision Tree Regessor:'.format(target))
        for n in range(0, 101):
            decision_tree = DecisionTreeRegressor(max_depth=depth)
            decision_tree.fit(X, y)
            predict = decision_tree.predict(X)
            predictions['predictions {}'.format(n)] = predict
            y = y - predict
        predicted = predictions.sum(axis=1)
        print(np.corrcoef(data[target], predicted)[0, 1])

In [8]:
cards_d_tree_boosted(4, cards)

Decision Tree Regression

The results for Specific Mana Cost as the target in the Decision Tree Regessor:
0.822506602587629

The results for Power as the target in the Decision Tree Regessor:
0.8288341188157663

The results for Toughness as the target in the Decision Tree Regessor:
0.8872318905480118


Not bad. Let's try with only cards printed recently.

In [9]:
d_tree_boosted(4, recent_cards, 'Power')
d_tree_boosted(4, recent_cards, 'Toughness')
d_tree_boosted(4, recent_cards, 'Legendary')
d_tree_boosted(4, recent_cards, 'Specific Mana Cost')
d_tree_boosted(4, recent_cards, 'EG Keywords')
d_tree_boosted(4, recent_cards, 'Set Keywords')
d_tree_boosted(4, recent_cards, 'Converted Mana Cost')

Data used from Recent Cards dataset.

Overall R^2 for Power after 100 iterations:
0.9491838642518328
Data used from Recent Cards dataset.

Overall R^2 for Toughness after 100 iterations:
0.9399120137736107
Data used from Recent Cards dataset.

Overall R^2 for Legendary after 100 iterations:
0.9123350861732801
Data used from Recent Cards dataset.

Overall R^2 for Specific Mana Cost after 100 iterations:
0.945088937935838
Data used from Recent Cards dataset.

Overall R^2 for EG Keywords after 100 iterations:
0.7990275131136388
Data used from Recent Cards dataset.

Overall R^2 for Set Keywords after 100 iterations:
0.741489350670608
Data used from Recent Cards dataset.

Overall R^2 for Converted Mana Cost after 100 iterations:
0.9421022130089292


Only Set Keywords looks bad here. Let's remove all of these as features and see if they can still be predicted.

In [10]:
def rcards_d_tree_boosted(depth, data):
    targets =  ['Specific Mana Cost', 'Power', 'Toughness', 'Legendary',
               'EG Keywords', 'Set Keywords', 'Converted Mana Cost']
    attributes = ['Specific Mana Cost', 'Power', 'Toughness', 'Legendary',
               'EG Keywords', 'Set Keywords', 'Converted Mana Cost']
    predictions = pd.DataFrame()
    print('Decision Tree Regression')
    for target in targets:
        y = data[target]
        X = data.drop(attributes, axis=1)
        print('\nThe results for {} as the target in the Decision Tree Regessor:'.format(target))
        for n in range(0, 101):
            decision_tree = DecisionTreeRegressor(max_depth=depth)
            decision_tree.fit(X, y)
            predict = decision_tree.predict(X)
            predictions['predictions {}'.format(n)] = predict
            y = y - predict
        predicted = predictions.sum(axis=1)
        print(np.corrcoef(data[target], predicted)[0, 1])
        
rcards_d_tree_boosted(4, recent_cards)

Decision Tree Regression

The results for Specific Mana Cost as the target in the Decision Tree Regessor:
0.7503263785635671

The results for Power as the target in the Decision Tree Regessor:
0.3761977781940892

The results for Toughness as the target in the Decision Tree Regessor:
0.3685369350994774

The results for Legendary as the target in the Decision Tree Regessor:
0.6296686126143884

The results for EG Keywords as the target in the Decision Tree Regessor:
0.343057146539964

The results for Set Keywords as the target in the Decision Tree Regessor:
0.21879692287532465

The results for Converted Mana Cost as the target in the Decision Tree Regessor:
0.29120808680357474


So, only Specific Mana Cost worked here. We'll add back in Converted Mana Cost and run these again to see what they look like.

In [11]:
def rcards_d_tree_boosted(depth, data):
    targets =  ['Specific Mana Cost', 'Power', 'Toughness', 'Legendary',
               'EG Keywords', 'Set Keywords']
    attributes = ['Specific Mana Cost', 'Power', 'Toughness', 'Legendary',
               'EG Keywords', 'Set Keywords']
    predictions = pd.DataFrame()
    print('Decision Tree Regression')
    for target in targets:
        y = data[target]
        X = data.drop(attributes, axis=1)
        print('\nThe results for {} as the target in the Decision Tree Regessor:'.format(target))
        for n in range(0, 101):
            decision_tree = DecisionTreeRegressor(max_depth=depth)
            decision_tree.fit(X, y)
            predict = decision_tree.predict(X)
            predictions['predictions {}'.format(n)] = predict
            y = y - predict
        predicted = predictions.sum(axis=1)
        print(np.corrcoef(data[target], predicted)[0, 1])
        
rcards_d_tree_boosted(4, recent_cards)

Decision Tree Regression

The results for Specific Mana Cost as the target in the Decision Tree Regessor:
0.8658218203322351

The results for Power as the target in the Decision Tree Regessor:
0.8431265407604026

The results for Toughness as the target in the Decision Tree Regessor:
0.7884972509027901

The results for Legendary as the target in the Decision Tree Regessor:
0.7320179508299496

The results for EG Keywords as the target in the Decision Tree Regessor:
0.556992914753479

The results for Set Keywords as the target in the Decision Tree Regessor:
0.46595684688981337


Since Specific Mana Cost and Power are now in acceptable ranges, we'll remove them to see what the rest look like.

In [12]:
def rcards_d_tree_boosted(depth, data):
    targets =  ['Toughness', 'Legendary', 'EG Keywords', 'Set Keywords']
    attributes = ['Toughness', 'Legendary', 'EG Keywords', 'Set Keywords']
    predictions = pd.DataFrame()
    print('Decision Tree Regression')
    for target in targets:
        y = data[target]
        X = data.drop(attributes, axis=1)
        print('\nThe results for {} as the target in the Decision Tree Regessor:'.format(target))
        for n in range(0, 101):
            decision_tree = DecisionTreeRegressor(max_depth=depth)
            decision_tree.fit(X, y)
            predict = decision_tree.predict(X)
            predictions['predictions {}'.format(n)] = predict
            y = y - predict
        predicted = predictions.sum(axis=1)
        print(np.corrcoef(data[target], predicted)[0, 1])
        
rcards_d_tree_boosted(4, recent_cards)

Decision Tree Regression

The results for Toughness as the target in the Decision Tree Regessor:
0.9279530949812422

The results for Legendary as the target in the Decision Tree Regessor:
0.8488370021366797

The results for EG Keywords as the target in the Decision Tree Regessor:
0.7070746207635284

The results for Set Keywords as the target in the Decision Tree Regessor:
0.6106631819986321


Now only EG Keywords and Set Keywords are wanting. Once more, with feeling.

In [13]:
def rcards_d_tree_boosted(depth, data):
    targets =  ['EG Keywords', 'Set Keywords']
    attributes = ['EG Keywords', 'Set Keywords']
    predictions = pd.DataFrame()
    print('Decision Tree Regression')
    for target in targets:
        y = data[target]
        X = data.drop(attributes, axis=1)
        print('\nThe results for {} as the target in the Decision Tree Regessor:'.format(target))
        for n in range(0, 101):
            decision_tree = DecisionTreeRegressor(max_depth=depth)
            decision_tree.fit(X, y)
            predict = decision_tree.predict(X)
            predictions['predictions {}'.format(n)] = predict
            y = y - predict
        predicted = predictions.sum(axis=1)
        print(np.corrcoef(data[target], predicted)[0, 1])
        
rcards_d_tree_boosted(4, recent_cards)

Decision Tree Regression

The results for EG Keywords as the target in the Decision Tree Regessor:
0.7770950812186038

The results for Set Keywords as the target in the Decision Tree Regessor:
0.6999511242780767


Okay. Having completed this, the order will be Specific Mana Cost, Power, Toughness, EG Keywords, and Set Keywords. Same as before.

In [14]:
def rcards_d_tree_boosted(depth, data):
    targets =  ['Specific Mana Cost', 'Power', 'Toughness', 'Legendary',
               'EG Keywords', 'Set Keywords']
    attributes = ['Specific Mana Cost', 'Power', 'Toughness', 'Legendary',
               'EG Keywords', 'Set Keywords']
    predictions = pd.DataFrame()
    print('Decision Tree Regression')
    for target in targets:
        y = data[target]
        X = data.drop(attributes, axis=1)
        attributes.pop(0)
        print('\nThe results for {} as the target in the Decision Tree Regessor:'.format(target))
        for n in range(0, 101):
            decision_tree = DecisionTreeRegressor(max_depth=depth)
            decision_tree.fit(X, y)
            predict = decision_tree.predict(X)
            predictions['predictions {}'.format(n)] = predict
            y = y - predict
        predicted = predictions.sum(axis=1)
        print(np.corrcoef(data[target], predicted)[0, 1])
        
rcards_d_tree_boosted(4, recent_cards)

Decision Tree Regression

The results for Specific Mana Cost as the target in the Decision Tree Regessor:
0.8658218203322351

The results for Power as the target in the Decision Tree Regessor:
0.8577705230347077

The results for Toughness as the target in the Decision Tree Regessor:
0.9279530949812422

The results for Legendary as the target in the Decision Tree Regessor:
0.8864599603196075

The results for EG Keywords as the target in the Decision Tree Regessor:
0.7770950812186042

The results for Set Keywords as the target in the Decision Tree Regessor:
0.7414893506706078
