## GPU Chooser

This workbook can help rank GPUs according a mixture of features (with the weights determined by the user) and graph it against price.

### Data
Firstly, pull in the parameters from Wikipedia for the cards under consideration (more can easily be added) :

In [2]:
raw="""
name                 | sh:tx:rop | mem | bw|bus|ocl|single|double|watts| amz:brand:comment
GeForce GTX 750 1Gb  | 512:32:16 | 1024| 80|128|1.2| 1044 | 32.6 |  55 | B00IDG3NDY
GeForce GTX 750 2Gb  | 512:32:16 | 2048| 80|128|1.2| 1044 | 32.6 |  55 | B00J3ZNB04
GeForce GTX 750Ti 2Gb| 640:40:16 | 2048| 80|128|1.2| 1306 | 40.8 |  60 | B00IDG3IDO
GeForce GTX 750Ti 4Gb| 640:40:16 | 4096| 80|128|1.2| 1306 | 40.8 |  60 | B00T4RJ8FI
GeForce GTX 760 2Gb  |1152:96:32 | 2048|192|256|1.2| 2257 | 94   | 170 | B00DT5R3EO
GeForce GTX 760 4Gb  |1152:96:32 | 4096|192|256|1.2| 2257 | 94   | 170 | B00E9O28DU

GeForce GTX 960 2Gb  |1024:64:32 | 2048|112|128|1.2| 2308 | 72.1 | 120 | B00SC6HAS4
GeForce GTX 960 4Gb  |1024:64:32 | 4096|112|128|1.2| 2308 | 72.1 | 120 | B00UOYQ5LA
GeForce GTX 970      |1664:104:56| 3584|196|224|1.2| 3494 | 109  | 145 | B00NVODXR4
GeForce GTX 980      |2048:128:64| 4096|224|256|1.2| 4612 | 144  | 165 | B00NT9UT3M
GeForce GTX 980 Ti   |2816:176:96| 6144|336|384|1.2| 5632 | 176  | 250 | B00YNEIAWY
GeForce GTX Titan X  |3072:192:96|12288|336|384|1.2| 6144 | 192  | 250 | B00UXTN5P0

R9 290               |2560:160:64| 4096|320|512|2.0| 4848 | 606  | 275 | B00V4JVY1A
R9 290X              |2816:176:64| 4096|320|512|2.0| 5632 | 704  | 290 | B00FLMKQY2

R9 380 2Gb           |1792:112:32| 2048|182|256|2.1| 3476 | 217  | 190 | B00ZGL8EBK
R9 380 4Gb           |1792:112:32| 4096|182|256|2.1| 3476 | 217  | 190 | B00ZGF3TUC
R9 390               |2560:160:64| 8192|384|512|2.1| 5120 | 640  | 275 | B00ZGL8CYY
R9 390X              |2816:176:64| 8192|384|512|2.1| 5914 | 739  | 275 | B00ZGL8CFI
"""

import re
arr = [ re.split(r'\s*[|:]\s*',l) for l in raw.split('\n') if len(l)>0]
headings = arr[0]
data=[ { h:(e if h in 'name.amz' else float(e)) for h,e in zip(headings,a) } for a in arr[1:] ]
#for d in data:print("%s|%s" % (d['name'], d['amz']))

Now the GPU card data is in a nice array of dictionary entries, with numeric entries for all but 'name' and the Amazon item ID, indexed in the same order as 'raw'.

### Equivalent cards for Additional Price data
Here, one can put additional Amazon product codes that refer to the same 
card from a Compute perspective (different manufacturer and/or different ports may make the 
cards different from a gaming user's perspective, of course).

**TODO : blend these into the price grabbing**

In [4]:
raw="""
name                 |amz:brand:comment
GeForce GTX 750 1Gb  |
GeForce GTX 750 2Gb  |
GeForce GTX 750Ti 2Gb|
GeForce GTX 750Ti 4Gb|
GeForce GTX 760 2Gb  |
GeForce GTX 760 4Gb  |
GeForce GTX 960 2Gb  |
GeForce GTX 960 4Gb  |
GeForce GTX 970      |
GeForce GTX 980      |
GeForce GTX 980 Ti   |
GeForce GTX Titan X  |
R9 290               |
R9 290X              |
R9 380 2Gb           |
R9 380 4Gb           |
R9 390               |
R9 390X              |
"""

arr = [ re.split(r'\s*[|:]\s*',l) for l in raw.split('\n') if len(l)>0]
headings = arr[0]
equivs=[ { h:e for h,e in zip(headings,a) } for a in arr[1:] ]

### Add known prices from Amazon
If you want to regenerate these, execute the block below.  To 'cache' them back into this script, 
simply copy the generated list back into the following cell

In [85]:
pxs={'B00ZGL8EBK': 216.53, 'B00UOYQ5LA': 239.99, 'B00IDG3IDO': 139.99, 'B00FLMKQY2': 339.99, 'B00V4JVY1A': 333.26, 'B00IDG3NDY': 114.12, 'B00YNEIAWY': 698.85, 'B00T4RJ8FI': 349.99, 'B00J3ZNB04': 149.37, 'B00UXTN5P0': 1029.99, 'B00NT9UT3M': 507.82, 'B00ZGL8CYY': 359.42, 'B00ZGF3TUC': 229.99, 'B00SC6HAS4': 199.99, 'B00ZGL8CFI': 458.63, 'B00DT5R3EO': 199.99, 'B00E9O28DU': 274.99, 'B00NVODXR4': 337.99}

for d in data:
    if d.get('amz',None) is not None and pxs.get(d['amz'],None) is not None:
        d['px'] = pxs[d['amz']]

### Grab prices from Amazon
Rather than use their API (which creates the issue of putting the keys into GitHub), just grab the pages.  NB: The page caches the prices found into the data structure to avoid doing this too often!

The price downloading/parsing requires that you have ``requests`` and ``BeautifulSoup`` installed : ``pip install requests BeautifulSoup4``


In [86]:
import requests
from bs4 import BeautifulSoup

BASE_URL = "http://www.amazon.com/exec/obidos/ASIN/"

for d in data:
    if d.get('px', None) is None:
        r = requests.get(BASE_URL + d['amz'])
        soup = BeautifulSoup(r.text, 'html.parser')
        price = None
        try:
            ele = soup.find(id="priceblock_ourprice")
            price = float(ele.text.replace('$','').replace(',',''))
            print("Got price for %s" % (d['name']))
        except AttributeError:
            print("Didn't find the 'price' element for %s (%s)" % (d['name'], d['amz']))
        d['px']=price
print("Finished downloading prices : Run the 'cache' script below to save the data")

### Code required to 'cache' prices found
Exectute the following, and copy its output to the ```pxs=``` line above so that the page 
can remember the prices found on Amazon most recently.

In [84]:
print({ d['amz']:d['px'] for d in data if d.get('amz',None) is not None and d.get('px',None) is not None})

### Show known prices

In [87]:
for d in data:
    if d.get('px', None) is not None:
        print("%s | %6.2f" % (d['name'], d['px']))


### Graph data based on given weights

The concept here is that one can focus on a 'basecard' (for instance, one you already have, or one you've looked at closely), and then assign multiplicative weights to each of a GPU card's qualities, and come up with a 'relative performance' according to that weighting scheme.  Then, performance can be visualised, against absolute dollar cost (the 'efficient frontier' being the envelope around the points from the upper left corner).

In [118]:
basecard = 'GeForce GTX 760 2Gb' # Name should match a card with full data above
basedata = [ d for d in data if d['name']==basecard ][0]  

multipliers = dict(single=2., mem=1.)  # FLOPs are twice as important as memory, all else ignored
data_filtered = [d for d in data if d['ocl']<2. and d['px']<500. ]

def evaluate_card(base, d, mult):
    comp=0.
    for (k,v) in mult.items():
        if d.get(k,None) is not None and base.get(k,None) is not None:
            comp += v*d[k]/base[k]
    return comp
x=[ d.get('px',None) for d in data_filtered]
y=[ evaluate_card(basedata, d, multipliers) for d in data_filtered ]
l=[ d['name'] for d in data_filtered]
zip(l,y,x)

%matplotlib inline
import matplotlib.pyplot as plt
plt.figure(figsize=(15, 8))
plt.plot(x,y, 'ro')
for i, xy in enumerate(zip(x, y)): 
    plt.annotate('%s' % (l[i]), xy=xy, xytext=(5,.05), textcoords='offset points')
plt.show()