## Deep Learning Framework Power Scores 2018
## By Jeff Hale

### See [this Medium article](https://towardsdatascience.com/deep-learning-framework-power-scores-2018-23607ddf297a) for a discussion of the state of Python deep learning frameworks in 2018 featuring these charts.

I'm going to use plotly and pandas to make interactive visuals for this project.

Updated Sept. 20-21, 2018  to include Caffe, DL4J, Caffe2, and Chainer and several improved metrics. 

# Please upvote this Kaggle kernel if you find it helpful.

In [1]:
# import the usual frameworks
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import collections
import warnings

from IPython.core.display import display, HTML
from sklearn.preprocessing import MinMaxScaler

import os
print(os.listdir("../input"))
    
# import plotly 
import plotly.figure_factory as ff
import plotly.graph_objs as go
import plotly.offline as py
import plotly.tools as tls

# for color scales in plotly
import colorlover as cl 

# define color scale https://plot.ly/ipython-notebooks/color-scales/
cs = cl.scales['10']['div']['RdYlGn']    # for most charts 
cs7 =  cl.scales['7']['qual']['Dark2']   # for stacked bar charts  

# configure things
warnings.filterwarnings('ignore')

pd.options.display.float_format = '{:,.2f}'.format  
pd.options.display.max_columns = 999

py.init_notebook_mode(connected=True)

%load_ext autoreload
%autoreload 2
%matplotlib inline

['ds13.csv']


List package versions for reproducibility.

In [2]:
#!pip list

Read in the data from the csv. The Google sheet that holds the data is available [here](https://docs.google.com/spreadsheets/d/1mYfHMZfuXGpZ0ggBVDot3SJMU-VsCsEGceEL8xd1QBo/edit?usp=sharing).

In [3]:
new_col_names = ['framework','indeed', 'monster', 'simply', 'linkedin', 'angel', 
                 'usage', 'search', 'medium', 'books', 'arxiv', 'stars', 
                 'watchers', 'forks', 'contribs',
                ]

df = pd.read_csv('../input/ds13.csv', 
                 skiprows=4,
                 header=None, 
                 nrows=11, 
                 thousands=',',
                 index_col=0,
                 names=new_col_names,
                 usecols=new_col_names,
                )
df

Unnamed: 0_level_0,indeed,monster,simply,linkedin,angel,usage,search,medium,books,arxiv,stars,watchers,forks,contribs
framework,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1
TensorFlow,2079,1253,1582,2610,552,29.90%,73,6200,202,3700,109576,8334,67551,1642
Keras,684,364,449,695,177,22.20%,53,9120,79,1390,33558,1847,12658,719
PyTorch,486,309,428,665,120,6.40%,19,1780,18,1560,18716,952,4474,760
Caffe,607,399,515,866,123,1.50%,4,815,14,1360,25604,2218,15633,270
Theano,356,316,279,508,95,4.90%,0,428,17,652,8477,585,2447,328
MXNET,266,154,200,298,29,1.50%,2,524,32,260,15200,1170,5498,587
CNTK,126,96,97,160,12,3.00%,0,223,1,88,15106,1368,4029,189
DeepLearning4J,17,5,9,35,3,3.40%,2,70,11,27,9615,829,4441,232
Caffe2,55,51,49,109,12,1.20%,2,335,2,67,8284,577,2102,193
Chainer,19,19,19,28,3,0.00%,2,91,3,164,4128,325,1095,182


Cool. We used the read_csv parameters to give us just what we wanted.

## Basic Data Exploration
Let's see what the data look like.

In [4]:
df.info()

<class 'pandas.core.frame.DataFrame'>
Index: 11 entries, TensorFlow to FastAI
Data columns (total 14 columns):
indeed      11 non-null int64
monster     11 non-null int64
simply      11 non-null int64
linkedin    11 non-null int64
angel       11 non-null int64
usage       11 non-null object
search      11 non-null int64
medium      11 non-null int64
books       11 non-null int64
arxiv       11 non-null int64
stars       11 non-null int64
watchers    11 non-null int64
forks       11 non-null int64
contribs    11 non-null int64
dtypes: int64(13), object(1)
memory usage: 1.3+ KB


In [5]:
df.describe()

Unnamed: 0,indeed,monster,simply,linkedin,angel,search,medium,books,arxiv,stars,watchers,forks,contribs
count,11.0,11.0,11.0,11.0,11.0,11.0,11.0,11.0,11.0,11.0,11.0,11.0,11.0
mean,426.82,269.64,329.73,543.09,102.36,14.27,1858.55,34.45,843.55,23230.18,1694.27,11143.18,481.55
std,600.66,359.61,456.83,750.28,161.39,25.08,2980.27,59.96,1122.97,29941.39,2280.61,19252.76,443.05
min,0.0,0.0,0.0,0.0,0.0,0.0,70.0,0.0,11.0,4128.0,325.0,1095.0,182.0
25%,37.0,35.0,34.0,72.0,7.5,1.0,279.0,2.5,77.5,8380.5,581.0,2547.0,194.0
50%,266.0,154.0,200.0,298.0,29.0,2.0,524.0,14.0,260.0,15106.0,952.0,4441.0,270.0
75%,546.5,340.0,438.5,680.0,121.5,11.5,1319.0,25.0,1375.0,22160.0,1607.5,9078.0,653.0
max,2079.0,1253.0,1582.0,2610.0,552.0,73.0,9120.0,202.0,3700.0,109576.0,8334.0,67551.0,1642.0


Looks like pandas read the usage column as a string because of it's percent sign. Let's make that a decimal.

In [6]:
df['usage'] = pd.to_numeric(df['usage'].str.strip('%'))
df['usage'] = df['usage'].astype(int)
df

Unnamed: 0_level_0,indeed,monster,simply,linkedin,angel,usage,search,medium,books,arxiv,stars,watchers,forks,contribs
framework,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1
TensorFlow,2079,1253,1582,2610,552,29,73,6200,202,3700,109576,8334,67551,1642
Keras,684,364,449,695,177,22,53,9120,79,1390,33558,1847,12658,719
PyTorch,486,309,428,665,120,6,19,1780,18,1560,18716,952,4474,760
Caffe,607,399,515,866,123,1,4,815,14,1360,25604,2218,15633,270
Theano,356,316,279,508,95,4,0,428,17,652,8477,585,2447,328
MXNET,266,154,200,298,29,1,2,524,32,260,15200,1170,5498,587
CNTK,126,96,97,160,12,3,0,223,1,88,15106,1368,4029,189
DeepLearning4J,17,5,9,35,3,3,2,70,11,27,9615,829,4441,232
Caffe2,55,51,49,109,12,1,2,335,2,67,8284,577,2102,193
Chainer,19,19,19,28,3,0,2,91,3,164,4128,325,1095,182


In [7]:
df.info()

<class 'pandas.core.frame.DataFrame'>
Index: 11 entries, TensorFlow to FastAI
Data columns (total 14 columns):
indeed      11 non-null int64
monster     11 non-null int64
simply      11 non-null int64
linkedin    11 non-null int64
angel       11 non-null int64
usage       11 non-null int64
search      11 non-null int64
medium      11 non-null int64
books       11 non-null int64
arxiv       11 non-null int64
stars       11 non-null int64
watchers    11 non-null int64
forks       11 non-null int64
contribs    11 non-null int64
dtypes: int64(14)
memory usage: 1.3+ KB


All ints! Great!

# Plotly
Let's make interactive plots with plotly for each popularity category.

## Online Job Listings
I looked at how many times each framework appeared in searches on job listing websites. For more discussion see the Medium Article the accompanies this notebook here.

In [8]:
# sum groupby for the hiring columns
df['hiring'] = df['indeed'] + df['monster'] + df['linkedin'] + df['simply'] + df['angel']

In [9]:
df

Unnamed: 0_level_0,indeed,monster,simply,linkedin,angel,usage,search,medium,books,arxiv,stars,watchers,forks,contribs,hiring
framework,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1
TensorFlow,2079,1253,1582,2610,552,29,73,6200,202,3700,109576,8334,67551,1642,8076
Keras,684,364,449,695,177,22,53,9120,79,1390,33558,1847,12658,719,2369
PyTorch,486,309,428,665,120,6,19,1780,18,1560,18716,952,4474,760,2008
Caffe,607,399,515,866,123,1,4,815,14,1360,25604,2218,15633,270,2510
Theano,356,316,279,508,95,4,0,428,17,652,8477,585,2447,328,1554
MXNET,266,154,200,298,29,1,2,524,32,260,15200,1170,5498,587,947
CNTK,126,96,97,160,12,3,0,223,1,88,15106,1368,4029,189,491
DeepLearning4J,17,5,9,35,3,3,2,70,11,27,9615,829,4441,232,69
Caffe2,55,51,49,109,12,1,2,335,2,67,8284,577,2102,193,276
Chainer,19,19,19,28,3,0,2,91,3,164,4128,325,1095,182,88


In [10]:
data = [go.Bar(
    x=df.index,
    y=df.hiring,
    marker=dict(color=cs),
    )
]

layout = {'title': 'Online Job Listings',
          'xaxis': {'title': 'Framework'},
          'yaxis': {'title': "Quantity"},
         }

fig = go.Figure(data=data, layout=layout)
py.iplot(fig)

That's just the aggregate listings. Let's plot the job listing mentions for each website in a stacked bar chart. This will take multiple traces.

In [11]:
y_indeed = df['indeed']
y_monster = df['monster']
y_simply = df['simply']
y_linkedin = df['linkedin']
y_angel = df['angel']

In [12]:
indeed = go.Bar(x=df.index, y=y_indeed, name = 'Indeed')
simply = go.Bar(x=df.index, y=y_simply, name='Simply Hired')
monster = go.Bar(x=df.index, y=y_monster, name='Monster')
linked = go.Bar(x=df.index, y=y_linkedin, name='LinkedIn')
angel = go.Bar(x=df.index, y=y_angel, name='Angel List')

data = [linked, indeed, simply, monster, angel]
layout = go.Layout(
    barmode='stack',
    title='Online Job Listings',
    xaxis={'title': 'Framework'},
    yaxis={'title': 'Mentions', 'separatethousands': True},
    colorway=cs,
)

fig = go.Figure(data = data, layout = layout)
py.iplot(fig)

Cool. Now let's see how this data looks with grouped bars instead of stacked bars by changing the barmode to "group".

In [13]:
indeed = go.Bar(x=df.index, y=y_indeed, name = "Indeed")
simply = go.Bar(x=df.index, y=y_simply, name="Simply Hired")
monster = go.Bar(x=df.index, y=y_monster, name="Monster")
linked = go.Bar(x=df.index, y=y_linkedin, name="LinkedIn")
angel = go.Bar(x=df.index, y=y_angel, name='Angel List')

data = [linked, indeed, simply, monster, angel]
layout = go.Layout(
    barmode='group',
    title="Online Job Listings",
    xaxis={'title': 'Framework'},
    yaxis={'title': "Listings", 'separatethousands': True,
    }
)

fig = go.Figure(data=data, layout=layout)
py.iplot(fig)

## KDnuggets Usage Survey
Let's look at usage as reported in KDnuggets 2018 survey.

In [14]:
# Make sure you have colorlover imported as cl for color scale
df['usage'] = df['usage'] / 100

## Google Search Volume
As of Sept. 15, 2018.

In [15]:
data = [
    go.Bar(
        x=df.index, 
        y=df['usage'],
        marker=dict(color=cs)  
    )
]
    
layout = {
    'title': 'KDnuggets Usage Survey',
    'xaxis': {'title': 'Framework'},
    'yaxis': {'title': "% Respondents Used in Past Year", 'tickformat': '.0%'},
}

fig = go.Figure(data=data, layout=layout)
py.iplot(fig)

In [16]:
data = [
    go.Bar(
        x = df.index, 
        y = df['search'],
        marker = dict(color=cs),  
    )
]
    
layout = {
    'title': 'Google Search Volume',
    'xaxis': {'title': 'Framework'},
    'yaxis': {'title': "Relative Search Volume"},
}

fig = go.Figure(data=data, layout=layout)
py.iplot(fig)

## Medium Articles
Past 12 months.

In [17]:
# Make sure you have colorlover imported as cl for color scale
# cs is defined in first cell

data = [
    go.Bar(
        x=df.index, 
        y=df['medium'],
        marker=dict(color=cs) ,
    )
]
    
layout = {
    'title': 'Medium Articles',
    'xaxis': {'title': 'Framework'},
    'yaxis': {'title': "Articles"},
}

fig = go.Figure(data=data, layout=layout)
py.iplot(fig)

## Amazon Books

In [18]:
data = [
    go.Bar(
        x=df.index, 
        y=df['books'],
        marker=dict(color=cs),           
    )
]
    
layout = {
    'title': 'Amazon Books',
    'xaxis': {'title': 'Framework'},
    'yaxis': {'title': "Books"},
}

fig = go.Figure(data=data, layout=layout)
py.iplot(fig)

## ArXiv Articles
Past 12 months.

In [19]:
data = [
    go.Bar(
        x=df.index, 
        y=df['arxiv'],
        marker=dict(color=cs),           
    )
]

layout = {
    'title': 'ArXiv Articles',
    'xaxis': {'title': 'Framework'},
    'yaxis': {'title': "Articles"},
}

fig = go.Figure(data=data, layout=layout)
py.iplot(fig)

# GitHub Activity
Let's make another stacked bar chart of the four GitHub categories.

In [20]:
y_stars = df['stars']
y_watchers = df['watchers']
y_forks = df['forks']
y_contribs = df['contribs']

stars = go.Bar(x = df.index, y=y_stars, name="Stars")
watchers = go.Bar(x=df.index, y=y_watchers, name="Watchers")
forks = go.Bar(x=df.index, y=y_forks, name="Forks")
contribs = go.Bar(x=df.index, y=y_contribs, name="Contributors")


data = [stars, watchers, forks, contribs]
layout = go.Layout(barmode='stack', 
    title="GitHub Activity",
    xaxis={'title': 'Framework'},
    yaxis={
        'title': "Quantity",
        'separatethousands': True,
    }
)

fig = go.Figure(data=data, layout=layout)
py.iplot(fig)

This configuration doesn't make the most sense, because there are going to be way more stars than contributors. It's not an apples to apples comparison. Let's try four subplots instead.

In [21]:
trace1 = go.Bar(
    x=df.index,
    y=df['stars'],
    name='Stars',
    marker=dict(color=cs),
)
trace2 = go.Bar(
    x=df.index,
    y=df['forks'],
    name ="Forks",
    marker=dict(color=cs)
)
trace3 = go.Bar(
    x=df.index,
    y=df['watchers'],
    name='Watchers',
    marker=dict(color=cs)
)
trace4 = go.Bar(
    x=df.index,
    y=df['contribs'],
    name='Contributors',
    marker=dict(color=cs),
)

fig = tls.make_subplots(
    rows=2, 
    cols=2, 
    subplot_titles=(
        'Stars', 
        'Forks',
        'Watchers',
        'Contributors',
    )
)

fig['layout']['yaxis3'].update(separatethousands = True)
fig['layout']['yaxis4'].update(separatethousands = True)
fig['layout']['yaxis2'].update(tickformat = ',k', separatethousands = True)
fig['layout']['yaxis1'].update(tickformat = ',k', separatethousands = True)

fig.append_trace(trace1, 1, 1)
fig.append_trace(trace2, 1, 2)
fig.append_trace(trace3, 2, 1)
fig.append_trace(trace4, 2, 2)

fig['layout'].update(title = 'GitHub Activity', showlegend = False)
py.iplot(fig)

This is the format of your plot grid:
[ (1,1) x1,y1 ]  [ (1,2) x2,y2 ]
[ (2,1) x3,y3 ]  [ (2,2) x4,y4 ]



This presentation shows the information in a more comprehensible and appropriate format.

# Scale and Aggregate for Power Scores
Scale each column. For each column we'll use MinMaxScaler to subtract the minumum and divide by the original max - original min.

In [22]:
df.info()

<class 'pandas.core.frame.DataFrame'>
Index: 11 entries, TensorFlow to FastAI
Data columns (total 15 columns):
indeed      11 non-null int64
monster     11 non-null int64
simply      11 non-null int64
linkedin    11 non-null int64
angel       11 non-null int64
usage       11 non-null float64
search      11 non-null int64
medium      11 non-null int64
books       11 non-null int64
arxiv       11 non-null int64
stars       11 non-null int64
watchers    11 non-null int64
forks       11 non-null int64
contribs    11 non-null int64
hiring      11 non-null int64
dtypes: float64(1), int64(14)
memory usage: 1.4+ KB


In [23]:
scale = MinMaxScaler()
scaled_df = pd.DataFrame(
    scale.fit_transform(df), 
    columns = df.columns,
    index = df.index)    

In [24]:
scaled_df

Unnamed: 0_level_0,indeed,monster,simply,linkedin,angel,usage,search,medium,books,arxiv,stars,watchers,forks,contribs,hiring
framework,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1
TensorFlow,1.0,1.0,1.0,1.0,1.0,1.0,1.0,0.68,1.0,1.0,1.0,1.0,1.0,1.0,1.0
Keras,0.33,0.29,0.28,0.27,0.32,0.76,0.73,1.0,0.39,0.37,0.28,0.19,0.17,0.37,0.29
PyTorch,0.23,0.25,0.27,0.25,0.22,0.21,0.26,0.19,0.09,0.42,0.14,0.08,0.05,0.4,0.25
Caffe,0.29,0.32,0.33,0.33,0.22,0.03,0.05,0.08,0.07,0.37,0.2,0.24,0.22,0.06,0.31
Theano,0.17,0.25,0.18,0.19,0.17,0.14,0.0,0.04,0.08,0.17,0.04,0.03,0.02,0.1,0.19
MXNET,0.13,0.12,0.13,0.11,0.05,0.03,0.03,0.05,0.16,0.07,0.1,0.11,0.07,0.28,0.12
CNTK,0.06,0.08,0.06,0.06,0.02,0.1,0.0,0.02,0.0,0.02,0.1,0.13,0.04,0.0,0.06
DeepLearning4J,0.01,0.0,0.01,0.01,0.01,0.1,0.03,0.0,0.05,0.0,0.05,0.06,0.05,0.03,0.01
Caffe2,0.03,0.04,0.03,0.04,0.02,0.03,0.03,0.03,0.01,0.02,0.04,0.03,0.02,0.01,0.03
Chainer,0.01,0.02,0.01,0.01,0.01,0.0,0.03,0.0,0.01,0.04,0.0,0.0,0.0,0.0,0.01


### Scaled Online Job Listings
Let's combine the scaled online job listing columns into a new column.

In [25]:
scaled_df['hiring_score'] = scaled_df[['indeed', 'monster', 'simply', 'linkedin', 'angel']].mean(axis=1)

In [26]:
scaled_df

Unnamed: 0_level_0,indeed,monster,simply,linkedin,angel,usage,search,medium,books,arxiv,stars,watchers,forks,contribs,hiring,hiring_score
framework,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1
TensorFlow,1.0,1.0,1.0,1.0,1.0,1.0,1.0,0.68,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0
Keras,0.33,0.29,0.28,0.27,0.32,0.76,0.73,1.0,0.39,0.37,0.28,0.19,0.17,0.37,0.29,0.3
PyTorch,0.23,0.25,0.27,0.25,0.22,0.21,0.26,0.19,0.09,0.42,0.14,0.08,0.05,0.4,0.25,0.24
Caffe,0.29,0.32,0.33,0.33,0.22,0.03,0.05,0.08,0.07,0.37,0.2,0.24,0.22,0.06,0.31,0.3
Theano,0.17,0.25,0.18,0.19,0.17,0.14,0.0,0.04,0.08,0.17,0.04,0.03,0.02,0.1,0.19,0.19
MXNET,0.13,0.12,0.13,0.11,0.05,0.03,0.03,0.05,0.16,0.07,0.1,0.11,0.07,0.28,0.12,0.11
CNTK,0.06,0.08,0.06,0.06,0.02,0.1,0.0,0.02,0.0,0.02,0.1,0.13,0.04,0.0,0.06,0.06
DeepLearning4J,0.01,0.0,0.01,0.01,0.01,0.1,0.03,0.0,0.05,0.0,0.05,0.06,0.05,0.03,0.01,0.01
Caffe2,0.03,0.04,0.03,0.04,0.02,0.03,0.03,0.03,0.01,0.02,0.04,0.03,0.02,0.01,0.03,0.03
Chainer,0.01,0.02,0.01,0.01,0.01,0.0,0.03,0.0,0.01,0.04,0.0,0.0,0.0,0.0,0.01,0.01


Now we have a hiring score.

### Scaled GitHub Activity

Let's combine the scaled GitHub columns into a new column.

In [27]:
scaled_df['github_score'] = scaled_df[['stars', 'watchers', 'forks', 'contribs']].mean(axis=1)

In [28]:
scaled_df

Unnamed: 0_level_0,indeed,monster,simply,linkedin,angel,usage,search,medium,books,arxiv,stars,watchers,forks,contribs,hiring,hiring_score,github_score
framework,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1
TensorFlow,1.0,1.0,1.0,1.0,1.0,1.0,1.0,0.68,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0
Keras,0.33,0.29,0.28,0.27,0.32,0.76,0.73,1.0,0.39,0.37,0.28,0.19,0.17,0.37,0.29,0.3,0.25
PyTorch,0.23,0.25,0.27,0.25,0.22,0.21,0.26,0.19,0.09,0.42,0.14,0.08,0.05,0.4,0.25,0.24,0.17
Caffe,0.29,0.32,0.33,0.33,0.22,0.03,0.05,0.08,0.07,0.37,0.2,0.24,0.22,0.06,0.31,0.3,0.18
Theano,0.17,0.25,0.18,0.19,0.17,0.14,0.0,0.04,0.08,0.17,0.04,0.03,0.02,0.1,0.19,0.19,0.05
MXNET,0.13,0.12,0.13,0.11,0.05,0.03,0.03,0.05,0.16,0.07,0.1,0.11,0.07,0.28,0.12,0.11,0.14
CNTK,0.06,0.08,0.06,0.06,0.02,0.1,0.0,0.02,0.0,0.02,0.1,0.13,0.04,0.0,0.06,0.06,0.07
DeepLearning4J,0.01,0.0,0.01,0.01,0.01,0.1,0.03,0.0,0.05,0.0,0.05,0.06,0.05,0.03,0.01,0.01,0.05
Caffe2,0.03,0.04,0.03,0.04,0.02,0.03,0.03,0.03,0.01,0.02,0.04,0.03,0.02,0.01,0.03,0.03,0.02
Chainer,0.01,0.02,0.01,0.01,0.01,0.0,0.03,0.0,0.01,0.04,0.0,0.0,0.0,0.0,0.01,0.01,0.0


Now we have all our aggregate columns and are ready to turn to the weights.

## Weights

Let's make a pie chart of weights by category.

In [29]:
weights = {'Online Job Listings ': .3,
           'KDnuggets Usage Survey': .2,
           'GitHub Activity': .1,
           'Google Search Volume': .1,
           'Medium Articles': .1,
           'Amazon Books': .1,
           'ArXiv Articles': .1 }

In [30]:
# changing colors because we want to show these aren't the frameworks
weight_colors = cl.scales['7']['qual']['Set1'] 

common_props = dict(
    labels = list(weights.keys()),
    values = list(weights.values()),
    textfont=dict(size=16),
    marker=dict(colors=weight_colors),
    hoverinfo='none',
    showlegend=False,
)

trace1 = go.Pie(
    **common_props,
    textinfo='label',
    textposition='outside',
)

trace2 = go.Pie(
    **common_props,
    textinfo='percent',
    textposition='inside',
)

layout = go.Layout(title = 'Weights by Category')

fig = go.Figure([trace1, trace2], layout=layout)
py.iplot(fig)

## Weight the Categories

In [31]:
scaled_df['w_hiring'] = scaled_df['hiring_score'] * .3
scaled_df['w_usage'] = scaled_df['usage'] * .2
scaled_df['w_github'] = scaled_df['github_score'] * .1
scaled_df['w_search'] = scaled_df['search'] * .1
scaled_df['w_arxiv'] = scaled_df['arxiv'] * .1
scaled_df['w_books'] = scaled_df['books'] * .1
scaled_df['w_medium'] = scaled_df['medium'] * .1

In [32]:
weight_list = ['w_hiring', 'w_usage', 'w_github', 'w_search', 'w_arxiv', 'w_books', 'w_medium']
scaled_df = scaled_df[weight_list]
scaled_df

Unnamed: 0_level_0,w_hiring,w_usage,w_github,w_search,w_arxiv,w_books,w_medium
framework,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
TensorFlow,0.3,0.2,0.1,0.1,0.1,0.1,0.07
Keras,0.09,0.15,0.03,0.07,0.04,0.04,0.1
PyTorch,0.07,0.04,0.02,0.03,0.04,0.01,0.02
Caffe,0.09,0.01,0.02,0.01,0.04,0.01,0.01
Theano,0.06,0.03,0.0,0.0,0.02,0.01,0.0
MXNET,0.03,0.01,0.01,0.0,0.01,0.02,0.01
CNTK,0.02,0.02,0.01,0.0,0.0,0.0,0.0
DeepLearning4J,0.0,0.02,0.0,0.0,0.0,0.01,0.0
Caffe2,0.01,0.01,0.0,0.0,0.0,0.0,0.0
Chainer,0.0,0.0,0.0,0.0,0.0,0.0,0.0


## Power Scores
Let's make the power score column by summing the seven category scores.

In [33]:
scaled_df['ps'] = scaled_df[weight_list].sum(axis = 1)
scaled_df

Unnamed: 0_level_0,w_hiring,w_usage,w_github,w_search,w_arxiv,w_books,w_medium,ps
framework,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
TensorFlow,0.3,0.2,0.1,0.1,0.1,0.1,0.07,0.97
Keras,0.09,0.15,0.03,0.07,0.04,0.04,0.1,0.52
PyTorch,0.07,0.04,0.02,0.03,0.04,0.01,0.02,0.23
Caffe,0.09,0.01,0.02,0.01,0.04,0.01,0.01,0.17
Theano,0.06,0.03,0.0,0.0,0.02,0.01,0.0,0.12
MXNET,0.03,0.01,0.01,0.0,0.01,0.02,0.01,0.08
CNTK,0.02,0.02,0.01,0.0,0.0,0.0,0.0,0.05
DeepLearning4J,0.0,0.02,0.0,0.0,0.0,0.01,0.0,0.04
Caffe2,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.03
Chainer,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01


Let's clean things up for publication

In [36]:
p_s_df = scaled_df * 100
p_s_df = p_s_df.round(2)
p_s_df.columns = ['Job Search Listings', 'Usage Survey', 'GitHub Activity', 'Search Volume', 'ArXiv Articles', 'Amazon Books', 'Medium Articles', 'Power Score']
p_s_df.rename_axis('Framework', inplace = True)
p_s_df

Unnamed: 0_level_0,Job Search Listings,Usage Survey,GitHub Activity,Search Volume,ArXiv Articles,Amazon Books,Medium Articles,Power Score
Framework,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
TensorFlow,30.0,20.0,10.0,10.0,10.0,10.0,6.77,96.77
Keras,8.94,15.17,2.53,7.26,3.74,3.91,10.0,51.55
PyTorch,7.34,4.14,1.66,2.6,4.2,0.89,1.89,22.72
Caffe,8.94,0.69,1.8,0.55,3.66,0.69,0.82,17.15
Theano,5.8,2.76,0.49,0.0,1.74,0.84,0.4,12.02
MXNET,3.26,0.69,1.39,0.27,0.67,1.58,0.5,8.37
CNTK,1.69,2.07,0.71,0.0,0.21,0.05,0.17,4.89
DeepLearning4J,0.22,2.07,0.5,0.27,0.04,0.54,0.0,3.65
Caffe2,0.97,0.69,0.23,0.27,0.15,0.1,0.29,2.71
Chainer,0.31,0.0,0.0,0.27,0.41,0.15,0.02,1.18


Let's make a bar chart of the power scores.

In [35]:
data = [
    go.Bar(
        x=scaled_df.index,          # you can pass plotly the axis
        y=p_s_df['Power Score'],
        marker=dict(color=cs),
        text=p_s_df['Power Score'],
        textposition='outside',
        textfont=dict(size=10)
    )
]

layout = {
    'title': 'Deep Learning Framework Power Scores 2018',
    'xaxis': {'title': 'Framework'},
    'yaxis': {'title': "Score"}
}

fig = go.Figure(data=data, layout=layout)
py.iplot(fig)

### That's the end! 
### See [this Medium article](https://towardsdatascience.com/deep-learning-framework-power-scores-2018-23607ddf297a) for a discussion of the state of Python deep learning frameworks in 2018 featuring these charts.
## Pleave upvote if you found this interesting or informative!