## Text8 Corpus 

The word vectors are trained using the text8 corpus. Each line consists of 10000 words

# Run  Benchmark

In [None]:
# inside docker
!python benchmark.py --frameworks dl4j tensorflow gensim originalc --fname /benmark_nn_frameworks/data/text8-split --epochs 4 --batch_size 32 --workers 7 --size 100 --platform local 

## Load Report File

In [None]:
import plotly.offline as py
import plotly.graph_objs as go
import json

py.init_notebook_mode()
with open('./local-report.json','r') as f:
    report = json.loads(f.read())

## System Information

In [None]:
print report['systeminfo']

## Training Parameters

In [None]:
print json.dumps(report['trainingparams'], indent=2)

### Platform

In [None]:
print report['platform']

## Commands For Each Framework

In [None]:
print json.dumps(report['command'], indent=4)

# Generate Graphics

Time to train(in seconds) and peak memory(in MiB) results

In [None]:
x, y = zip(*report['time'].items())
# y = [_ / 3600. for _ in y]  # re-scale to hours

data = [go.Bar(x=x, y=y, orientation='v')]
layout = go.Layout(
    title='Time Report',
    xaxis=dict(title='Framework'),
    yaxis=dict(title='Training time (in seconds)'),
    annotations=[
        dict(
            x=xi,y=yi,
            text=str(yi),
            xanchor='center',
            yanchor='bottom',
            showarrow=False,
        ) 
        for xi, yi in zip(x, y)]
)
fig = dict(data=data, layout=layout)
py.iplot(fig)  

In [None]:
x, y = zip(*report['memory'].items())

data = [go.Bar(x=x, y=y, orientation='v')]
layout = go.Layout(
    title='Memory Report',
    xaxis=dict(title='Framework'),
    yaxis=dict(title='Peak memory (in MB)'),
    annotations=[
        dict(
            x=xi,y=yi,
            text=str(yi),
            xanchor='center',
            yanchor='bottom',
            showarrow=False,
        ) 
        for xi, yi in zip(x, y)]
)
fig = dict(data=data, layout=layout)
py.iplot(fig)  

Results of evaluation on the popular **Word Similarities** task. This task measures how well the notion of word similarity according to humans is captured by the word vector representations. Two lists are obtained by sorting the word pairs according to human similarity and vector-space similarity. Spearman’s correlation/rho between these
ranked lists is the used to signify how well the vector space agrees with human judgement. 

In [None]:
data = []
for framework in report['frameworks']:
    x, y = zip(*report['wordpairs'][framework])
    trace = go.Bar(x=x, y=y, name=framework)
    data.append(trace)

layout = go.Layout(
    title='Word Pairs Evaluation Report',
    xaxis=dict(title='Dataset', tickangle=-45),
    yaxis=dict(title='Spearman\'s Rho'),
    barmode = 'group'
)
fig = dict(data=data, layout=layout)
py.iplot(fig) 

Results of evaluation on the popular **Word Analogy** task. The aim of this task is to find the missing word b' in the relation: a is to a' as b is to b'. In other words we look at the most similar word vector to b' (= a' + b - a) and compare it with the human analogy and report the accuracy. 

In [None]:
data = []
for framework in report['frameworks']:
    x, y = zip(*report['qa'][framework])
    trace = go.Bar(x=x, y=y, name=framework)
    data.append(trace)

layout = go.Layout(
    title='Analogies Task(Questions&Answers) Report',
    xaxis=dict(tickangle=-45),
    yaxis=dict(title='Accuracy (in %)'),
    barmode='group'
)
fig = dict(data=data, layout=layout)
py.iplot(fig)  