# Colab running nptsne

# <img src="http://reconstrue.com/assets/images/reconstrue_logo_brandmark.svg" width="42px" align="top" /> **Reconstrue**




## Introduction

### Legal

Copyright 2019 Reconstrue LLC.
Licensed under the Apache License, Version 2.0 (the "License");


In [0]:
# Copyright 2019 Reconstrue LLC. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================

### Basic links
"nptsne is a numpy compatible python binary package that offers a number of APIs for fast tSNE calculation."

- https://github.com/biovault/nptsne
- PyPI: [nptsne 1.0.0](https://pypi.org/project/nptsne/)
- Demo notebook: [NPTSNE_notebooktests.ipynb](https://github.com/biovault/nptsne/blob/master/examples/NPTSNE_notebooktests.ipynb)

## notes

https://twitter.com/ThomasHollt/status/1192738112876756992

>Our multi-platform, multi-GPU GPU t-SNE (https://doi.org/10.1109/TVCG.2019.2934307) is now also available directly through pypi. Simply type
>
>`pip install nptsne`
>
>on Mac and windows. (Check https://pypi.org/project/nptsne/ for more details and Linux instructions)
```

## Install nptsne on Colab

[The nptsne page on pypi.org](https://pypi.org/project/nptsne/) page says to just `pip install nptsne`. But the page also says, that Linux – unlike Windows and macOC – requires a slightly more complicated install. Linux needs to downloand nptsne for specific Python environment: 3.6 or 3.7. That is confirmed by the following failure:

In [0]:
#!pip install nptsne
#
#ERROR: Could not find a version that satisfies the requirement nptsne (from versions: none)
#ERROR: No matching distribution found for nptsne

According to the install instructions, it seems the install is sensitive to 3.6.x vs. 3.7.x.


In [3]:
import platform
from datetime import date

print("Python runtime version: %s" % platform.python_version())
print(f'Date run: {date.today()}')

Python runtime version: 3.6.9
Date run: 2019-12-16


So, download the 3.6 wheel for nptsne 1.0.0, `nptsne-1.0.0-cp36-none-linux_x86_64.whl`. Then install

In [4]:
!wget --show-progress -q http://cytosplore.lumc.nl:8081/artifactory/wheels/nptsne/nptsne-1.0.0-cp36-none-linux_x86_64.whl
!echo --
!ls
!echo --
!pip install nptsne-1.0.0-cp36-none-linux_x86_64.whl

--
mnist-original.mat
nptsne-1.0.0-cp36-none-linux_x86_64.whl
nptsne-1.0.0-cp36-none-linux_x86_64.whl.1
nptsne-1.0.0-cp36-none-linux_x86_64.whl.2
nptsne-1.0.0-cp36-none-linux_x86_64.whl.3
sample_data
--


In [5]:
!apt-get install libglfw3

Reading package lists... Done
Building dependency tree       
Reading state information... Done
libglfw3 is already the newest version (3.2.1-1).
The following package was automatically installed and is no longer required:
  libnvidia-common-430
Use 'apt autoremove' to remove it.
0 upgraded, 0 newly installed, 0 to remove and 7 not upgraded.


In [6]:
pip install glfw



## Exercise nptsne on Colab

### Follow example notebook

https://github.com/biovault/nptsne/blob/master/examples/NPTSNE_notebooktests.ipynb


In [7]:
import sys
import nptsne
import matplotlib.pyplot as plt
from   matplotlib import rc
import numpy as np

from six.moves import urllib
from scipy.io import loadmat
from matplotlib import colors as mcolors
from timeit import default_timer as timer

print("Running python {}.{}".format(sys.version_info.major, sys.version_info.minor))


Running python 3.6


### nptsne.TextureTsne API

In [8]:

import nptsne
help(nptsne)

Help on package nptsne:

NAME
    nptsne

PACKAGE CONTENTS
    example
    version

SUBMODULES
    libs

CLASSES
    pybind11_builtins.pybind11_object(builtins.object)
        nptsne.libs._nptsne.KnnAlgorithm
        nptsne.libs._nptsne.TextureTsne
        nptsne.libs._nptsne.TextureTsneExtended
    
    class KnnAlgorithm(pybind11_builtins.pybind11_object)
     |  Enumeration used to select the knn algorithm used. Two possibilities are
     |  supported:
     |  
     |  :obj:`KnnAlgorithm.Flann`: Knn using FLANN - Fast Library for Approximate Nearest Neighbors
     |  
     |  :obj:`KnnAlgorithm.HNSW`: Knn using Hnswlib - fast approximate nearest neighbor search
     |  
     |  Method resolution order:
     |      KnnAlgorithm
     |      pybind11_builtins.pybind11_object
     |      builtins.object
     |  
     |  Methods defined here:
     |  
     |  __eq__(...)
     |      __eq__(self: nptsne.libs._nptsne.KnnAlgorithm, arg0: nptsne.libs._nptsne.KnnAlgorithm) -> bool
     |  
  

In [9]:
from nptsne import KnnAlgorithm
help(KnnAlgorithm)

Help on class KnnAlgorithm in module nptsne.libs._nptsne:

class KnnAlgorithm(pybind11_builtins.pybind11_object)
 |  Enumeration used to select the knn algorithm used. Two possibilities are
 |  supported:
 |  
 |  :obj:`KnnAlgorithm.Flann`: Knn using FLANN - Fast Library for Approximate Nearest Neighbors
 |  
 |  :obj:`KnnAlgorithm.HNSW`: Knn using Hnswlib - fast approximate nearest neighbor search
 |  
 |  Method resolution order:
 |      KnnAlgorithm
 |      pybind11_builtins.pybind11_object
 |      builtins.object
 |  
 |  Methods defined here:
 |  
 |  __eq__(...)
 |      __eq__(self: nptsne.libs._nptsne.KnnAlgorithm, arg0: nptsne.libs._nptsne.KnnAlgorithm) -> bool
 |  
 |  __ge__(...)
 |      __ge__(self: nptsne.libs._nptsne.KnnAlgorithm, arg0: nptsne.libs._nptsne.KnnAlgorithm) -> bool
 |  
 |  __getstate__(...)
 |      __getstate__(self: nptsne.libs._nptsne.KnnAlgorithm) -> tuple
 |  
 |  __gt__(...)
 |      __gt__(self: nptsne.libs._nptsne.KnnAlgorithm, arg0: nptsne.libs._nptsne

### Download minist data for use in the demos

In [10]:
import os 

mnist_path = 'mnist-original.mat'
if not os.path.isfile(mnist_path):
    mnist_alternative_url = 'https://github.com/amplab/datascience-sp14/raw/master/lab7/mldata/mnist-original.mat'
    response = urllib.request.urlopen(mnist_alternative_url)
    with open(mnist_path, 'wb') as f:
        content = response.read()
        f.write(content)
mnist_raw = loadmat(mnist_path)
mnist = {
    'data': mnist_raw['data'].T,
    'label': mnist_raw['label'][0],
    'COL_NAMES': ['label', 'data']
}
print('Mnist data dimenstions: ', mnist['data'].shape)

Mnist data dimenstions:  (70000, 784)


### Create a tSNE embedding of the 70000 MNIST data points & display the elapsed time


In [11]:
tsne = nptsne.TextureTsne(False,1000,2,30,800, nptsne.KnnAlgorithm.Flann)
#Can also be run with knn as HNSW: this works faster in very large datasets lower dimensional data (<40 dimensions)
#tsne = nptsne.TextureTsne(False,1000,2,30,800, nptsne.nptsne.KnnAlgorithm.HNSW)

embedding = None
try:
   
    for i in range(1):
        start = timer()
        embedding = tsne.fit_transform(mnist['data'])
        end = timer()
        print(f'got embedding in {end - start}')
except Exception as ex:
    print('Error....')
    print(ex)

Error....
Unable to initialize GLFW.
