Skip to content
Python package to accelerate the sparse matrix multiplication and top-n similarity selection
Branch: master
Clone or download
ymwdalex
Latest commit 3cb5818 Nov 27, 2018
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
example v0.2: improve module installation and layout Nov 14, 2018
sparse_dot_topn v0.2.1: fix the problem when pip install by tar.gz; update meta-infor… Nov 26, 2018
.gitignore v0.2.1: fix the problem when pip install by tar.gz; update meta-infor… Nov 26, 2018
CHANGES.md
LICENSE Add license file, and better readme file Jul 25, 2017
MANIFEST.in v0.2.1: fix the problem when pip install by tar.gz; update meta-infor… Nov 26, 2018
README.md v0.2.2: update readme file name Nov 27, 2018
requirements.txt
setup.py

README.md

sparse_dot_topn:

sparse_dot_topn provides a fast way to performing a sparse matrix multiplication followed by top-n multiplication result selection.

Comparing very large feature vectors and picking the best matches, in practice often results in performing a sparse matrix multiplication followed by selecting the top-n multiplication results. In this package, we implement a customized Cython function for this purpose. When comparing our Cythonic approach to doing the same use with SciPy and NumPy functions, our approach improves the speed by about 40% and reduces memory consumption.

This package is made by ING Wholesale Banking Advanced Analytics team. This blog explains how we implement it.

Example

import numpy as np
from scipy.sparse import csr_matrix
from scipy.sparse import rand
from sparse_dot_topn import awesome_cossim_topn

N = 10
a = rand(100, 1000000, density=0.005, format='csr')
b = rand(1000000, 200, density=0.005, format='csr')

c = awesome_cossim_topn(a, b, 5, 0.01)

You can also find code which compares our boosting method with calling scipy+numpy function directly in example/comparison.py

Dependency and Install

Install numpy and cython first before installing this package. Then,

pip install sparse_dot_topn

Uninstall

pip uninstall sparse_dot_topn
You can’t perform that action at this time.