Skip to content

Theano pairwise function#9

Merged
ogrisel merged 3 commits intonumfocus:masterfrom
jaberg:master
Jul 10, 2013
Merged

Theano pairwise function#9
ogrisel merged 3 commits intonumfocus:masterfrom
jaberg:master

Conversation

@jaberg
Copy link
Copy Markdown
Contributor

@jaberg jaberg commented Jul 9, 2013

It is cheating by using BLAS and being less accurate than other implementations. Is that really cheating? It could be more accurate by centering data before computing all pairs, is that "enough" ? The way to express the looping algorithm from the other implementations in Theano is to write an Op for that, essentially passing the burden of speed along to cython / numba / numpy / julia.

@jaberg
Copy link
Copy Markdown
Contributor Author

jaberg commented Jul 9, 2013

(This addresses #8.)

@aterrel
Copy link
Copy Markdown
Member

aterrel commented Jul 9, 2013

Perhaps we could make one that has the same accuracy and one that is not cheating. At least be up front about it in the report.

I have an example that I didn't put in solving a Poisson eqn that depending on the method can span 4 orders of magnitude in runtime. It's not showing off the compiler tech by doing that unless the compiler tech can automatically detect this switch and do it for you (which I hope my sfl dsl will do someday).

@jaberg
Copy link
Copy Markdown
Contributor Author

jaberg commented Jul 10, 2013

I added a non-cheating implementation, inspired by recent reading of the Tensor Contraction Engine (how does TCE maintain tolerances for numerical accuracy??). This impl currently uses more RAM than the other implementations, but that's just a compiler implementation detail right?

@ogrisel
Copy link
Copy Markdown
Contributor

ogrisel commented Jul 10, 2013

RAM usage is not yet measured :)

Also I wanted to add an implementation that uses just python + numpy as done in sklearn: https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/metrics/pairwise.py#L176

in our case, safe_sparse_dot is just np.dot which is basically the BLAS GEMM.

@ogrisel
Copy link
Copy Markdown
Contributor

ogrisel commented Jul 10, 2013

When running this branch I get:

Traceback (most recent call last):
  File "run_benchmarks.py", line 77, in find_benchmarks
    module = __import__(abs_module_name, fromlist="dummy")
  File "/Users/ogrisel/code/python-benchmarks/pairwise/pairwise_theano.py", line 32, in 
    pairwise_theano_tensor_prepare('float64'),
  File "/Users/ogrisel/code/python-benchmarks/pairwise/pairwise_theano.py", line 11, in pairwise_theano_tensor_prepare
    TT.sqr(X[:, None, :] - X),
  File "/usr/local/Cellar/python/2.7.5/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/Theano-0.6.0rc3-py2.7.egg/theano/tensor/basic.py", line 1864, in __getitem__
    view = self.dimshuffle(pattern)
  File "/usr/local/Cellar/python/2.7.5/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/Theano-0.6.0rc3-py2.7.egg/theano/tensor/basic.py", line 1788, in dimshuffle
    return op(self)
  File "/usr/local/Cellar/python/2.7.5/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/Theano-0.6.0rc3-py2.7.egg/theano/gof/op.py", line 407, in __call__
    raise ValueError('Cannot compute test value: input %i (%s) of Op %s missing default value' % (i, ins, node))
ValueError: Cannot compute test value: input 0 () of Op DimShuffle{0,x,1}() missing default value

@jaberg
Copy link
Copy Markdown
Contributor Author

jaberg commented Jul 10, 2013

This "test_value" business is one of the things in Theano I know next-to-nothing about. I suggest disabling it. It might be in your theanorc.

@ogrisel
Copy link
Copy Markdown
Contributor

ogrisel commented Jul 10, 2013

@jaberg I don't have a theanorc. Anyway to make this benchmark work without custom config on the user environment?

@jaberg
Copy link
Copy Markdown
Contributor Author

jaberg commented Jul 10, 2013

I didn't mean you would need a custom rc, I was only suggesting to delete any customizations you had made. I can't reproduce your crash, can you tell me what version of theano you are using?

@ogrisel
Copy link
Copy Markdown
Contributor

ogrisel commented Jul 10, 2013

I was on master, I think. I will retry on the last stable release.

@ogrisel
Copy link
Copy Markdown
Contributor

ogrisel commented Jul 10, 2013

Indeed it works with the latest stable version. I am pep8-ing and merging.

@ogrisel ogrisel merged commit 2705546 into numfocus:master Jul 10, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants