Skip to content

Commit

Permalink
When on AMD/Intel platforms, copies for unaligned arrays are disabled.
Browse files Browse the repository at this point in the history
This is because AMD/Intel processors can deal very efficiently with
unaligned data, so avoiding the copy is generally a good thing (up to
2x speed-ups wrt doing a copy can easily be seen).

However, this optimization is not active when Numexpr is compiled with
Intel VML (Vector Math Library).  The rational is that (surprisingly
enough), VML has not optimized the access to unaligned arrays, and hence
the performance of its functions can drop a lot (as much as 10x in some
cases!).

Finally, the ``cpuinfo.py`` module has been added to the code in order
to detect AMD/Intel CPUs easily.  Thanks to Pearu Peterson for providing
this nice piece of code.
  • Loading branch information
FrancescAlted committed Jun 23, 2009
1 parent 85770c0 commit f7bb1a6
Show file tree
Hide file tree
Showing 9 changed files with 749 additions and 23 deletions.
31 changes: 31 additions & 0 deletions LICENSES/cpuinfo.txt
@@ -0,0 +1,31 @@
Copyright statement for `cpuinfo` module.

Copyright 2002 Pearu Peterson all rights reserved,
Pearu Peterson <pearu@cens.ioc.ee>

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:

* Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.

* Redistributions in binary form must reproduce the above
copyright notice, this list of conditions and the following
disclaimer in the documentation and/or other materials provided
with the distribution.

* Neither the name of Pearu Peterson nor the names of its
contributors may be used to endorse or promote products derived
from this software without specific prior written permission.

HIS SOFTWARE IS PROVIDED BY PEARU PETERSON ''AS IS'' AND ANY EXPRESS
OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL PEARU PETERSON BE LIABLE FOR ANY DIRECT,
INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING
IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
POSSIBILITY OF SUCH DAMAGE.
4 changes: 2 additions & 2 deletions bench/boolean_timing.py
Expand Up @@ -114,8 +114,8 @@ def compare(expression=False):

if __name__ == '__main__':

print 'Python version: %s' % sys.version
print "NumPy version: %s" % numpy.__version__
import numexpr
numexpr.print_versions()

if len(sys.argv) > 1:
expression = sys.argv[1]
Expand Down
3 changes: 3 additions & 0 deletions bench/timing.py
Expand Up @@ -126,6 +126,9 @@ def compare(check_only=False):
return average

if __name__ == '__main__':
import numexpr
numexpr.print_versions()

averages = []
for i in range(iterations):
averages.append(compare())
Expand Down
5 changes: 4 additions & 1 deletion bench/unaligned-bench.py
Expand Up @@ -7,7 +7,10 @@
import numexpr as ne

niter = 10
shape = (1000, 10000)
#shape = (1000*10000) # unidimensional test
shape = (1000, 10000) # multidimensional test

ne.print_versions()

Z_fast = np.zeros(shape, dtype=[('x',np.float64),('y',np.int64)])
Z_slow = np.zeros(shape, dtype=[('x',np.float64),('y',np.bool)])
Expand Down
6 changes: 2 additions & 4 deletions bench/vml_timing.py
Expand Up @@ -114,10 +114,8 @@ def compare(expression=False):
print

if __name__ == '__main__':

print 'Python version: %s' % sys.version
print "NumPy version: %s" % numpy.__version__
print "use vml %s" % numexpr.use_vml
import numexpr
numexpr.print_versions()

numexpr.set_vml_accuracy_mode('low')
numexpr.set_vml_num_threads(2)
Expand Down
6 changes: 6 additions & 0 deletions numexpr/__init__.py
Expand Up @@ -18,6 +18,12 @@
else:
use_vml = False

from cpuinfo import cpu
if cpu.is_AMD() or cpu.is_Intel():
is_cpu_amd_intel = True
else:
is_cpu_amd_intel = False

import os.path
from numexpr.expressions import E
from numexpr.necompiler import NumExpr, disassemble, evaluate
Expand Down

0 comments on commit f7bb1a6

Please sign in to comment.