Skip to content

Commit

Permalink
Import 3.3.3
Browse files Browse the repository at this point in the history
  • Loading branch information
plorkyeran committed Jan 1, 2014
1 parent 9041893 commit d68ca63
Show file tree
Hide file tree
Showing 624 changed files with 3,676 additions and 2,513 deletions.
129 changes: 129 additions & 0 deletions ChangeLog
@@ -1,3 +1,132 @@
Sat Nov 24 22:37:54 EST 2012 stevenj@fftw.org
* fixed deadlock bug caused by bogosity flag getting out of synch between processes; thanks to Michael Pippig for the bug report

M ./kernel/ifftw.h +1
M ./kernel/planner.c -3 +6
M ./mpi/api.c +12

Wed Nov 21 18:34:29 EST 2012 athena@fftw.org
* Updated NEWS

M ./NEWS -2 +7

Wed Nov 21 18:33:15 EST 2012 athena@fftw.org
* use 2x2 AVX transposition instead of individual stores.

This seems to improve single-precision AVX on Sandy Bridge machines.


M ./simd-support/simd-avx.h -2 +14

Tue Nov 20 12:18:00 EST 2012 stevenj@fftw.org
* revert part of Taylor patch to acx_mpi.m4: do not link -lmpi if mpicc works without libraries, as -lmpi may be some completely different MPI implementation

M ./m4/acx_mpi.m4 -3 +3

Tue Nov 20 11:44:57 EST 2012 stevenj@fftw.org
* fix deadlock bug (thanks to Michael Pippig for the bug report and patch, and to Graham Dennis for the bug report) in which some processes called MPI_Alltoall and some called MPI_Alltoallv

M ./mpi/transpose-alltoall.c -3 +2

Mon Oct 29 15:20:01 EDT 2012 athena@fftw.org
* fix texinfo quirk

M ./doc/tutorial.texi -2 +2

Mon Oct 29 09:16:43 EDT 2012 athena@fftw.org
* clarify that padding only applies to in-place transforms

M ./doc/tutorial.texi -5 +10

Sun Oct 28 18:42:48 EDT 2012 athena@fftw.org
* make the index-computation logic less paranoid

The problem is that for each K and for each expression of the form P[I
+ STRIDE * K] in a loop, most compilers will try to lift an induction
variable PK := &P[I + STRIDE * K]. In large codelets we have many
such values of K. For example, a codelet of size 32 with 4 input
pointers will generate O(128) induction variables, which will likely
overflow the register set, which is likely worse than doing the index
computation in the first place.

In the past we (wisely and correctly) assumed that compilers will do
the wrong thing, and consequently we disabled the induction-variable
"optimization" altogether by setting STRIDE ^= ZERO, where ZERO is a
value guaranteed to be 0. Since the compiler does not know that
ZERO=0, it cannot perform its "optimization" and it is forced to
behave sensibly.

With this patch, FFTW is a little bit less paranoid. FFTW now
disables the induction-variable optimization" only when we estimate
that the codelet uses more than ESTIMATED_AVAILABLE_INDEX_REGISTERS
induction variables.

Currently we set ESTIMATED_AVAILABLE_INDEX_REGISTERS=16. 16 registers ought
to be enough for anybody (or so the amd64 and ARM ISA's seem to imply).


M ./genfft/gen_hc2c.ml -1 +1
M ./genfft/gen_hc2cdft.ml -1 +1
M ./genfft/gen_hc2cdft_c.ml -1 +1
M ./genfft/gen_hc2hc.ml -1 +1
M ./genfft/gen_notw.ml -2 +2
M ./genfft/gen_notw_c.ml -2 +2
M ./genfft/gen_r2cb.ml -3 +3
M ./genfft/gen_r2cf.ml -3 +3
M ./genfft/gen_r2r.ml -2 +2
M ./genfft/gen_twiddle.ml -1 +1
M ./genfft/gen_twiddle_c.ml -1 +1
M ./genfft/gen_twidsq.ml -2 +2
M ./genfft/gen_twidsq_c.ml -2 +2
M ./genfft/genutil.ml -1 +2
M ./kernel/ifftw.h -3 +20

Sun Oct 28 18:33:24 EDT 2012 athena@fftw.org
* silence warnings

M ./kernel/buffered.c +1
M ./rdft/rank0.c +1

Sat Oct 27 09:58:49 EDT 2012 athena@fftw.org
* bump version to 3.3.3

M ./NEWS +7
M ./configure.ac -1 +1

Sat Oct 27 09:55:15 EDT 2012 athena@fftw.org
* evaluate plans for >1ms when using gettimeofday()

The previous limit 10ms was too paranoid, and it made life difficult
on machines without an "official" cycle counter, such as ARM.

M ./kernel/timer.c -1 +1

Sat Oct 27 09:46:04 EDT 2012 athena@fftw.org
* use 4-way NEON SIMD instead of 2-way

Kai-Uwe Bloem tried to warn me a year ago that 128-bit NEON was better
than 64-bit NEON even on machines with a 64-bit pipe, but I foolishly
did not listen. Now that 128-bit NEON pipes are starting to appear on
the market it is definitely time to switch.


M ./simd-support/simd-neon.h -55 +100

Wed Sep 26 14:21:12 EDT 2012 athena@fftw.org
* Note that fftw-3.3 includes MPI support

M ./doc/intro.texi -5 +4

Wed Jul 18 11:25:40 EDT 2012 athena@fftw.org
* remove obsolete unused function

M ./dft/bluestein.c -14

Fri Jun 29 15:57:14 EDT 2012 stevenj@fftw.org
* whoops, call omp_get_max_threads; thanks to Hanno Rein for the bug report

M ./doc/threads.texi -1 +1

Sat Apr 28 10:55:09 EDT 2012 athena@fftw.org
* Fix libfftw3/libfftw3_threads chicken-egg problem

Expand Down
45 changes: 34 additions & 11 deletions Makefile.in
@@ -1,4 +1,4 @@
# Makefile.in generated by automake 1.11.3 from Makefile.am.
# Makefile.in generated by automake 1.11.6 from Makefile.am.
# @configure_input@

# Copyright (C) 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001, 2002,
Expand All @@ -17,6 +17,23 @@


VPATH = @srcdir@
am__make_dryrun = \
{ \
am__dry=no; \
case $$MAKEFLAGS in \
*\\[\ \ ]*) \
echo 'am--echo: ; @echo "AM" OK' | $(MAKE) -f - 2>/dev/null \
| grep '^AM OK$$' >/dev/null || am__dry=yes;; \
*) \
for am__flg in $$MAKEFLAGS; do \
case $$am__flg in \
*=*|--*) ;; \
*n*) am__dry=yes; break;; \
esac; \
done;; \
esac; \
test $$am__dry = yes; \
}
pkgdatadir = $(datadir)/@PACKAGE@
pkgincludedir = $(includedir)/@PACKAGE@
pkglibdir = $(libdir)/@PACKAGE@
Expand Down Expand Up @@ -124,6 +141,11 @@ RECURSIVE_TARGETS = all-recursive check-recursive dvi-recursive \
install-pdf-recursive install-ps-recursive install-recursive \
installcheck-recursive installdirs-recursive pdf-recursive \
ps-recursive uninstall-recursive
am__can_run_installinfo = \
case $$AM_UPDATE_INFO_DIR in \
n|no|NO) false;; \
*) (install-info --version) >/dev/null 2>&1;; \
esac
DATA = $(pkgconfig_DATA)
RECURSIVE_CLEAN_TARGETS = mostlyclean-recursive clean-recursive \
distclean-recursive maintainer-clean-recursive
Expand Down Expand Up @@ -447,14 +469,15 @@ fftw.pc: $(top_builddir)/config.status $(srcdir)/fftw.pc.in
cd $(top_builddir) && $(SHELL) ./config.status $@
install-libLTLIBRARIES: $(lib_LTLIBRARIES)
@$(NORMAL_INSTALL)
test -z "$(libdir)" || $(MKDIR_P) "$(DESTDIR)$(libdir)"
@list='$(lib_LTLIBRARIES)'; test -n "$(libdir)" || list=; \
list2=; for p in $$list; do \
if test -f $$p; then \
list2="$$list2 $$p"; \
else :; fi; \
done; \
test -z "$$list2" || { \
echo " $(MKDIR_P) '$(DESTDIR)$(libdir)'"; \
$(MKDIR_P) "$(DESTDIR)$(libdir)" || exit 1; \
echo " $(LIBTOOL) $(AM_LIBTOOLFLAGS) $(LIBTOOLFLAGS) --mode=install $(INSTALL) $(INSTALL_STRIP_FLAG) $$list2 '$(DESTDIR)$(libdir)'"; \
$(LIBTOOL) $(AM_LIBTOOLFLAGS) $(LIBTOOLFLAGS) --mode=install $(INSTALL) $(INSTALL_STRIP_FLAG) $$list2 "$(DESTDIR)$(libdir)"; \
}
Expand Down Expand Up @@ -495,8 +518,11 @@ distclean-libtool:
-rm -f libtool config.lt
install-pkgconfigDATA: $(pkgconfig_DATA)
@$(NORMAL_INSTALL)
test -z "$(pkgconfigdir)" || $(MKDIR_P) "$(DESTDIR)$(pkgconfigdir)"
@list='$(pkgconfig_DATA)'; test -n "$(pkgconfigdir)" || list=; \
if test -n "$$list"; then \
echo " $(MKDIR_P) '$(DESTDIR)$(pkgconfigdir)'"; \
$(MKDIR_P) "$(DESTDIR)$(pkgconfigdir)" || exit 1; \
fi; \
for p in $$list; do \
if test -f "$$p"; then d=; else d="$(srcdir)/"; fi; \
echo "$$d$$p"; \
Expand Down Expand Up @@ -681,13 +707,10 @@ distdir: $(DISTFILES)
done
@list='$(DIST_SUBDIRS)'; for subdir in $$list; do \
if test "$$subdir" = .; then :; else \
test -d "$(distdir)/$$subdir" \
|| $(MKDIR_P) "$(distdir)/$$subdir" \
|| exit 1; \
fi; \
done
@list='$(DIST_SUBDIRS)'; for subdir in $$list; do \
if test "$$subdir" = .; then :; else \
$(am__make_dryrun) \
|| test -d "$(distdir)/$$subdir" \
|| $(MKDIR_P) "$(distdir)/$$subdir" \
|| exit 1; \
dir1=$$subdir; dir2="$(distdir)/$$subdir"; \
$(am__relativize); \
new_distdir=$$reldir; \
Expand Down Expand Up @@ -773,7 +796,7 @@ distcheck: dist
*.zip*) \
unzip $(distdir).zip ;;\
esac
chmod -R a-w $(distdir); chmod a+w $(distdir)
chmod -R a-w $(distdir); chmod u+w $(distdir)
mkdir $(distdir)/_build
mkdir $(distdir)/_inst
chmod a-w $(distdir)
Expand Down
12 changes: 12 additions & 0 deletions NEWS
@@ -1,3 +1,15 @@
FFTW 3.3.3

* Fix deadlock bug in MPI transforms (thanks to Michael Pippig for the
bug report and patch, and to Graham Dennis for the bug report).

* Use 128-bit ARM NEON instructions instead of 64-bits. This change
appears to speed up even ARM processors with a 64-bit NEON pipe.

* Speed improvements for single-precision AVX.

* Speed up planner on machines without "official" cycle counters, such as ARM.

FFTW 3.3.2

* Removed an archaic stack-alignment hack that was failing with
Expand Down
10 changes: 5 additions & 5 deletions aclocal.m4
@@ -1,4 +1,4 @@
# generated automatically by aclocal 1.11.3 -*- Autoconf -*-
# generated automatically by aclocal 1.11.6 -*- Autoconf -*-

# Copyright (C) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004,
# 2005, 2006, 2007, 2008, 2009, 2010, 2011 Free Software Foundation,
Expand All @@ -14,8 +14,8 @@

m4_ifndef([AC_AUTOCONF_VERSION],
[m4_copy([m4_PACKAGE_VERSION], [AC_AUTOCONF_VERSION])])dnl
m4_if(m4_defn([AC_AUTOCONF_VERSION]), [2.68],,
[m4_warning([this file was generated for autoconf 2.68.
m4_if(m4_defn([AC_AUTOCONF_VERSION]), [2.69],,
[m4_warning([this file was generated for autoconf 2.69.
You have another version of autoconf. It may work, but is not guaranteed to.
If you have problems, you may need to regenerate the build system entirely.
To do so, use the procedure documented by the package, typically `autoreconf'.])])
Expand All @@ -38,7 +38,7 @@ AC_DEFUN([AM_AUTOMAKE_VERSION],
[am__api_version='1.11'
dnl Some users find AM_AUTOMAKE_VERSION and mistake it for a way to
dnl require some minimum version. Point them to the right macro.
m4_if([$1], [1.11.3], [],
m4_if([$1], [1.11.6], [],
[AC_FATAL([Do not call $0, use AM_INIT_AUTOMAKE([$1]).])])dnl
])

Expand All @@ -54,7 +54,7 @@ m4_define([_AM_AUTOCONF_VERSION], [])
# Call AM_AUTOMAKE_VERSION and AM_AUTOMAKE_VERSION so they can be traced.
# This function is AC_REQUIREd by AM_INIT_AUTOMAKE.
AC_DEFUN([AM_SET_CURRENT_AUTOMAKE_VERSION],
[AM_AUTOMAKE_VERSION([1.11.3])dnl
[AM_AUTOMAKE_VERSION([1.11.6])dnl
m4_ifndef([AC_AUTOCONF_VERSION],
[m4_copy([m4_PACKAGE_VERSION], [AC_AUTOCONF_VERSION])])dnl
_AM_AUTOCONF_VERSION(m4_defn([AC_AUTOCONF_VERSION]))])
Expand Down
34 changes: 31 additions & 3 deletions api/Makefile.in
@@ -1,4 +1,4 @@
# Makefile.in generated by automake 1.11.3 from Makefile.am.
# Makefile.in generated by automake 1.11.6 from Makefile.am.
# @configure_input@

# Copyright (C) 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001, 2002,
Expand All @@ -17,6 +17,23 @@


VPATH = @srcdir@
am__make_dryrun = \
{ \
am__dry=no; \
case $$MAKEFLAGS in \
*\\[\ \ ]*) \
echo 'am--echo: ; @echo "AM" OK' | $(MAKE) -f - 2>/dev/null \
| grep '^AM OK$$' >/dev/null || am__dry=yes;; \
*) \
for am__flg in $$MAKEFLAGS; do \
case $$am__flg in \
*=*|--*) ;; \
*n*) am__dry=yes; break;; \
esac; \
done;; \
esac; \
test $$am__dry = yes; \
}
pkgdatadir = $(datadir)/@PACKAGE@
pkgincludedir = $(includedir)/@PACKAGE@
pkglibdir = $(libdir)/@PACKAGE@
Expand Down Expand Up @@ -97,6 +114,11 @@ LINK = $(LIBTOOL) --tag=CC $(AM_LIBTOOLFLAGS) $(LIBTOOLFLAGS) \
$(LDFLAGS) -o $@
SOURCES = $(libapi_la_SOURCES)
DIST_SOURCES = $(libapi_la_SOURCES)
am__can_run_installinfo = \
case $$AM_UPDATE_INFO_DIR in \
n|no|NO) false;; \
*) (install-info --version) >/dev/null 2>&1;; \
esac
am__vpath_adj_setup = srcdirstrip=`echo "$(srcdir)" | sed 's|.|.|g'`;
am__vpath_adj = case $$p in \
$(srcdir)/*) f=`echo "$$p" | sed "s|^$$srcdirstrip/||"`;; \
Expand Down Expand Up @@ -455,8 +477,11 @@ clean-libtool:
-rm -rf .libs _libs
install-includeHEADERS: $(include_HEADERS)
@$(NORMAL_INSTALL)
test -z "$(includedir)" || $(MKDIR_P) "$(DESTDIR)$(includedir)"
@list='$(include_HEADERS)'; test -n "$(includedir)" || list=; \
if test -n "$$list"; then \
echo " $(MKDIR_P) '$(DESTDIR)$(includedir)'"; \
$(MKDIR_P) "$(DESTDIR)$(includedir)" || exit 1; \
fi; \
for p in $$list; do \
if test -f "$$p"; then d=; else d="$(srcdir)/"; fi; \
echo "$$d$$p"; \
Expand All @@ -473,8 +498,11 @@ uninstall-includeHEADERS:
dir='$(DESTDIR)$(includedir)'; $(am__uninstall_files_from_dir)
install-nodist_includeHEADERS: $(nodist_include_HEADERS)
@$(NORMAL_INSTALL)
test -z "$(includedir)" || $(MKDIR_P) "$(DESTDIR)$(includedir)"
@list='$(nodist_include_HEADERS)'; test -n "$(includedir)" || list=; \
if test -n "$$list"; then \
echo " $(MKDIR_P) '$(DESTDIR)$(includedir)'"; \
$(MKDIR_P) "$(DESTDIR)$(includedir)" || exit 1; \
fi; \
for p in $$list; do \
if test -f "$$p"; then d=; else d="$(srcdir)/"; fi; \
echo "$$d$$p"; \
Expand Down

0 comments on commit d68ca63

Please sign in to comment.