Permalink
Browse files

Try compiling with a large memory model if the initial compile fails.

Documentation and comments are still not completely coherent on
this subject, but the compilation issue itself is improved.
  • Loading branch information...
Greg Smith Greg Smith
Greg Smith authored and Greg Smith committed Jul 17, 2012
1 parent 0dbaeda commit ebfc6aa8b14d978ddb9b762079c95f7e5f172002
Showing with 53 additions and 33 deletions.
  1. +31 −23 README.rst
  2. +22 −10 stream-scaling
View
@@ -312,38 +312,46 @@ Bugs
====
On some systems, the amount of memory selected for the stream array
-ends up exceeding how large of a block of RAM the system is willing
-to allocate at once. This seems a particular issue on 32-bit operating
-systems, but even 64-bit ones are not immune. The program currently
-enforces an upper limit on the stream array size of 130M, which
-allocates approximately 3GB of memory just for that part (with 4GB being
-the normal limit for 32-bit structures). If your system fails to
-compile stream with an error such as this::
+ends up exceeding how large of a block of RAM the operatin system (or
+in some cases the compiler) is willing to allocate at once. This
+seems a particular issue on 32-bit operating systems, but even 64-bit
+ones are not immune.
+
+If your system fails to compile stream with an error such as this::
stream.c:(.text+0x34): relocation truncated to fit: R_X86_64_32S against `.bss'
-You will need to manually decrease the size of the array until the
-program will compile and link. Manual compile can be done like this::
+stream-scaling will try to compile stream using the gcc "-mcmodel=large"
+option after hitting this error. That will let the program use larger data
+structures. If you are using a new enough version of the gcc compiler,
+believed to be at least verison 4.4, the program will run normally after
+that; you can ignore these "relocation truncated" warnings.
+
+If you have both a large amount of cache--so a matching large block of memory
+is needed--and an older version of gcc, the second compile attempt will also
+fail, with the following error::
+
+ stream.c:1: sorry, unimplemented: code model ‘large’ not supported yet
+
+In that case, it is unlikely you will get accurate results from
+stream-scaling. You can try it anyway by manually decreasing the size of the
+array until the program will compile and link. Manual compile can be done like
+this::
gcc -O3 -DN=130000000 -fopenmp stream.c -o stream
And then reducing the ``-DN`` value until compilation is successful.
After that upper limit is determined, adjust the setting for
MAX_ARRAY_SIZE at the beginning of the stream-scaling program to reflect
-it.
-
-The current version of stream-scaling tries to work around this by
-using a customized version of the stream code that dynamically allocates
-these arrays. It is still possible a problem here exists, and a
-warning suggesting a workaround (an easier one than doing a manual
-compile as described above) appears if your system appears to have
-so much cache it could run into this issue.
-
-If you encounter this situation, where stream-scaling still doesn't
-work properly for you, a problem report to the author would
-be appreciated. It's not clear yet why the exact cut-off value varies
-on some systems, or if there are systems where the improved dynamic
-allocation logic may not be sufficient.
+it. An upper limit on the stream array size of 130M as shown here
+allocates approximately 3GB of memory for the test array, with 4GB being
+the normal limit for 32-bit structures.
+
+The fixes for this issue are new, and it is still possible a problem here
+still exists. If you have a gcc version >=4.4 but stream-scaling still won't
+compile correctly, a problem report to the author would be appreciated. It's
+not clear yet why the exact cut-off value varies on some systems, or if there
+are systems where the improved dynamic allocation logic may not be sufficient.
Documentation
=============
View
@@ -195,7 +195,7 @@ function stream_array_elements {
fi
# The array sizing code will overflow 32 bits on systems with many
- # processors having lots of cache. The crash looks like this:
+ # processors having lots of cache. The compiler error looks like this:
#
# $ gcc -O3 -DN=133823657 -fopenmp stream.c -o stream
# /tmp/ccecdC49.o: In function `checkSTREAMresults':
@@ -214,9 +214,11 @@ function stream_array_elements {
# stream.c:(.text+0x660): relocation truncated to fit: R_X86_64_32S against `.bss'
# stream.c:(.text+0x6ab): additional relocation overflows omitted from the output
# collect2: ld returned 1 exit status
-
- # Clamp the upper value to a smaller maximum size to try and avoid this
- # error. 130,000,000 makes for approximately a 3GB array.
+ #
+ # Warn about this issue, and provide a way to clamp the upper value to a smaller
+ # maximum size to try and avoid this error. 130,000,000 makes for approximately
+ # a 3GB array. The large memory model compiler option will avoid this issue
+ # if a gcc version that supports it is available.
if [ $NEEDED_SIZE -gt $MAX_ARRAY_SIZE ] ; then
#
# Size clamp code
@@ -236,18 +238,16 @@ function stream_array_elements {
fi
# Given the sizing above uses a factor of 10X cache size, this reduced size
- # is still large enough for current generation procesors up to the 48 core
+ # might still be large enough for current generation procesors up to the 48 core
# range. For example, a system containing 8 Intel Xeon L7555 processors with
# 4 cores having 24576 KB cache each will suggest:
#
# Total CPU system cache: 814743552 bytes
# Computed minimum array elements needed: 370337978
#
- # So using 130,000,000 instead of 370,337,978 still an array >3X the
- # size of cache sum. Really large systems with >48 processors might overflow
- # this still, but hopefully this limitation will be addressed by the
- # underlying stream code being called here eventually, rather than
- # trying to work around it here.
+ # So using 130,000,000 instead of 370,337,978 still be an array >3X the
+ # size of the cache sum in this case. Really large systems with >48 processors
+ # might overflow this still.
echo Array elements used: $NEEDED_SIZE
eval $__resultvar="'$NEEDED_SIZE'"
@@ -307,6 +307,18 @@ if [ -f stream ] ; then
fi
gcc -O3 $ARRAY_FLAG -fopenmp stream.c -o stream
+if [ $? -ne 0 ] ; then
+ # The most likely way the program will fail to compile is if it's
+ # trying to use more memory than will fit on the standard gcc memory
+ # model. Try the large one instead. This will only work on newer
+ # gcc versions (it works on at least>=4.4), so there's no single
+ # compile option set here that will support older gcc versions
+ # and the large memory model. Just trying both ways seems both
+ # simpler and more definitive than something like checking the
+ # gcc version.
+ echo === Trying large memory model ===
+ gcc -O3 $ARRAY_FLAG -fopenmp stream.c -o stream -mcmodel=large
+fi
if [ ! -x stream ] ; then
echo Error: did not find valid stream program compiled here, aborting

0 comments on commit ebfc6aa

Please sign in to comment.