tofile() truncation on arrays >= 2**32 on 64-bit OSX (Trac #2114) #574

numpy-gitbot · 2012-10-19T15:08:42Z

Original ticket http://projects.scipy.org/numpy/ticket/2114 on 2012-04-24 by trac user embray, assigned to unknown.

After a fair bit of debugging we've tracked down a bug in OSX's fwrite() (actually in an internal function that affects fwrite(), fprintf(), and other functions that write to a file handle). This bug was originally discovered by trying to write out some large arrays with Numpy. As far as I can tell (from some Google searches) this bug isn't otherwise well known yet.

The bug is that at some point the size passed to fwrite() is stuffed into a 32-bit register and checks if it's a multiple of 0x1000 (4096) and then branches off to some separate routine for doing writes that are a multiple of one block size.

Thus, if the size is a multiple of 4096 and >= 2**32, the size gets silently truncated to size & 0xffffffff.

The attached test program illustrates the problem. This has been tested and been shown buggy on Leopard and Lion (and so presumably the bug exists in Snow Leopard--not sure about earlier OSX versions).

This is what the output looks like:

$ gcc -g -Wall -arch x86_64 -Wextra writetest.c -o writetest 
$ ./writetest 0x100000000 && ls -l test.array
size_t bytes: 8
array size: 4294967296
array size cast as size_t: 4294967296
wrote 4294967296 bytes
-rw-r--r--  1 embray  31  0 Apr 24 11:03 test.array

As you can see, fwrite() even returns that it wrote "4294967296 bytes", though in reality it wrote zero bytes. Likewise:

$ ./writetest 0x100001000 && ls -l test.array
size_t bytes: 8
array size: 4294971392
array size cast as size_t: 4294971392
wrote 4294971392 bytes
-rw-r--r--  1 embray  31  4096 Apr 24 11:04 test.array

Further testing has shown that this holds for any multiple of 4096.

The fix that was implemented for #2256, where arrays are written in 2GB chunks, would also solve this problem. So I think it would probably be sufficient to just enable the same chunked write code block in PyArray_ToFile() on OSX as well.

Although the OSX bug only occurs on those 4K boundaries and only for sizes >= 2**32, for the sake of simplicity I think it's fine to just use more or less the same workaround.

The text was updated successfully, but these errors were encountered:

numpy-gitbot · 2012-10-23T02:46:45Z

Attachment added by trac user embray on 2012-04-24: writetest.c

numpy-gitbot · 2012-10-23T02:46:45Z

@charris wrote on 2012-04-27

Arrggghhhh...

I h8 these buggy OS workarounds. But we are here to serve ;) If you can put together a pull request with a fix and test using #2256 as a template I'll put it in. And thanks for tracking it down.

numpy-gitbot · 2012-10-23T02:46:46Z

trac user embray wrote on 2012-04-27

Believe me, I hate them just as much. "OS X Lion - The world's most advanced OS...that can't write files properly."

Sure, I'll put together a pull request to fix this.

charris · 2014-05-05T18:12:30Z

I believe this is fixed in Mavericks. Please reopen if there is a continuing problem.

lukauskas mentioned this issue Dec 30, 2012

save/load and tofile/fromfile fail silently for large arrays on Mac os X #2806

Closed

pv mentioned this issue Oct 3, 2013

ERROR: test_big_arrays (test_io.TestSavezLoad) on OS X + Python 3.3 #3858

Closed

charris closed this as completed May 5, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tofile() truncation on arrays >= 2**32 on 64-bit OSX (Trac #2114) #574

tofile() truncation on arrays >= 2**32 on 64-bit OSX (Trac #2114) #574

numpy-gitbot commented Oct 19, 2012

numpy-gitbot commented Oct 23, 2012

numpy-gitbot commented Oct 23, 2012

numpy-gitbot commented Oct 23, 2012

charris commented May 5, 2014

tofile() truncation on arrays >= 2**32 on 64-bit OSX (Trac #2114) #574

tofile() truncation on arrays >= 2**32 on 64-bit OSX (Trac #2114) #574

Comments

numpy-gitbot commented Oct 19, 2012

numpy-gitbot commented Oct 23, 2012

numpy-gitbot commented Oct 23, 2012

numpy-gitbot commented Oct 23, 2012

charris commented May 5, 2014