-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
340 Gigabyte data and error while creating .mex from mapped_tensor_shim.c #16
Comments
Hi Kerim I've pushed an attempted fix for the compile problem under MinGw, to branch iss16. Please pull that and see if you can successfully compile. |
Regarding access time, In general, accessing contiguous regions of a file is fast, while accessing bits and pieces is slow. So accessing Regarding the second run being much faster than the first, this is a disk access caching issue. On the first run the data is actually read from the drive / network. The OS then caches this data in memory, so the second run is reading from memory rather than from disk. This all happens behind the scenes as far as |
Thank you for reply Dylan, I forgot to tell you that my data is stored on a external hard drive. Now few minuts ago I launched the same code in loop, just to see will access to he data be faster: How do you think, if the data were stored on a local hard drive, would time access to the same data be few seconds/minutes? I can't check it now because I don't have enough space on my local hard drive. If there is a way to store data of a complicated format? I mean my data is recorded kind of a |
I think the access will definitely not be faster in a loop — Re transposing, yes you're right. I mean transposing the data when you write it to disk, not transposing from Matlab. If you need to read the data many times, then it might be worthwhile to use No, there's no way to store complicated data formats using |
Dylan,
I use windows 7 x64. When I try to compile .mex function from mapped_tensor_shim.c I can such error. I deleted the whole line 33, that includes "#define UINT64_C(c) c ## i64" and it was compiled normally. I could use MappedTensor.
I tryed to work with data of 340 Gigabytes.
mtVar = MappedTensor([r_path r_file],[8240 42654189], 'Class', 'uint8');
So I have a matrice of uint8 of size(mtVar) = [8240 42654189]. Then I use the following command to get the data (I need 42654189 elements that weigh about 42 Megabytes):
tic; mtVar(1,:); toc
The elapsed time is almost 9 hours. By the way it doesn't require much RAM or CPU unlike memmapfile, which consumes 5 Gigabytes of RAM (all my RAM) in 5 minutes and my machine is hung.
Is it possible to speedup access to such data?
By the way if try to get 10^5 elements:
tic; mtVar(1,1:10^5); toc
It takes time about 66 seconds. If I then rerun this command then elapsed time will be les than 1 second. Why?
The text was updated successfully, but these errors were encountered: