-
Notifications
You must be signed in to change notification settings - Fork 88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
nptdms.TdmsFile crashes when opening large file #19
Comments
I just tried pyTDMS; that is able to successfully open the file, but there is so little documentation about the project that it would be lovely to use npTDMS instead. |
Are you able to upload the file to somewhere like Dropbox for me to test? I haven't ever really looked at the memory consumption of npTDMS but I would have thought it could open a file that big. Does it work if you use the memmap file support, by passing a |
I did a small test using a 12 MB file, and loading that used a maximum of about 25 MB ram, so there's a bit of inefficiency there. Using the |
So out of interest, I tested pyTDMS, and it used about 48 MB when reading my 12 MB file. So i'm surprised npTDMS crashes on your file but pyTDMS works... |
Thanks for taking a look at this. I've uploaded the problematic datafile Let me know if you have any problems accessing it. On Sun, Nov 9, 2014 at 4:17 AM, Adam Reeve notifications@github.com wrote:
|
Reading some TDMS files creates a lot of these objects, so try to reduce their memory usage. See issue #19.
I got the file thanks. Trying to load it with npTDMS used over 60% of my 8 GB of ram before I killed it. The above commit reduces the memory usage to 645 MB, which is still not great but is a big improvement. For comparison, pyTDMS uses 327 MB. The problem is that the segments in your TDMS file alternate between three different structures, so each segment has a different set of objects to the previous segment. Because of the way npTDMS reads the file structure, a very large number of objects are allocated to represent every TDMS object in every segment. In many TDMS files, the segment structure repeats so the objects describing the segment structure can be reused. That commit just reduces the amount of memory required by each of these objects. Ideally I could reduce the number of objects used, but that would require much bigger changes. |
Gives a further small memory saving. See issue #19.
That second commit reduced memory usage a little bit further to 619 MB. |
Thanks so much for addressing this - it appears to be working much better. Thanks again for fixing it so quickly - it is very much appreciated. c. On Sun, Nov 9, 2014 at 8:02 PM, Adam Reeve notifications@github.com wrote:
|
Ok thanks, I'll close it for now. The memory usage is still not ideal but hopefully it will be good enough to use with your files now. |
By the way, there's a TDMS defragment VI in LabView that should clean up your TDMS files and make them much faster to read: http://zone.ni.com/reference/en-XX/help/371361H-01/glang/tdms_defrag/. I'm not sure whether this would be useful for you but thought it's worth mentioning. |
I have a very large TDMS file (220 Mb) that I am trying to open using nptdms; when I run a script which only includes the following lines, Python (or IPython) crashes most spectacularly.
import nptdms
inputFileNameString = "HAC-20141017-093246.tdms"
tdmsFile = nptdms.TdmsFile(inputFileNameString)
I can open smaller files without a problem using npTDMS, and can also open large files in other applications (more specifically, convertTDMS.m, available on the Mathworks Matlab user website at http://www.mathworks.com/matlabcentral/fileexchange/44206-converttdms--v10-). The resulting errors are listed below. Please let me know if this is a problem that can be fixed.
I am running Python 2.7.3, 32 bit, on a Windows 7 machine.
Traceback (most recent call last):
File "dataplot.py", line 5, in
myDataFrame = myFunc2.loadTDMSDataFrame(inputFileNameString)
File "C:\Folder\myFunc2.py", line 41, in loadTDMSDataFrame
tdmsFile = nptdms.TdmsFile(inputFileNameString)
File "C:\Python27\lib\site-packages\nptdms\tdms.py", line 148, in init
self._read_segments(tdms_file)
File "C:\Python27\lib\site-packages\nptdms\tdms.py", line 160, in _read_segments
previous_segment)
File "C:\Python27\lib\site-packages\nptdms\tdms.py", line 369, in read_metadata
segment_obj._read_metadata(f)
File "C:\Python27\lib\site-packages\nptdms\tdms.py", line 683, in _read_metadata
log.debug("Reading %d properties" % num_properties)
MemoryError
The text was updated successfully, but these errors were encountered: