Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bad_alloc when loading CSVs larger than ~120MB with 200+ dimensions #102

Open
lumyx2 opened this issue Oct 17, 2016 · 3 comments
Open

bad_alloc when loading CSVs larger than ~120MB with 200+ dimensions #102

lumyx2 opened this issue Oct 17, 2016 · 3 comments

Comments

@lumyx2
Copy link

lumyx2 commented Oct 17, 2016

Hi Nick,

There seems to be something funny with the memory. I notice that very large files crash it instantly. It basically fails to load them.

Also, I notice that it seems to leak memory. After building 100s of models, the program crashes with bad_alloc. If I make it resume from where it was, it runs fine till it crashes again later on. I can't seem to see anything wrong with my code.

I basically have something along the lines of:

for (i -> end)
classificationdata testd;
load testd
for (j -> end)
classificationdata traind;
load traind;
train using traind and testd;

Everything is on the stack.

Cheers,
M

@nickgillian
Copy link
Owner

This sounds like a bad bug in the matrix class, I'll look into this now.

Can you give me a better idea of what you are doing to cause this error so I can recreate it. How big are the files you are trying to load? I'll try loading some really large CSV files and see if I can recreate the memory leaks.

@lumyx2
Copy link
Author

lumyx2 commented Oct 18, 2016

As i said, there is the instant crash when loading very large files and
then there is progressive leaks that creep in.

For the instant crash, I happens very liably when I try to just load a CSV
with 576 dimensions and of a size of over 200MB.

The progressive crash is a lot more elusive. Basically some memory is not
released correctly, but it's difficult to pin point where.
It tends to happens when while of going through my loop. I noticed that
when the models generated are larger, it tends to crash faster. I say this
because I have been doing a lot of grid searches for various different size
files and the crashes are very dependant of file sizes.

M

On 18 October 2016 at 06:30, Nicholas Gillian notifications@github.com
wrote:

This sounds like a bad bug in the matrix class, I'll look into this now.

Can you give me a better idea of what you are doing to cause this error so
I can recreate it. How big are the files you are trying to load? I'll try
loading some really large CSV files and see if I can recreate the memory
leaks.


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#102 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AHyvO9RDD8L7FkyZDkXx18H7Gr_W8LVFks5q1FldgaJpZM4KYglG
.

@SurKM9
Copy link

SurKM9 commented Apr 27, 2017

@lumyx2 @nickgillian same with me here. I am getting “Invalid Allocation Size: 4294967295 Bytes” when i try to load a .csv file of 35 MB with 17 columns. But when i reduce the file size to 746KB it works for DTW but for only single class. Any solutions yet?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants