Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MemoryError on large restore #42

Closed
slknijnenburg opened this issue May 23, 2012 · 24 comments
Closed

MemoryError on large restore #42

slknijnenburg opened this issue May 23, 2012 · 24 comments

Comments

@slknijnenburg
Copy link

After restoring the first 19050 emails to a new gmail address (migrating from an old one), I received the following exception (win7 x64, 6gb ram, gmvault 1.5-beta):

Restore email with id 1319881181591886388.
Error: .

=== Exception traceback ===
Traceback (most recent call last):
File "gmv_cmd.py", line 515, in run
File "gmv_cmd.py", line 423, in _restore
File "gmvault.pyc", line 1032, in restore
MemoryError

=== End of Exception traceback ===

When retrying, it raised the same exception after restoring only 6550 mails. I already tried to see if it possible to adjust the amount of reserved memory for python application (as it is possible for Java for example), but apparently there is no such option as Python's memory manager takes care of that.

I haven't done any Python coding myself, but my first guess at a fix would be that some resources need to be freed after a successful iteration of the for-loop starting at line 960 of gmvault.py...

@ghost ghost assigned gaubert May 23, 2012
@gaubert
Copy link
Owner

gaubert commented May 23, 2012

I restored multiple times my 30000 emails without any problems on Mac OSX, Linux (and few times on windows).
The problem seems to be on Windows. I need to investigate.
Thanks for reporting the problem.

@slknijnenburg
Copy link
Author

Let me know if there's anything I can do to help!

@gaubert
Copy link
Owner

gaubert commented May 24, 2012

@Halu I need to profile the program and it will take some time.
In the mean time, you can use the --restart option for restore. It should really be called --resume as it will resume the restore near the last email you restored.

@slknijnenburg
Copy link
Author

Perhaps another hint at what is going wrong: when continueing using the --restart option, gmvault fails within the first 50 emails, again with the same MemoryError, as if it buffers all the emails it already has restored anyway. Restarting the gmvault-shell has no effect.

@gaubert
Copy link
Owner

gaubert commented May 25, 2012

Ok thanks, I don't understand your problem but I am profiling Gmvault to remove memory fragmentation issues.
I should have a new version for testing ready next week. Keep your email db it is safe.
Will you be ready to test this new version for me next week ?

Thanks,

Guillaume

@slknijnenburg
Copy link
Author

Sure, I'll be happy to help testing. Thanks for the quick responses!

@gaubert
Copy link
Owner

gaubert commented May 26, 2012

@Halu I have prepared v1.6-dev available from here http://bit.ly/JtfbvD.
In this version I have fixed some memory fragmentation issues but I am not convinced that you were affected by this issue. Please install the new version and test it in debug mode.
$> gmvault restore myemail@gmail.com --debug --restart

If it fails again send me by email guillaume((dot))aubert((at))gmail((dot))com the log file that should be under C:\Users\MyUser and called gmvault.log. Please zip it before to send it if it is big.

Waiting for your results now. Many thanks for your help.

@gaubert
Copy link
Owner

gaubert commented Jun 3, 2012

@Halu Did you have time to do a test ? I will close the issue in the mean time because other people could test the fix for me.

@gaubert gaubert closed this as completed Jun 3, 2012
@slknijnenburg
Copy link
Author

It did not work alas, it failed on one specific email which I sent you as an email attachment to you 6 days ago :) It is unclear to me though if it still is a memory issue or that I'm encountering another bug...

@gaubert
Copy link
Owner

gaubert commented Jun 3, 2012

@Halu Ooups yes sorry. Just to be sure. Now it doesn't eat all your memory but fails for one email. There is something in your email that is special and I need to analyse it (probably a special char but I need to understand who did put it).
In the mean time, you could go in your email db and remove the failing email (it should start with the gmail id a long number that looks like a timestamp). Move the .meta and .eml.gz file somwhere else and you could continue the restore.

@slknijnenburg
Copy link
Author

Deleting the 'faulty' .eml.gz/meta-files does help, it indeed continues the restore when I retry after deleting. However, even though the hanging does not eat my memory anymore, I noticed that the gmv_cmd.exe does hog the CPU: it keeps at a steady 50% on my dualcore cpu while it hangs, which makes me think the hanging on certain emails may somehow still be related to the previous memory errors, except that the memory is now not eaten anymore :) I'll continue restoring now whilst deleting those few mails that give raise to trouble, as it is a fine workaround.

@gaubert
Copy link
Owner

gaubert commented Jun 4, 2012

@Halu If the emails are not confidential please consider sending them to me. I will make a test suite with them.
For a future version, I will profile Gmvault to see how to fix the issues (in imaplib) most probably

@bommy
Copy link

bommy commented Jun 13, 2012

v1.6beta, OSX 10.6, I'm getting this issue too. Got up to 22GB of virtual memory then the Mac freezes totally as it runs out of hard drive space.

If I delete/move the problematic email, is there an easy way for me to set the upload going again from the failed point? Is there any danger of creating duplicates?

@gaubert
Copy link
Owner

gaubert commented Jun 14, 2012

Hi,

I am working on that issue but sometimes emails contain "crap" and they do
not respect the standards. I will try to find a solution but the problem seems to be in the lower layers (SSL).
Once you have removed the faulty email, you can use the option --resume
that will restart around where it stopped.
Could you send me the faulty email if it isn't personnal ?
I would like to do a test with it.

Thanks,
Guillaume

On Wed, Jun 13, 2012 at 11:09 PM, bommy <
reply@reply.github.com

wrote:

v1.6beta, OSX 10.6, I'm getting this issue too. Got up to 22GB of virtual
memory then the Mac freezes totally as it runs out of hard drive space.

If I delete/move the problematic email, is there an easy way for me to set
the upload going again from the failed point? Is there any danger of
creating duplicates?


Reply to this email directly or view it on GitHub:
#42 (comment)

@bommy
Copy link

bommy commented Jun 14, 2012

Thank you. How would I identify and open the email to see if it's anything
personal or not?

Screenshot of the failure state attached. I'm actually trying to re-do the
process on my work computer (Win 7) as the internet connection is much
faster here, so would be interesting to see if the same happens. I'm
redownloading all the emails rather than transferring my existing backup
files from Mac to PC.

Tom

On 14 June 2012 08:01, Guillaume Aubert <
reply@reply.github.com

wrote:

Hi,

I am working on that issue but sometimes emails contain "crap" and they do
not respect the standards. I will try to find a solution.
Once you have removed the faulty email, you can use the option --resume
that will restart around where it stopped.
Could you send me the faulty email if it isn't personnal ?
I would like to do a test with it.

Thanks,
Guillaume

On Wed, Jun 13, 2012 at 11:09 PM, bommy <
reply@reply.github.com

wrote:

v1.6beta, OSX 10.6, I'm getting this issue too. Got up to 22GB of virtual
memory then the Mac freezes totally as it runs out of hard drive space.

If I delete/move the problematic email, is there an easy way for me to
set
the upload going again from the failed point? Is there any danger of
creating duplicates?


Reply to this email directly or view it on GitHub:
#42 (comment)


Reply to this email directly or view it on GitHub:
#42 (comment)

@bommy
Copy link

bommy commented Jun 14, 2012

Sorry, attachment bounced, uploaded here - http://bit.ly/tomwhitakerpublic

Tom

---------- Forwarded message ----------
From: Tom Whitaker tom@tomwhitaker.com
Date: 14 June 2012 09:42
Subject: Re: [gmvault] MemoryError on large restore (#42)
To: Guillaume Aubert <
reply@reply.github.com

Thank you. How would I identify and open the email to see if it's anything
personal or not?

Screenshot of the failure state attached. I'm actually trying to re-do the
process on my work computer (Win 7) as the internet connection is much
faster here, so would be interesting to see if the same happens. I'm
redownloading all the emails rather than transferring my existing backup
files from Mac to PC.

Tom

On 14 June 2012 08:01, Guillaume Aubert <
reply@reply.github.com

wrote:

Hi,

I am working on that issue but sometimes emails contain "crap" and they do
not respect the standards. I will try to find a solution.
Once you have removed the faulty email, you can use the option --resume
that will restart around where it stopped.
Could you send me the faulty email if it isn't personnal ?
I would like to do a test with it.

Thanks,
Guillaume

On Wed, Jun 13, 2012 at 11:09 PM, bommy <
reply@reply.github.com

wrote:

v1.6beta, OSX 10.6, I'm getting this issue too. Got up to 22GB of virtual
memory then the Mac freezes totally as it runs out of hard drive space.

If I delete/move the problematic email, is there an easy way for me to
set
the upload going again from the failed point? Is there any danger of
creating duplicates?


Reply to this email directly or view it on GitHub:
#42 (comment)


Reply to this email directly or view it on GitHub:
#42 (comment)

@gaubert
Copy link
Owner

gaubert commented Jun 14, 2012

@bommy. What operation were you doing ? sync or restore ?
If it is a restore it might be due to a problematic email and I will explain you how to remove it from the database.

@bommy
Copy link

bommy commented Jun 14, 2012

It was a restore, thanks @gaubert.

@gaubert
Copy link
Owner

gaubert commented Jun 14, 2012

@bommy so you should have the email id in the logs of Gmvault. It is the last message that has been restored. For example
[2012-06-14 12:24]:CRITICAL:gmvault:Restore email with id 1293213852590002613

The id here is 1293213852590002613

after in the terminal of your mac, you can do:
$>cd ~/gmvault-db (or where you saved your emails)
$>find . -name "1293213852590002613*"
./db/2009-01/1293213852590002613.meta
./db/2009-01/1293213852590002613.eml.gz

Move the files to quarantine:
$>mv ./db/2009-01/1293213852590002613* ./quarantine

Then the email is 1293213852590002613.eml.gz
$>cd ./quarantine
$>gunzip 1293213852590002613.eml.gz
$>cat 1293213852590002613.eml
To see the content or you can use any editor.

Thanks for the help.

@3formit
Copy link

3formit commented Jul 27, 2012

I am having the MemoryError problem as well while trying to restore a large mailbox. The restore will work for awhile and then I will get this:

Restore email with id 1354135464925655217.
Restore email with id 1354136403316027783.
Restore email with id 1354136977391387211.
Restore email with id 1354142573072803336.
Restore email with id 1354144124929707583.
Restore email with id 1354144265707605388.
Restore email with id 1354146493429344036.
Restore email with id 1354146887922003500.
Restore email with id 1354148204224160982.
Restore email with id 1354148609343052803.
Restore email with id 1354148784008556488.
Error: .

=== Exception traceback ===
Traceback (most recent call last):
File "gmv_cmd.py", line 515, in run
File "gmv_cmd.py", line 423, in _restore
File "gmvault.pyc", line 1032, in restore
MemoryError

=== End of Exception traceback ===

If i use the --restart switch it will work for awhile again, but after about a half hour it will fail with the MemoryError.

I've been trying to restore this mailbox for 2 days and I am now getting a lot of heat for it not being done. Please advise on what I should do.

Thanks

@gaubert
Copy link
Owner

gaubert commented Jul 28, 2012

@3formit What version are you using ? v1.6-beta ?

@3formit
Copy link

3formit commented Jul 28, 2012

Correct 1.6 beta

@gaubert
Copy link
Owner

gaubert commented Jul 29, 2012

@3formit I have version 1.7 in alpha available here: http://bit.ly/JtfbvD.
It should solve the problem and quarantine the emails that create the issues. For some emails, Gmail refuse to ingest them and sometimes it creates in the lower layers (ssl, socket) and infinite loop. I should have fixed it.
Please have a try and let me know.
In case of problems use --debug and send me the gmvault.log file (available in $HOME/gmvault.log for lin and mac and %HOME%/gmvault.log)

@gaubert
Copy link
Owner

gaubert commented Aug 3, 2012

@3formit did you have time to try it and was it better with 1.70alpha.

Thanks,
zoobert

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants