Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

offlineimap confused after (suspend and) resume #56

Open
quite opened this issue Sep 30, 2013 · 35 comments
Open

offlineimap confused after (suspend and) resume #56

quite opened this issue Sep 30, 2013 · 35 comments

Comments

@quite
Copy link

quite commented Sep 30, 2013

I have at (almost) all times offlineimap running in a tmux. After resuming my compu, offlineimap has problems to reconnect properly.
Often it just hangs for a long time (or indefinetely) while nothing seems to happen.

I came up with a solution to the problem by having a resume-script that sends SIGUSR2 to my running offlineimap, whereafter it is automatically restarted.

The problem though, is that sometimes offlineimap takes a long time to exit upon SIGUSR2. Maybe this is again related to the lost tcp connections, timeouts, and such. And essentially the same problem as above.

Any suggestions on how I should solve? I have been reluctant to just send SIGKILL to the process, because I was afraid that this might cause inconsistencies in the repo, or such. But maybe this is what I need and should to do?

@konvpalto
Copy link
Member

Which version of OfflineIMAP you're using? Which OS?

@quite
Copy link
Author

quite commented Sep 30, 2013

offlineimap 6.5.4 on an updated Arch Linux, which means basically latest vanilla versions of everything

@quite
Copy link
Author

quite commented Oct 3, 2013

What typically happens after resume and upon sending offlineimap the SIGUSR2:

Terminating after this sync...
[seems to hang forever, i press ^C]
Terminating NOW (this may take a few seconds)...
[hung again....]

@aignas
Copy link

aignas commented Feb 6, 2014

Just wanted to add that I can see the same whilst using 6.5.5 on an updated Arch Linux. Are there any thoughts as of why is this happening?

@Gonzih
Copy link

Gonzih commented Feb 10, 2014

Same here. Offlineimap 6.5.3 on OpenSUSE. I use timeout to fix that and kill offlineimap if timeout returns non zero. Never had issues with broken repo because of that (happily).

@quite
Copy link
Author

quite commented Feb 11, 2014

How do you use timeout(1) for that? Are you running offlineimap -o (run-once mode)?

@Gonzih
Copy link

Gonzih commented Feb 11, 2014

No, just calling offlineimap by cron.

On Tue, Feb 11, 2014 at 03:55:07AM -0800, Daniel wrote:

How do you use timeout(1) for that? Are you running offlineimap -o (run-once mode)?


Reply to this email directly or view it on GitHub:
#56 (comment)

Best regards,
Max

@doy
Copy link

doy commented Mar 12, 2014

I'm seeing this issue too (6.5.5 on Arch Linux).

@rcorre
Copy link

rcorre commented Mar 16, 2014

Same here, 6.5.5 on Arch Linux. Not using cron, just running offlineimap -o.

@treese
Copy link

treese commented Apr 1, 2014

I am seeing a similar problem on a Macbook Air running Mac OS X 10.9.2 (Mavericks). No cron, just using offlineimap.

@christopherraa
Copy link

Same here. Offlineimap 6.5.4, Python: 2.7.5, Debian Wheezy (Ubuntu)

@mlen
Copy link

mlen commented Jun 8, 2014

You should try to set socktimeout in general section of your offlineimaprc. It sets timeout on select call, so the process will terminate when no data is recieved within the timeout. Solved the problem for me.

@rcorre
Copy link

rcorre commented Jun 8, 2014

Thanks @mlen -- setting socktimeout in the [general] section seems to work.

@doy
Copy link

doy commented Jun 10, 2014

This workaround helps, but I'm still getting occasional hangs even with the socktimeout option set.

doy added a commit to doy/conf that referenced this issue Jun 16, 2014
without this, offlineimap hangs whenever the network goes away (due to
suspend or whatever). without this set, the socket never times out, even
if the connection no longer exists, and offlineimap defers signals until
all connections are cleaned up, so it just permanently hangs. see
OfflineIMAP/offlineimap#56.
@jbmartin
Copy link

@mlen's suggestion worked for me, thanks.

@tgy
Copy link

tgy commented Sep 9, 2014

what value of socktimeout did you use?

@rcorre
Copy link

rcorre commented Sep 9, 2014

I use socktimeout = 10

@tgy
Copy link

tgy commented Sep 9, 2014

@murphyslaw480 thanks 👍

@nicolas33
Copy link
Member

Requires to be documented in known issues.

untitaker added a commit to untitaker/dotfiles that referenced this issue Feb 7, 2015
@nicolas33
Copy link
Member

Done in cd962d4.

@doy
Copy link

doy commented Feb 13, 2015

As I mentioned, setting socktimeout doesn't actually fix the problem - it makes it less frequent, but it still happens to me all the time even with this set. I don't think this is a sufficient fix.

@nicolas33
Copy link
Member

Ok. I know current behaviour sucks. Sadly, it's hard to handle this properly so don't expect this to be fixed soon.

@nicolas33 nicolas33 reopened this Feb 13, 2015
@nicolas33 nicolas33 modified the milestones: 6.6.0 (next major), 6.5.7 (next minor) Mar 18, 2015
@nicolas33 nicolas33 removed their assignment Mar 18, 2015
@ezyang
Copy link
Contributor

ezyang commented Mar 31, 2015

There are two things I would suggest here:

  1. If the first C-c attempts a graceful exit, the second C-c should hard exit.
  2. I'm pretty sure the remaining hanging is from the timeout not being applied to all blocking calls. This is something an audit should be able to catch.

@nicolas33
Copy link
Member

Hi Edward,

  1. Yes, I've already suggested to handle the second Ctrl-c as hard exit.
  2. On resume the timeout might require to wait until the local time is adjusted. Or wait until the timeout is hit. BTW, the broken socket should be better handled.
    I'm planning a deep refactoring and such issues should be made easier to fix. You might be interested in following the coming changes.

@nicolas33 nicolas33 removed this from the next major (6.6.0) milestone Nov 5, 2015
@dolohow
Copy link
Member

dolohow commented Feb 16, 2016

Still happening on 6.6.1

@nicolas33
Copy link
Member

Fix to force OfflineIMAP to stop with consecutives ctrl+c was merged some days ago. Will be in the next release (6.7.0-rc2).
AFAIK, nobody worked on proper resume at wakeup.

@nicolas33 nicolas33 added bug and removed enhancement labels Feb 17, 2016
@dolohow
Copy link
Member

dolohow commented Feb 17, 2016 via email

@tgy
Copy link

tgy commented Feb 17, 2016

I'm also interested in someone fixing this. I have the same issue. 😅

(I have offlineimap running as a systemd user service under Arch Linux)

@nicolas33
Copy link
Member

  • Naive but still effective approach would be to introduce print statements to find a blocker.
  • A more advanced way can be to try strace while this can be tricky to map the output to lines of code.
  • Python includes debugging tools that could be usefull. Most appealing for the purpose might worth a try.
  • Team working can greatly help. Do share your analysis, success and failures.

Bear in mind there might be more than one blocker.

@pwnage101
Copy link

Assuming this will be difficult to debug, can we at least implement a SIG{INT,TERM} handler which deletes the lockfile? I currently have to delete .offlineimap/*.lock files every time I resume from suspend because when offlineimap freezes it leaves the lockfiles hanging around.

@nicolas33
Copy link
Member

Assuming this will be difficult to debug, can we at least implement a SIG{INT,TERM} handler which deletes the lockfile? I currently have to delete .offlineimap/*.lock files every time I resume from suspend because when offlineimap freezes it leaves the lockfiles hanging around.

Please, try v6.7.0 or later.

@pwnage101
Copy link

Ah, since I run offlineimap as a notmuch hook, when I hit CTRL-C offlineimap was getting reparented to init, returning me to the shell but without actually killing the sync threads. If i wait another couple of seconds after CTRL-C, the lockfiles disappear and I can re-run notmuch.

By the way, I am running v6.7.0, but I was wrongly blaming offlineimap for something that seems to be an issue with notmuch or my configuration.

@nicolas33
Copy link
Member

@pwnage101 You might like to try SIGQUIT. Unix signals handling is explained in the manual.

@sim590
Copy link

sim590 commented Oct 11, 2016

While the popular reaction to this problem seems to be to restart offlineimap, I might point out that restarting offlineimap might not be ideal. For instance, if you have your passwords being retrieved by a subprocess using gpg-agent, each time the connectivity issue occurs, gpg-agent will ask for your password which can appear weird at first while you're not aware this is related to offlineimap restarting in the background and also annoying if it happens alot. Therefor, I'm really looking forward to a solution.

wincent added a commit to wincent/wincent that referenced this issue Dec 6, 2016
I notice offlineimap sometimes hanging forever, especially on wake from
sleep. Often only one of the two configured accounts will stop syncing,
so it is hard to notice that something went wrong.

Searching led me to:

OfflineIMAP/offlineimap#56

Suggests that setting socktimeout will help, although likely won't fix
all issues. The sample config file documents it thus:

> By default, Offlineimap will not exit due to a network error until the
> operating system returns an error code.  Operating systems can sometimes take
> forever to notice this.  Here you can activate a timeout on the socket.  This
> timeout applies to individual socket reads and writes, not to an overall sync
> operation.  You could perfectly well have a 30s timeout here and your sync
> still take minutes.
>
> Values in the 30-120 second range are reasonable.
>
> The default is to have no timeout beyond the OS.  Times are given in seconds.

The key there being "no timeout beyond the OS".
@pwnage101
Copy link

pwnage101 commented Dec 9, 2016

I said earlier:

If i wait another couple of seconds after CTRL-C, the lockfiles disappear and I can re-run notmuch.

I should clarify that sometimes if i resume from suspend, offlineimap will hang forever. Of course I can kill it manually with SIGKILL or SIGQUIT.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests