Skip to content
This repository has been archived by the owner. It is now read-only.

14 minute periodic IMAP-idle noop() calls? #193

Closed
testbird opened this issue Jun 17, 2018 · 28 comments
Closed

14 minute periodic IMAP-idle noop() calls? #193

testbird opened this issue Jun 17, 2018 · 28 comments
Labels

Comments

@testbird
Copy link
Contributor

@testbird testbird commented Jun 17, 2018

NOOP may be cheaper than frequent re-starting of a new idle watch on the server/client,
and network interruptions should be handled better and more quickly by listening to network events (instead of frequent IDLE restarts).

I have searched the repo but was not able to find a NOOP loop.
I found the requirement for 14 minute timeouts mentioned at https://www.isode.com/whitepapers/imap-idle.html. It says:

In practice things are made more complex by the problem of timeouts occuring when there is no activity keeping the connection open. The main timeouts that will occur are:

  • IMAP server timeout: Typically occurs after 30 minutes with no activity.
  • NAT Gateway timeout: Most mobile devices access the Internet through a device operated by the mobile service provider called a NAT (Network Address Translation) gateway. These will typically time out an idle connection after 15 minutes.

The solution to this is for the IMAP client to issue a NOOP (No Operation) command at intervals, typically every 15 minutes. This will exchange a few bytes of data, and keep everything active.

@t-oster

This comment has been minimized.

Copy link
Contributor

@t-oster t-oster commented Jun 17, 2018

Shouln' the IDLE command be send every few minutes anyway? (It should usually be resent every 28min as I see, but right now it should be sent every minute https://github.com/deltachat/deltachat-core/blob/master/src/mrimap.c#L921)

@testbird

This comment has been minimized.

Copy link
Contributor Author

@testbird testbird commented Jun 17, 2018

Thank you, for pointing me to it.

So I see in the released version 0.17.3 the IDLE_DELAY_SECONDS is (28*60), and it only got later reduced to 23 and then 1 minute.

The intervals > 15 minutes in the release explain that the NAT timeout breaks the tcp connection if there is no activity at all. Consequently the idle thread only gets re-established after 30 minutes.

The current 1 minute polling during imap-idle mode may prevent NAT resets, but IIUC it's just a ugly workaround until smtp is in a separate thread, right? It somehow defeats the purpose of IDLE.

I found this, with an answer saying how messages should actually not get lost:
https://stackoverflow.com/questions/2513194/imap-idle-timeout
Together with
https://joshdata.wordpress.com/2014/08/09/how-bad-is-imap-idle/
what I take away is:

  • The imap-idle connections needs to be refreshed at least after 29 minutes, even if the client IP has not changed. Possibly start a new connection before closing the old one, in case some server implementations would miss to send a notification otherwise.
  • If the client has an IP in the private IP ranges (and the server has not send a "OK Still here" after at most 14 minutes), the client has to emit a NOOP to prevent NAT resets.
    => Or more simply, already refresh the IDLE after 14 minutes from private IPs.
  • Polling can be done, instead of IDLE, by calling NOOP more frequently.
@csb0730

This comment has been minimized.

Copy link
Contributor

@csb0730 csb0730 commented Jun 17, 2018

My experience by now (v0.17.3 and prior):

  • With the 28*60 interval I never had the experience that some msg are not coming. So I'm not shure to really change this (!)
  • I have the experience that sometimes DC fully gets stuck in a situation where no message comes in but in same situation no message can be sent until DC is restarted.
  • It is possible that under unstable network conditions connection handling is not done as clear as necessary. Means that if for example network is available or not or IP changes from mobile to WLAN then there is possibly a problem. Maybe this is the only problem which exists!

@r10s is trying to make connection handling more clear and easy now but I think this should be done with patience. A necessary one minute data transfer would really concern me. You are right if You see this as not in sense of IDLE.

But we will see. Maybe this should be done in a field test. Maybe there should be introduced some user settings to give user possibility to test this (like K9 offers, IDLE timout adjustable!).

@testbird

This comment has been minimized.

Copy link
Contributor Author

@testbird testbird commented Jun 18, 2018

  1. I've seen problems with 28*60 interval (IMAP connection resets every 30 minutes with the releases), and because the was first slightly reduced to 23 minutes, I guess they are practically confirmed by others as well.
    But theoretically, if the server is not sending keep alives, the connection must get broken with NAT resets after 15 minutes. So some action after 14 will be required IF connecting from a private IP range network to such servers. If possible we could try to first confirm if the server is really not sending any connection keep-alives (to take this burden from the mobile clients). (May the new IMAP thread get an option to log those, and maybe more frequent log marks to see when it gets killed?)

  2. Seen this at other person setup as well. One possibility could be server IP / connection limits https://www.dovecot.org/list/dovecot/2015-February/099713.html, but if I recall it correctly it seemed to have happened on or after connecting to wifi. So maybe SMTP blocked (possibly due to routers now blocking port 25?).

  3. To be seen with new threading. :-)

As already suggested, DC may automatically adapt the IDLE timeout to the current IP network, and the current server's keep-alive interval.

@r10s

This comment has been minimized.

Copy link
Member

@r10s r10s commented Jun 18, 2018

we're currently reworking some parts of the idle/thread behavior, also see the discussion on the mailing list.
the general ideas:

  • all imap is done queued in an extra thread and all smtp is done queued in an extra thread. this allows us to remove most of the locking conditions,
  • if the process is killed by the os, these threads are reinstalled in a timer that is executed periodically (about once per minute in device-OFF-mode, about once per 10 minutes on device-DOZE-mode)
  • we have two job queues now - one for imap and one for smtp. in up to 0.17.3 this was only one. so if sending fails, receiving is possible and both can also happen at the same time.
  • no more threads in the core, thread handling and the loop is done in the android part

there may be still bug, however, the overall approach is much simpler and clearer and complicated locking and multi-thread conditions are avoided most of the time.

@testbird

This comment has been minimized.

Copy link
Contributor Author

@testbird testbird commented Jun 18, 2018

Very nice and consistent! It should help in logging and identifying connection problems.

Hmm, if the core function of receiving and sending threads would not be a (re-integrated) part of the deltachat-core package, would every app have to create those threads? It may not make sense on all platforms, and create problems with connector plug-ins. The reinstall timers are specific, and at the same time should only be set up once per device.

Maybe it's already time to have some thoughts about the possibly of separting this properly. How would you propose using the same IMAP connections for "Email-push-notifications" and "Email-chat" as in #113 (comment). (Have deltachat-core-lib and deltachat-core-android?)

Or, just a compile time target?

@r10s

This comment has been minimized.

Copy link
Member

@r10s r10s commented Jun 18, 2018

would every app have to create those threads?

Yes. Depending on the environment, libs creating their own threads are hard to handle, so we decided to drop this. It's also better for mobile systems as the ui-part knows better where to set wake-locks or what do to when the device reconnects and so on.
However, handling the threads is rather straight-forward, the gist eg. for the imap-thread is simple:

new Thread(
    public void run()
    {
        while (true)
        {
            mrmailbox_perform_jobs();
            mrmailbox_fetch();
            mrmailbox_idle(); // may take hours and MUST be called directly after fetch; can be interrupted using mrmailbox_interrupt_idle()
       }
    }
).start();
@testbird

This comment has been minimized.

Copy link
Contributor Author

@testbird testbird commented Jun 18, 2018

Ok, so instead of every app having to create and maintain ongoing connections, threads, timers, and doze mode exceptions/workarounds, this is probably the point where a more generic push message client service (daemon) on android would implement a contentprovider through which it could pass updates to different apps,
https://stackoverflow.com/questions/15610527/notifychange-with-changed-uri-from-contentprovider-update
#113 (comment)

@testbird

This comment has been minimized.

Copy link
Contributor Author

@testbird testbird commented Jun 20, 2018

Looked into the the newest code changes, now, and thanks to @r10s answers, I found that really good looking concepts where started, to solve things properly. :)

For example, new threaded handling of IMAP communication and broadcast receivers to handle state changes.

To my understanding, though, moving the thread handling into the UI really is an unfortunate drawback, but with the possibility to turn it into a true feature.

Could you imagine deltachat-core to compile the broadcast receiver and thread management with all the tweaks to make things work into a service to be used by UI apps? Providing a content provider and receiving or watching the foreground status changes from UI apps?

It would require some more work, yes.

But there sure are already a bunch of apps (not MUAs) with interest in being able to receive self-sendable (IMAP) push messages. The devs of these apps may also be much more likely to contribute to the shared service than to the deltachat UI.
The very first thing the other apps would have to work on would be to filter and separate the push notification "channels" from the chat/email.

(also CC @Ampli-fier)

@testbird

This comment has been minimized.

Copy link
Contributor Author

@testbird testbird commented Jun 20, 2018

Account configuration (UI) could also go into the -core service app, and also allow to select other apps then deltachat to access the account's content.

@t-oster

This comment has been minimized.

Copy link
Contributor

@t-oster t-oster commented Jun 20, 2018

IMHO we should not try to build a generic IMAP based push service. This is not the scope of DeltaChat. This should be done in a separate project if needed. It would make DC even more complex and slow down the development. DC should concentrate on becoming a reliable Messenger and not the solution to other problems. (no offence but I think we should not get ahead of ourselfes)

@testbird

This comment has been minimized.

Copy link
Contributor Author

@testbird testbird commented Jun 20, 2018

Things only progress step by step.

Nothing radically new to implement. Deltachat UI provides already a content provider for attachments. Deltachat is not limited to receive only specific push messages, and of course shouldn't. DC's scope would stay on universal Email-Chatting (while allowing other apps to make different use of and contributions to the shared -core service).

The architectural adaption, to compile a supporting service in -core, provides a large potential for an overall benefit. It's the step that creates an attraction point for fellow devs, a starting point to go with.

@testbird

This comment has been minimized.

Copy link
Contributor Author

@testbird testbird commented Jun 20, 2018

Who else has gotten so far and is in DCs position, able to trigger something?

@testbird

This comment has been minimized.

Copy link
Contributor Author

@testbird testbird commented Jun 20, 2018

Maybe there are better ideas to break "message passing" down to a most simple proof-of-concept appoach...?

Could it be a content provider in deltachat-android that another app can use to watch only a specifc chat (messages coming from a contact)?

Or even simply an option for chats to let DC send an intent to a selected app on verified incoming messages?

@r10s

This comment has been minimized.

Copy link
Member

@r10s r10s commented Jun 20, 2018

i agree to @t-oster , the scope of Delta Chat is a messenger. and the current focus should be to make the core features stable and reliable.

@testbird

This comment has been minimized.

Copy link
Contributor Author

@testbird testbird commented Jun 23, 2018

With the 60 seconds timeout in v0.18 I see an idle timeout every minute (sometimes two) and already 7% battery usage. :(

Edit: Why did you choose 60 seconds? When 14 minutes seem to be the recomendation?

Is there still some advanced option to have imap-idle when screen is off?

@r10s

This comment has been minimized.

Copy link
Member

@r10s r10s commented Jun 23, 2018

i do not think the 7% battery come from the idle-timeout. this should not take much battery.
however, we can tweak this at some time, for now, the most important thing is to have a reliable sending+receiving system.
the one-minute timeouts help with that as, if something goes badly wrong, the imap- and smtp- jobs will be done after only one minute.

@testbird

This comment has been minimized.

Copy link
Contributor Author

@testbird testbird commented Jun 24, 2018

After the screen was off for some hours, and then turning it on, I actually see about 30 minute long periods without IMAP-idle interruptions in the log (only SMTP).

For example:

16:10:02 IMAP-IDLE started...
     ... only smtp-idle and -jobs ending starting every minute
16:40:15 IMAP-IDLE has data.
     ... IMAP-idle ends, jobs started, ended, fetch started
16:40:15 IMAP stream lost; we'll reconnect soon
     ... fetch failed, re-tried, 0 new messages
     

Here, I guess the "has data" actually was wrong, and rather the server saying to time out, or the os closing the connection, or libetpan saying no response from server? As the fetch() failed with stream lost, this may already have been so since a NAT reset after the first 15 minutes of idle.

In another example it looks like a local timeout triggered a full re-start right away:

22:00:25 IMAP-IDLE started...
     ... only smtp-idle and -jobs ending starting every minute
22:28:45 IMAP-IDLE timeout.
      ... then 1275ms full fetch that logged ignored folders (this is not always the case!) 
      ... imap-idle restarts every minute...
@testbird

This comment has been minimized.

Copy link
Contributor Author

@testbird testbird commented Jun 24, 2018

This is all too strange. Now I found in the info log two 8 and 16 minute periods with only smtp interruptions:

  • 15:12:00 IMAP-IDLE started..., ending with 15:20:25 IMAP-IDLE timeout, new fetch ok, and later

  • 15:20:48 IMAP-IDLE started..., ending with 15:36:19 IMAP-IDLE timeout, and the new fetch failing with stream lost

@testbird testbird changed the title 14 minute periodic IMAP noop() calls? 14 minute periodic IMAP-idle noop() calls? Jun 24, 2018
@testbird

This comment has been minimized.

Copy link
Contributor Author

@testbird testbird commented Jun 24, 2018

After updating the OS and rebooting, there was still a 19 minute period with only smtp interruptions.

It looks like the IMAP timeout (60 seconds) is quite flaky.

@testbird

This comment has been minimized.

Copy link
Contributor Author

@testbird testbird commented Jun 24, 2018

Could it be that the permanent notification works for the smtp thread, but is ineffective for the imap idle or interrupt loop?

@testbird

This comment has been minimized.

Copy link
Contributor Author

@testbird testbird commented Jun 24, 2018

Maybe only if loop thread is started before the permanent notification or similar.

@csb0730

This comment has been minimized.

Copy link
Contributor

@csb0730 csb0730 commented Jul 3, 2018

Beside the sometimes strange timings in the log I would be interested in understanding how the new approach is working in detail. Some questions:

  • Is the IMAP connection kept open for the complete timeout time (23 min?) and
  • is it only made busy after 14 min with a simple fetch only for INBOX?
  • Are the short 1min interruptions which can be seen in log only internal loops to check internal job status? Do they mean real data transfer? I think not?!

Maybe @r10s can give some short informations?

@csb0730

This comment has been minimized.

Copy link
Contributor

@csb0730 csb0730 commented Jul 3, 2018

By the way: I see battery consumption now at approx 2%. This is really good. Great 👍

@testbird

This comment has been minimized.

Copy link
Contributor Author

@testbird testbird commented Jul 27, 2018

There wasn't any noticeable high battery usage any more here either, I guess that the consumption seen could have been due to unplugging the phone from the charger during phone usage or update.

I had seen once that deltachat was not receiving messages until phone was woken up, though.

@r10s

This comment has been minimized.

Copy link
Member

@r10s r10s commented Jul 27, 2018

on my test phones here also 1% ~ 3% battery usage :)

@testbird

This comment has been minimized.

Copy link
Contributor Author

@testbird testbird commented Jul 27, 2018

The remaining point is:

  • From what the logs and missed messages show, the idle renew in the background is not always done in reliable 14 minute intervals. Notifications get through regardless, most of the times but not always. Maybe it is the server sending keep-alives, that makes it working mostly.

The #define IDLE_DELAY_SECONDS (1*60) may only work if the app is in the foreground state, aren't there further wake-up timers used or possible in the background?

@r10s

This comment has been minimized.

Copy link
Member

@r10s r10s commented Nov 6, 2018

i think the core of the issue is targeted and this can be closed therefore.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
4 participants
You can’t perform that action at this time.