Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IDLE can't be interrupted using idleTerm() #15

Open
dinhvh opened this issue Oct 13, 2015 · 19 comments
Open

IDLE can't be interrupted using idleTerm() #15

dinhvh opened this issue Oct 13, 2015 · 19 comments

Comments

@dinhvh
Copy link

dinhvh commented Oct 13, 2015

Here's a sample code:

  idleChannel := make(chan int8)
  for {
    interrupted := false
    log.Printf("before idle\n");
    _, idleErr := c.Idle()
    log.Printf("after idle %v\n", idling);

    go func() {
      log.Printf("receiving data\n");
      c.Recv(-1)
      log.Printf("received data xx\n");
      if (!interrupted) {
        idleChannel <- 1 // received
      }
    }()

    log.Printf("select")
    timeout := time.After(2 * time.Second)
        select {
    case value := <- idleChannel:
      if (value == 1) {
        log.Printf("received data %v\n", c.Data);
      }
      c.IdleTerm()
    case timeValue := <- timeout:
      log.Printf("interrupted %v\n", timeValue)
      interrupted = true
      c.IdleTerm()
      log.Printf("interrupted done\n")
    }
    log.Printf("out of select")
  }

When calling, c.IdleTerm() it will interrupt IDLE but the execution flow will be blocked at c.IdleTerm().
The log "interrupted done" won't show.

@iragsdale has a suggestion of fix here:
boxer@796926b

And it seems like it's the right thing to do:

  • don't try to read and parse the response in c.IdleTerm()
  • let c.Recv() do the receiving and parsing work
@mxk
Copy link
Owner

mxk commented Oct 13, 2015

Please post the output of your sample code along with the raw message log (imap.DefaultLogMask = imap.LogConn | imap.LogRaw).

@dinhvh
Copy link
Author

dinhvh commented Oct 13, 2015

Here are the logs:

C: BQMUP4 SELECT "INBOX"
S: * FLAGS (\Answered \Flagged \Draft \Deleted \Seen $Forwarded $Junk $MDNSent $MailFlagBit0 $NotJunk $NotPhishing $Phishing JunkRecorded NotJunk Seen)
S: * OK [PERMANENTFLAGS (\Answered \Flagged \Draft \Deleted \Seen $Forwarded $Junk $MDNSent $MailFlagBit0 $NotJunk $NotPhishing $Phishing JunkRecorded NotJunk Seen \*)] Flags permitted.
S: * OK [UIDVALIDITY 2] UIDs valid.
S: * 189 EXISTS
S: * 0 RECENT
S: * OK [UIDNEXT 207898] Predicted next UID.
S: * OK [HIGHESTMODSEQ 17940080]
S: BQMUP4 OK [READ-WRITE] INBOX selected. (Success)
2015/10/13 15:16:42 before idle
C: BQMUP5 IDLE
S: + idling
2015/10/13 15:16:42 after idle
2015/10/13 15:16:42 select
2015/10/13 15:16:42 receiving data
2015/10/13 15:16:44 interrupted 2015-10-13 15:16:44.535483765 -0700 PDT
C: DONE
S: BQMUP5 OK IDLE terminated (Success)
2015/10/13 15:16:44 received data xx

@dinhvh
Copy link
Author

dinhvh commented Oct 13, 2015

I also tried the patch for @iragsdale but I think there's still a race condition somewhere :(

@iragsdale
Copy link

Hmm, I've been using that code for quite a while handling a lot of clients. What doesn't appear to be working correctly?

@dinhvh
Copy link
Author

dinhvh commented Oct 13, 2015

I retried and when used properly, it works :)
Which probably means that the API for now is not so great.

@iragsdale
Copy link

Yeah, a better API overall might be to have the idle command take a channel that you can send to for canceling. I think I tried to make that work and had a hard time with it, but it was a while back and I don't recall.

@mxk
Copy link
Owner

mxk commented Oct 14, 2015

The problem that you guys are running into is that it's not safe to call IdleTerm concurrently with Recv. The documentation says: "The Client and its Command objects cannot be used concurrently from multiple goroutines." You have one goroutine blocked in the Recv(-1) call and another one in IdleTerm, which also calls Recv via cmd.Result(OK). Only one will get the command completion response, while the other will remain blocked.

You aren't fixing this problem by removing the cmd.Result(OK) call. It's still not safe to call IdleTerm while a Recv is in progress. You would need to add mutexes in a number of places to ensure that any outstanding Recv calls are actually blocked on network I/O and are not in the middle of processing a new response. I started out with such design, but it quickly turned into a mess, so I went with the simpler implementation.

The best way to deal with this is to Recv for a short interval (e.g. 100ms), or to poll for new responses, and call IdleTerm from the same goroutine when you want to stop idling.

@dinhvh
Copy link
Author

dinhvh commented Oct 14, 2015

When coroutine are executing concurrently, I don't think we need mutex. We need probably need mutexes only when it's running in a thread.
Recv() for a short interval will be CPU intensive and it's likely that it won't scale.
Could you let us know what's not safe once we move the cmd.Result(OK) call?

@mxk
Copy link
Owner

mxk commented Oct 14, 2015

Run that code with the race detector on. I've never tried it, but I'm betting that it will complain about a lot of things. The rest of the client state is unprotected, starting with concurrent access to c.tags and c.cmds. What you're doing is definitely not safe and will break in strange, unpredictable, and unreproducable ways later on. 100ms is an eternity for a CPU and there is nothing wrong in doing that.

@dinhvh
Copy link
Author

dinhvh commented Oct 14, 2015

My concern is that if we run ten thousands (or more) IDLE loops, 100ms won't look like a small number any more. I'm going to investigate the race detector. Thanks!

@dinhvh
Copy link
Author

dinhvh commented Oct 14, 2015

The race detector isn't showing anything.

@iragsdale
Copy link

@dinhviethoa I do believe you want to take precautions to use this safely. What we do looks something like the following:

idleCmd, err := c.Idle()
if err != nil {
    log.Printf("Error sending idle for %v: %v", &account, err)
    return err
}

// wait for a notification or a timeout in a goroutine, because the receive will block
// this allows us to cancel the idle if needed
serverChan := make(chan bool)
doneChan := make(chan bool)
waitc := c
go func() {
    select {
    // wait for cancel of the timeout
    case _ = <-cancel:
        waitc.IdleEnd()
        // wait for the main routine to continue
        <-serverChan
        // now read the result
        idleCmd.Result(imap.OK)
    // if we got here first, the server notified us or the timeout expired
    case <-serverChan:
        // this both sends the done command AND reads the result
        waitc.IdleTerm()
    }
    doneChan <- true
}()

// now wait for a response
err = c.Recv(25 * time.Minute)
if err != nil && err != imap.ErrTimeout {
    return err
}
serverChan <- true

// wait for the idle response to be read
<-doneChan

This could probably be cleaned up a bit, but what you see here is that if we hit the timeout, we trigger the end of the IDLE by sending the command to the server. After that we wait for the main thread receive to finish before reading the result and allowing the main thread to continue. That means the ONLY place we can use the client concurrently is to trigger send the end of the idle command.

We've run this against the race checker with no issue, and we've been using this to watch many thousands of accounts concurrently with good results and no crashes. It's a bit tricky to use properly, and I'm sure we could find some way to encapsulate that pattern a bit better, but I can confirm it works.

@mxk
Copy link
Owner

mxk commented Oct 14, 2015

I just pushed a new interrupt branch, which I think may be the better solution to your problem. I haven't tested these changes, but the idea is to allow one goroutine to interrupt a blocked Recv call in another goroutine by sending an error via a channel. You would still need to use mutexes, conditions, or other channels to synchronize both goroutines so that only one is calling Client methods at any given time, but this should allow you to block on Recv indefinitely.

Test it out, let me know what you think. If you like this interface, I'll add some tests for it and merge it into master. I need to think about this a bit more, though. It's been a long time since I last looked at this code.

@dinhvh
Copy link
Author

dinhvh commented Oct 15, 2015

Thanks! I'm taking a look at it!

@iragsdale
Copy link

At first glance, this makes WAY more sense to me. I think it should be far simpler to use and also solves for the case where you send the term to the server but it isn't responding (which currently would continue to block the main thread indefinitely).

@dinhvh
Copy link
Author

dinhvh commented Oct 15, 2015

I just tried it. It works really well!

Here's an API suggestion:

Usage:

idleCmd, err := c.Idle()

idleData := make(chan bool)
go func() {
  err := c.IdlePoll();
  if (err == errorInterrupted) {
    return
  }
  idleData <- true
}()

timeout := time.After(pollDelay * time.Second)
select {
   case <- idleData:
     log.Printf("has data")
   case <- timeout:
     log.Printf("timeout")
     c.IdleInterrupt()
}

c.IdleTerm()

Approximate implementation:

// When creating the client, always register an interruption channel: c.interruptChan(c.idleChannel)

func (c *imap.Client) IdlePoll() (error err) { // I'm not sure whether we should return an imap command so that the user of the API could collect the data on the command instead of collecting it on imap.Client
        err = c.Recv(-1)
        return
}

func  (c *imap.Client) IdleInterrupt() {
        c.idleChannel <- errorInterrupted
}

@dinhvh
Copy link
Author

dinhvh commented Oct 19, 2015

@iragsdale are you planning to merge that change in https://github.com/boxer/go-imap ?

@iragsdale
Copy link

Maybe? We're not really doing any active development on the tool that uses this now, as it's currently running pretty stable and doing what we need it to do.

@mikaa123
Copy link

Late to the party, but I'm also running into this issue.
I went with polling c.Recv(100), as @mxk suggested, and it works well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants