Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP

Loading…

RBE sometimes hangs during the inbox loop #16

Closed
r-a-y opened this Issue · 1 comment

1 participant

@r-a-y
Owner

Received reports that RBE occasionally gets stuck as always connected.

This occurs some time during the inbox check and as a result, RBE fails to stop the connection and the DB markers are not cleared. During each successive WP cron check, RBE will think it's still connected to the inbox, when in actual fact, it isn't. This leads to many potential emails being stuck in perpetuity.

@r-a-y r-a-y referenced this issue from a commit
@r-a-y Introduce failsafe mechanism into RBE. See #16.
* Add a hook - 'bp_rbe_log_already_connected' - right after the failed cronjob connect message.

* Use this hook to introduce bp_rbe_failsafe().  This function checks the last three lines of the RBE debug log (number can be overriden with a custom define) to see if these lines all match the failed cronjob message.  If all lines match, bp_rbe_cleanup() is used to clear RBE's DB entries and schedule hook so RBE can run fresh on the next scheduled run.

* bp_rbe_failsafe() uses the new bp_rbe_tail() function to grab the last N lines of a file.  This function is derived from the MIT-licensed PHP-Tail library - https://github.com/ruscoe/PHP-Tail

* bp_rbe_tail() itself uses fseek() / fopen().  Was contemplating using the WP Filesystem API, but that was too simplistic and a little intensive.
d1ad634
@r-a-y r-a-y referenced this issue from a commit
@r-a-y Replace WP cron with custom version.
In this commit, we're removing integration with WordPress cron due to edge-
case scheduling issues and creating our own version of cron using
transients.

The side-effect of this is we are also able to automatically reconnect to
the inbox after an inbox check with the keep-alive value expires.  The next
commit will add an admin option to disable auto-reconnection.

See #16.
006f3b2
@r-a-y r-a-y referenced this issue from a commit
@r-a-y Add autoconnect option.
In commit 006f3b2, we overhauled cron for our custom version.  This led to
the ability to automatically reconnect to an inbox after the keep-alive
value expires.

However, if we wanted to disable this feature, there was no admin option to
do so.

This commit adds an admin option to disable the autoconnect feature.

See #16.
1579a98
@r-a-y r-a-y referenced this issue from a commit
@r-a-y Replace transients with options saved as timestamps.
In commit 006f3b2, transients replaced the older method of using WP-cron.
However, I was using transients more as an expiration date.

This is the incorrect way to use transients as transients uses the object
cache when a persistent object cache is in use.

A persistent object cache is a cache and not a data store.  Meaning that the
transient date guarantees the data will not exist past the expiration time.
However, the transient *can* expire earlier than this date, which could lead
to inaccuracies and potential double-connections.

Switching to options resolves this issue.

See #16.
5eeb88c
@r-a-y r-a-y referenced this issue from a commit
@r-a-y Tweaks to lock checking.
See commit 5eeb88c, #16.
051eaf1
@r-a-y r-a-y referenced this issue from a commit
@r-a-y Further tweaks to IMAP locking.
See #16.
dd2576e
@r-a-y r-a-y referenced this issue from a commit
@r-a-y Refactor IMAP locking system to use filesystem instead of database.
Querying the database to check for a lock is too slow and led to race
conditions where duplicate IMAP connections could occur.  Accessing the
filesystem is better, but not 100% foolproof.  WP-cron suffers from the
same problem.

In my testing, the filesystem can handle up to 12 concurrent page loads on
the same second without launching a duplicate connection.  This rarely
happens unless your site experiences a ton of traffic.

The filesystem method can be overriden by redeclaring these functions in a
plugin:
- bp_rbe_is_connecting()
- bp_rbe_add_imap_lock()
- bp_rbe_remove_imap_lock()
- bp_rbe_stop_imap()
- bp_rbe_should_stop()

Handy if you wanted to use something faster like memcached or shared memory.

See #16.
2adc4d0
@r-a-y
Owner

Closing this. Commit 5eeb88c resolved the main issue.

The commits following it improved the duplicate connections problem.

@r-a-y r-a-y closed this
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.