Skip to content
This repository has been archived by the owner on Apr 7, 2022. It is now read-only.

Backfill param is missing when re-connect. #86

Closed
GoSkyLine opened this issue Sep 23, 2013 · 4 comments
Closed

Backfill param is missing when re-connect. #86

GoSkyLine opened this issue Sep 23, 2013 · 4 comments

Comments

@GoSkyLine
Copy link

Hello everybody,

We have implemented the hosebird client (hbc-core-1.4.0.jar) to consume twitter firehose stream, everything works great 👍 , except the backfill param "count" which doesn't always work. That to be said, we set the backfill on startup, and we can see it works on first connection. However, we don't see the backfill param "count" anymore whenever it re-connects due to error during consuming.

We went through the HBC's source code few times, and it looks like backfill should always work with rate retrieved from rate tracker (we tried with the default rate tracker in ClientBuilder and our own instance of rate tracker as well).


Here are some logs around first connection, where we can see the backfill param "count"

2013-09-20 08:59:05,725 INFO TwitterStreamReader com.twitter.hbc.httpclient.BasicClient - New connection executed: HosebirdClient, endpoint: /1.1/statuses/firehose.json?count=150000&allow_restricted=true&delimited=length&partitions=0%2C1%2C2%2C3&stall_warnings=true
2013-09-20 08:59:06,080 INFO hosebird-client-io-thread-0 com.twitter.hbc.httpclient.ClientBase - HosebirdClient Establishing a connection
2013-09-20 08:59:06,085 INFO TwitterEventQueue TwitterEventQueue - [CONNECTION_ATTEMPT] - GET https://stream.twitter.com/1.1/statuses/firehose.json?count=150000&allow_restricted=true&delimited=length&partitions=0%2C1%2C2%2C3&stall_warnings=true HTTP/1.1
2013-09-20 08:59:06,741 DEBUG hosebird-client-io-thread-0 com.twitter.hbc.httpclient.ClientBase - HosebirdClient Connection successfully established
2013-09-20 08:59:06,742 INFO TwitterEventQueue TwitterEventQueue - [CONNECTED] - HTTP/1.1 200 OK
2013-09-20 08:59:06,742 INFO hosebird-client-io-thread-0 com.twitter.hbc.httpclient.ClientBase - HosebirdClient Processing connection data
2013-09-20 08:59:06,742 INFO TwitterEventQueue TwitterEventQueue - [PROCESSING] - Processing messages


Here are some logs around re-connection, where the backfill param "count" is missing

2013-09-20 11:28:27,083 INFO hosebird-client-io-thread-0 com.twitter.hbc.httpclient.ClientBase - HosebirdClient Disconnected during processing - will reconnect
2013-09-20 11:28:27,083 INFO hosebird-client-io-thread-0 com.twitter.hbc.httpclient.ClientBase - HosebirdClient Done processing, preparing to close connection
2013-09-20 11:28:27,083 INFO TwitterEventQueue TwitterEventQueue - [DISCONNECTED] - Read timed out
2013-09-20 11:28:27,093 INFO hosebird-client-io-thread-0 com.twitter.hbc.httpclient.ClientBase - HosebirdClient Establishing a connection
2013-09-20 11:28:27,093 INFO TwitterEventQueue TwitterEventQueue - [CONNECTION_ATTEMPT] - GET https://stream.twitter.com/1.1/statuses/firehose.json?allow_restricted=true&delimited=length&partitions=0%2C1%2C2%2C3&stall_warnings=true HTTP/1.1
2013-09-20 11:28:27,684 DEBUG hosebird-client-io-thread-0 com.twitter.hbc.httpclient.ClientBase - HosebirdClient Connection successfully established
2013-09-20 11:28:27,684 INFO TwitterEventQueue TwitterEventQueue - [CONNECTED] - HTTP/1.1 200 OK
2013-09-20 11:28:27,684 INFO hosebird-client-io-thread-0 com.twitter.hbc.httpclient.ClientBase - HosebirdClient Processing connection data
2013-09-20 11:28:27,685 INFO TwitterEventQueue TwitterEventQueue - [PROCESSING] - Processing messages


From source code we can see that it was IOExcepton causing the disconnection

catch (IOException ex) {
// connection issue? whatever. let's try connecting again
// we can't really diagnosis the actual disconnection reason without parsing (looking at disconnect message)
// but we can make a good guess at when we're stalling. TODO
logger.info("{} Disconnected during processing - will reconnect", name);
statsReporter.incrNumDisconnects();
addEvent(new Event(EventType.DISCONNECTED, ex));
}

According to a line of comment on ClientBase.run() method
"if IOException, time to restart the connection: handle http connection cleanup, do some backoff, set backfill"
So, we expect the backfill param would still be set before next connection, but it didn't.

Any help or direction will be much appreciated!

Thanks in advance!

Jack

@scubasau
Copy link

I noticed the same issue during Firehose testing. It appears that the issue is due to backoffMillis in BasicReconnectionManager being 0 for the first reconnection. It is only if the reconnect fails that backoffMillis is increased and estimateBackfill(tps) will result in a count > 0. This means that if a read times out or we get a gzip/ssl error (not uncommon), we'd be missing a few ms worth of tweets.

My solution to this was to write a custom ReconnectionManager that listens for DISCONNECT events. It then calculates the backfill based on tps * amount of time disconnected (i.e., currentTime - disconnectTime).

I'm still in the process of testing this, but so far so good

@GoSkyLine
Copy link
Author

@scubasau thank you for your input, hope it can be fixed in next release.

@kevinoliver
Copy link
Contributor

Thanks for the analysis @scubasau — looks like a straightforward fix.

@kevinoliver
Copy link
Contributor

Should be fixed in the next release.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Development

No branches or pull requests

3 participants