-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add proxy_user, proxy_password and move IOError (EOFError) to its own rescue clause #59
Conversation
I tried mitmproxy (hit some https limitation in full vs relative urls after CONNECT), squid (nothing happens that I can tell, logs EOFError with retry) and OWASP ZAP (more traction, but GatewayTimeout and TooManyRequests errors only from Twitter). Note: that with ZAP I could connect to twitter from a browser after setting up the ca cert and proxy info in firefox. This on a MBP (time to get a linux laptop I think). |
About the rewrite. I found that Twitter have a Java based Streaming Client with proxy support... and an unofficial one that has seen more recent upkeep... |
@@ -98,11 +99,24 @@ class LogStash::Inputs::Twitter < LogStash::Inputs::Base | |||
# Port where the proxy is listening, by default 3128 (squid) | |||
config :proxy_port, :validate => :number, :default => 3128 | |||
|
|||
# Username where the proxy is listening, by default 3128 (squid) | |||
config :proxy_port, :validate => :number, :default => 3128 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
duplicate config item here
# Duration in seconds to wait before retrying a connection when twitter responds with a 429 TooManyRequests | ||
# In some cases the 'x-rate-limit-reset' header is not set in the response and <error>.rate_limit.reset_in | ||
# is nil. If this occurs then we use the integer specified here. The default is 5 minutes. | ||
config :rate_limit_reset_in, :validate => :number, :default => 300 | ||
|
||
|
||
# Duration in seconds to wait before retrying when Twitter client socket raises a EOF on network interruption | ||
config :steam_eof_wait_duration, :validate => :number, :default => 30 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we just implement an exponential backoff with no need for more configuration items?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now, if only we had a exponential backoff widget.
I will add the backoff.
@@ -145,6 +159,10 @@ def run(queue) | |||
@logger.warn("Twitter too many requests error, sleeping for #{sleep_for}s") | |||
Stud.stoppable_sleep(sleep_for) { stop? } | |||
retry | |||
rescue IOError => e | |||
@logger.info("Twitter error: #{e.message}, retry in #{steam_eof_wait_duration}s") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe @logger.error
here?
@logger.info("Twitter error: #{e.message}, retry in #{steam_eof_wait_duration}s") | |
@logger.error("Socket error, retrying in #{steam_eof_wait_duration}s", :error => e.message) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I chose INFO because it is not an error per se. When the user sees the ERROR log line they panic as if there is something that they need to do/fix. The message can be more INFO and not ERROR in its wording.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe a good compromise with warn? :D Since that the ocurrence of this event means that there's no tweet ingestion I think we should do > info
proxy_address: @proxy_address, | ||
proxy_port: @proxy_port, | ||
} | ||
uri = URI.parse('') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it seems the twitter client docs suggest you could just place the host and port into the hash as we're doing before, what is the need for creating the URI object here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I misread the part in the patch that uses the http
gem's Request object. I though it was using the typhoeus gem and that uses a uri.
Still, now that I'm reading the http gem code I'm more confused.
@jsvd
I used wireshark and owasp zap to check the before/after behaviour.
Before:
I can't say whether this plugin actually works with the types of proxies used in the enterprise. I would like to park/close this PR until I have some time to experiment with the Java libraries mentioned earlier. |
I also removed the integration tests, Twitter has blocked the Publisher App now - makes them unreliable and of little value.
Pere had a PR to add the proxy auth, I will close it #33.
A contributor submitted a PR a long time ago that changed the patch code to the twitter gem but I feel that it introduces a loop that can't check for
stop
#46.IMHO the streaming part of this plugin needs to be rewritten because I don't think we via the twitter gem are handling the semantics of the streaming api very well especially not if via a proxy.
Either we rewrite or we remove proxy support.