Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace Time.parse with Time.iso8601 #1408

Closed
wants to merge 1 commit into from
Closed

Conversation

dwbutler
Copy link
Contributor

I noticed that the LogStash::Event constructor calls LogStash::Time.parse_iso8601. In MRI, this calls Time.parse. The comment says it all: "Warning, ruby's Time.parse is really terrible and slow." Luckily, Ruby has an ISO8601 parsing method built into the standard library: Time.iso8601.

I ran the event RSpec benchmarks in MRI Ruby 2.1.2 and here are the results:

Before: event @timestamp parse rate: 30851/sec, elapsed: 32.413648s
After: event @timestamp parse rate: 71543/sec, elapsed: 13.977632s

So the rate more than doubled.

I also created a Gist benchmarking Time.parse vs Time.iso8601 in MRI and JRuby, vs the Joda implementation in JRuby.

@wiibaa
Copy link
Contributor

wiibaa commented Jun 12, 2014

@colinsurprenant sorry to chase you but to be sure that #1405 good work is complete maybe this PR proposal would help as well in your new Logstash::Timestamp for non-jruby https://github.com/colinsurprenant/logstash/blob/feature/faster_json/lib/logstash/timestamp.rb#L69

Now that you added cool specs, it should be easy to decide :)

@colinsurprenant
Copy link
Contributor

thanks for the heads up @wiibaa and thanks for the suggestion and benchmarks @dwbutler ! curious, are you actually running LS on MRI? there's JRuby dependant stuff like Event#sprintf and the date filter.

So, I have a branch here where I did a first quick pass for MRI compatibility, for Event#sprintf I simply used Time#strftime but obviously that won't support the joda format syntax. I haven't touch the date filter yet.

So the problem is essentially to either document the format syntax difference if using MRI or JRuby. Since by default everyone will be running under JRuby, I'm not sure its pertinent to potentially confuse users with such details in the doc (also note that most LS users don't even know JRuby is running under the hood). We threw the idea of maybe writing a Joda to strftime format converter. Not sure about that. So bottom lime, very few users will be running LS on MRI.

On the other hand, I think its important to make sure LS does run on MRI as much as possible.

What do you guys think?

@colinsurprenant
Copy link
Contributor

forgot to mention, I will not apply this suggestion on #1405 but in my MRI branch which has not been PR'ed yet - but I can push the branch if you guys want to play with it, or I could also create a "early" WIP PR. Let me know.

@dwbutler
Copy link
Contributor Author

@colinsurprenant There is actually a whole ecosystem of Ruby gems using the logstash-event gem in order to write logs in logstash-compatible JSON format. (My gem, logstash-logger, is one of them.) So any Ruby implementation is fair game, not just MRI.

I noticed that if you pass a string timestamp to the LogStash::Event constructor, it will call LogStash::Time.parse_iso8601, which is why I raised this issue.

On the other hand, I think its important to make sure LS does run on MRI as much as possible.

It's good to maintain cross-Ruby compatibility, but I get the impression that logstash is heading more towards deeper dependency on Java for performance reasons.

In any case, LogStash::Event should definitely be compatible with all Rubies. It might be worth extracting LogStash::Event out to a totally separate codebase to minimize dependencies on the logstash codebase.

@colinsurprenant
Copy link
Contributor

@dwbutler good point about logstash-event. haven't really paid much attention to it, but will definitely put that on the roadmap. We'll see what makes more sense performance & maintainability wise.

Again, we do want to maintain cross-Ruby compatibility, but as you noted, we will rely more on Java for performance reasons, so some features might not be available in non-JRuby environments, unless we also provide, or someone contributes, and alternate implementation. We'll try to make Java specific implementations as modular as possible to facilitate alternate implementations.

@jordansissel
Copy link
Contributor

Regarding time parsing ,I did these tests a long time ago:

@dwbutler
Copy link
Contributor Author

@jordansissel I did find your benchmarks during my journey into the logstash code. =)

The problem with that benchmark is that it's comparing apples to oranges. Time.parse checks for all kinds of different time formats, so it will naturally be a lot slower than a method that only needs to check for ISO8601. Time.iso8601 is naturally much closer in speed to Joda's ISO8601 parsing. ("Only" 3 times slower instead of 20x slower.)

@colinsurprenant
Copy link
Contributor

@dwbutler about your gist: I added a note for using Benchmark.bmbm with JRuby...

@dwbutler
Copy link
Contributor Author

Thanks Colin, I didn't know about Benchmark.bmbm! It turns out that the results I posted were the stabilized values after running a few times, so Benchmark.bmbm actually gave similar results.

@wiibaa
Copy link
Contributor

wiibaa commented Jun 19, 2014

@colinsurprenant if I'm not wrong @jordansissel time parsing benchmark was in the context of the date filter where it is required to parse "anything" but now that Event internal timestamp is more safe-guarded to be ISO8601 or be modified only through the date filter, the proposed change would still help, no?

Also please have a look to jruby 1.7.12 new release, in particular for jruby/jruby#1319 if I understand correctly it could be a great benefit to long-running logstash instances

@colinsurprenant
Copy link
Contributor

@wiibaa I will integrate this into my MRI branch shortly. Also, we will definitely update to 1.7.12 but possibly not in 1.4.2.

@elasticsearch-release
Copy link

Can one of the admins verify this patch?

@elasticsearch-release
Copy link

Jenkins standing by to test this. If you aren't a maintainer, you can ignore this comment. Someone with commit access, please review this and clear it for Jenkins to run; then say 'jenkins, test it'.

@dwbutler
Copy link
Contributor Author

I signed the CLA. Not sure if/when the automated check will pick that up.

@elasticsearch-release
Copy link

Jenkins standing by to test this. If you aren't a maintainer, you can ignore this comment. Someone with commit access, please review this and clear it for Jenkins to run; then say 'jenkins, test it'.

@ph
Copy link
Contributor

ph commented Dec 4, 2015

Sadly we are moving away from a MRI implementation, so I think we can close this PR. :(

@ph ph closed this Dec 4, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants