-
Notifications
You must be signed in to change notification settings - Fork 415
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TEZ-4365: Use Regex Pattern to Parse DAG ID String #172
base: master
Are you sure you want to change the base?
Conversation
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
@belugabehr, can you go back and run the performance tests in https://issues.apache.org/jira/browse/TEZ-1526. It will be interesting to see how this performs after removing the performance optimizations. |
@jteagles I create a small driver with JMH:
Quite a bit slower, but still an impressive 1,809,324 string per second on my dated hardware. Using regex provides for fewer lines of code and makes it more readable. But your call. If you're not accepting of it, consider the unit tests update. |
This code optimization was critically import as the the event thread spends a significant time parsing task/attempt ids to dispatch messages. I would hate to lose that. I can appreciate the simplicity of REGEX though. Perhaps the regex can be used to validate the manual parsing, as the manual parsing is more error prone. And improved testing is welcome. This patch inspired a YARN ID parsing improvement that made significant improvements there as well. I've linked the jira for reference in the original. https://issues.apache.org/jira/browse/YARN-6768 |
No description provided.