Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TEZ-4365: Use Regex Pattern to Parse DAG ID String #172

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

belugabehr
Copy link
Contributor

No description provided.

@tez-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 16m 27s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ master Compile Tests _
+1 💚 mvninstall 12m 59s master passed
+1 💚 compile 0m 23s master passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04
+1 💚 compile 0m 22s master passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 checkstyle 0m 53s master passed
+1 💚 javadoc 0m 33s master passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04
+1 💚 javadoc 0m 19s master passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+0 🆗 spotbugs 1m 0s Used deprecated FindBugs config; considering switching to SpotBugs.
+1 💚 findbugs 0m 57s master passed
_ Patch Compile Tests _
+1 💚 mvninstall 0m 13s the patch passed
+1 💚 compile 0m 13s the patch passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04
+1 💚 javac 0m 13s the patch passed
+1 💚 compile 0m 11s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 javac 0m 11s the patch passed
+1 💚 checkstyle 0m 8s the patch passed
-1 ❌ whitespace 0m 0s The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix <<patch_file>>. Refer https://git-scm.com/docs/git-apply
+1 💚 javadoc 0m 12s the patch passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04
+1 💚 javadoc 0m 11s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 findbugs 0m 32s the patch passed
_ Other Tests _
+1 💚 unit 0m 30s tez-common in the patch passed.
+1 💚 asflicense 0m 13s The patch does not generate ASF License warnings.
36m 4s
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-172/1/artifact/out/Dockerfile
GITHUB PR #172
JIRA Issue TEZ-4365
Optional Tests dupname asflicense javac javadoc unit spotbugs findbugs checkstyle compile
uname Linux bb1b434b01c7 4.15.0-163-generic #171-Ubuntu SMP Fri Nov 5 11:55:11 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/tez.sh
git revision master / c9b8e90
Default Java Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
whitespace https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-172/1/artifact/out/whitespace-eol.txt
Test Results https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-172/1/testReport/
Max. process+thread count 91 (vs. ulimit of 5500)
modules C: tez-common U: tez-common
Console output https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-172/1/console
versions git=2.25.1 maven=3.6.3 findbugs=3.0.1
Powered by Apache Yetus 0.12.0 https://yetus.apache.org

This message was automatically generated.

@tez-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 1m 37s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ master Compile Tests _
+1 💚 mvninstall 14m 42s master passed
+1 💚 compile 0m 27s master passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04
+1 💚 compile 0m 23s master passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 checkstyle 0m 59s master passed
+1 💚 javadoc 0m 35s master passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04
+1 💚 javadoc 0m 24s master passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+0 🆗 spotbugs 1m 1s Used deprecated FindBugs config; considering switching to SpotBugs.
+1 💚 findbugs 0m 59s master passed
_ Patch Compile Tests _
+1 💚 mvninstall 0m 15s the patch passed
+1 💚 compile 0m 15s the patch passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04
+1 💚 javac 0m 15s the patch passed
+1 💚 compile 0m 14s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 javac 0m 14s the patch passed
+1 💚 checkstyle 0m 8s the patch passed
+1 💚 whitespace 0m 0s The patch has no whitespace issues.
+1 💚 javadoc 0m 13s the patch passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04
+1 💚 javadoc 0m 13s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 findbugs 0m 38s the patch passed
_ Other Tests _
+1 💚 unit 0m 32s tez-common in the patch passed.
+1 💚 asflicense 0m 14s The patch does not generate ASF License warnings.
23m 44s
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-172/2/artifact/out/Dockerfile
GITHUB PR #172
JIRA Issue TEZ-4365
Optional Tests dupname asflicense javac javadoc unit spotbugs findbugs checkstyle compile
uname Linux 8454a0c9589f 4.15.0-163-generic #171-Ubuntu SMP Fri Nov 5 11:55:11 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/tez.sh
git revision master / c9b8e90
Default Java Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Test Results https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-172/2/testReport/
Max. process+thread count 91 (vs. ulimit of 5500)
modules C: tez-common U: tez-common
Console output https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-172/2/console
versions git=2.25.1 maven=3.6.3 findbugs=3.0.1
Powered by Apache Yetus 0.12.0 https://yetus.apache.org

This message was automatically generated.

@jteagles
Copy link
Contributor

@belugabehr, can you go back and run the performance tests in https://issues.apache.org/jira/browse/TEZ-1526. It will be interesting to see how this performs after removing the performance optimizations.

@belugabehr
Copy link
Contributor Author

@jteagles I create a small driver with JMH:

Benchmark           Mode  Cnt        Score        Error  Units
TestSplit.master   thrpt   30  5324642.492 ± 228078.761  ops/s
TestSplit.tez4365  thrpt   30  1809324.533 ±  37792.272  ops/s

Quite a bit slower, but still an impressive 1,809,324 string per second on my dated hardware. Using regex provides for fewer lines of code and makes it more readable. But your call. If you're not accepting of it, consider the unit tests update.

@jteagles
Copy link
Contributor

This code optimization was critically import as the the event thread spends a significant time parsing task/attempt ids to dispatch messages. I would hate to lose that. I can appreciate the simplicity of REGEX though. Perhaps the regex can be used to validate the manual parsing, as the manual parsing is more error prone. And improved testing is welcome.

This patch inspired a YARN ID parsing improvement that made significant improvements there as well. I've linked the jira for reference in the original. https://issues.apache.org/jira/browse/YARN-6768

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants