Adopt Prism/YARP as the parser in TruffleRuby #3117

eregon · 2023-06-21T13:10:56Z

This will solve a far amount of syntax incompatibilities such as #3038, and provide us with a parser we can very easily re-import.
And it means new syntax would always be supported in YARP early, which will be very helpful when updating to a new Ruby version.

Currently we are using a fork of the parser in JRuby, but this is very difficult to update (notably due to using different types in the fork), so migrating to YARP will address this maintenance problem.

It should also help by replacing the Ripper C extension which is slow and unreliable with the YARP Ripper backend: #2767

eregon · 2024-01-28T14:52:17Z

Will be done for 24.0.

Prism is used by default since #3406 / 728da83.

eregon · 2024-01-31T16:11:29Z

Some early numbers, on jvm env, running ruby/spec fast specs in --dry-run mode to measure how much time is spent on parsing and translator. This is using --cpusampler which might be somewhat imprecise for this, I'll try a different way as well.

3 runs of each.
With the old parser:

$ jt -q test fast --dry-run -- --experimental-options --cpusampler --cpusampler.Period=1 --metrics-profile-require=summary --prism=false | grep "metrics "
 metrics parsing                                                       ||             1853ms  12.9% ||             1853ms  12.9% || (metrics)~1:0
 metrics translating                                                   ||              543ms   3.8% ||              543ms   3.8% || (metrics)~1:0

 metrics parsing                                                       ||             1897ms  12.9% ||             1897ms  12.9% || (metrics)~1:0
 metrics translating                                                   ||              532ms   3.6% ||              532ms   3.6% || (metrics)~1:0

 metrics parsing                                                      ||             1897ms  13.0% ||             1897ms  13.0% || (metrics)~1:0
 metrics translating                                                  ||              535ms   3.7% ||              535ms   3.7% || (metrics)~1:0

With Prism:

$ jt -q test fast --dry-run -- --experimental-options --cpusampler --cpusampler.Period=1 --metrics-profile-require=summary | grep "metrics "
 metrics parsing                                                      ||             1014ms   7.5% ||             1014ms   7.5% || (metrics)~1:0
 metrics translating                                                  ||              579ms   4.3% ||              579ms   4.3% || (metrics)~1:0

 metrics parsing                                                       ||             1005ms   7.5% ||             1005ms   7.5% || (metrics)~1:0
 metrics translating                                                   ||              566ms   4.2% ||              566ms   4.2% || (metrics)~1:0

 metrics parsing                                                       ||             1067ms   7.9% ||             1067ms   7.9% || (metrics)~1:0
 metrics translating                                                   ||              570ms   4.2% ||              570ms   4.2% || (metrics)~1:0

Parsing seems almost twice as fast! (~1900ms vs ~1000ms)
The translator seems similar, maybe a little bit slower (I wonder if that could be due to parsing integer&double literals in the translator again, but that's a wild guess, would need profiling).

eregon · 2024-01-31T18:46:01Z

On commit 90bfaa0

Parsing the core library is about 3-4 times faster with Prism (because there is no need to warmup the parser) and it saves about 200ms startup on JVM!

$ TRUFFLERUBY_METRICS_REPS=1 jt metrics time --experimental-options --prism=false -e0
1.763 total
    1.005 load-core
     0.330 parsing-core
     0.077 translate-core
$ TRUFFLERUBY_METRICS_REPS=1 jt metrics time --experimental-options -e0
1.533 total
    0.772 load-core
     0.093 parsing-core
     0.098 translate-core

$ TRUFFLERUBY_METRICS_REPS=1 jt -u jvm-ce-libgraal metrics time --experimental-options --prism=false -e0
2.044 total
    1.093 load-core
     0.318 parsing-core
     0.081 translate-core

$ TRUFFLERUBY_METRICS_REPS=1 jt -u jvm-ce-libgraal metrics time --experimental-options -e0
1.847 total
    0.874 load-core
     0.102 parsing-core
     0.094 translate-core

Numbers without libgraal, not so relevant:
$ TRUFFLERUBY_METRICS_REPS=1 jt -u jvm-ce metrics time --experimental-options --prism=false -e0
     1.399 parsing-core
     0.259 translate-core
$ TRUFFLERUBY_METRICS_REPS=1 jt -u jvm-ce metrics time --experimental-options -e0 
     0.293 parsing-core
     0.279 translate-core

This no-warmup advantage is very clear when looking at the first files being parsed, 10x faster for the first file and 4x (16x for jvm-ce) faster for the 3rd file (the second file is very small):

jvm-ce-libgraal:
$ TRUFFLERUBY_METRICS_REPS=1 jt -u jvm-ce-libgraal metrics time --metrics-time-parsing-file --experimental-options --prism=false -e0 
    1.185 load-core
     0.102 parsing-pre
     0.012 translate-pre
     0.027 execute-pre
     0.001 parsing-basic_object
     0.001 translate-basic_object
     0.000 execute-basic_object
     0.047 parsing-array
     0.002 translate-array
     0.022 execute-array

$ TRUFFLERUBY_METRICS_REPS=1 jt -u jvm-ce-libgraal metrics time --metrics-time-parsing-file --experimental-options -e0 
    0.997 load-core
     0.011 parsing-pre
     0.018 translate-pre
     0.026 execute-pre
     0.001 parsing-basic_object
     0.000 translate-basic_object
     0.001 execute-basic_object
     0.011 parsing-array
     0.002 translate-array
     0.022 execute-array

---
jvm-ce:
$ TRUFFLERUBY_METRICS_REPS=1 jt -u jvm-ce metrics time --metrics-time-parsing-file --experimental-options --prism=false  -e0 
    2.977 load-core
     0.153 parsing-pre
     0.023 translate-pre
     0.050 execute-pre
     0.003 parsing-basic_object
     0.001 translate-basic_object
     0.003 execute-basic_object
     0.299 parsing-array
     0.028 translate-array
     0.059 execute-array

$ TRUFFLERUBY_METRICS_REPS=1 jt -u jvm-ce metrics time --metrics-time-parsing-file --experimental-options -e0               
    2.657 load-core
     0.015 parsing-pre
     0.028 translate-pre
     0.044 execute-pre
     0.000 parsing-basic_object
     0.001 translate-basic_object
     0.000 execute-basic_object
     0.018 parsing-array
     0.014 translate-array
     0.046 execute-array

eregon · 2024-01-31T20:57:20Z

Here are some measurements for parsing and translating for 1 iteration of yjit-bench railsbench.
Parsing is 2x faster with Prism on both native-ee and jvm-ee-libgraal!
Parsing + translating is 1.5x faster on native-ee and 2x faster on jvm-ee-libgraal with Prism, saving 378ms/1133ms and 1096ms/2122ms respectively!

native-ee:
$ WARMUP_ITRS=0 MIN_BENCH_ITRS=1 MIN_BENCH_TIME=0 TRUFFLERUBY_METRICS_REPS=1 jt -u native-ee metrics time --experimental-options --prism=false --engine.UsePreInitializedContext=false -I./harness benchmarks/railsbench/benchmark.rb
itr #1: 12816ms
25.947 total
   0.974 parsing
   0.159 translate

$ WARMUP_ITRS=0 MIN_BENCH_ITRS=1 MIN_BENCH_TIME=0 TRUFFLERUBY_METRICS_REPS=1 jt -u native-ee metrics time --experimental-options --engine.UsePreInitializedContext=false -I./harness benchmarks/railsbench/benchmark.rb
itr #1: 12666ms
25.242 total
   0.480 parsing
   0.275 translate

jvm-ee-libgraal:
$ WARMUP_ITRS=0 MIN_BENCH_ITRS=1 MIN_BENCH_TIME=0 TRUFFLERUBY_METRICS_REPS=1 jt -u jvm-ee-libgraal metrics time --experimental-options --prism=false -I./harness benchmarks/railsbench/benchmark.rb
itr #1: 17819ms
59.489 total
   1.650 parsing
   0.472 translate

$ WARMUP_ITRS=0 MIN_BENCH_ITRS=1 MIN_BENCH_TIME=0 TRUFFLERUBY_METRICS_REPS=1 jt -u jvm-ee-libgraal metrics time -I./harness benchmarks/railsbench/benchmark.rb
itr #1: 17556ms
58.615 total
   0.784 parsing
   0.554 translate

Translating seems slower than before, so likely something we can optimize further with a Java profiler.

I used this patch to get aggregated parsing+translate timings for all files:

diff --git a/src/main/java/org/truffleruby/parser/TranslatorDriver.java b/src/main/java/org/truffleruby/parser/TranslatorDriver.java
index 6f5f20fa73..ebd0423f4f 100644
--- a/src/main/java/org/truffleruby/parser/TranslatorDriver.java
+++ b/src/main/java/org/truffleruby/parser/TranslatorDriver.java
@@ -479,10 +479,12 @@ public final class TranslatorDriver {
                     name = name.substring(lastSlash + 1, lastDot);
                 }
                 Metrics.printTime(id + "-" + name);
-            } else if (context.getCoreLibrary().isLoadingRubyCore()) {
-                // Only show times for core (the biggest contributor) to avoid multiple metrics with
-                // the same name, which is not supported in mx_truffleruby_benchmark.py.
-                Metrics.printTime(id + "-core");
+            } else {
+                Metrics.printTime(id);
+//            } else if (context.getCoreLibrary().isLoadingRubyCore()) {
+//                // Only show times for core (the biggest contributor) to avoid multiple metrics with
+//                // the same name, which is not supported in mx_truffleruby_benchmark.py.
+//                Metrics.printTime(id + "-core");
             }
         }
     }
diff --git a/src/main/java/org/truffleruby/parser/YARPTranslatorDriver.java b/src/main/java/org/truffleruby/parser/YARPTranslatorDriver.java
index ed1d692756..2977ce2740 100644
--- a/src/main/java/org/truffleruby/parser/YARPTranslatorDriver.java
+++ b/src/main/java/org/truffleruby/parser/YARPTranslatorDriver.java
@@ -612,10 +612,12 @@ public final class YARPTranslatorDriver {
                     name = name.substring(lastSlash + 1, lastDot);
                 }
                 Metrics.printTime(id + "-" + name);
-            } else if (context.getCoreLibrary().isLoadingRubyCore()) {
-                // Only show times for core (the biggest contributor) to avoid multiple metrics with
-                // the same name, which is not supported in mx_truffleruby_benchmark.py.
-                Metrics.printTime(id + "-core");
+            } else {
+                Metrics.printTime(id);
+//            } else if (context.getCoreLibrary().isLoadingRubyCore()) {
+//                // Only show times for core (the biggest contributor) to avoid multiple metrics with
+//                // the same name, which is not supported in mx_truffleruby_benchmark.py.
+//                Metrics.printTime(id + "-core");
             }
         }
     }

eregon added compatibility priority labels Jun 21, 2023

eregon added this to the 23.1.0 Release (September 19, 2023) milestone Jun 21, 2023

eregon assigned eregon and andrykonchin Jun 21, 2023

eregon modified the milestones: 23.1.0 Release (September 19, 2023), 24.0.0 Release (March 19, 2024) Sep 13, 2023

eregon changed the title ~~Adopt YARP as the parser in TruffleRuby~~ Adopt Prism/YARP as the parser in TruffleRuby Nov 27, 2023

This was referenced Nov 27, 2023

Ruby 3.2 support #3039

Closed

Ruby -c does not detect unsupported syntax (false positive) #3330

Closed

Full pattern matching support #3332

Closed

Support endless method definitions #3038

Closed

eregon closed this as completed Jan 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adopt Prism/YARP as the parser in TruffleRuby #3117

Adopt Prism/YARP as the parser in TruffleRuby #3117

eregon commented Jun 21, 2023

eregon commented Jan 28, 2024 •

edited

eregon commented Jan 31, 2024

eregon commented Jan 31, 2024 •

edited

eregon commented Jan 31, 2024 •

edited

Adopt Prism/YARP as the parser in TruffleRuby #3117

Adopt Prism/YARP as the parser in TruffleRuby #3117

Comments

eregon commented Jun 21, 2023

eregon commented Jan 28, 2024 • edited

eregon commented Jan 31, 2024

eregon commented Jan 31, 2024 • edited

eregon commented Jan 31, 2024 • edited

eregon commented Jan 28, 2024 •

edited

eregon commented Jan 31, 2024 •

edited

eregon commented Jan 31, 2024 •

edited