Skip to content

ZJIT: lobsters perf burndown #833

@tekknolagi

Description

@tekknolagi

This issue maintains a list of TODOs for speeding up the lobsters benchmark.

In Progress

Exits

Action items

Things that are already actionable:

Inlining

Fallbacks

Exits

Backlog

Fallbacks

  • send_without_block_polymorphic: We need to add proper polymorphic call support in HIR

Exits

  • unhandled_hir_insn invokebuiltin: We need to update the backend to support CCall with 6+ args.
  • unhandled_hir_insn throw: Because we use call/ret, we need to implement it differently from YJIT.
  • compile_error exception_handler: Similarly, because of call/ret, this needs to be implemented differently from YJIT.

ZJIT stats

As of 2025-11-21:

ZJIT stats
***ZJIT: Printing ZJIT statistics on exit***
Top-20 not inlined C methods (55.8% of total 16,518,363):
                                                 Hash#[]=: 1,520,741 ( 9.2%)
                                               Hash#fetch: 1,206,943 ( 7.3%)
                                            Regexp#match?:   805,012 ( 4.9%)
                                                Hash#key?:   709,650 ( 4.3%)
                                           Array#include?:   499,988 ( 3.0%)
                                              String#sub!:   482,034 ( 2.9%)
                                               Kernel#dup:   430,760 ( 2.6%)
                                                String#<<:   396,153 ( 2.4%)
                                       String#start_with?:   382,228 ( 2.3%)
                               ObjectSpace::WeakKeyMap#[]:   356,672 ( 2.2%)
                                              Hash#delete:   326,510 ( 2.0%)
                                               String.new:   307,767 ( 1.9%)
                                             Set#include?:   264,322 ( 1.6%)
                                             Kernel#is_a?:   249,202 ( 1.5%)
                                    Process.clock_gettime:   228,154 ( 1.4%)
                                            String#match?:   226,352 ( 1.4%)
                                          String#downcase:   215,980 ( 1.3%)
                                              Integer#<=>:   203,974 ( 1.2%)
                                            Range#member?:   203,069 ( 1.2%)
                                          String#include?:   193,850 ( 1.2%)
Top-20 calls to C functions from JIT code (81.4% of total 140,906,871):
                             rb_vm_opt_send_without_block: 29,537,173 (21.0%)
                                rb_vm_setinstancevariable: 14,351,396 (10.2%)
                                             rb_hash_aref: 10,683,158 ( 7.6%)
                                rb_vm_getinstancevariable: 10,188,493 ( 7.2%)
                                               rb_vm_send:  9,802,031 ( 7.0%)
                                          rb_vm_env_write:  8,262,495 ( 5.9%)
                                        rb_obj_is_kind_of:  5,862,031 ( 4.2%)
                                        rb_vm_invokesuper:  4,674,941 ( 3.3%)
                                              rb_ivar_get:  3,855,672 ( 2.7%)
                                             rb_ary_entry:  3,488,651 ( 2.5%)
                               rb_vm_opt_getconstant_path:  2,136,566 ( 1.5%)
                                        rb_vm_invokeblock:  1,670,497 ( 1.2%)
                                                 Hash#[]=:  1,520,741 ( 1.1%)
                                              rb_ary_push:  1,520,109 ( 1.1%)
                                        rb_str_buf_append:  1,392,897 ( 1.0%)
                                          rb_ary_new_capa:  1,355,402 ( 1.0%)
                               rb_class_allocate_instance:  1,217,642 ( 0.9%)
                                               Hash#fetch:  1,206,943 ( 0.9%)
                                    rb_hash_new_with_size:    988,775 ( 0.7%)
                                                    _bi20:    985,762 ( 0.7%)
Top-2 not optimized method types for send (100.0% of total 3,769,708):
  iseq: 3,766,886 (99.9%)
  null:     2,822 ( 0.1%)
Top-4 not optimized method types for send_without_block (100.0% of total 1,007,921):
        optimized_send: 531,615 (52.7%)
        optimized_call: 461,022 (45.7%)
                  null:  10,984 ( 1.1%)
  optimized_block_call:   4,300 ( 0.4%)
Top-4 instructions with uncategorized fallback reason (100.0% of total 7,194,475):
             invokesuper: 4,674,941 (65.0%)
             invokeblock: 1,670,497 (23.2%)
             sendforward:   784,074 (10.9%)
  opt_send_without_block:    64,963 ( 0.9%)
Top-16 send fallback reasons (100.0% of total 46,468,716):
                          send_without_block_polymorphic: 19,727,639 (42.5%)
                                           uncategorized:  7,194,475 (15.5%)
                          send_without_block_no_profiles:  4,847,735 (10.4%)
                          send_not_optimized_method_type:  3,769,708 ( 8.1%)
                                        send_no_profiles:  3,328,774 ( 7.2%)
                            one_or_more_complex_arg_pass:  3,106,048 ( 6.7%)
                                     send_cfunc_variadic:  2,184,442 ( 4.7%)
  send_without_block_not_optimized_method_type_optimized:    996,937 ( 2.1%)
                          send_without_block_megamorphic:    548,702 ( 1.2%)
                                        send_polymorphic:    515,429 ( 1.1%)
                                   too_many_args_for_lir:    172,171 ( 0.4%)
                 send_without_block_cfunc_array_variadic:     35,517 ( 0.1%)
                                obj_to_string_not_string:     25,659 ( 0.1%)
            send_without_block_not_optimized_method_type:     10,984 ( 0.0%)
                                        send_megamorphic:      2,875 ( 0.0%)
                          ccall_with_frame_too_many_args:      1,621 ( 0.0%)
Top-6 invokeblock handler (100.0% of total 1,670,497):
        polymorphic: 826,917 (49.5%)
   monomorphic_iseq: 722,573 (43.3%)
  monomorphic_other:  57,304 ( 3.4%)
  monomorphic_ifunc:  55,505 ( 3.3%)
        megamorphic:   4,269 ( 0.3%)
        no_profiles:   3,929 ( 0.2%)
Top-9 popular complex argument-parameter features not optimized (100.0% of total 3,371,359):
       caller_kwarg: 838,631 (24.9%)
           param_kw: 758,394 (22.5%)
  param_forwardable: 666,977 (19.8%)
        param_block: 652,002 (19.3%)
         param_rest: 285,193 ( 8.5%)
       param_kwrest: 122,536 ( 3.6%)
       caller_splat:  46,226 ( 1.4%)
    caller_blockarg:     803 ( 0.0%)
    caller_kw_splat:     597 ( 0.0%)
Top-1 compile error reasons (100.0% of total 249,772):
  exception_handler: 249,772 (100.0%)
Top-7 unhandled YARV insns (100.0% of total 188,890):
       getblockparam: 102,072 (54.0%)
  invokesuperforward:  81,668 (43.2%)
       setblockparam:   2,837 ( 1.5%)
         getconstant:   1,538 ( 0.8%)
         expandarray:     360 ( 0.2%)
          checkmatch:     298 ( 0.2%)
                once:     117 ( 0.1%)
Top-4 unhandled HIR insns (100.0% of total 301,570):
          throw: 258,507 (85.7%)
  invokebuiltin:  35,373 (11.7%)
     fixnum_div:   4,971 ( 1.6%)
      array_max:   2,719 ( 0.9%)
Top-18 side exit reasons (100.0% of total 14,154,325):
                   guard_type_failure: 7,148,808 (50.5%)
                  guard_shape_failure: 4,156,516 (29.4%)
  block_param_proxy_not_iseq_or_ifunc: 1,229,391 ( 8.7%)
     patchpoint_stable_constant_names:   351,319 ( 2.5%)
                   unhandled_hir_insn:   301,570 ( 2.1%)
                        compile_error:   249,772 ( 1.8%)
        patchpoint_no_singleton_class:   231,816 ( 1.6%)
          patchpoint_method_redefined:   219,034 ( 1.5%)
                  unhandled_yarv_insn:   188,890 ( 1.3%)
           block_param_proxy_modified:    29,122 ( 0.2%)
         unhandled_newarray_send_pack:    14,481 ( 0.1%)
                 fixnum_mult_overflow:    10,866 ( 0.1%)
               fixnum_lshift_overflow:    10,085 ( 0.1%)
              patchpoint_no_ep_escape:     7,820 ( 0.1%)
             guard_bit_equals_failure:     4,533 ( 0.0%)
               obj_to_string_fallback:       173 ( 0.0%)
                            interrupt:       107 ( 0.0%)
               guard_type_not_failure:        22 ( 0.0%)
                             send_count: 150,524,302
                     dynamic_send_count:  46,468,716 (30.9%)
                   optimized_send_count: 104,055,586 (69.1%)
              iseq_optimized_send_count:  38,193,646 (25.4%)
      inline_cfunc_optimized_send_count:  42,890,355 (28.5%)
       inline_iseq_optimized_send_count:   3,818,113 ( 2.5%)
non_variadic_cfunc_optimized_send_count:  13,127,033 ( 8.7%)
    variadic_cfunc_optimized_send_count:   6,026,439 ( 4.0%)
dynamic_getivar_count:                       14,044,165
dynamic_setivar_count:                       14,627,818
compiled_iseq_count:                              5,155
failed_iseq_count:                                    0
compile_time:                                  14,453ms
profile_time:                                      66ms
gc_time:                                           64ms
invalidation_time:                                356ms
vm_write_pc_count:                          142,163,889
vm_write_sp_count:                          193,484,568
vm_write_locals_count:                      136,619,361
vm_write_stack_count:                       136,619,361
vm_write_to_parent_iseq_local_count:            544,020
vm_read_from_parent_iseq_local_count:        15,095,338
guard_type_count:                           148,590,649
guard_type_exit_ratio:                             4.8%
guard_shape_count:                           44,301,439
guard_shape_exit_ratio:                            9.4%
code_region_bytes:                           37,421,056
zjit_alloc_bytes:                            19,150,634
total_mem_bytes:                             56,571,690
side_exit_count:                             14,154,325
total_insn_count:                           924,811,964
vm_insn_count:                              152,438,526
zjit_insn_count:                            772,373,438
ratio_in_zjit:                                    83.5%

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions