Skip RBS rewriter when file does not contain RBS syntax by dejmedus · Pull Request #917 · Shopify/spoom

dejmedus · 2026-05-12T19:39:33Z

closes Skip RBS rewriter when a file doesn't contain any RBS syntax #916
moved from tapioca Skip RBS rewrite when no markers are present tapioca#2616

If a file does not contain typed file markers (ex # typed: true) or RBS syntax we can skip attempting to translate it. If a file has not been rewritten we can exclude it from the count of translated files

amomchilov · 2026-05-12T20:17:48Z

          RB
        end

+        def test_should_rewrite_returns_true_for_supported_typed_sigils


Two more cases to test:

Don't trigger if # typed: true exists later in the file. This is unlikely, but it's a performance improvement to ensure we're not scanning through the whole file

Trigger if there's multiple magic comments, but typed isn't first:

# frozen_string_literal: true # typed: true

~~> Don't trigger if # typed: true exists later in the file~~

~~Good call! I updated to only check for the typed marker in the magic comments block~~

Refactored to instead use strictness_in_content and valid_strictness checks from /sorbet/sigils.rb

dejmedus · 2026-05-12T21:33:50Z

+              next if new_contents == contents
+
+              File.write(file, new_contents)
              transformed_count += 1


This will only count changed files in the Translated signatures in <x> files. message, but I'm happy to drop the commit if we want to leave as is

paracycle

Shouldn't all this logic be encapsulated inside the RBSCommentsToSorbetSigs class, which already stores the ruby_contents as an ivar, and wouldn't need to pass it to a should_rewrite? method. The downside is just an extra object allocation only to return the same contents, but I think that's fine.

amomchilov

Few small points, but it's shaping up!

amomchilov · 2026-05-12T23:17:27Z

+  end
 end

+Spoom::Sorbet::Translate::RBSCommentsToSorbetSigs::RBS_ANNOTATION_MARKERS = T.let(T.unsafe(nil), Array)


Let's mark these with private_constant where they're defined, so we don't expose them to the public.

amomchilov

LGTM pending the unresolved comments

amomchilov · 2026-05-13T15:03:30Z

+          end
+        end
+
+        def test_contains_rbs_syntax_returns_true_when_typed_sigils_follow_magic_comments


Suggested change

def test_contains_rbs_syntax_returns_true_when_typed_sigils_follow_magic_comments

def test_contains_rbs_syntax_returns_true_when_typed_sigil_is_after_other_magic_comments

amomchilov · 2026-05-13T15:03:50Z

+          RB
+        end
+
+        def test_contains_rbs_syntax_returns_false_for_unrelated_yard_tags


Morriar · 2026-05-13T17:07:51Z

+          end
+
+          #: (String ruby_contents, file: String, ?max_line_length: Integer?) -> String
+          def rewrite(ruby_contents, file:, max_line_length: nil)


Suggested change

def rewrite(ruby_contents, file:, max_line_length: nil)

def rewrite_if_needed(ruby_contents, file:, max_line_length: nil)

or something like this

Morriar · 2026-05-13T17:12:08Z


        def test_translate_to_rbi_method_sigs
          contents = <<~RB
+            # typed: true


Can we instead call the underlying new.rewrite instead so we don't have to rewrite all the tests?

~~Can do:) Updated (I just left the few in the cli sigs_test.rb file)~~

Sorry scratch that, the newest commit moves the RBS check logic into new.rewrite so the tests would again need a sigil (unless we wanted to drop the sigil check and just look for RBS?) I could probably make a test helper to insert sigils to contents but I wonder if that would be more confusing. What do you think?

When a file is not typed or contains no RBS comment syntax, we can skip running the RBS rewriter on it Co-authored-by: Matt Kubej <matt.kubej@shopify.com>

paracycle · 2026-05-13T19:38:06Z

+        class << self
+          #: (String source) -> bool
+          def contains_rbs_syntax?(source)
+            Sigils.contains_valid_sigil?(source) && source.match?(RBS_REWRITE_PATTERN)
+          end
+
+          #: (String ruby_contents, file: String, ?max_line_length: Integer?) -> String
+          def rewrite_if_needed(ruby_contents, file:, max_line_length: nil)
+            return ruby_contents unless contains_rbs_syntax?(ruby_contents)
+
+            new(ruby_contents, file:, max_line_length:).rewrite
+          end
+        end
+


I am not opinionated about this but I would have prefered:

Suggested change

class << self

#: (String source) -> bool

def contains_rbs_syntax?(source)

Sigils.contains_valid_sigil?(source) && source.match?(RBS_REWRITE_PATTERN)

end

#: (String ruby_contents, file: String, ?max_line_length: Integer?) -> String

def rewrite_if_needed(ruby_contents, file:, max_line_length: nil)

return ruby_contents unless contains_rbs_syntax?(ruby_contents)

new(ruby_contents, file:, max_line_length:).rewrite

end

end

#: (String source) -> bool

def contains_rbs_syntax?

Sigils.contains_valid_sigil?(@ruby_contents) && @ruby_contents.match?(RBS_REWRITE_PATTERN)

end

# @override

#: () -> String

def rewrite

return @ruby_contents unless contains_rbs_syntax?

super

end

In summary, I think we are unnecessarily pulling logic into class methods and passing values around when the value we are interested in is being passed into the constructor of our class, and no-one cares what exactly rewrite does internally (i.e. in this case it decides to return the original buffer intact).

I notice that we eagerly do parsing in the initialize of Translator, but we could also short-circuit the call to super in this class's initialize method by doing the sigil and rewrite checks in the initializer.

I think I was thinking that I needed to exit early before we initialized to prevent hitting parse_ruby and that rewrite needed that path to have run already, but I think this is clicking for me now, thanks very much for the examples!

@dejmedus No problem. I am glad to hear it is helpful.

As for initialize, there is nothing magic about that method, and returning early doesn't change behaviour in any way different than in other methods. Even an empty initialize method still initializes the object, so you can add any logic in initialize that you want.

amomchilov · 2026-05-13T20:05:49Z

+          "# @override",
+          "# @overridable",


Random thought I had: does factoring out the common prefixes produce a more faster regular expression? E.g. overrid(?:e|able) instead of checking for the two whole words separately.

In short, yes, but barely. Onigmo doesn't optimize out common prefixes in the NFA it creates, but it doesn't make a real world difference here.

Got Claude to benchmark it for me:

require "benchmark" # loading corpus... 101996 files, 616MB, loaded in 15.09s # sanity-checking match equivalence... 19104 total matches across corpus # # # user system total real # flat 0.536375 0.006953 0.543328 ( 0.552176) # factored 0.530740 0.005332 0.536072 ( 0.544158) # # factored is 1.01x faster than flat (real time) MARKERS = [ "# @abstract", "# @interface", "# @sealed", "# @final", "# @requires_ancestor:", "# @override", "# @overridable", "# @without_runtime", ].freeze FLAT = Regexp.union(*MARKERS) FACTORED = /\# @(?:abstract|interface|sealed|final|requires_ancestor:|overrid(?:e|able)|without_runtime)/ raise "regexes diverge on hits" unless MARKERS.all? { |m| FLAT =~ m && FACTORED =~ m } raise "regexes diverge on misses" if FLAT =~ "# @other" || FACTORED =~ "# @other" CORPUS_GLOB = "/your/test/corpus/**/*.rb" print "loading corpus... " load_start = Process.clock_gettime(Process::CLOCK_MONOTONIC) CORPUS = Dir.glob(CORPUS_GLOB).map { |path| File.read(path) rescue nil }.compact.freeze load_elapsed = Process.clock_gettime(Process::CLOCK_MONOTONIC) - load_start total_bytes = CORPUS.sum(&:bytesize) puts "#{CORPUS.size} files, #{total_bytes / 1_000_000}MB, loaded in #{load_elapsed.round(2)}s" print "sanity-checking match equivalence... " flat_hits = CORPUS.sum { |s| s.scan(FLAT).size } factored_hits = CORPUS.sum { |s| s.scan(FACTORED).size } raise "regexes diverge: flat=#{flat_hits} factored=#{factored_hits}" unless flat_hits == factored_hits puts "#{flat_hits} total matches across corpus" flat_time = Benchmark.measure("flat") { CORPUS.each { |s| s.scan(FLAT) } } factored_time = Benchmark.measure("factored") { CORPUS.each { |s| s.scan(FACTORED) } } slower, faster = [flat_time, factored_time].sort_by(&:real).reverse ratio = slower.real / faster.real faster_name = faster.equal?(flat_time) ? "flat" : "factored" slower_name = slower.equal?(flat_time) ? "flat" : "factored" puts <<~RESULTS #{Benchmark::CAPTION} flat #{flat_time.to_s.strip} factored #{factored_time.to_s.strip} #{faster_name} is #{ratio.round(2)}x faster than #{slower_name} (real time) RESULTS

factored is 1.01x faster than flat

Interesting!

Morriar

design is wrong, see comment

Morriar · 2026-05-14T16:23:15Z

-          super(ruby_contents, file: file)
+          @ruby_contents = ruby_contents
+          if contains_rbs_syntax?
+            super(ruby_contents, file: file)


So if contains_rbs_syntax? returns false we still initialize the object but do not call super. This means instance methods after that are responsible of knowing which state the instance is in. It's bad design.

Why do we even instantiate the Translator if we don't need to rewrite? This check should be made earlier.

Let's extract this to a maybe_rewrite singleton method.

If I'm understanding correctly, I believe this is what we originally had here but it was discussed that we could prevent needing to pass around values if the logic was moved inside

The conditional call to super is a code smell, the fact that we can't test translation without the sigils is another one.

The main problems:

Fragile: forgetting which condition triggers super leads to half-initialized objects. Future maintainers have to reason about two different initialization paths.

Violates Liskov: subclass instances behave differently depending on whether the parent was initialized, which breaks substitutability.

Hard to test: you need to cover both branches, and bugs in the "no super" path tend to surface late as nil errors.

Instead of short-circuiting the super call we should just short-circuit the class instantiation altogether since we don't need it.

Ah, I understand, thank you. I dropped the commit and now are exiting before initialization and calling new.rewrite in tests

Morriar

Thanks! Sorry for the back and forth about the design.

amomchilov

shipit

dejmedus force-pushed the jb-rbs-marker-gaurd branch from 82e90b7 to 07a9394 Compare May 12, 2026 19:51

amomchilov reviewed May 12, 2026

View reviewed changes

dejmedus force-pushed the jb-rbs-marker-gaurd branch from 07a9394 to 47118ab Compare May 12, 2026 20:57

github-advanced-security AI found potential problems May 12, 2026

View reviewed changes

Comment thread lib/spoom/sorbet/translate/rbs_comments_to_sorbet_sigs.rb Fixed

Comment thread lib/spoom/sorbet/translate/rbs_comments_to_sorbet_sigs.rb Fixed

dejmedus force-pushed the jb-rbs-marker-gaurd branch 2 times, most recently from cf00c3f to d3125ac Compare May 12, 2026 21:23

dejmedus commented May 12, 2026

View reviewed changes

dejmedus force-pushed the jb-rbs-marker-gaurd branch from d3125ac to f25aa67 Compare May 12, 2026 22:26

dejmedus marked this pull request as ready for review May 12, 2026 22:46

dejmedus requested a review from a team as a code owner May 12, 2026 22:46

dejmedus requested a review from amomchilov May 12, 2026 22:47

paracycle reviewed May 12, 2026

View reviewed changes

amomchilov reviewed May 12, 2026

View reviewed changes

Comment thread lib/spoom/sorbet/translate/rbs_comments_to_sorbet_sigs.rb

amomchilov reviewed May 12, 2026

View reviewed changes

dejmedus force-pushed the jb-rbs-marker-gaurd branch from f25aa67 to 8244d5a Compare May 13, 2026 02:19

dejmedus requested review from amomchilov and paracycle May 13, 2026 02:58

amomchilov approved these changes May 13, 2026

View reviewed changes

Morriar reviewed May 13, 2026

View reviewed changes

dejmedus and others added 3 commits May 13, 2026 12:56

Skip RBS rewrite when file contains no RBS syntax

b3862c1

When a file is not typed or contains no RBS comment syntax, we can skip running the RBS rewriter on it Co-authored-by: Matt Kubej <matt.kubej@shopify.com>

Add typed: true to test example files

ca8a1fd

Don't count unchanged files in translated file number

5d5eb6a

dejmedus force-pushed the jb-rbs-marker-gaurd branch from 8244d5a to 5d5eb6a Compare May 13, 2026 19:07

dejmedus requested a review from Morriar May 13, 2026 19:17

paracycle reviewed May 13, 2026

View reviewed changes

amomchilov reviewed May 13, 2026

View reviewed changes

paracycle reviewed May 13, 2026

View reviewed changes

Comment thread lib/spoom/sorbet/translate/rbs_comments_to_sorbet_sigs.rb Outdated

dejmedus force-pushed the jb-rbs-marker-gaurd branch from 9a52a1f to 0b8d325 Compare May 13, 2026 21:25

paracycle approved these changes May 13, 2026

View reviewed changes

Morriar requested changes May 14, 2026

View reviewed changes

dejmedus force-pushed the jb-rbs-marker-gaurd branch from 0b8d325 to 5d5eb6a Compare May 14, 2026 18:18

dejmedus requested a review from Morriar May 14, 2026 18:27

Morriar approved these changes May 14, 2026

View reviewed changes

amomchilov approved these changes May 14, 2026

View reviewed changes

dejmedus merged commit 21e8e9b into main May 14, 2026
23 checks passed

dejmedus deleted the jb-rbs-marker-gaurd branch May 14, 2026 19:50

	def test_contains_rbs_syntax_returns_true_when_typed_sigils_follow_magic_comments
	def test_contains_rbs_syntax_returns_true_when_typed_sigil_is_after_other_magic_comments

	def rewrite(ruby_contents, file:, max_line_length: nil)
	def rewrite_if_needed(ruby_contents, file:, max_line_length: nil)

Conversation

dejmedus commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

amomchilov May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dejmedus May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

dejmedus May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

paracycle left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

amomchilov left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

amomchilov left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dejmedus May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dejmedus May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

amomchilov May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Morriar left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

dejmedus commented May 12, 2026 •

edited

Loading

amomchilov May 12, 2026 •

edited

Loading

dejmedus May 12, 2026 •

edited

Loading

dejmedus May 12, 2026 •

edited

Loading

dejmedus May 13, 2026 •

edited

Loading

dejmedus May 13, 2026 •

edited

Loading

amomchilov May 13, 2026 •

edited

Loading

Morriar left a comment •

edited

Loading

dejmedus May 14, 2026 •

edited

Loading