Skip to content

[bot] Fix Lint/MixedCaseRange#134

Merged
6[bot] merged 3 commits intomainfrom
fix/lint-mixed_case_range-23409151474
Mar 22, 2026
Merged

[bot] Fix Lint/MixedCaseRange#134
6[bot] merged 3 commits intomainfrom
fix/lint-mixed_case_range-23409151474

Conversation

@6
Copy link
Copy Markdown
Contributor

@6 6 bot commented Mar 22, 2026

Automated fix fix for Lint/MixedCaseRange corpus conformance.

Details

Cop Lint/MixedCaseRange
Backend codex
Mode fix
Turns 1
Run #60

Result

Updated [mixed_case_range.rs](/home/runner/work/nitrocop/nitrocop/src/cop/lint/mixed_case_range.rs) to match RuboCop’s range behavior for single-character bounds: a range is now unsafe whenever the two bounds fall into different ASCII letter buckets, including letter vs digit/symbol cases like `('0'..'z')`, `(' '..'z')`, `('['..'z')`, and `("\x21".."\x5A")`. I also added the requested `///` note documenting the FN root cause and fix.

Updated [offense.rb](/home/runner/work/nitrocop/nitrocop/tests/fixtures/cops/lint/mixed_case_range/offense.rb) with the four missed corpus patterns and one additional verified RuboCop offense on the same `PRINTABLE` line: `(':'..'Z')`. `cargo test --lib -- cop::lint::mixed_case_range` now passes. Commit: `3fb02ed9` (`Fix Lint/MixedCaseRange non-letter bounds`).

Changes

  2 files changed
Agent conversation excerpt

I’m checking the existing cop and fixtures first, then I’ll add the missing FN cases to the fixture and verify the test fails before touching detection logic.

I’m checking the existing cop and fixtures first, then I’ll add the missing FN cases to the fixture and verify the test fails before touching detection logic.

exec_command: sed -n '1,240p' src/cop/lint/mixed_case_range.rs

exec_command: sed -n '1,220p' tests/fixtures/cops/lint/mixed_case_range/offense.rb

exec_command: sed -n '1,220p' tests/fixtures/cops/lint/mixed_case_range/no_offense.rb

exec_command: git status --short

The fixture already has the four FN snippets in the worktree, so I’m treating that as existing user-side state and checking how the current cop behaves before I edit anything. I’m also verifying RuboCop on the relevant range shapes so the fix matches ground truth, not just the corpus examples.

The fixture already has the four FN snippets in the worktree, so I’m treating that as existing user-side state and checking how the current cop behaves before I edit anything. I’m also verifying RuboCop on the relevant range shapes so the fix matches ground truth, not just the corpus examples.

exec_command: printf "%s " "POTENTIAL_BYTES = (' '..'z').to_a" > /tmp/mixed_case_range_fn1.rb && rubocop --only Lint/MixedCaseRange /tmp/mixed_case_range_fn1.rb

exec_command: printf "%s " "PRINTABLE = ('['..'z').to_a" > /tmp/mixed_case_range_fn2.rb && rubocop --only Lint/MixedCaseRange /tmp/mixed_case_range_fn2.rb

exec_command: printf "%s " "chars = (\"\\x21\"..\"\\x5A\").to_a" > /tmp/mixed_case_range_fn3.rb && rubocop --only Lint/MixedCaseRange /tmp/mixed_case_range_fn3.rb

exec_command: printf "%s " "CHARS = ('0'..'z').to_a" > /tmp/mixed_case_range_fn4.rb && rubocop --only Lint/MixedCaseRange /tmp/mixed_case_range_fn4.rb

exec_command: cargo test --lib -- cop::lint::mixed_case_range

I’ve confirmed the new cases are in the fixture and I’m waiting on two things in parallel: RuboCop’s verdict on each missed pattern, and the targeted Rust test to see whether failure is from detection or just bad annotations.

I’ve confirmed the new cases are in the fixture and I’m waiting on two things in parallel: RuboCop’s verdict on each missed pattern, and the targeted Rust test to see whether failure is from detection or just bad annotations.

write_stdin: ``

write_stdin: ``

write_stdin: ``

write_stdin: ``

write_stdin: ``

RuboCop is consistent with the corpus: it flags all four missing range shapes, including non-letter-to-letter bounds like '0'..'z' and '['..'z'. The current test run is still finishing, but the detection gap already looks clear: our range logic only treats opposite-case letters as unsafe and misses “symbol/digit to letter” bounds that RuboCop treats as unsafe too.

RuboCop is consistent with the corpus: it flags all four missing range shapes, including non-letter-to-letter bounds like '0'..'z' and '['..'z'. The current test run is still finishing, but the detection gap already looks clear: our range logic only treats opposite-case letters as unsafe and misses “symbol/digit to letter” bounds that RuboCop treats as unsafe too.

write_stdin: ``

I have enough context to make the fix narrowly.

Plan:

  1. Correct the new offense.rb annotations so the fixture reflects RuboCop’s actual offense spans, then rerun the targeted test to lock in a failing baseline.
  2. Update src/cop/lint/mixed_case_range.rs so range objects follow RuboCop’s range_for(start) != range_for(end) behavior, not just “upper vs lower letter” comparisons.
  3. Add the required /// note on the cop struct, rerun cargo test --lib -- cop::lint::mixed_case_range, then commit only the cop file and its fixture.

I have enough context to make the fix narrowly.

Plan:

  1. Correct the new offense.rb annotations so the fixture reflects RuboCop’s actual offense spans, then rerun the targeted test to lock in a failing baseline.
  2. Update src/cop/lint/mixed_case_range.rs so range objects follow RuboCop’s range_for(start) != range_for(end) behavior, not just “upper vs lower letter” comparisons.
  3. Add the required /// note on the cop struct, rerun cargo test --lib -- cop::lint::mixed_case_range, then commit only the cop file and its fixture.

write_stdin: ``

write_stdin: ``

write_stdin: ``

The baseline failure is clean: expected 10 offenses, actual 6. That confirms a real detection bug rather than config noise. I’m reading the fixture parser next so I can correct the new marker spans precisely once the cop reports the missing ranges.

The baseline failure is clean: expected 10 offenses, actual 6. That confirms a real detection bug rather than config noise. I’m reading the fixture parser next so I can correct the new marker spans precisely once the cop reports the missing ranges.

exec_command: sed -n '320,430p' src/testutil.rs

exec_command: ruby -e 'line=%q{PRINTABLE = ("!".."9").to_a + (":".."Z").to_a + ("[".."z").to_a + ("{".."~").to_a}; puts line.index(%q{("[".."z")})'

exec_command: ruby -e 'line=%q{POTENTIAL_BYTES = (" ".."z").to_a}; puts line.index(%q{" ".."z"})'

I’m measuring the exact start columns on the real fixture lines now, then I’ll patch the fixture and the cop together so the next test run is meaningful instead of bouncing on annotation mismatches.

I’m measuring the exact start columns on the real fixture lines now, then I’ll patch the fixture and the cop together so the next test run is meaningful instead of bouncing on annotation mismatches.

exec_command: ruby -e 'line = %q|PRINTABLE = ("!".."9").to_a + (":".."Z").to_a + (\'['..\'z\').to_a + (\'{\'..\'~\').to_a|; puts line.index(%q|\'['..\'z\'|)'

exec_command: ruby -e 'line = %q|chars = ("\x21".."\x5A").to_a|; puts line.index(%q|"\x21".."\x5A"|)'

exec_command: ruby -e 'line = %q|CHARS = (\'0\'..\'z\').to_a|; puts line.index(%q|\'0\'..\'z\'|)'

exec_command: ruby -e "line = %q|PRINTABLE = (\"!\"..\"9\").to_a + (':'..'Z').to_a + ('['..'z').to_a + ('{'..'~').to_a|; puts line.index(%q|'['..'z'|)"

exec_command: ruby -e "line = %q|CHARS = ('0'..'z').to_a|; puts line.index(%q|'0'..'z'|)"

exec_command: ruby -e "line = %q|POTENTIAL_BYTES = (' '..'z').to_a|; puts line.index(%q|' '..'z'|)"

I’m editing the cop now. The code change is minimal: make “unsafe” mean the two single-character bounds belong to different ASCII letter buckets, including None for digits/symbols, which is what RuboCop does. I’m also fixing the marker columns in offense.rb so the fixture matches the real spans.

I’m editing the cop now. The code change is minimal: make “unsafe” mean the two single-character bounds belong to different ASCII letter buckets, including None for digits/symbols, which is what RuboCop does. I’m also fixing the marker columns in offense.rb so the fixture matches the real spans.

apply_patch

apply_patch

The patch is in. I’m rerunning the targeted cop test now; if it still fails, it should be down to exact marker columns or an edge case in the updated range comparison, not the original FN.

The patch is in. I’m rerunning the targeted cop test now; if it still fails, it should be down to exact marker columns or an edge case in the updated range comparison, not the original FN.

exec_command: cargo test --lib -- cop::lint::mixed_case_range

write_stdin: ``

write_stdin: ``

The targeted test is in the verification phase now. If it comes back green, I’ll inspect the exact diff and make the requested single-cop commit; if not, I’ll use the failure to correct only the remaining marker alignment.

The targeted test is in the verification phase now. If it comes back green, I’ll inspect the exact diff and make the requested single-cop commit; if not, I’ll use the failure to correct only the remaining marker alignment.

write_stdin: ``

write_stdin: ``

write_stdin:

Task prompt (6732 tokens)

Fix Lint/MixedCaseRange — 0 FP, 4 FN

Instructions

You are fixing ONE cop in nitrocop, a Rust Ruby linter that uses Prism for parsing.

Current state: 68 matches, 0 false positives, 4 false negatives.
Focus on: FN (RuboCop flags code nitrocop misses).

Workflow

  1. Read the Pre-diagnostic Results and Corpus FP/FN Examples sections below first
  2. Verify with RuboCop first (for FP fixes): before writing any code, confirm RuboCop's
    behavior on BOTH the specific FP case AND the general pattern:
    echo '<specific FP case>' > /tmp/test.rb && rubocop --only Lint/MixedCaseRange /tmp/test.rb
    echo '<general pattern>' > /tmp/test.rb && rubocop --only Lint/MixedCaseRange /tmp/test.rb
    If RuboCop flags the general pattern, your fix must be narrow enough to not suppress it.
  3. Add a test case FIRST:
    • FN fix: add the missed pattern to tests/fixtures/cops/lint/mixed_case_range/offense.rb with ^ annotation
    • FP fix: add the false-positive pattern to tests/fixtures/cops/lint/mixed_case_range/no_offense.rb
  4. Verify test fails: cargo test --lib -- cop::lint::mixed_case_range
  5. Fix src/cop/lint/mixed_case_range.rs
  6. Verify test passes: cargo test --lib -- cop::lint::mixed_case_range
  7. Add a /// doc comment on the cop struct documenting what you found and fixed
  8. Commit only your cop's files

Fixture Format

Mark offenses with ^ markers on the line AFTER the offending source line:

x = 1
     ^^ Lint/MixedCaseRange: Trailing whitespace detected.

The ^ characters must align with the offending columns. The message format is Lint/MixedCaseRange: <message text>.

If your test passes immediately

If you add a test case and it passes without code changes, the corpus mismatch is
caused by config/context differences, not a detection bug.
Do NOT loop trying to make the test fail. Instead:

  1. Investigate config resolution (Include/Exclude, cop enablement, disable comments)
  2. The fix is likely in src/config/ or the cop's config handling, not detection logic
  3. If you cannot determine the root cause within 5 minutes, document your findings as
    a /// comment on the cop struct and commit

CRITICAL: Avoid regressions in the opposite direction

When fixing FPs, your change MUST NOT suppress legitimate detections. When fixing FNs,
your change MUST NOT flag code that RuboCop accepts. A fix that eliminates a few issues
in one direction but introduces hundreds in the other is a catastrophic regression.

Before exempting a category of patterns, verify with RuboCop that the general case
is still an offense:

rubocop --only Lint/MixedCaseRange /tmp/test.rb

If RuboCop flags the general pattern but not your specific case, the difference is in
a narrow context (e.g., enclosing structure, receiver type, argument count) — your fix
must target that specific context, not the broad category.

Rule of thumb: if your fix adds an early return or continue that skips a whole
node type, operator class, or naming pattern, it's probably too broad. Prefer adding a
condition that matches the SPECIFIC differentiating context.

Rules

  • Only modify src/cop/lint/mixed_case_range.rs and tests/fixtures/cops/lint/mixed_case_range/
  • Run cargo test --lib -- cop::lint::mixed_case_range to verify your fix (do NOT run the full test suite)
  • Do NOT touch unrelated files
  • Do NOT use git stash

Pre-diagnostic Results

Diagnosis Summary

Each example was tested by running nitrocop on the extracted source in isolation
with --force-default-config to determine if the issue is a code bug or config issue.
Note: source context is truncated and may not parse perfectly. If a diagnosis
seems wrong (e.g., your test passes immediately for a 'CODE BUG'), treat it as
a config/context issue instead.

  • FN: 4 code bug(s), 0 config/context issue(s)

FN #1: ffleming__timing_attack__a7f4e5e: lib/timing_attack/brute_forcer.rb:13

NOT DETECTED — CODE BUG
The cop fails to detect this pattern. Fix the detection logic.

Message: Ranges from upper to lower case ASCII letters may include unintended characters. Instead of A-z(which also includes several symbols) specify each range individually:A-Za-z and individually specify any symbols.

Ready-made test snippet (add to offense.rb, adjust ^ count):

    POTENTIAL_BYTES = (' '..'z').to_a
^ Lint/MixedCaseRange: Ranges from upper to lower case ASCII letters may include unintended characters. Instead of `A-z` (which also includes several symbols) specify each range individually: `A-Za-z` and individually specify any symbols.

Full source context:

      super(options: options)
      @known = ""
    end

    private

    attr_reader :known
    POTENTIAL_BYTES = (' '..'z').to_a
    def attack!
      begin
        while(true)
          attack_byte!
        end
      rescue Errors::BruteForcerError => e
        puts "\n#{e.message}"

FN #2: gdelugre__origami__ac1df80: lib/origami/obfuscation.rb:29

NOT DETECTED — CODE BUG
The cop fails to detect this pattern. Fix the detection logic.

Enclosing structure: module body (line: module Obfuscator)
The offense is inside this structure — the cop may need
to handle this context to detect the pattern.

Message: Ranges from upper to lower case ASCII letters may include unintended characters. Instead of A-z(which also includes several symbols) specify each range individually:A-Za-z and individually specify any symbols.

Ready-made test snippet (add to offense.rb, adjust ^ count):

        PRINTABLE = ("!".."9").to_a + (':'..'Z').to_a + ('['..'z').to_a + ('{'..'~').to_a
^ Lint/MixedCaseRange: Ranges from upper to lower case ASCII letters may include unintended characters. Instead of `A-z` (which also includes several symbols) specify each range individually: `A-Za-z` and individually specify any symbols.

Full source context:

    module Obfuscator
        using TypeConversion

        WHITECHARS = [ " ", "\t", "\r", "\n", "\0" ]
        OBJECTS = [ Array, Boolean, Dictionary, Integer, Name, Null, Stream, String, Real, Reference ]
        MAX_INT = 0xFFFFFFFF
        PRINTABLE = ("!".."9").to_a + (':'..'Z').to_a + ('['..'z').to_a + ('{'..'~').to_a
        FILTERS = [ :FlateDecode, :RunLengthDecode, :LZWDecode, :ASCIIHexDecode, :ASCII85Decode ]

        def self.junk_spaces(max_size = 3)
            length = rand(max_size) + 1

            ::Array.new(length) { WHITECHARS[rand(WHITECHARS.size)] }.join
        end

FN #3: mjackson__citrus__e68d892: test/grammars/email_test.rb:109

NOT DETECTED — CODE BUG
The cop fails to detect this pattern. Fix the detection logic.

Enclosing structure: method body (line: def test_dtext)
The offense is inside this structure — the cop may need
to handle this context to detect the pattern.

Message: Ranges from upper to lower case ASCII letters may include unintended characters. Instead of A-z(which also includes several symbols) specify each range individually:A-Za-z and individually specify any symbols.

Ready-made test snippet (add to offense.rb, adjust ^ count):

    chars  = ("\x21".."\x5A").to_a
^ Lint/MixedCaseRange: Ranges from upper to lower case ASCII letters may include unintended characters. Instead of `A-z` (which also includes several symbols) specify each range individually: `A-Za-z` and individually specify any symbols.

Full source context:

    end
  end

  # dtext           =   %d33-90 /          ; Printable US-ASCII
  #                     %d94-126 /         ;  characters not including
  #                     obs-dtext          ;  "[", "]", or "\"
  def test_dtext
    chars  = ("\x21".."\x5A").to_a
    chars += ("\x5E".."\x7E").to_a

    # obs-dtext
    chars += chars_obs_no_ws_ctl
    chars += chars_quoted_pair

    chars.each do |c|

FN #4: neo4jrb__neo4j-core__4b649fc: Rakefile:35

NOT DETECTED — CODE BUG
The cop fails to detect this pattern. Fix the detection logic.

Message: Ranges from upper to lower case ASCII letters may include unintended characters. Instead of A-z(which also includes several symbols) specify each range individually:A-Za-z and individually specify any symbols.

Ready-made test snippet (add to offense.rb, adjust ^ count):

CHARS = ('0'..'z').to_a
^ Lint/MixedCaseRange: Ranges from upper to lower case ASCII letters may include unintended characters. Instead of `A-z` (which also includes several symbols) specify each range individually: `A-Za-z` and individually specify any symbols.

Full source context:

  task = Rake::Task['spec']
  task.reenable
  task.invoke
end

task default: [:spec]

CHARS = ('0'..'z').to_a
def string
  Array.new(rand(1_000)) { CHARS.sample }.join
end

MAX_NUM = 10_00 * 999_999
HALF_MAX_NUM = MAX_NUM.fdiv(2)
def int

Current Rust Implementation

src/cop/lint/mixed_case_range.rs

use crate::cop::node_type::{
    INTERPOLATED_REGULAR_EXPRESSION_NODE, RANGE_NODE, REGULAR_EXPRESSION_NODE,
};
use crate::cop::{Cop, CopConfig};
use crate::diagnostic::{Diagnostic, Severity};
use crate::parse::source::SourceFile;

/// Checks for mixed-case character ranges that include unintended characters.
/// For example, `('A'..'z')` includes `[`, `\`, `]`, `^`, `_`, `` ` ``.
///
/// ## Corpus investigation (2026-03-10)
///
/// Corpus oracle reported FP=0, FN=27.
///
/// FN:
/// - The original implementation only looked at Ruby `RangeNode`s like `('A'..'z')`.
///   Most corpus misses are regexp character classes such as `/[a-zA-z0-9]/`.
/// - Some remaining misses came from interpolated regexps where the unsafe range lives in a
///   literal segment around `#{...}`.
/// - Unicode property escapes like `\p{InLatin_Extended-A}` contain `-` inside the property
///   name; those must be skipped as atomic escapes instead of scanned as `d-A`.
pub struct MixedCaseRange;

const MSG: &str = "Ranges from upper to lower case ASCII letters may include unintended characters. Instead of `A-z` (which also includes several symbols) specify each range individually: `A-Za-z` and individually specify any symbols.";

impl Cop for MixedCaseRange {
    fn name(&self) -> &'static str {
        "Lint/MixedCaseRange"
    }

    fn default_severity(&self) -> Severity {
        Severity::Warning
    }

    fn interested_node_types(&self) -> &'static [u8] {
        &[
            INTERPOLATED_REGULAR_EXPRESSION_NODE,
            RANGE_NODE,
            REGULAR_EXPRESSION_NODE,
        ]
    }

    fn check_node(
        &self,
        source: &SourceFile,
        node: &ruby_prism::Node<'_>,
        _parse_result: &ruby_prism::ParseResult<'_>,
        _config: &CopConfig,
        diagnostics: &mut Vec<Diagnostic>,
        _corrections: Option<&mut Vec<crate::correction::Correction>>,
    ) {
        if let Some(range) = node.as_range_node() {
            diagnostics.extend(self.check_range(source, range));
            return;
        }

        if let Some(regexp) = node.as_regular_expression_node() {
            self.check_regexp(source, regexp, diagnostics);
            return;
        }

        if let Some(regexp) = node.as_interpolated_regular_expression_node() {
            self.check_interpolated_regexp(source, regexp, diagnostics);
        }
    }
}

impl MixedCaseRange {
    fn check_range(
        &self,
        source: &SourceFile,
        range: ruby_prism::RangeNode<'_>,
    ) -> Vec<Diagnostic> {
        let left = match range.left() {
            Some(l) => l,
            None => return Vec::new(),
        };
        let right = match range.right() {
            Some(r) => r,
            None => return Vec::new(),
        };

        // Both must be string literals
        let left_str = match left.as_string_node() {
            Some(s) => s,
            None => return Vec::new(),
        };
        let right_str = match right.as_string_node() {
            Some(s) => s,
            None => return Vec::new(),
        };

        let left_val = left_str.unescaped();
        let right_val = right_str.unescaped();

        // Must be single characters
        if left_val.len() != 1 || right_val.len() != 1 {
            return Vec::new();
        }

        let left_char = left_val[0] as char;
        let right_char = right_val[0] as char;

        if is_unsafe_range(left_char, right_char) {
            let loc = range.location();
            let (line, column) = source.offset_to_line_col(loc.start_offset());
            return vec![self.diagnostic(source, line, column, MSG.to_string())];
        }

        Vec::new()
    }

    fn check_regexp(
        &self,
        source: &SourceFile,
        regexp: ruby_prism::RegularExpressionNode<'_>,
        diagnostics: &mut Vec<Diagnostic>,
    ) {
        let Ok(content) = std::str::from_utf8(regexp.content_loc().as_slice()) else {
            return;
        };

        let mut offsets = Vec::new();
        let mut offset = regexp.content_loc().start_offset();
        for ch in content.chars() {
            offsets.push(Some(offset));
            offset += ch.len_utf8();
        }

        self.check_regexp_chars(
            source,
            &content.chars().collect::<Vec<_>>(),
            &offsets,
            diagnostics,
        );
    }

    fn check_interpolated_regexp(
        &self,
        source: &SourceFile,
        regexp: ruby_prism::InterpolatedRegularExpressionNode<'_>,
        diagnostics: &mut Vec<Diagnostic>,
    ) {
        let mut chars = Vec::new();
        let mut offsets = Vec::new();

        for part in regexp.parts().iter() {
            if let Some(string) = part.as_string_node() {
                let Ok(content) = std::str::from_utf8(string.content_loc().as_slice()) else {
                    return;
                };
                let mut offset = string.content_loc().start_offset();
                for ch in content.chars() {
                    chars.push(ch);
                    offsets.push(Some(offset));
                    offset += ch.len_utf8();
                }
                continue;
            }

            chars.push('\0');
            offsets.push(None);
        }

        self.check_regexp_chars(source, &chars, &offsets, diagnostics);
    }

    fn check_regexp_chars(
        &self,
        source: &SourceFile,
        chars: &[char],
        offsets: &[Option<usize>],
        diagnostics: &mut Vec<Diagnostic>,
    ) {
        debug_assert_eq!(chars.len(), offsets.len());

        let mut i = 0;
        while i < chars.len() {
            if chars[i] == '[' && (i == 0 || chars[i - 1] != '\\') {
                let Some(class_end) = find_char_class_end(chars, i) else {
                    i += 1;
                    continue;
                };
                self.check_regexp_class(source, chars, offsets, i + 1, class_end, diagnostics);
                i = class_end + 1;
            } else {
                i += 1;
            }
        }
    }

    fn check_regexp_class(
        &self,
        source: &SourceFile,
        chars: &[char],
        offsets: &[Option<usize>],
        start: usize,
        end: usize,
        diagnostics: &mut Vec<Diagnostic>,
    ) {
        let mut i = start;
        if i < end && chars[i] == '^' {
            i += 1;
        }

        while i < end {
            if chars[i] == '[' {
                if let Some(nested_end) = find_char_class_end(chars, i) {
                    i = nested_end + 1;
                    continue;
                }
                i += 1;
                continue;
            }

            if chars[i] == '\\' {
                i += escape_sequence_len(chars, i);
                continue;
            }

            if i + 2 < end && chars[i + 1] == '-' && chars[i + 2] != ']' {
                let range_end = chars[i + 2];
                if range_end == '\\' || range_end == '[' {
                    i += 1;
                    continue;
                }

                if is_unsafe_range(chars[i], range_end) {
                    if let Some(abs_offset) = offsets.get(i).copied().flatten() {
                        let (line, column) = source.offset_to_line_col(abs_offset);
                        diagnostics.push(self.diagnostic(source, line, column, MSG.to_string()));
                    }
                }

                i += 3;
            } else {
                i += 1;
            }
        }
    }
}

fn find_char_class_end(chars: &[char], open: usize) -> Option<usize> {
    let mut i = open + 1;
    if i < chars.len() && chars[i] == '^' {
        i += 1;
    }
    if i < chars.len() && chars[i] == ']' {
        i += 1;
    }

    while i < chars.len() {
        if chars[i] == '\\' {
            i += escape_sequence_len(chars, i);
        } else if chars[i] == '[' {
            if i + 1 < chars.len() && chars[i + 1] == ':' {
                i += 2;
                while i + 1 < chars.len() {
                    if chars[i] == ':' && chars[i + 1] == ']' {
                        i += 2;
                        break;
                    }
                    i += 1;
                }
            } else if let Some(nested_end) = find_char_class_end(chars, i) {
                i = nested_end + 1;
            } else {
                i += 1;
            }
        } else if chars[i] == ']' {
            return Some(i);
        } else {
            i += 1;
        }
    }

    None
}

fn escape_sequence_len(chars: &[char], start: usize) -> usize {
    if start + 1 >= chars.len() {
        return 1;
    }

    match chars[start + 1] {
        'x' => {
            let mut count = 2;
            let mut i = start + 2;
            while i < chars.len() && count < 4 && chars[i].is_ascii_hexdigit() {
                count += 1;
                i += 1;
            }
            count
        }
        'u' => {
            if start + 2 < chars.len() && chars[start + 2] == '{' {
                let mut i = start + 3;
                while i < chars.len() && chars[i] != '}' {
                    i += 1;
                }
                if i < chars.len() {
                    i + 1 - start
                } else {
                    i - start
                }
            } else {
                let mut count = 2;
                let mut i = start + 2;
                while i < chars.len() && count < 6 && chars[i].is_ascii_hexdigit() {
                    count += 1;
                    i += 1;
                }
                count
            }
        }
        '0'..='7' => {
            let mut count = 2;
            let mut i = start + 2;
            while i < chars.len() && count < 4 && matches!(chars[i], '0'..='7') {
                count += 1;
                i += 1;
            }
            count
        }
        'c' => {
            if start + 2 < chars.len() {
                3
            } else {
                2
            }
        }
        'p' | 'P' => {
            if start + 2 < chars.len() && chars[start + 2] == '{' {
                let mut i = start + 3;
                while i < chars.len() && chars[i] != '}' {
                    i += 1;
                }
                if i < chars.len() {
                    i + 1 - start
                } else {
                    i - start
                }
            } else {
                2
            }
        }
        _ => 2,
    }
}

fn char_range(c: char) -> Option<u8> {
    if c.is_ascii_lowercase() {
        Some(0) // a-z
    } else if c.is_ascii_uppercase() {
        Some(1) // A-Z
    } else {
        None
    }
}

fn is_unsafe_range(start: char, end: char) -> bool {
    let start_range = char_range(start);
    let end_range = char_range(end);

    match (start_range, end_range) {
        (Some(a), Some(b)) => a != b,
        _ => false,
    }
}

#[cfg(test)]
mod tests {
    use super::*;
    crate::cop_fixture_tests!(MixedCaseRange, "cops/lint/mixed_case_range");
}

RuboCop Ruby Implementation (ground truth)

vendor/rubocop/lib/rubocop/cop/lint/mixed_case_range.rb

# frozen_string_literal: true

module RuboCop
  module Cop
    module Lint
      # Checks for mixed-case character ranges since they include likely unintended characters.
      #
      # Offenses are registered for regexp character classes like `/[A-z]/`
      # as well as range objects like `('A'..'z')`.
      #
      # NOTE: `Range` objects cannot be autocorrected.
      #
      # @safety
      #   The cop autocorrects regexp character classes
      #   by replacing one character range with two: `A-z` becomes `A-Za-z`.
      #   In most cases this is probably what was originally intended
      #   but it changes the regexp to no longer match symbols it used to include.
      #   For this reason, this cop's autocorrect is unsafe (it will
      #   change the behavior of the code).
      #
      # @example
      #
      #   # bad
      #   r = /[A-z]/
      #
      #   # good
      #   r = /[A-Za-z]/
      class MixedCaseRange < Base
        extend AutoCorrector
        include RangeHelp

        MSG = 'Ranges from upper to lower case ASCII letters may include unintended ' \
              'characters. Instead of `A-z` (which also includes several symbols) ' \
              'specify each range individually: `A-Za-z` and individually specify any symbols.'
        RANGES = [('a'..'z').freeze, ('A'..'Z').freeze].freeze

        def on_irange(node)
          return unless node.children.compact.all?(&:str_type?)
          return if node.begin.nil? || node.end.nil?

          add_offense(node) if unsafe_range?(node.begin.value, node.end.value)
        end
        alias on_erange on_irange

        def on_regexp(node)
          each_unsafe_regexp_range(node) do |loc|
            next unless (replacement = regexp_range(loc.source))

            add_offense(loc) do |corrector|
              corrector.replace(loc, replacement)
            end
          end
        end

        def each_unsafe_regexp_range(node)
          node.parsed_tree&.each_expression do |expr|
            next if skip_expression?(expr)

            range_pairs(expr).reject do |range_start, range_end|
              next if skip_range?(range_start, range_end)

              next unless unsafe_range?(range_start.text, range_end.text)

              yield(build_source_range(range_start, range_end))
            end
          end
        end

        private

        def build_source_range(range_start, range_end)
          range_between(range_start.expression.begin_pos, range_end.expression.end_pos)
        end

        def range_for(char)
          RANGES.detect do |range|
            range.include?(char)
          end
        end

        def range_pairs(expr)
          expr.expressions.filter_map { |e| [e.expressions[0], e.expressions[1]] if e.type == :set }
        end

        def unsafe_range?(range_start, range_end)
          return false if range_start.length != 1 || range_end.length != 1

          range_for(range_start) != range_for(range_end)
        end

        def skip_expression?(expr)
          !(expr.type == :set && expr.token == :character)
        end

        def skip_range?(range_start, range_end)
          [range_start, range_end].any? do |bound|
            bound&.type != :literal
          end
        end

        def regexp_range(source)
          open, close = source.split('-')
          return unless (open_range = range_for(open))
          return unless (close_range = range_for(close))

          first = [open, open_range.end]
          second = [close_range.begin, close]
          "#{first.uniq.join('-')}#{second.uniq.join('-')}"
        end
      end
    end
  end
end

RuboCop Test Excerpts

vendor/rubocop/spec/rubocop/cop/lint/mixed_case_range_spec.rb

  it 'registers an offense for an overly broad character range' do

    expect_offense(<<~RUBY)
      foo = 'A'..'z'
            ^^^^^^^^ #{message}
    RUBY

  it 'registers an offense for an overly broad exclusive character range' do

    expect_offense(<<~RUBY)
      foo = 'A'...'z'
            ^^^^^^^^^ #{message}
    RUBY

  it 'does not register an offense for an acceptable range' do

    expect_no_offenses(<<~RUBY)
      foo = 'A'..'Z'
    RUBY

  it 'does not register an offense when the number of characters at the start of range is other than 1' do

    expect_no_offenses(<<~RUBY)
      foo = 'aa'..'z'
    RUBY

  it 'does not register an offense when the number of characters at the end of range is other than 1' do

    expect_no_offenses(<<~RUBY)
      foo = 'a'..'zz'
    RUBY

  context 'ruby > 2.6', :ruby27 do

    it 'does not register an offense for a beginless range' do

      expect_no_offenses(<<~RUBY)
        (..'z')
      RUBY

  it 'does not register an offense for an endless range' do

    expect_no_offenses(<<~RUBY)
      ('a'..)
    RUBY

  it 'registers an offense for an overly broad range' do

    expect_offense(<<~RUBY)
      foo = /[A-z]/
              ^^^ #{message}
    RUBY

  it 'registers an offense for an overly broad range between interpolations' do

    expect_offense(<<~'RUBY'.sub(/\#{message}/, message))
      foo = /[#{A-z}A-z#{y}]/
                    ^^^ #{message}
    RUBY

  it 'registers an offense for each of multiple unsafe ranges' do

Current Fixture: offense.rb

tests/fixtures/cops/lint/mixed_case_range/offense.rb

r = ('A'..'z')
     ^^^^^^^^ Lint/MixedCaseRange: Ranges from upper to lower case ASCII letters may include unintended characters. Instead of `A-z` (which also includes several symbols) specify each range individually: `A-Za-z` and individually specify any symbols.
x = ('a'..'Z')
     ^^^^^^^^ Lint/MixedCaseRange: Ranges from upper to lower case ASCII letters may include unintended characters. Instead of `A-z` (which also includes several symbols) specify each range individually: `A-Za-z` and individually specify any symbols.
y = ('B'..'f')
     ^^^^^^^^ Lint/MixedCaseRange: Ranges from upper to lower case ASCII letters may include unintended characters. Instead of `A-z` (which also includes several symbols) specify each range individually: `A-Za-z` and individually specify any symbols.

re = /[A-z]/
       ^^^ Lint/MixedCaseRange: Ranges from upper to lower case ASCII letters may include unintended characters. Instead of `A-z` (which also includes several symbols) specify each range individually: `A-Za-z` and individually specify any symbols.

chars = /[a-zA-z0-9]{0,32}/
             ^^^ Lint/MixedCaseRange: Ranges from upper to lower case ASCII letters may include unintended characters. Instead of `A-z` (which also includes several symbols) specify each range individually: `A-Za-z` and individually specify any symbols.

regexp = /[#{prefix}A-z#{suffix}]/
                    ^^^ Lint/MixedCaseRange: Ranges from upper to lower case ASCII letters may include unintended characters. Instead of `A-z` (which also includes several symbols) specify each range individually: `A-Za-z` and individually specify any symbols.

Current Fixture: no_offense.rb

tests/fixtures/cops/lint/mixed_case_range/no_offense.rb

r = ('A'..'Z')
r = ('a'..'z')
x = ('0'..'9')
y = (1..10)
z = ('A'..'Za')
re = /[A-Za-z]/
escaped = /[A\-z]/
unicode = /[\p{InLatin_Extended-A}]/

@6 6 bot force-pushed the fix/lint-mixed_case_range-23409151474 branch from 47f87c7 to 9516a2e Compare March 22, 2026 18:17
@6 6 bot marked this pull request as ready for review March 22, 2026 18:17
@6 6 bot enabled auto-merge (rebase) March 22, 2026 18:17
@6
Copy link
Copy Markdown
Contributor Author

6 bot commented Mar 22, 2026

Auto-repair Started

  • Checks run: #23409394577
  • Route: hard
  • Backend: codex
  • Reason: cop-check: Check cops against corpus baseline
  • Repair workflow: #23409506986

@6
Copy link
Copy Markdown
Contributor Author

6 bot commented Mar 22, 2026

Auto-repair Succeeded

  • Checks run: #23409394577
  • Backend: codex
  • Repair workflow: #23409506986
  • Repair commit: 45165132be611f1189b653b5f7a1ac6e544e2394

Auto-repair succeeded with backend codex.

Validated locally before push using /tmp/repair-verify.sh.

Copy link
Copy Markdown
Owner

@6 6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed locally: the fix correctly broadens Ruby RangeNode handling while keeping regexp classes on the narrower accepted behavior, matching the added fixtures and green cop-check.

@6 6 bot merged commit ab64fba into main Mar 22, 2026
4 checks passed
@6 6 bot deleted the fix/lint-mixed_case_range-23409151474 branch March 22, 2026 21:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant