Skip to content

[bot] Fix Lint/HashCompareByIdentity#153

Merged
6[bot] merged 2 commits intomainfrom
fix/lint-hash_compare_by_identity-23474441379
Mar 24, 2026
Merged

[bot] Fix Lint/HashCompareByIdentity#153
6[bot] merged 2 commits intomainfrom
fix/lint-hash_compare_by_identity-23474441379

Conversation

@6
Copy link
Copy Markdown
Contributor

@6 6 bot commented Mar 24, 2026

Automated fix fix for Lint/HashCompareByIdentity corpus conformance.

Details

Cop Lint/HashCompareByIdentity
Backend claude-oauth / normal
Model Claude Opus 4.6 (OAuth, medium)
Mode fix
Turns 26
Run #72

Result

**Summary:** The 2 extended-corpus FPs for `Lint/HashCompareByIdentity` are not a cop detection bug. Both are in `zaml.rb` which has syntax errors (invalid multibyte escapes) that break RuboCop's parser — RuboCop only reports `Lint/Syntax` and never runs any other cop on that file. Prism handles the file fine, so nitrocop correctly detects the `object_id` key pattern.

This is a parser-level difference that would need to be fixed at the engine/runner level (skipping non-syntax cops on files with parse errors), not in individual cop implementations. I've documented the finding on the cop struct.

Changes

  1 files changed
Agent conversation excerpt

ToolSearch

Read: /home/runner/work/nitrocop/nitrocop/docs/agent-ci.md

Bash: echo '@previously_emitted_object[obj.object_id]' > /tmp/test_fp.rb && bundle exec rubocop --only Lint/HashCompareByIdentity /tmp/test_fp.rb 2>&1 | tail -20

Bash: cd /home/runner/work/nitrocop/nitrocop/bench/corpus && echo '@previously_emitted_object[obj.object_id]' > /tmp/test_fp.rb && BUNDLE_PATH=vendor/bundle bundle exec rubocop --only Lint/HashCompareByIden

Bash: BUNDLE_PATH=vendor/bundle bundle install 2>&1 | tail -5

Bash: echo '@previously_emitted_object[obj.object_id]' > /tmp/test_fp.rb && BUNDLE_PATH=vendor/bundle bundle exec rubocop --only Lint/HashCompareByIdentity /tmp/test_fp.rb 2>&1 | tail -10

RuboCop flags this pattern too. So the FP is likely a config/exclusion issue (the file is under vendor/). Let me investigate further.

Bash: cd /home/runner/work/nitrocop/nitrocop && python3 scripts/investigate-cop.py Lint/HashCompareByIdentity --context 2>&1 | head -80

Bash: python3 scripts/investigate-cop.py Lint/HashCompareByIdentity --context --extended 2>&1 | head -80

Bash: echo 'AllCops: Exclude: [] Lint/HashCompareByIdentity: Enabled: true' > /tmp/rubocop.yml && echo '@previously_emitted_object[obj.object_id]' > /tmp/test_fp.rb && BUNDLE_PATH=vendor/bundle bundle e

Bash: cd /home/runner/work/nitrocop/nitrocop/bench/corpus && echo 'AllCops: Exclude: [] Lint/HashCompareByIdentity: Enabled: true' > /tmp/rubocop.yml && echo '@previously_emitted_object[obj.object_id]'

Bash: BUNDLE_PATH=vendor/bundle bundle exec ruby -e "require 'rubocop'; puts RuboCop::ConfigLoader.default_configuration['AllCops']['Exclude'].inspect" 2>&1

Bash: cd /home/runner/work/nitrocop/nitrocop && ls bench/corpus/repos/ 2>/dev/null | grep -i pitluga | head -5

Bash: python3 scripts/investigate-repo.py pitluga__supply_drop__d64c50c 2>&1 | head -40

Bash: python3 scripts/investigate-repo.py pitluga__supply_drop__d64c50c --extended 2>&1 | head -40

Bash: cd /tmp && git clone --depth 1 https://github.com/pitluga/supply_drop.git 2>&1 | tail -3

Bash: cat /tmp/supply_drop/.rubocop.yml 2>/dev/null || echo "No .rubocop.yml found"

Bash: head -10 /tmp/supply_drop/examples/vendored-puppet/vendor/puppet-2.7.8/lib/puppet/util/zaml.rb 2>/dev/null

Bash: cd /tmp/supply_drop && BUNDLE_PATH=/home/runner/work/nitrocop/nitrocop/bench/corpus/vendor/bundle bundle exec rubocop --only Lint/HashCompareByIdentity examples/vendored-puppet/vendor/puppet-2.7.8/lib

Bash: cd /tmp/supply_drop && rubocop --only Lint/HashCompareByIdentity examples/vendored-puppet/vendor/puppet-2.7.8/lib/puppet/util/zaml.rb 2>&1 | tail -10

Bash: cd /tmp/supply_drop && rubocop --only Lint/HashCompareByIdentity examples/vendored-puppet/vendor/puppet-2.7.8/lib/puppet/util/zaml.rb 2>&1

The file has syntax errors under RuboCop's parser, so RuboCop never runs the cop on it. Let me verify with a higher target Ruby version.

Bash: cd /tmp/supply_drop && echo 'AllCops: NewCops: enable TargetRubyVersion: 3.1' > .rubocop.yml && rubocop --only Lint/HashCompareByIdentity examples/vendored-puppet/vendor/puppet-2.7.8/lib/puppet/ut

Root cause identified: RuboCop's parser fails on this file (syntax errors at line 224+), so it never runs Lint/HashCompareByIdentity. Prism handles the file fine, so nitrocop reports offenses RuboCop can't even check. This is a parser-level difference, not a cop detection bug.

I'll document this finding on the cop struct and commit.

Read: /home/runner/work/nitrocop/nitrocop/src/cop/lint/hash_compare_by_identity.rs

Edit: /home/runner/work/nitrocop/nitrocop/src/cop/lint/hash_compare_by_identity.rs

Now let me verify the tests still pass.

Bash: cargo test --lib -- cop::lint::hash_compare_by_identity 2>&1 | tail -10

Tests pass. Now let me commit.

Bash: `git add src/cop/lint/hash_compare_by_identity.rs && git commit -m "$(cat <<'EOF'
Document Lint/HashCompareByIdentity extended corpus FP root cause

The 2 extended-corpus FPs are in a file with syntax `

Summary: The 2 extended-corpus FPs for Lint/HashCompareByIdentity are not a cop detection bug. Both are in zaml.rb which has syntax errors (invalid multibyte escapes) that break RuboCop's parser — RuboCop only reports Lint/Syntax and never runs any other cop on that file. Prism handles the file fine, so nitrocop correctly detects the object_id key pattern.

This is a parser-level difference that would need to be fixed at the engine/runner level (skipping non-syntax cops on files with parse errors), not in individual cop implementations. I've documented the finding on the cop struct.

Task prompt (4222 tokens)

Fix Lint/HashCompareByIdentity — 2 FP, 0 FN

Cross-Corpus Risk

  • Standard corpus: 75 matches, 0 FP, 0 FN
  • Extended corpus: 203 matches, 2 FP, 0 FN
  • Extended divergence currently touches 1 repo

Risk class: extended-only edge case against a standard-perfect baseline.

Treat the extended examples as a narrow edge case, not proof that the broad pattern is safe.
Any carve-out that reduces the extended FP/FN count but regresses the standard corpus is a bad fix.
Prefer the smallest context-specific change you can justify from RuboCop behavior and the vendor spec.
This is concentrated in only a few repos, which increases the risk of overfitting to one local shape.

Instructions

You are fixing ONE cop in nitrocop, a Rust Ruby linter that uses Prism for parsing.

Current state: 203 matches, 2 false positives, 0 false negatives.
Focus on: FP (nitrocop flags code RuboCop does not).

Workflow

  1. Read the Pre-diagnostic Results and Corpus FP/FN Examples sections below first
  2. Verify with RuboCop first (for FP fixes): before writing any code, confirm RuboCop's
    behavior on BOTH the specific FP case AND the general pattern:
    echo '<specific FP case>' > /tmp/test.rb && rubocop --only Lint/HashCompareByIdentity /tmp/test.rb
    echo '<general pattern>' > /tmp/test.rb && rubocop --only Lint/HashCompareByIdentity /tmp/test.rb
    If RuboCop flags the general pattern, your fix must be narrow enough to not suppress it.
  3. Add a test case FIRST:
    • FN fix: add the missed pattern to tests/fixtures/cops/lint/hash_compare_by_identity/offense.rb with ^ annotation
    • FP fix: add the false-positive pattern to tests/fixtures/cops/lint/hash_compare_by_identity/no_offense.rb
  4. Verify test fails: cargo test --lib -- cop::lint::hash_compare_by_identity
  5. Fix src/cop/lint/hash_compare_by_identity.rs
  6. Verify test passes: cargo test --lib -- cop::lint::hash_compare_by_identity
  7. Add a /// doc comment on the cop struct documenting what you found and fixed
  8. Commit only your cop's files

Fixture Format

Mark offenses with ^ markers on the line AFTER the offending source line:

x = 1
     ^^ Lint/HashCompareByIdentity: Trailing whitespace detected.

The ^ characters must align with the offending columns. The message format is Lint/HashCompareByIdentity: <message text>.

If your test passes immediately

If you add a test case and it passes without code changes, the corpus mismatch is
caused by config/context differences, not a detection bug.
Do NOT loop trying to make the test fail. Instead:

  1. Investigate config resolution (Include/Exclude, cop enablement, disable comments)
  2. The fix is likely in src/config/ or the cop's config handling, not detection logic
  3. If you cannot determine the root cause within 5 minutes, document your findings as
    a /// comment on the cop struct and commit

CRITICAL: Avoid regressions in the opposite direction

When fixing FPs, your change MUST NOT suppress legitimate detections. When fixing FNs,
your change MUST NOT flag code that RuboCop accepts. A fix that eliminates a few issues
in one direction but introduces hundreds in the other is a catastrophic regression.

Before exempting a category of patterns, verify with RuboCop that the general case
is still an offense:

rubocop --only Lint/HashCompareByIdentity /tmp/test.rb

If RuboCop flags the general pattern but not your specific case, the difference is in
a narrow context (e.g., enclosing structure, receiver type, argument count) — your fix
must target that specific context, not the broad category.

Rule of thumb: if your fix adds an early return or continue that skips a whole
node type, operator class, or naming pattern, it's probably too broad. Prefer adding a
condition that matches the SPECIFIC differentiating context.

Rules

  • Only modify src/cop/lint/hash_compare_by_identity.rs and tests/fixtures/cops/lint/hash_compare_by_identity/
  • Run cargo test --lib -- cop::lint::hash_compare_by_identity to verify your fix (do NOT run the full test suite)
  • Do NOT touch unrelated files
  • Do NOT use git stash

Start Here

Use the existing corpus data to focus on the most concentrated regressions first.

Helpful local commands:

  • python3 scripts/investigate-cop.py Lint/HashCompareByIdentity --repos-only
  • python3 scripts/investigate-cop.py Lint/HashCompareByIdentity --context
  • python3 scripts/verify-cop-locations.py Lint/HashCompareByIdentity

Top FP repos:

  • pitluga__supply_drop__d64c50c (2 FP) — example examples/vendored-puppet/vendor/puppet-2.7.8/lib/puppet/util/zaml.rb:73

Representative FP examples:

  • pitluga__supply_drop__d64c50c: examples/vendored-puppet/vendor/puppet-2.7.8/lib/puppet/util/zaml.rb:73 — Use Hash#compare_by_identity instead of using object_id for keys.
  • pitluga__supply_drop__d64c50c: examples/vendored-puppet/vendor/puppet-2.7.8/lib/puppet/util/zaml.rb:77 — Use Hash#compare_by_identity instead of using object_id for keys.

Pre-diagnostic Results

Diagnosis Summary

Each example was tested by running nitrocop on the extracted source in isolation
with --force-default-config to determine if the issue is a code bug or config issue.
Note: source context is truncated and may not parse perfectly. If a diagnosis
seems wrong (e.g., your test passes immediately for a 'CODE BUG'), treat it as
a config/context issue instead.

  • FP: 2 confirmed code bug(s), 0 context-dependent

FP #1: pitluga__supply_drop__d64c50c: examples/vendored-puppet/vendor/puppet-2.7.8/lib/puppet/util/zaml.rb:73

CONFIRMED false positive — CODE BUG
nitrocop incorrectly flags this pattern in isolation.
Fix the detection logic to not flag this.

Enclosing structure: method body (line: def label_for(obj))
The offense is inside this structure — this is likely WHY
RuboCop does not flag it. Your fix should detect this context.

Full source context (add relevant parts to no_offense.rb):

      @this_label_number ? ('&id%03d%s' % [@this_label_number, @indent]) : ''
    end
    def reference
      @reference         ||= '*id%03d' % @this_label_number
    end
  end
  def label_for(obj)
    @previously_emitted_object[obj.object_id]
  end
  def new_label_for(obj)
    label = Label.new(obj,(Hash === obj || Array === obj) ? "#{@indent || "\n"}  " : ' ')
    @previously_emitted_object[obj.object_id] = label
    label
  end
  def first_time_only(obj)

Message: Use Hash#compare_by_identityinstead of usingobject_id for keys.

FP #2: pitluga__supply_drop__d64c50c: examples/vendored-puppet/vendor/puppet-2.7.8/lib/puppet/util/zaml.rb:77

CONFIRMED false positive — CODE BUG
nitrocop incorrectly flags this pattern in isolation.
Fix the detection logic to not flag this.

Enclosing structure: method body (line: def new_label_for(obj))
The offense is inside this structure — this is likely WHY
RuboCop does not flag it. Your fix should detect this context.

Full source context (add relevant parts to no_offense.rb):

    end
  end
  def label_for(obj)
    @previously_emitted_object[obj.object_id]
  end
  def new_label_for(obj)
    label = Label.new(obj,(Hash === obj || Array === obj) ? "#{@indent || "\n"}  " : ' ')
    @previously_emitted_object[obj.object_id] = label
    label
  end
  def first_time_only(obj)
    if label = label_for(obj)
      label.this_label_number ||= (@next_free_label_number += 1)
      emit(label.reference)
    else

Message: Use Hash#compare_by_identityinstead of usingobject_id for keys.

Current Rust Implementation

src/cop/lint/hash_compare_by_identity.rs

use crate::cop::node_type::{
    CALL_NODE, INDEX_AND_WRITE_NODE, INDEX_OPERATOR_WRITE_NODE, INDEX_OR_WRITE_NODE,
};
use crate::cop::{Cop, CopConfig};
use crate::diagnostic::{Diagnostic, Severity};
use crate::parse::source::SourceFile;

/// Checks for hashes keyed by `object_id`.
///
/// ## Corpus investigation (2026-03-10)
///
/// Corpus oracle reported FP=0, FN=18.
///
/// FN:
/// - The implementation only matched `foo.object_id`, not bare `object_id`. Real-world code
///   often uses `hash[object_id]` inside instance methods, where Prism represents the key as a
///   zero-receiver `CallNode`.
/// - Residual corpus misses were `hash[key] ||= value` forms. Prism represents those as
///   `IndexOrWriteNode` / `IndexAndWriteNode` / `IndexOperatorWriteNode`, not normal `CallNode`s.
/// - Rerunning the corpus gate after handling both shapes matched RuboCop exactly:
///   expected 74, actual 74, with no potential FP/FN.
pub struct HashCompareByIdentity;

const HASH_KEY_METHODS: &[&[u8]] = &[b"key?", b"has_key?", b"fetch", b"[]", b"[]="];

impl Cop for HashCompareByIdentity {
    fn name(&self) -> &'static str {
        "Lint/HashCompareByIdentity"
    }

    fn default_severity(&self) -> Severity {
        Severity::Warning
    }

    fn interested_node_types(&self) -> &'static [u8] {
        &[
            CALL_NODE,
            INDEX_AND_WRITE_NODE,
            INDEX_OPERATOR_WRITE_NODE,
            INDEX_OR_WRITE_NODE,
        ]
    }

    fn check_node(
        &self,
        source: &SourceFile,
        node: &ruby_prism::Node<'_>,
        _parse_result: &ruby_prism::ParseResult<'_>,
        _config: &CopConfig,
        diagnostics: &mut Vec<Diagnostic>,
        _corrections: Option<&mut Vec<crate::correction::Correction>>,
    ) {
        if let Some(call) = node.as_call_node() {
            let method_name = call.name().as_slice();
            if !HASH_KEY_METHODS.contains(&method_name) || call.receiver().is_none() {
                return;
            }

            if let Some(first_arg) = first_argument(call.arguments()) {
                self.add_offense_if_object_id_key(source, &call.as_node(), &first_arg, diagnostics);
            }
            return;
        }

        if let Some(write) = node.as_index_operator_write_node() {
            if let Some(first_arg) = first_argument(write.arguments()) {
                self.add_offense_if_object_id_key(
                    source,
                    &write.as_node(),
                    &first_arg,
                    diagnostics,
                );
            }
            return;
        }

        if let Some(write) = node.as_index_or_write_node() {
            if let Some(first_arg) = first_argument(write.arguments()) {
                self.add_offense_if_object_id_key(
                    source,
                    &write.as_node(),
                    &first_arg,
                    diagnostics,
                );
            }
            return;
        }

        if let Some(write) = node.as_index_and_write_node() {
            if let Some(first_arg) = first_argument(write.arguments()) {
                self.add_offense_if_object_id_key(
                    source,
                    &write.as_node(),
                    &first_arg,
                    diagnostics,
                );
            }
        }
    }
}

impl HashCompareByIdentity {
    fn add_offense_if_object_id_key(
        &self,
        source: &SourceFile,
        node: &ruby_prism::Node<'_>,
        first_arg: &ruby_prism::Node<'_>,
        diagnostics: &mut Vec<Diagnostic>,
    ) {
        if !is_object_id_call(first_arg) {
            return;
        }

        let loc = node.location();
        let (line, column) = source.offset_to_line_col(loc.start_offset());
        diagnostics.push(self.diagnostic(
            source,
            line,
            column,
            "Use `Hash#compare_by_identity` instead of using `object_id` for keys.".to_string(),
        ));
    }
}

fn first_argument(
    arguments: Option<ruby_prism::ArgumentsNode<'_>>,
) -> Option<ruby_prism::Node<'_>> {
    arguments?.arguments().iter().next()
}

fn is_object_id_call(node: &ruby_prism::Node<'_>) -> bool {
    node.as_call_node()
        .is_some_and(|call| call.name().as_slice() == b"object_id")
}

#[cfg(test)]
mod tests {
    use super::*;
    crate::cop_fixture_tests!(HashCompareByIdentity, "cops/lint/hash_compare_by_identity");
}

RuboCop Ruby Implementation (ground truth)

vendor/rubocop/lib/rubocop/cop/lint/hash_compare_by_identity.rb

# frozen_string_literal: true

module RuboCop
  module Cop
    module Lint
      # Prefer using `Hash#compare_by_identity` rather than using `object_id`
      # for hash keys.
      #
      # This cop looks for hashes being keyed by objects' `object_id`, using
      # one of these methods: `key?`, `has_key?`, `fetch`, `[]` and `[]=`.
      #
      # @safety
      #   This cop is unsafe. Although unlikely, the hash could store both object
      #   ids and other values that need be compared by value, and thus
      #   could be a false positive.
      #
      #   Furthermore, this cop cannot guarantee that the receiver of one of the
      #   methods (`key?`, etc.) is actually a hash.
      #
      # @example
      #   # bad
      #   hash = {}
      #   hash[foo.object_id] = :bar
      #   hash.key?(baz.object_id)
      #
      #   # good
      #   hash = {}.compare_by_identity
      #   hash[foo] = :bar
      #   hash.key?(baz)
      #
      class HashCompareByIdentity < Base
        RESTRICT_ON_SEND = %i[key? has_key? fetch [] []=].freeze

        MSG = 'Use `Hash#compare_by_identity` instead of using `object_id` for keys.'

        # @!method id_as_hash_key?(node)
        def_node_matcher :id_as_hash_key?, <<~PATTERN
          (call _ {:key? :has_key? :fetch :[] :[]=} (send _ :object_id) ...)
        PATTERN

        def on_send(node)
          add_offense(node) if id_as_hash_key?(node)
        end
        alias on_csend on_send
      end
    end
  end
end

RuboCop Test Excerpts

vendor/rubocop/spec/rubocop/cop/lint/hash_compare_by_identity_spec.rb

  it 'registers an offense when using hash methods with `object_id` on receiver as a key' do

    expect_offense(<<~RUBY)
      hash.key?(foo.object_id)
      ^^^^^^^^^^^^^^^^^^^^^^^^ Use `Hash#compare_by_identity` instead of using `object_id` for keys.
      hash.has_key?(foo.object_id)
      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Use `Hash#compare_by_identity` instead of using `object_id` for keys.
      hash[foo.object_id] = bar
      ^^^^^^^^^^^^^^^^^^^^^^^^^ Use `Hash#compare_by_identity` instead of using `object_id` for keys.
      hash[foo.object_id]
      ^^^^^^^^^^^^^^^^^^^ Use `Hash#compare_by_identity` instead of using `object_id` for keys.
      hash.fetch(foo.object_id, 42)
      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Use `Hash#compare_by_identity` instead of using `object_id` for keys.
    RUBY

  it 'registers an offense when using hash method with `object_id` as a key' do

    expect_offense(<<~RUBY)
      hash.key?(object_id)
      ^^^^^^^^^^^^^^^^^^^^ Use `Hash#compare_by_identity` instead of using `object_id` for keys.
    RUBY

  it 'registers an offense when using hash method with `object_id` as a key with safe navigation' do

    expect_offense(<<~RUBY)
      hash&.key?(object_id)
      ^^^^^^^^^^^^^^^^^^^^^ Use `Hash#compare_by_identity` instead of using `object_id` for keys.
    RUBY

  it 'does not register an offense for hash methods without `object_id` as key' do

    expect_no_offenses(<<~RUBY)
      hash.key?(foo)
    RUBY

Current Fixture: offense.rb

tests/fixtures/cops/lint/hash_compare_by_identity/offense.rb

hash[foo.object_id] = :bar
^^^^^^^^^^^^^^^^^^^^^^^^^^^ Lint/HashCompareByIdentity: Use `Hash#compare_by_identity` instead of using `object_id` for keys.

hash.key?(baz.object_id)
^^^^^^^^^^^^^^^^^^^^^^^^ Lint/HashCompareByIdentity: Use `Hash#compare_by_identity` instead of using `object_id` for keys.

hash.fetch(x.object_id)
^^^^^^^^^^^^^^^^^^^^^^^ Lint/HashCompareByIdentity: Use `Hash#compare_by_identity` instead of using `object_id` for keys.

hash[object_id]
^^^^^^^^^^^^^^^ Lint/HashCompareByIdentity: Use `Hash#compare_by_identity` instead of using `object_id` for keys.

hash&.key?(object_id)
^^^^^^^^^^^^^^^^^^^^^ Lint/HashCompareByIdentity: Use `Hash#compare_by_identity` instead of using `object_id` for keys.

memo[foo.object_id] ||= :value
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Lint/HashCompareByIdentity: Use `Hash#compare_by_identity` instead of using `object_id` for keys.

memo[object_id] ||= :value
^^^^^^^^^^^^^^^^^^^^^^^^^^ Lint/HashCompareByIdentity: Use `Hash#compare_by_identity` instead of using `object_id` for keys.

Current Fixture: no_offense.rb

tests/fixtures/cops/lint/hash_compare_by_identity/no_offense.rb

hash = {}.compare_by_identity
hash[foo] = :bar
hash.key?(baz)
hash.fetch(x)
hash[y]
hash[z] = :value

@6 6 bot force-pushed the fix/lint-hash_compare_by_identity-23474441379 branch from b99bf4c to fa1529f Compare March 24, 2026 05:34
@6 6 bot marked this pull request as ready for review March 24, 2026 05:34
@6 6 bot enabled auto-merge (squash) March 24, 2026 05:34
@6 6 bot merged commit 9b02867 into main Mar 24, 2026
4 checks passed
@6 6 bot deleted the fix/lint-hash_compare_by_identity-23474441379 branch March 24, 2026 05:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant