Skip to content

Conversation

@owen-mc
Copy link
Contributor

@owen-mc owen-mc commented Nov 14, 2025

These errors were accidentally introduced in #20623 . We can't reproduce them locally and we don't know why. The cause seems to be that the path transformer isn't used in one place where a label for a file is used. I have moved the use of the path transformer into the function that makes file labels so we don't have to remember to transform it at all call sites.

Note: this is mostly a reversion of this commit from that PR, and then a different way of achieving the same thing. It would have been better if I'd done an explicit revert commit and then put the TransformPath call in a different place. But I don't want to change it now, since then it's not totally clear that the DCA run is for the code that will be merged.

This way we don't have to remember to transform it at all call sites.
@owen-mc owen-mc added the no-change-note-required This PR does not need a change note label Nov 14, 2025
@github-actions github-actions bot added the Go label Nov 14, 2025
@owen-mc
Copy link
Contributor Author

owen-mc commented Nov 14, 2025

DCA shows that this gets rid of the dataset check errors 🎉 . There is an increase in the number of extraction errors, but that brings it back to the number from before the overlay PR was merged, so presumably they were being masked by the database inconsistencies.

@owen-mc owen-mc marked this pull request as ready for review November 14, 2025 12:07
@owen-mc owen-mc requested review from a team as code owners November 14, 2025 12:07
@owen-mc owen-mc requested review from Copilot and nickrolfe November 14, 2025 12:07
Copilot finished reviewing on behalf of owen-mc November 14, 2025 12:09
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes dataset check errors by centralizing path transformation logic within the FileLabelFor function. The issue was that path transformation wasn't consistently applied at all call sites where file labels were created. By moving srcarchive.TransformPath() into FileLabelFor, the transformation is now automatically applied whenever a file label is generated, eliminating the need to manually transform paths at each call site.

Key Changes:

  • Moved path transformation logic from call sites into the FileLabelFor method itself
  • Updated all FileLabelFor call sites to pass untransformed paths instead of pre-transformed paths
  • Simplified the FileLabel method to pass the raw path directly to FileLabelFor

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
go/extractor/trap/labels.go Centralized path transformation by moving srcarchive.TransformPath() into FileLabelFor and removed it from FileLabel
go/extractor/extractor.go Updated extractFileInfo to pass untransformed file parameter to FileLabelFor instead of the pre-transformed path variable

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

return l.fileLabel
}

// FileLabelFor returns the label for the file for which the trap writer `tw` is associated
Copy link

Copilot AI Nov 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The comment describes FileLabelFor as returning "the label for the file for which the trap writer tw is associated", but this function is now a general utility that can be used with any file path, not just the one associated with the trap writer. Consider updating the comment to reflect that it takes an arbitrary path parameter and transforms it. For example: "FileLabelFor returns the label for a file at the given path (after applying path transformation)".

Suggested change
// FileLabelFor returns the label for the file for which the trap writer `tw` is associated
// FileLabelFor returns the label for a file at the given path (after applying path transformation).

Copilot uses AI. Check for mistakes.
@owen-mc owen-mc merged commit fabcd04 into github:main Nov 14, 2025
21 checks passed
@owen-mc owen-mc deleted the go/fix/dataset-check-errors-sourcefile branch November 14, 2025 21:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Go no-change-note-required This PR does not need a change note

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants