Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix LSIF indexing handling of duplicate analyzer references #68423

Conversation

jasonmalinowski
Copy link
Member

@jasonmalinowski jasonmalinowski commented Jun 2, 2023

The LSIF tool supports many types of inputs, but two in particular:

  1. A JSON file that contains the command line arguments passed to the compiler.
  2. A binlog which will be searched for invocations.

When the binlog support was implemented, it called into the implementation for JSON files to do the heavy lifting of actually parsing the command line. But oddly, that code somewhat reinvented the implementation of our CommandLineProject type, so it could also support some specific path mapping logic. Unfortunately implementation for JSON inputs has a bunch of bugs that don't exist in the CommandLineProject code. Specifically, duplicate analyzer references are not correctly handled which triggered this investigation.

In this commit, I'm fixing some bugs in CommandLineProject and then making the binlog code just directly call into CommandLineProject which removes the buggy code for processing JSON inputs from the binlog path. As a follow up, we should either also move the JSON input handling to CommandLineProject after adding path mapping support to it, or just deprecate that support entirely.

The LSIF tool supports many types of inputs, but two in particular:

1. A JSON file that contains the command line arguments passed to
   the compiler.
2. A binlog which will be searched for invocations.

When the binlog support was implemented, it called into the
implementation for JSON files to do the heavy lifting of actually
parsing the command line. But oddly, that code somewhat reinvented the
implementation of our CommandLineProject type, so it could also support
some specific path mapping logic. Unfortunately implementation for
JSON inputs has a bunch of bugs that don't exist in the
CommandLineProject code Specifically, duplicate analyzer
references are not correctly handled which triggered this investigation.

In this commit, I'm fixing some bugs in CommandLineProject and then
making the binlog code just directly call into CommandLineProject which
removes the buggy code for processing JSON inputs from the binlog path.
As a follow up, we should either also move the JSON input handling to
CommandLineProject after adding path mapping support to it, or
just deprecate that support entirely.
@jasonmalinowski jasonmalinowski requested a review from a team as a code owner June 2, 2023 03:56
@dotnet-issue-labeler dotnet-issue-labeler bot added Area-IDE untriaged Issues and PRs which have not yet been triaged by a lead labels Jun 2, 2023
@jasonmalinowski jasonmalinowski changed the title Fix lsif indexing handling of duplicate references Fix LSIF indexing handling of duplicate references Jun 2, 2023
@jasonmalinowski jasonmalinowski self-assigned this Jun 2, 2023
@jasonmalinowski jasonmalinowski changed the title Fix LSIF indexing handling of duplicate references Fix LSIF indexing handling of duplicate analyzer references Jun 2, 2023
hostObjectType: null);

return projectInfo;

IList<DocumentInfo> CreateDocuments(ImmutableArray<CommandLineSourceFile> files)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this just extracting the helper and additionally uses it for analyzerConfigDocuments? Is there other changes in how those documents are created?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@genlu: yeah for some reason it was copied/pasted for additional files. The only change I saw was special handling around regular/script, but since the compiler's inputs will just count additional files as "regular" that wouldn't be a problem in the first place.

Copy link
Member

@genlu genlu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@jasonmalinowski jasonmalinowski merged commit e006309 into dotnet:main Jun 5, 2023
25 checks passed
@jasonmalinowski jasonmalinowski deleted the fix-lsif-indexing-handling-of-duplicate-references branch June 5, 2023 21:23
@ghost ghost added this to the Next milestone Jun 5, 2023
@RikkiGibson RikkiGibson modified the milestones: Next, 17.7 P3 Jun 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-IDE untriaged Issues and PRs which have not yet been triaged by a lead
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants