Skip to content

Fix incorrect model files on case-insensitive file systems#3148

Merged
koesie10 merged 2 commits intomainfrom
koesie10/model-editor-capitalization
Dec 18, 2023
Merged

Fix incorrect model files on case-insensitive file systems#3148
koesie10 merged 2 commits intomainfrom
koesie10/model-editor-capitalization

Conversation

@koesie10
Copy link
Copy Markdown
Member

This fixes some incorrect model files on case-insensitive file systems when the package names are the same but the capitalization is different.

For example, when there are two packages Volo.Abp.TestApp.MongoDb and Volo.Abp.TestApp.MongoDB, there would be 1 model file for each package. However, on case-insensitive file systems, the second file would overwrite the first file. This results in missing models. This fixes it by canonicalizing the filenames to lowercase and writing all files with the same package name to the same file.

Screenshot 2023-12-15 at 15 10 32

Checklist

  • CHANGELOG.md has been updated to incorporate all user visible changes made by this pull request.
  • Issues have been created for any UI or other user-facing changes made by this pull request.
  • [Maintainers only] If this pull request makes user-facing changes that require documentation changes, open a corresponding docs pull request in the github/codeql repo and add the ready-for-doc-review label there.

This fixes some incorrect model files on case-insensitive file systems
when the package names are the same but the capitalization is different.

For example, when there are two packages `Volo.Abp.TestApp.MongoDb` and
`Volo.Abp.TestApp.MongoDB`, there would be 1 model file for each
package. However, on case-insensitive file systems, the second file
would overwrite the first file. This results in missing models. This
fixes it by canonicalizing the filenames to lowercase and writing all
files with the same package name to the same file.
@koesie10 koesie10 marked this pull request as ready for review December 15, 2023 14:24
@koesie10 koesie10 requested a review from a team as a code owner December 15, 2023 14:24
Comment thread extensions/ql-vscode/src/model-editor/yaml.ts Outdated

// Ensure that if a file exists on disk, we use the same capitalization
// as the original file.
actualFilenameByCanonicalFilename[canonicalFilename] = filename;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering what will happen if two files with the same canonicalization already exist on disk.

What I believe will happen is that we'll arbitrarily pick one of them as the true filename. Which one we pick will depend on the order they were listed by codeql resolve extensions when we listed existing packs. We'll merge their data together into one file (unless they contain data for the same method signature, in which the canonical file will win) and write all of this data back out to the canonical file. The non-canonical file will remain untouched.

This could be quite bad actually. When the extension or CodeQL reads in data extensions it'll read data from both the canonical and non-canonical files, but when the extension writes data it'll only update the canonical file. Effectively the non-canonical file will because invisible and untouchable by the model editor, but it'll continue to exist and supply model data to CodeQL.

Do you agree with my understanding here, or have I got it wrong?

I'm not sure what else we could do here. Should we display a warning? Or could we go as far as to delete the other file after we've merged its data into the canonical file?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're right, it'll just pick one arbitrarily and merge all data into that file. I'm not sure if we should be deleting the file because we could be deleting data that is not in the merged file (this can happen if the CodeQL query doesn't return the method). This can only happen on Linux or on a (non-default) case-sensitive filesystem on Windows/macOS.

I'm not sure if we should be handling this case because having two files with the same canonical filename will already fail on Windows/macOS. For example, if you create a model pack with a.yml and A.yml, I suspect that the CodeQL CLI will already arbitrarily pick one of these files on case-insensitive filesystems. Starting from this PR, we should never create filenames with different capitalizations, and I don't really think this is something the model editor should check.

This bug is also very unlikely to happen in Java since package names are usually lowercase, and even in C# it should be very unlikely. The only reason that I ran into this bug is because one database contained multiple test projects with different capitalization.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This bug is also very unlikely to happen in Java since package names are usually lowercase, and even in C# it should be very unlikely.

That's a good point. It is unlikely to happen in practice. It's likely only to happen if people manually create a conflicting file. Given we don't have many users right now and it's unlikely someone hit this already I think we can ignore it for now.

we could be deleting data that is not in the merged file

Doesn't change our decision, but I don't think this is true. For every method we're saving we'll load the full contents of any files that canonicalise that way, and then write the canonical files back out again.

Anyway, let's move on for now. We're agreed we want to do this transformation to the canonical file. If any users hit bugs or confusion around this then we can add extra logging or warnings later on.

Comment thread extensions/ql-vscode/test/unit-tests/model-editor/yaml.test.ts
Copy link
Copy Markdown
Contributor

@robertbrignull robertbrignull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM as a fix for case-insensitive file systems.

There could still be edge cases. We could cover this in more unit tests. But maybe we should avoid getting bogged down and instead just go ahead and assess further if we see bugs.

@koesie10 koesie10 merged commit 82f17c4 into main Dec 18, 2023
@koesie10 koesie10 deleted the koesie10/model-editor-capitalization branch December 18, 2023 15:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants