Skip to content

Extension should notify users when their QL libraries are out of date #495

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
aeisenberg opened this issue Jul 10, 2020 · 5 comments
Open
Labels
enhancement New feature or request VSCode

Comments

@aeisenberg
Copy link
Contributor

Background

Most users consume QL through the vscode-codeql-starter workspace. Some cli and ide features not work unless you have up to date QL libraries. Most users aren't aware of this and there is no mechanism to notify users that updates need to happen. The only situation where users are notified is when they add a new database built from a later version of the QL libraries. However, this is not sufficient.

Proposed solution

The extension should periodically check to see if there are new QL libraries available and warn the user if there are. Optionally, the extension could try to install the changes automatically. Otherwise, clear manual instructions should be given.

Caveats and questions

  1. This will only work for the vscode-codeql-starter workspace.
  2. What is considered a new version? We probably don't want to assume any change in master is new. Likely, we want to parse tags and look for new version numbers. Or we can do this through branches.
  3. We can attempt to automatically upgrade. This is just going to be pulling the submodules, which probably won't do any harm unless a user has made changes or commits to the core queries.

It might look like this:

  1. On restart or periodically, fetch the tags for submodules
  2. Check to see if any new tagged version exists.
  3. If so prompt the user to update their libraries
  4. Ensure there is clear instuctions on how to do this.
@aeisenberg aeisenberg added the enhancement New feature or request label Jul 10, 2020
@dbartol
Copy link
Contributor

dbartol commented Jul 11, 2020

As a security researcher playing with a one-off query, I'd probably be fine with always using the latest official QL libraries. As a QL query author developing a query to be used by the rest of my team or company, I really need my libraries to match the ones that will actually be used by my teammates and CI systems. As a QL library author, I need my libraries to match the ones I expect consumers of my library to use. And in any of these scenarios, if the latest official library contains a breaking change, I don't want the extension to automatically update the libraries and break me, I don't want the extension to prompt me "Do you want to upgrade to enable these cool IDE features? Warning: this may break your code!", and I still want to use the cool IDE features if I can.

It appears that the main reason we would need to have the user update the libraries is because we've chosen to distribute the IDE queries as part of the standard library. However, it looks like the IDE queries we have today depend on a relatively small surface area of the standard libraries. The PrintAst.ql IDE query just imports PrintAST.qll, which has been in the library for a year or two, for C++ anyway. The definitions.ql and its supporting definitions.qll library seem to be even older. What we can do today, or at least with not too much work, is have the IDE queries in a separate qlpack, with a dependency on a particular range of standard library versions. We would have the IDE queries either packaged as part of the extension, or downloaded automatically as separate packs. The former is probably fine for now. If the user's library version is compatible with the IDE query, then that feature works as expected. If the user's library version is of an incompatible version, we would have to degrade gracefully. Most likely, we're not likely to make the sort of changes that would break these queries anytime soon, and even if we did, we could just bundle both an old and an updated version of the IDE query, and choose which one to use based on the available library version.

In the future, a combination of non-destructive database upgrades and true query packages will probably make this easier, but I think we can still make the experience better today without having to ask users to upgrade the libraries their code depends on because their IDE wants to depend on a different library version.

Actual work required:

  • Move (or even just copy) the IDE queries into a separate qlpack that is deployed with the extension. Mark it as dependent upon a particular version or version range of the appropriate standard library.
  • When deciding whether a given IDE feature should be available, check the user's library version to see if it's compatible with the one the query needs. If not, fail gracefully, possibly prompting the user to upgrade.
  • Update the version number of each standard library each time we release a new lgtm.com/Code Scanning/starter repo dist upgrade. The versions are all 0.0.0 right now, so we could probably just bump to something like 0.0.1 and keep incrementing the patch version for each release, since 0.0.* versions aren't expected to preserve compatibility in general. This lets us avoid the question of what constitutes a SemVer breaking change until we're ready to actually deal with that.

@aeisenberg
Copy link
Contributor Author

That's a great idea. It is kind of odd to have extension features be dependent on the sources in your workspace. Your proposal is not going to completely get around that, but at least it puts extension-specific bits in the extension repo.

This will make our logic for generating qlpacks slightly more complex, but still doable.

One question:

Are version checks already implemented? If so, I think the extension wouldn't need to check versions, it just attempts to run the query and if the compiler fails due to bad version constraints, then we report an understandable message back to the user.

@aeisenberg
Copy link
Contributor Author

So, there is a small concern I have with your proposal. I'm not sure how much to make of it since this architectural decision was made before I started.

The way that contextual queries are implemented now, the extension doesn't need to know anything about which languages, libraries, or extractors are available. The extension just knows to run query X, with some bound externals and using the default query pack.

So, as long as the standard libraries for a given language implement the query correctly, the contextual queries can run properly. The extension doesn't need to change to fully support Ruby (for example).

I can see the advantages for separating responsibilities in this manner, but I'm not arguing too strongly for it since I also see the advantage of leaving IDE-things in the IDE. Though, I've had a similar conversation with @p0 and @jcreedcmu in the past., so they may have something different to say.

@dbartol
Copy link
Contributor

dbartol commented Jul 13, 2020

Side note: We should be imagining a world in which the starter repo is not necessary in order to write or use a query. At some point, the process of writing a new query will be basically:

codeql pack init --language ruby
emacs MyQuery.ql
codeql query run ...

(or the IDE equivalent, of course).

The starter repo may still exist, and may still be a useful way to play around with and modify existing queries, but we shouldn't assume that the user has a checkout of the starter repo at all. If the user is writing a query, there will at least be a copy of the standard library installed at some point via the package manager, so we'd still have access to that if we needed it. If the user is merely fixing alerts from an existing query, or just exploring a source code base using CodeQL-assisted contextual queries, there may not be a standard library on the user's machine at all. I don't know how much we care about this last scenario, though.

@dbartol
Copy link
Contributor

dbartol commented Jul 13, 2020

I might be overthinking this. If we include the contextual queries themselves in the standard library, what happens at each of the following events?

Changes to the standard library without affecting the contextual query itself
A new standard library version is released with no changes to the contextual query. This has no effect on users of the contextual query.

Changes to the standard library that affect the results of the contextual query, without changing the contextual query itself
A new standard library version is released. Users of the old version still get the old results, and anyone who updates to the new version of the standard library gets the new results. I think this is exactly how we want it.

Changes to the contextual query without changing the standard library itself
Forces an update (probably just a patch update, though) to the standard library. Forces the user to upgrade to a new standard library version to get the fix, but it should be a safe upgrade.

CodeQL for VS Code implements support for a new contextual query
Users can't use the new UI without updating to a new version of the standard library. This is my main issue with bundling the contextual queries in the standard library, but its impact depends on how often we add new support and how often users upgrade standard library versions. If we implement lots of these, then it's a problem. If we're just going to implement jump-to-def, PrintAst, and maybe PrintIR, it's not that big a deal.

Here's a thought: What if, whenever we start to support a new contextual query, we write the implementation for each language that we care about, and include that implementation within the extension so it's always available, and have it depend on the lowest version of the standard library that it can reasonably support. We also commit that same implementation to the repo for the relevant standard library, or, in the case of an external language, submit a PR to that language's repo. If the user's standard library already has an implementation of the query, we use that. Otherwise, we fall back to the one bundled with the extension if one exists. If we don't have one of those, either, we fail gracefully. Users get to use the new query immediately, implementers of other languages can add support at their leisure, although it will force their users to upgrade to a new standard library to take advantage of it. And if a language implementer does want to allow existing users to use the query, they can submit a PR to the VS Code extension to add the bundled copy. If we ever get to the point where this happens so often that we want a more sophisticated way of automatically discovering a separate query pack containing the implementations of contextual queries for a particular language without bundling them, I think we'll be clearly into "nice problem to have" territory, and can figure it out then.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request VSCode
Projects
None yet
Development

No branches or pull requests

3 participants