Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upload database with commit SHA and branch name #1399

Open
PavelBansky opened this issue Nov 28, 2022 · 8 comments
Open

Upload database with commit SHA and branch name #1399

PavelBansky opened this issue Nov 28, 2022 · 8 comments
Assignees
Labels
CodeQL Action This repo! Helps for internal planning enhancement New feature or request

Comments

@PavelBansky
Copy link

Would it be possible to extend the CodeQL database upload/download API to contain the commit sha and branch name?
Having a database but not knowing which source code version it is belonging too, is not very useful.

This is code from src/database-upload.ts. Perfect place to pass the SHA and branch.

        `POST https://uploads.github.com/repos/:owner/:repo/code-scanning/codeql/databases/:language?name=:name`,
        {
          owner: repositoryNwo.owner,
          repo: repositoryNwo.repo,
          language,
          name: `${language}-database`,
          data: payload,
          headers: {
            authorization: `token ${apiDetails.auth}`,
            "Content-Type": "application/zip",
          },
        }

When calling the database list API:
https://api.github.com/repos/{repository full name}/code-scanning/codeql/databases

It would be nice to see the branch name and commit sha in the response.

[
{
    "id": 11071980,
    "name": "javascript-database",
    "language": "javascript",
    "uploader": {
            --- REMOVED TO REDUCE COMPLEXITY ---
    },
    "content_type": "application/zip",
    "size": 5680496,
    "created_at": "2022-11-28T14:19:59Z",
    "updated_at": "2022-11-28T14:19:59Z",
    "url": "https://api.github.com/repositories/553492177/code-scanning/codeql/databases/javascript"
  }
]
@aeisenberg
Copy link
Contributor

Thanks for your feedback. I think this information is already available, though perhaps not quite in an easily accessible format:

  1. The SHA that the database was created from is available in the codeql-database.yml file inside the downloaded database in the creationMetadata.sha property.
  2. The branch is always the repositories default branch.

Is this the information you are looking for? We can do better at exposing this information and we are discussing internally.

@aeisenberg aeisenberg added enhancement New feature or request CodeQL Action This repo! Helps for internal planning labels Nov 28, 2022
@aeisenberg aeisenberg self-assigned this Nov 28, 2022
@PavelBansky
Copy link
Author

PavelBansky commented Nov 28, 2022

@aeisenberg, thank you for your response.

  1. Ok, that is good to know, but it is forcing me to download the DB first and unzip it. Which is a bit cumbersome

  2. I know that the branch SHOULD be always the default, but it's not a hard requirement, right? Nothing prevents the caller of the API to upload database from user branch.

It would be cool, if you could expose this information in the API response. We are working on some integration service and being able to quickly tell if the database same or different version of the source code, would be extremely helpful.

@aeisenberg
Copy link
Contributor

Something to clarify here. The upload database request is not public API and is restricted to the default branch. We have no intention of changing that. The list databases request is public API and you will receive at most one database per language. When newer databases are uploaded for a language, the older one is no longer available.

@PavelBansky
Copy link
Author

@aeisenberg, it's good that only default branch can be upload but, I see the following check in src/database-upload.ts

  if (!(await actionsUtil.isAnalyzingDefaultBranch())) {
    // We only want to upload a database if we are analyzing the default branch.
    logger.debug("Not analyzing default branch. Skipping upload.");
    return;
  }

It looks like it is not the API restricting the user branch upload.

I know that only one database per language can be uploaded and I really like it.
But I still can't tell if the database changed from the API response. The created_at field will be different, but that doesn't mean the source code inside is different.

We have GitHubApp that checks for CodeQL databases and downloads them into our internal archive. Right now, it's hard to tell from the API response, if the database changed. Unless I download it and check the creationMetadata.sha

@aeisenberg
Copy link
Contributor

It looks like it is not the API restricting the user branch upload.

This is an internal, undocumented request. You should not be explicitly calling this request. I know the server side does some checking on the request data, but I am not sure exactly to what extent it checks.

Are you downloading databases from third party repos and concerned that they may be calling this API in the wrong way?

But I still can't tell if the database changed from the API response. The created_at field will be different, but that doesn't mean the source code inside is different.

We're discussing internally.

@PavelBansky
Copy link
Author

PavelBansky commented Nov 29, 2022

@aeisenberg, you are correct. We are downloading databases from accounts/repos in our Enterprise github. Basically, creating archive of databases. In case of security incident, we can easily query all of the databases for a specific code pattern.

The branch is not a big concern. The commit SHA (and ideally CodeQL version) in the API response would be really nice, thou...

@norascheuch
Copy link
Contributor

The commit SHA is now part of the API response, when getting CodeQL databases! Every database who is newly created will have the SHA. It will take some time until older databases - for which we hadn't stored the commit SHA - will be replaced by newer builds that include the SHA.

@aeisenberg
Copy link
Contributor

The codeql version used to create the database is available in the codeql-database.yml file. Is this sufficient for your purposes?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CodeQL Action This repo! Helps for internal planning enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants