Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed-up database downloads #1805

Open
aibaars opened this issue Nov 29, 2022 · 0 comments
Open

Speed-up database downloads #1805

aibaars opened this issue Nov 29, 2022 · 0 comments
Labels
enhancement New feature or request VSCode

Comments

@aibaars
Copy link
Contributor

aibaars commented Nov 29, 2022

Is your feature request related to a problem? Please describe.

I feel like it takes too long to download a CodeQL database from GitHub into VSCode.

Describe the solution you'd like

Use multi-threaded downloads to speed things up.

Describe alternatives you've considered
N/A

Additional context

For example the QL database from github/codeql is only 160MB, but it takes 2 minutes to download. If I concurrently download 10 chunks of the file the download takes less than 10 seconds. I wrote a small bash script to demonstrate.
A single 160MB chunk:

time sh script.sh github/codeql ql 1
gh api -H Accept: application/zip -H Range: bytes=0-165712932 /repos/github/codeql/code-scanning/codeql/databases/ql

real	2m9.894s
user	0m0.439s
sys	0m1.426s

and a download with 10 chunks of 16MB:

time sh script.sh github/codeql ql 10
gh api -H Accept: application/zip -H Range: bytes=0-16571293 /repos/github/codeql/code-scanning/codeql/databases/ql
gh api -H Accept: application/zip -H Range: bytes=16571294-33142587 /repos/github/codeql/code-scanning/codeql/databases/ql
gh api -H Accept: application/zip -H Range: bytes=33142588-49713881 /repos/github/codeql/code-scanning/codeql/databases/ql
gh api -H Accept: application/zip -H Range: bytes=49713882-66285175 /repos/github/codeql/code-scanning/codeql/databases/ql
gh api -H Accept: application/zip -H Range: bytes=66285176-82856469 /repos/github/codeql/code-scanning/codeql/databases/ql
gh api -H Accept: application/zip -H Range: bytes=82856470-99427763 /repos/github/codeql/code-scanning/codeql/databases/ql
gh api -H Accept: application/zip -H Range: bytes=99427764-115999057 /repos/github/codeql/code-scanning/codeql/databases/ql
gh api -H Accept: application/zip -H Range: bytes=115999058-132570351 /repos/github/codeql/code-scanning/codeql/databases/ql
gh api -H Accept: application/zip -H Range: bytes=132570352-149141645 /repos/github/codeql/code-scanning/codeql/databases/ql
gh api -H Accept: application/zip -H Range: bytes=149141646-165712932 /repos/github/codeql/code-scanning/codeql/databases/ql

real	0m9.752s
user	0m1.069s
sys	0m2.009s

The script

#! /bin/bash

nwo="$1"
lang="$2"
count="$3"

URL="/repos/${nwo}/code-scanning/codeql/databases/${lang}"
SIZE=$(gh api  -H "Accept: application/zip" -H "Range: bytes=0-1" -i "${URL}"  | tr -d '\r' |  grep "Content-Range: bytes 0-1/" | cut -d / -f 2)
CHUNK_SIZE=$(expr "${SIZE}" / "${count}")

start=0
parts=""
for i in $(seq $(expr "${count}" - 1))
do
  end=$(expr "${start}" + "${CHUNK_SIZE}")
  echo gh api  -H "Accept: application/zip" -H "Range: bytes=${start}-${end}" "${URL}"
  gh api  -H "Accept: application/zip" -H "Range: bytes=${start}-${end}" "${URL}" > "part-$i" &
  start=$(expr "${end}" + 1)
  parts="${parts}part-${i} "
done

if [ "${start}" -lt "${SIZE}" ] ; then
 echo gh api  -H "Accept: application/zip" -H "Range: bytes=${start}-${SIZE}" "${URL}"
 gh api  -H "Accept: application/zip" -H "Range: bytes=${start}-${SIZE}" "${URL}" > "part-${count}"
 parts="${parts}part-${count}"
fi
wait

cat $parts > database.zip
rm -f $parts
@aibaars aibaars added the enhancement New feature or request label Nov 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request VSCode
Projects
None yet
Development

No branches or pull requests

1 participant