-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Binary datasets #3513
Comments
We currently have binary datasets under dataset folder for extension testing purpose.
|
Well, they wouldn't be CI artifacts locally, but there are a couple of options I think:
That said, for the tinysnb dataset used by the extension, is it necessary that it be shared between different platforms? The binary-demo dataset was added explicitly for testing stability across multiple platforms, but the tinysnb one seems like it could just always be generated as part of the
You download them first. See Passing data between jobs in a workflow in the github docs. |
Instead of storing binary datasets in the git repository, it might be a better idea to generate them as CI artifacts and share them between pipelines when they are used for compatibility tests.
That would remove the need to regenerate them and clutter the repository with large binary artifacts (and also mean that the compatibility tests could use larger datasets more easily).
On the other hand it will add some CI pipeline dependencies that overall may slow down the pipelines (the job generating the database would need to be run before other jobs which use the database are started, and that job will need to build kuzu), but it looks like most of the build steps are only taking about a minute, so it shouldn't be too much of a slowdown.
Edit: It's also really annoying to have to deal with the binary database tests locally when making changes which break storage, as short of updating the version number every time you make a change, there's nothing preventing it from running on the old database and potentially allocating arbitrarily large amounts of memory when data isn't where it should be. It might be better to have those tests be disabled by default and just run in CI
The text was updated successfully, but these errors were encountered: