Skip to content

Add sgr cloud download (download query as CSV)#587

Merged
mildbyte merged 7 commits intomasterfrom
feature/CU-1nwr3yk-sgr-cloud-export
Dec 17, 2021
Merged

Add sgr cloud download (download query as CSV)#587
mildbyte merged 7 commits intomasterfrom
feature/CU-1nwr3yk-sgr-cloud-export

Conversation

@mildbyte
Copy link
Copy Markdown
Contributor

Use the GQL Export API to download query results as a .csv.gz file.

Sample usage:

$ cat test.sql
SELECT * FROM "some/repo".some_table

$ sgr cloud download "$(cat test.sql)"
⣯ (SUCCESS) Waiting for task ID 5ebe8857-5592-4271-a6af-0360f2b74692
query-e2abcf218bbc964ff9999b08c06c97447018c395-20211217-134600.csv.gz: 100%|██████████████████████████████████████| 104M/104M [00:07<00:00, 13.6MB/s]
Downloaded query results to query-e2abcf218bbc964ff9999b08c06c97447018c395-20211217-134600.csv.gz.

(splitgraph-3.9.4)  ~ $ cat query-e2abcf218bbc964ff9999b08c06c97447018c395-20211217-134600.csv.gz | gunzip | wc -l
12232602

@mildbyte mildbyte merged commit 16ef8f6 into master Dec 17, 2021
@mildbyte mildbyte deleted the feature/CU-1nwr3yk-sgr-cloud-export branch December 17, 2021 15:37
mildbyte added a commit that referenced this pull request Dec 17, 2021
Fleshing out the `splitgraph.yml` (aka `repositories.yml`) format that defines a Splitgraph Cloud "project" (datasets, their sources and metadata).

Existing users of `repositories.yml` don't need to change anything, though note that `sgr cloud` commands using the YAML format will now default to `splitgraph.yml` unless explicitly set to `repositories.yml`.


New sgr cloud commands:

See #582 and #587

These let users manipulate Splitgraph Cloud and ingestion jobs from the CLI:

  * `sgr cloud status`: view the status of ingestion jobs in the current project
  * `sgr cloud logs`: view job logs
  * `sgr cloud upload`: upload a CSV file to Splitgraph Cloud (without using the engine)
  * `sgr cloud sync`: trigger a one-off load of a dataset
  * `sgr cloud stub`: generate a `splitgraph.yml` file
  * `sgr cloud seed`: generate a Splitgraph Cloud project with a `splitgraph.yml`, GitHub Actions, dbt etc
  * `sgr cloud validate`: merge multiple project files and output the result (like `docker-compose config`)
  * `sgr cloud download`: download a query result from Splitgraph Cloud as a CSV file, bypassing time/query size limits.


repositories.yml/splitgraph.yml format:

Change various commands that use `repositories.yml` to default to `splitgraph.yml` instead. Allow "mixing in" multiple `.yml` files Docker Compose-style, useful for splitting credentials (and not checking them in) and data settings.

Temporary location for the new full documentation on `splitgraph.yml`: https://github.com/splitgraph/splitgraph.com/blob/f7ac524cb5023091832e8bf51b277991c435f241/content/docs/0900_splitgraph-cloud/0500_splitgraph-yml.mdx


Miscellaneous:

  * Initial backend support for "transforming" Splitgraph plugins, including dbt (#574)
  * Dump scheduled ingestion/transformation jobs with `sgr cloud dump` (#577)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant