Convert schema to JSON for Gemini #14

grighetto · 2020-09-24T21:11:06Z

┆Issue is synchronized with this Asana task by Unito

grighetto · 2020-09-24T21:14:18Z

DEPENDS ON #12

Let's first see the outcome of #12 before we go down this route.

aboudreault · 2020-09-28T19:10:22Z

@grighetto about the --drop-schema option: This basically just ensure that the keyspace ks1 (default name) is dropped and recreated at start. If I create a custom ks1.table1 table and omit the -d option, gemini fails to execute the statements because it tries to use its default schema (unless I provide one).

So I'm afraid we'll need to implement a --json option to our anonymizer if we want to use gemini. You probably looked more at Gemini than me, am I missing something here? If not, our options are:

1- Implement the json option to the anonymizer. One concern I have about this is that we don't have a clear vision of what the gemini schema format supports. So it's hard to know if we'll hit issues in the future about unsupported things, like table options or data types.
2- Help the nosqlbench devs to get what we need for providing schema as input and generating statements automatically.

grighetto · 2020-09-28T19:34:25Z

@aboudreault right, we need to generate the JSON file either way if we want to use Gemini (more details in #12). I only suggested using the --drop-schema=false to make sure Gemini will not recreate the schema, that is, it will use the schema we create beforehand with the output from the anonymizer. This way we guarantee NoSQLBench and Gemini will operate on the exact same schema.

To answer your questions, I think the simplest solution at the moment is probably generating the JSON file for Gemini. You can gain some insight on the format by letting it generate a random schema, which it does by default if you don't provide one and then checking the JSON schema it prints to the console at startup.
There's some work happening already on the NoSQLBench side to generate the statements automatically, but that's more involved. Let's start with Gemini and add NoSQLBench at a later moment.

aboudreault · 2020-09-28T19:46:31Z

Ok I see now what you were suggesting.

1- Use the anonymizer to generate the cql schema file.
2- Load the cql schema file on the clusters.
3- use gemini with -d=false to use the schema already loaded in the cluster. This ensures gemini doesn't recreate stuff, which could potentially create inconsistencies between the CQL schema and the generated one.
4- Provide the appropriate schema.json to gemini, so it is able to generate statements etc.

Sounds good. I will start looking at this.

aboudreault · 2020-09-30T16:03:41Z

For the MVP, udt types are going to be skipped.

grighetto added this to the Gemini Robustness milestone Sep 24, 2020

grighetto mentioned this issue Sep 26, 2020

Make Gemini work with an existing schema #12

Closed

grighetto added the medium label Sep 28, 2020

aboudreault mentioned this issue Sep 28, 2020

#8: Initial work to provide a schema.cql as input #11

Merged

aboudreault self-assigned this Sep 28, 2020

aboudreault modified the milestones: Gemini Robustness, MVP Sep 30, 2020

aboudreault mentioned this issue Oct 1, 2020

Add gemini schema support to anonymizer #22

Merged

aboudreault closed this as completed Oct 6, 2020

grighetto mentioned this issue Oct 16, 2020

Convert a given anonymized schema file to Gemini JSON #27

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Convert schema to JSON for Gemini #14

Convert schema to JSON for Gemini #14

grighetto commented Sep 24, 2020 •

edited by sync-by-unito bot

Loading

grighetto commented Sep 24, 2020

aboudreault commented Sep 28, 2020

grighetto commented Sep 28, 2020 •

edited

Loading

aboudreault commented Sep 28, 2020 •

edited

Loading

aboudreault commented Sep 30, 2020

Convert schema to JSON for Gemini #14

Convert schema to JSON for Gemini #14

Comments

grighetto commented Sep 24, 2020 • edited by sync-by-unito bot Loading

grighetto commented Sep 24, 2020

aboudreault commented Sep 28, 2020

grighetto commented Sep 28, 2020 • edited Loading

aboudreault commented Sep 28, 2020 • edited Loading

aboudreault commented Sep 30, 2020

grighetto commented Sep 24, 2020 •

edited by sync-by-unito bot

Loading

grighetto commented Sep 28, 2020 •

edited

Loading

aboudreault commented Sep 28, 2020 •

edited

Loading