Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Full support of SERIAL data type #1628

Closed
ray6080 opened this issue Jun 5, 2023 · 1 comment
Closed

Full support of SERIAL data type #1628

ray6080 opened this issue Jun 5, 2023 · 1 comment

Comments

@ray6080
Copy link
Contributor

ray6080 commented Jun 5, 2023

For now, SERIAL is added as an experimental feature.
Users can define the primary key property of node tables as SERIAL to avoid the creation of HashIndex in node tables, thus can speed up data ingestion a lot (no index construction and lookups).

Internally, SERIAL is interpreted as INT64, and its values are auto-incremented as new nodes are appended.
Essentially, SERIAL is an alias of a SEQUENCE with data type as int64, start as 0, increment as 1.
For node tuples, we assume their SERIAL values are aligned with our internal node offsets.

Given a node table person with the schema (ID SERIAL, age INT64, name STRING, PRIMARY KEY(ID)).
To completely support SERIAL, we should consider following cases:

  1. Support copy to specified table properties only. For example, users can specify COPY person(name) FROM 'person_names.csv' to copy from a csv file with only one column ("name") into age and name properties. This will let ID and age be their default values, ID will be set as SERIAL auto incremented starting from 0 (suppose the table is empty), and age will all be set to 0.
  2. Copy to SERIAL property from csv files. We should also allow users to copy from csv files into a SERIAL property. This is to handle the case where a table only contains SERIAL properties. For example, CREATE NODE TABLE n(ID SERIAL, PRIMARY KEY(ID)). In this case, we have the constraint that values in the csv file must match the auto increment pattern.
  3. We need to handle creations of node tuples following deletions, where we reuse deleted node offsets, thus we reuse deleted serial values too.
  4. Consider add the support of SEQUENCE if necessary in the future.

This is still debatable. Let me know if there are any ideas around this.

@andyfengHKU
Copy link
Contributor

Close due to duplication with #1496

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants