dolt table import requires an empty column for AUTO_INCREMENT columns #5848

alecstein · 2023-05-01T17:30:52Z

This works.

create table bodegas (
  bodega_id INT AUTO_INCREMENT,
  bodega_name VARCHAR(200),
  primary key (`bodega_id`)
  );
  
insert into bodegas values (NULL, 'The Bodega of Cats')

This also works.

create table bodegas (
  bodega_id INT AUTO_INCREMENT,
  bodega_name VARCHAR(200),
  primary key (`bodega_id`)
  );
  
insert into bodegas (bodega_name) values ('The Bodega of Cats')

However this does not work:

create table bodegas (
  bodega_id INT AUTO_INCREMENT,
  bodega_name VARCHAR(200),
  primary key (`bodega_id`)
  );

bodegas.csv

bodega_name
The Bodega of Cats

> dolt table import -u bodegas.csv
Error determining the output schema.
cause: input primary keys do not match primary keys of existing table
When attempting to move data from csv file:bodega.csv to bodegas, could not determine the output schema.
Schema File: ""
explicit pks: ""

To restate the point more concisely, you can do insert into bodegas (bodega_name) values ('The Bodega of Cats') without drawing a foul, but you can't insert the same data

bodega_name
The Bodega of Cats

from a CSV.

You should be allowed to insert files via CSV without providing the bodega_id column.

The text was updated successfully, but these errors were encountered:

fulghum · 2023-05-02T16:32:15Z

It seems like dolt table import's logic to map from the source csv to the destination table schema is smart enough to know that any columns with default values don't need to be included in the csv. Changing the repro in issue #5855 to use a column default (instead of a trigger) allowed dolt table import to happily import the csv.

dolt table import doesn't seem to make the same decision for an auto_increment column though. If we're going to do that for cols with default values, it seems fair to do the same for auto_increment columns.

The docs for dolt table import don't explain this detail today either:

If the schema for the existing table does not match the schema for the new file, the import will be aborted by default. To overwrite both the table and the schema, use -c -f.

Seems like we should make two changes:

treat auto_increment columns as optional to specify in the input file for dolt table import, just like we do for default columns.
update the docs for dolt table import to explain how we handle columns with default values that don't need to be supplied.

nicktobey · 2023-05-03T23:34:01Z

This appears to be a case of overzealous checks. import.go::newImportSqlEngineMover has a check that every PK exists in the csv and errors if it doesn't. Removing that check prints a warning further down the line:

Warning: There are fewer columns in the import file's schema than the table's schema.
If unintentional, check for any typos in the import file's header.

But all rows get successfully added.

We probably want to fix this check to skip auto incrementing rows.

What's curious is why the workaround from #5855 worked, if this check exists. It turns out, our SQL engine appears to consider a result table column to not be a primary key if it has a default value!

alecstein · 2023-05-04T15:28:05Z

What's curious is why the workaround from #5855 worked, if this check exists. It turns out, our SQL engine appears to consider a result table column to not be a primary key if it has a default value!

You can correct me if I'm wrong, but here's how I interpret the CSV import logic: "make sure that the CSV has all the necessary primary key columns, except if it has a default value for one of those columns. In that case, replace the column with the default value." This is equivalent to saying "if it has a default value, consider this column to not be a primary key, for the purpose of this initial check." Which seems reasonable to me.

However I still think that the "correct" way to import a CSV is to check that the inserted row has all of its primary key values upon inserting, and not upon reading. Whether this can be done efficiently I don't really know.

nicktobey · 2023-05-04T20:59:38Z

There's definitely a bug on our end where tables with default values have an incorrect schema. I filed it as dolthub/go-mysql-server#1753

I agree with your reasoning here though: we perform this check as soon as the CSV is read, but it's overbroad and we should instead be running checks like this when we're actually performing the inserts... and we are. This check appears to be redundant, in order to fail early and return a more helpful error message. But it's not helpful if it's wrong.

I could relax the check, but I don't think we're going to be able to fix #5855 unless we remove it entirely. So that's what I'm going to do.

nicktobey · 2023-05-08T18:42:27Z

Should be fixed as of #5881

alecstein added the bug Something isn't working label May 1, 2023

alecstein changed the title ~~dolt table import requires an empty column fro AUTO_INCREMENT columns~~ dolt table import requires an empty column for AUTO_INCREMENT columns May 1, 2023

alecstein added the enhancement New feature or request label May 1, 2023

alecstein mentioned this issue May 2, 2023

Trigger requires an empty column in order to set new row value with dolt table import #5855

Closed

timsehn added the cli label May 3, 2023

nicktobey self-assigned this May 3, 2023

nicktobey closed this as completed May 8, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dolt table import requires an empty column for AUTO_INCREMENT columns #5848

dolt table import requires an empty column for AUTO_INCREMENT columns #5848

alecstein commented May 1, 2023 •

edited

fulghum commented May 2, 2023

nicktobey commented May 3, 2023

alecstein commented May 4, 2023

nicktobey commented May 4, 2023

nicktobey commented May 8, 2023

dolt table import requires an empty column for AUTO_INCREMENT columns #5848

dolt table import requires an empty column for AUTO_INCREMENT columns #5848

Comments

alecstein commented May 1, 2023 • edited

fulghum commented May 2, 2023

nicktobey commented May 3, 2023

alecstein commented May 4, 2023

nicktobey commented May 4, 2023

nicktobey commented May 8, 2023

alecstein commented May 1, 2023 •

edited