Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

INSERT INTO SQL failing on CSV-backed table #10324

Open
singularsyntax opened this issue May 1, 2024 · 3 comments
Open

INSERT INTO SQL failing on CSV-backed table #10324

singularsyntax opened this issue May 1, 2024 · 3 comments
Labels
bug Something isn't working

Comments

@singularsyntax
Copy link

Describe the bug

Hello,

When I try to insert data with the INSERT INTO SQL syntax (see reproduction code below), I get the error: Inserting query must have the same schema with the table.

[2024-05-01T00:48:23Z INFO] TABLE SCHEMA: DFSchema { fields: [DFField { qualifier: Some(Bare { table: "test" }), field: Field { name: "k", data_type: Utf8, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} } }, DFField { qualifier: Some(Bare { table: "test" }), field: Field { name: "v", data_type: Utf8, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} } }], metadata: {}, functional_dependencies: FunctionalDependencies { deps: [FunctionalDependence { source_indices: [0], target_indices: [0, 1], nullable: false, mode: Single }] } }
[2024-05-01T00:48:23Z INFO] DATAFRAME SCHEMA: DFSchema { fields: [DFField { qualifier: None, field: Field { name: "k", data_type: Utf8, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} } }, DFField { qualifier: None, field: Field { name: "v", data_type: Utf8, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} } }], metadata: {}, functional_dependencies: FunctionalDependencies { deps: [] } }
thread 'main' panicked at src/main.rs:317:88:
called `Result::unwrap()` on an `Err` value: Plan("Inserting query must have the same schema with the table.")

As logged above, the problem seems to be in the discrepancy between the table schema, which is qualified with the table name, and the query schema, which is not.

The code I'm using is about as simple as I can imagine. Am I missing something? Is there some example code that demonstrates how to use INSERT INTO SQL correctly? Or is this a bug?

To Reproduce

async fn df_test() {
    let ctx = SessionContext::new();
    let sql = "CREATE EXTERNAL TABLE test (k VARCHAR PRIMARY KEY NOT NULL, v VARCHAR NOT NULL) STORED AS CSV LOCATION './store/test/'";
    let df = ctx.sql(sql).await.unwrap();

    df.collect().await.unwrap();

    let table_df = ctx.table("test").await.unwrap();
    info!("TABLE SCHEMA: {:?}", table_df.schema());
 
    let sql = "INSERT INTO test (k, v) VALUES ('foo', 'bar')";
    let query_df = ctx.sql(sql).await.unwrap();
    info!("DATAFRAME SCHEMA: {:?}", query_df.schema());

    let _result = query_df.write_table("test", DataFrameWriteOptions::default()).await.unwrap();
}

Expected behavior

Insertion of the row ('foo', 'bar') is successful. DataFusion creates a CSV file in the filesystem corresponding to the inserted data.

Additional context

[dependencies]
datafusion = "37.1.0"
@singularsyntax singularsyntax added the bug Something isn't working label May 1, 2024
@singularsyntax
Copy link
Author

Additional information:

If I replace the call to write_table() with write_csv():

let _result = query_df.write_csv("foo", DataFrameWriteOptions::default(), None).await.unwrap();

I get the following error:

thread 'main' panicked at ~/.cargo/registry/src/index.crates.io-6f17d22bba15001f/datafusion-physical-plan-37.1.0/src/insert.rs:127:9:
assertion `left == right` failed
  left: 2
 right: 1

@phillipleblanc
Copy link
Contributor

This looks like a bug. I wonder if this is a regression from #9595?

@yyy1000
Copy link
Contributor

yyy1000 commented May 3, 2024

I think it's a latent bug which doesn't relate to #9595 , I tested using version 36 code.
I can try to help it to see what's wrong with it. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants