-
Notifications
You must be signed in to change notification settings - Fork 108
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.
Already on GitHub? Sign in to your account
The decimal data type is transformed after the data is inserted. #751
Comments
Hello @as10128, can you share with us a minimal code example that reproduces this error? |
Hello @ndrluis ,I wrote a test code, I used minio and hive catalog, you need to modify the aws_access_key, aws_secret_key, s3_endpoint, hive_thrift_uri, warehouse_uri parameters.The results are annotated in the code.
|
The problem is not with append; it is something occurring when we call model_copy(deep=True) on the metadata object. When I was debugging, I couldn't find where the update was occurring. So, I did a sanity check by creating a table and then calling table.metadata.model_copy(deep=True), and then I was able to reproduce the error. def test_schema_mutation(catalog: SqlCatalog, table_schema_decimal: Schema, random_identifier: Identifier) -> None:
database_name, _table_name = random_identifier
catalog.create_namespace(database_name)
table = catalog.create_table(random_identifier, table_schema_decimal)
schema = table.schema()
schema_copy = table.metadata.model_copy(deep=True).schemas[0]
assert schema == schema_copy Where table_schema_decimal is: Schema(
NestedField(field_id=1, name="integer", field_type=IntegerType(), required=True),
NestedField(field_id=2, name="decimal_32_2", field_type=DecimalType(32, 2), required=False),
NestedField(field_id=3, name="decimal_32_16", field_type=DecimalType(32, 16), required=False),
schema_id=0,
identifier_field_ids=[1],
) I will continue trying to understand what is happening, but I don't know if some other contributor can help me to understand where the problem is. |
I have discovered the problem, but I don't know how to solve it. I removed the Singleton inheritance from the PrimitiveType class and this solved the problem. While debugging, I discovered that when the model_copy runs, the DecimalType in the Singleton initializer does not receive any arguments; it's an empty tuple (https://github.com/apache/iceberg-python/blob/main/pyiceberg/utils/singleton.py#L45). I believe this is the problem. @Fokko, can you help with a suggestion to solve this? |
Apache Iceberg version
0.6.0 (latest release)
Please describe the bug 馃悶
Version: Pyiceberg 0.6.1
I create a table, there are multiple columns of type decimal, decimal(32,16), decimal(32,2),
I continue to insert data, and an error is reported. I still want to use the schema I created as the schema for my iceber table, how do I do that?after I build the table successfully, the metadata.json file shows this,
After using the append method to insert data, the columns of decimal data type all become decimal(32,2).
The text was updated successfully, but these errors were encountered: