Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bug] Encountered Chinese garbled code issue during Iceberg rest HTTP transmission #7821

Closed
hxgdada opened this issue Jun 12, 2023 · 3 comments · Fixed by #8046
Closed

[bug] Encountered Chinese garbled code issue during Iceberg rest HTTP transmission #7821

hxgdada opened this issue Jun 12, 2023 · 3 comments · Fixed by #8046
Assignees

Comments

@hxgdada
Copy link

hxgdada commented Jun 12, 2023

Apache Iceberg version

ALL

Query engine

None

Please describe the bug 🐞

org.apache.iceberg.rest.HTTPClient

origin:
private StringEntity toJson(Object requestBody) {
try {
return new StringEntity(mapper.writeValueAsString(requestBody));
} catch (JsonProcessingException e) {
throw new RESTException(e, "Failed to write request body: %s", requestBody);
}
}

After modification:
private StringEntity toJson(Object requestBody) {
try {
return new StringEntity(mapper.writeValueAsString(requestBody), Charset.forName("UTF-8"));
} catch (JsonProcessingException e) {
throw new RESTException(e, "Failed to write request body: %s", requestBody);
}
}

@hxgdada hxgdada changed the title Encountered Chinese garbled code issue during Iceberg rest HTTP transmission [bug] Encountered Chinese garbled code issue during Iceberg rest HTTP transmission Jun 12, 2023
@nastra
Copy link
Contributor

nastra commented Jun 29, 2023

This probably makes sense. @hxgdada do you have a small reproducible example?

@hxgdada
Copy link
Author

hxgdada commented Jun 30, 2023

This probably makes sense. @hxgdada do you have a small reproducible example?
This is a case study where I used iceberg rest to create a table in trino and found that the Chinese fields became garbled. Through debugging, I found that the iceberg rest service converted the Chinese to a special encoding during HTTP protocol transmission. The screenshot is shown below. Subsequently, I added Charset. forName ("UTF-8") to the HTTP method of iceberg rest to solve this problem

Before modification

修改前

修改前建表

修改前show表

After modification

修改后
修改后建表
@nastra

nastra added a commit to nastra/iceberg that referenced this issue Jul 12, 2023
fixes apache#7821

Without the fix, tests would fail with
```
expected: struct<1: id: required int (unique ID 😀), 2: data: required string>
 but was: struct<1: id: required int (unique ID ?), 2: data: required string>
```
@nastra
Copy link
Contributor

nastra commented Jul 12, 2023

@hxgdada thanks for reporting this. I've opened #8046 that should fix it.

nastra added a commit to nastra/iceberg that referenced this issue Jul 12, 2023
fixes apache#7821

Without the fix, tests would fail with
```
expected: struct<1: id: required int (unique ID 🤪), 2: data: required string>
 but was: struct<1: id: required int (unique ID ?), 2: data: required string>
```
@nastra nastra self-assigned this Jul 12, 2023
nastra added a commit to nastra/iceberg that referenced this issue Jul 13, 2023
fixes apache#7821

Without the fix, tests would fail with
```
expected: struct<1: id: required int (unique ID 🤪), 2: data: required string>
 but was: struct<1: id: required int (unique ID ?), 2: data: required string>
```
danielcweeks pushed a commit that referenced this issue Jul 13, 2023
fixes #7821

Without the fix, tests would fail with
```
expected: struct<1: id: required int (unique ID 🤪), 2: data: required string>
 but was: struct<1: id: required int (unique ID ?), 2: data: required string>
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants