Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Query with external data over http ended up with error in 21.3 #21953

Closed
zhicwu opened this issue Mar 22, 2021 · 5 comments
Closed

Query with external data over http ended up with error in 21.3 #21953

zhicwu opened this issue Mar 22, 2021 · 5 comments
Labels
bug Confirmed user-visible misbehaviour in official release comp-http http protocol related duplicate

Comments

@zhicwu
Copy link
Contributor

zhicwu commented Mar 22, 2021

Below test case stopped working on 21.3(example) with error saying You have carriage return (\r, 0x0D, ASCII 13) at end of first row., but it works just fine in 20.8, 21.2, and latest.

ClickHouseStatement stmt = connection.createStatement();
ResultSet rs = stmt.executeQuery(
        "select UserName, GroupName " +
                "from (select 'User' as UserName, 1 as GroupId) as g" +
                "any left join groups using GroupId",
        null,
        Collections.singletonList(new ClickHouseExternalData(
                "groups",
                new ByteArrayInputStream("1\tGroup".getBytes())
        ).withStructure("GroupId UInt8, GroupName String"))
);
Expand to see debug logs...
...
2021-03-22 12:46:46.046 [main] [DEBUG] {headers:136} - http-outgoing-0 >> Content-Length: 238
2021-03-22 12:46:46.046 [main] [DEBUG] {headers:136} - http-outgoing-0 >> Content-Type: multipart/form-data; boundary=BV364k1DPnMYurbjnAECXUALtNo2LvPxv7BLwz
2021-03-22 12:46:46.046 [main] [DEBUG] {headers:136} - http-outgoing-0 >> Host: localhost:49459
2021-03-22 12:46:46.046 [main] [DEBUG] {headers:136} - http-outgoing-0 >> Connection: Keep-Alive
2021-03-22 12:46:46.046 [main] [DEBUG] {headers:136} - http-outgoing-0 >> User-Agent: Apache-HttpClient/4.5.13 (Java/11.0.8)
2021-03-22 12:46:46.046 [main] [DEBUG] {wire:73} - http-outgoing-0 >> "POST /?query=select+UserName%2C+GroupName+from+%28select+%27User%27+as+UserName%2C+1+as+GroupId%29+as+gany+left+join+groups+using+GroupId%0AFORMAT+TabSeparatedWithNamesAndTypes%3B&groups_structure=GroupId+UInt8%2C+GroupName+String&compress=1&database=default&extremes=0&query_id=91b6dae5-03d9-4289-a5d7-9ccbd3818cef HTTP/1.1[\r][\n]"
2021-03-22 12:46:46.046 [main] [DEBUG] {wire:73} - http-outgoing-0 >> "Content-Length: 238[\r][\n]"
2021-03-22 12:46:46.046 [main] [DEBUG] {wire:73} - http-outgoing-0 >> "Content-Type: multipart/form-data; boundary=BV364k1DPnMYurbjnAECXUALtNo2LvPxv7BLwz[\r][\n]"
2021-03-22 12:46:46.046 [main] [DEBUG] {wire:73} - http-outgoing-0 >> "Host: localhost:49459[\r][\n]"
2021-03-22 12:46:46.046 [main] [DEBUG] {wire:73} - http-outgoing-0 >> "Connection: Keep-Alive[\r][\n]"
2021-03-22 12:46:46.046 [main] [DEBUG] {wire:73} - http-outgoing-0 >> "User-Agent: Apache-HttpClient/4.5.13 (Java/11.0.8)[\r][\n]"
2021-03-22 12:46:46.046 [main] [DEBUG] {wire:73} - http-outgoing-0 >> "[\r][\n]"
2021-03-22 12:46:46.046 [main] [DEBUG] {wire:73} - http-outgoing-0 >> "--BV364k1DPnMYurbjnAECXUALtNo2LvPxv7BLwz[\r][\n]"
2021-03-22 12:46:46.046 [main] [DEBUG] {wire:73} - http-outgoing-0 >> "Content-Disposition: form-data; name="groups"; filename="groups"[\r][\n]"
2021-03-22 12:46:46.046 [main] [DEBUG] {wire:73} - http-outgoing-0 >> "Content-Type: application/octet-stream[\r][\n]"
2021-03-22 12:46:46.046 [main] [DEBUG] {wire:73} - http-outgoing-0 >> "Content-Transfer-Encoding: binary[\r][\n]"
2021-03-22 12:46:46.046 [main] [DEBUG] {wire:73} - http-outgoing-0 >> "[\r][\n]"
2021-03-22 12:46:46.046 [main] [DEBUG] {wire:73} - http-outgoing-0 >> "1[0x9]Group[\r][\n]"
2021-03-22 12:46:46.046 [main] [DEBUG] {wire:73} - http-outgoing-0 >> "--BV364k1DPnMYurbjnAECXUALtNo2LvPxv7BLwz--[\r][\n]"
2021-03-22 12:46:46.046 [main] [DEBUG] {wire:73} - http-outgoing-0 << "HTTP/1.1 400 Bad Request[\r][\n]"
2021-03-22 12:46:46.046 [main] [DEBUG] {wire:73} - http-outgoing-0 << "Date: Mon, 22 Mar 2021 04:46:46 GMT[\r][\n]"
2021-03-22 12:46:46.046 [main] [DEBUG] {wire:73} - http-outgoing-0 << "Connection: Keep-Alive[\r][\n]"
2021-03-22 12:46:46.046 [main] [DEBUG] {wire:73} - http-outgoing-0 << "Content-Type: text/plain; charset=UTF-8[\r][\n]"
2021-03-22 12:46:46.046 [main] [DEBUG] {wire:73} - http-outgoing-0 << "X-ClickHouse-Server-Display-Name: 66ddceb76e47[\r][\n]"
2021-03-22 12:46:46.046 [main] [DEBUG] {wire:73} - http-outgoing-0 << "Transfer-Encoding: chunked[\r][\n]"
2021-03-22 12:46:46.046 [main] [DEBUG] {wire:73} - http-outgoing-0 << "X-ClickHouse-Exception-Code: 117[\r][\n]"
2021-03-22 12:46:46.046 [main] [DEBUG] {wire:73} - http-outgoing-0 << "Keep-Alive: timeout=3[\r][\n]"
2021-03-22 12:46:46.046 [main] [DEBUG] {wire:73} - http-outgoing-0 << "X-ClickHouse-Summary: {"read_rows":"0","read_bytes":"0","written_rows":"0","written_bytes":"0","total_rows_to_read":"0"}[\r][\n]"
2021-03-22 12:46:46.046 [main] [DEBUG] {wire:73} - http-outgoing-0 << "[\r][\n]"
2021-03-22 12:46:46.046 [main] [DEBUG] {wire:73} - http-outgoing-0 << "230[\r][\n]"
2021-03-22 12:46:46.046 [main] [DEBUG] {wire:73} - http-outgoing-0 << "F [0xd0]%[0xf5][\r][0x14]z"[0xed]A[0x81][\r][0xd7][0x15][0xe9][0x82] [0x2][0x0][0x0]a[0x2][0x0][0x0][0xf2][0xba]Code: 117, e.displayText() = DB::Exception: [\n]"
2021-03-22 12:46:46.046 [main] [DEBUG] {wire:73} - http-outgoing-0 << "You have carriage return (\r, 0x0D, ASCII 13) at end of first row.[\n]"
...
@zhicwu zhicwu added the bug Confirmed user-visible misbehaviour in official release label Mar 22, 2021
@filimonov filimonov added the comp-http http protocol related label Mar 22, 2021
@filimonov
Copy link
Contributor

Maybe related: #21936

@Felixoid
Copy link
Member

I'll repost here the reproducible case for a POST request from the linked ticket:

As I've mentioned in the dev-chat, I've found the request that fails even on localhost.

  • I run docker run --rm --net=host --name=clickhouse yandex/clickhouse-server:21.3 in one terminal
  • Then in another I run cat POST.txt | nc localhost 8123 | tail -c 153 | gzip -d, file POST.txt
  • Sometimes it gives 400 Bad Request, sometimes 500 Internal Server Error, and sometimes its success. But the success rate is very low

Here's my lo configuration just in case

$ ip a l dev lo
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever

@den-crane
Copy link
Contributor

den-crane commented Mar 22, 2021

BTW, may be this test exploited UB, it seems it should have \n "1\tGroup\n"

20.8

printf '' | clickhouse-local --query="SELECT count(*) FROM table"  --structure='a String'
0

printf '\n' | clickhouse-local --query="SELECT count(*) FROM table"  --structure='a String'
1

printf '1' | clickhouse-local --query="SELECT count(*) FROM table"  --structure='a String'
1

printf '1\n\n' | clickhouse-local --query="SELECT count(*) FROM table"  --structure='a String'
2

it seems a text stream should always ends with \n. CH never consider \n at the end as a one-more row.

@zhicwu
Copy link
Contributor Author

zhicwu commented Mar 23, 2021

BTW, may be this test exploited UB, it seems it should have \n "1\tGroup\n"

Yes, appending \n works. I enhanced the test by considering both cases, except I excluded the one without new line when testing against 21.3.3.14 in order to fix build failure.

it seems a text stream should always ends with \n. CH never consider \n at the end as a one-more row.

I reported this as an bug. If this is changed on purpose and not going to roll back, I'll enhance JDBC driver 0.3.1 accordingly by appending \n to the end of text stream if does not exist.

@Felixoid,
I also got below two errors:

Error:  ru.yandex.clickhouse.integration.ClickHouseStatementImplTest.testExternalDataStream  Time elapsed: 0.204 s  <<< FAILURE!
ru.yandex.clickhouse.except.ClickHouseException: 
ClickHouse exception, code: 27, host: localhost, port: 49156; Code: 27, e.displayText() = DB::ParsingException: Cannot parse input: expected '\t' before: '�\0��': 
Row 1:
Column 0,   name: id, type: UInt64, ERROR: text "�<ASCII NUL>�<0x17>" is not like UInt64

: While executing SourceFromInputStream: (at row 1)
 (version 21.3.3.14 (official build))

	at ru.yandex.clickhouse.integration.ClickHouseStatementImplTest.testExternalDataStream(ClickHouseStatementImplTest.java:194)
Caused by: java.lang.Throwable: 
Code: 27, e.displayText() = DB::ParsingException: Cannot parse input: expected '\t' before: '�\0��': 
Row 1:
Column 0,   name: id, type: UInt64, ERROR: text "�<ASCII NUL>�<0x17>" is not like UInt64

: While executing SourceFromInputStream: (at row 1)
 (version 21.3.3.14 (official build))

	at ru.yandex.clickhouse.integration.ClickHouseStatementImplTest.testExternalDataStream(ClickHouseStatementImplTest.java:194)

Error:  ru.yandex.clickhouse.integration.ClickhouseLZ4StreamTest.testBigBatchCompressedInsert  Time elapsed: 1.514 s  <<< FAILURE!
ru.yandex.clickhouse.except.ClickHouseUnknownException: ClickHouse exception, code: 1002, host: localhost, port: 49156; Broken pipe (Write failed)
	at ru.yandex.clickhouse.integration.ClickhouseLZ4StreamTest.testBigBatchCompressedInsert(ClickhouseLZ4StreamTest.java:45)
Caused by: java.net.SocketException: Broken pipe (Write failed)
	at ru.yandex.clickhouse.integration.ClickhouseLZ4StreamTest.testBigBatchCompressedInsert(ClickhouseLZ4StreamTest.java:45)

The latter one might be the same issue you encountered. I ran into 500 error in most cases and sometimes 400.

@den-crane
Copy link
Contributor

So it's a duplicate of #20244

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Confirmed user-visible misbehaviour in official release comp-http http protocol related duplicate
Projects
None yet
Development

No branches or pull requests

4 participants