-
Notifications
You must be signed in to change notification settings - Fork 535
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ClickHouse reads 40 million data and writes it into MySQL. When 35 million data is read, an error is reported. #1550
Comments
Good day, @T-M-L-C ! Thanks in advance! |
There is a similar issue, but it is intermittent issue, occurs mostly when client side is under pressure (several statements) ClickHouse:
Client side:
Any thought what could be the reason? And how to avoid such failures? |
@TimonSP What are your client / server versions? |
@chernser, sorry, link is not work for me (Page not found) ClickHouse version 23.8.14.6 |
I'm seeing this same issue using: Similar to @TimonSP we see this intermittently, but frequently. We are not able to reproduce on demand, but we could turn on any debug logging that might be helpful. It always occurs when reading large amounts of records from Clickhouse (> 10M). We see this both when using the Clickhouse JDBC driver and when using the native HTTP client. One guess I had was that this could be related to TCP send/recv buffers. Is this a plausible explanation? How does Clickhouse handle the TCP send buffer being full? Also, the link to the issue does not work for me either. I think this is because it's in the "clickhouse-private" repo. Can you give some more information about what this issue references? |
@chernser could you please copy-paste ClickHouse issue at [internal link] |
I was managed to reproduce issue using Thread.sleep as load simulation. import com.clickhouse.jdbc.ClickHouseDataSource;
import java.sql.Connection;
import java.sql.PreparedStatement;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.text.SimpleDateFormat;
import java.util.Date;
import java.util.Properties;
public class ConsoleTestException {
public static void main(String[] args) {
String user_name = "***";
String user_passsword = "***";
String url = "jdbc:clickhouse://***:8123";
String query = "SELECT toString(cityHash64(number)) FROM numbers(10000000)";
SimpleDateFormat sdf = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss.SSS");
Properties properties = new Properties();
ClickHouseDataSource dataSource;
Connection connection;
PreparedStatement statement;
ResultSet rs;
int idx = 1;
try {
dataSource = new ClickHouseDataSource(url, properties);
connection = dataSource.getConnection(user_name, user_passsword);
statement = connection.prepareStatement(query);
rs = statement.executeQuery();
while (rs.next()) {
if (idx % 100 == 0){
System.out.println(String.format("[%s] %s", sdf.format(new Date()), idx));
if (idx == 1000 || idx == 2000){
try {
Thread.sleep(25000);
} catch (InterruptedException e) {
e.printStackTrace();
}
System.out.println(String.format("[%s] sleep completed", sdf.format(new Date())));
}
}
idx++;
}
System.out.println(String.format("[%s] total %s", sdf.format(new Date()), idx));
} catch (SQLException e) {
System.out.println(String.format("[%s] last %s", sdf.format(new Date()), idx));
e.printStackTrace();
}
}
} But I'm afraid it is ClickHouse issue, it is reproducible with python clickhouse_connect too import clickhouse_connect
import time
if __name__ == '__main__':
client = clickhouse_connect.get_client(
host="***",
port=8123,
username="***",
password="***"
)
with client.query_rows_stream('SELECT toString(cityHash64(number)) FROM numbers(10000000)') as stream:
idx = 1
for row in stream:
if idx % 1000 == 0:
print(idx)
if idx == 1000 or idx == 2000:
time.sleep(25)
idx += 1
pass Fails with clickhouse_connect.driver.exceptions.StreamFailureError |
The same code on the 24.3.7.30 provided another (more clear) exception:
That helped to find something that looks like root cause of the issue, it is http_send_timeout but it should be set for default profile only and restart would required, see ClickHouse/ClickHouse#64731 for details I've a bit problem to check this out (as far as I don't have access to change default profile) |
I can confirm that we fixed this problem by increasing the http_send_timeout in the default profile. |
@TimonSP @wallacms sorry it is an internal link.
|
@TimonSP
I read it like this http_send_timeout is defines for how long server will hold unsent data before giving up. In your example code with Summary
|
@chernser We've observed this problem primarily in the context of accessing Clickhouse through Dremio. Since Dremio is a distributed execution engine, it can have long pauses in reading results from Clickhouse while it is servicing other queries or doing upstream processing on the results it gets from Clickhouse. Thus, in our case increasing the the timeout is a reasonable thing to do because we expect long pauses. |
@chernser |
@chernser whole process repeats, it is a third party tool for us so we cannot control its behavior BTW, I've found that the setting was decreased in 23.6 ClickHouse/ClickHouse#51171 |
@TimonSP - which BI tool are you using? |
clickhouse-jdbc: version(0.4.6)
error message: java.sql.SQLException: java.io.StreamCorruptedException: Reached end of input stream after reading 6719 of 10487 bytes
clickhouse: version(23.3.2.1)
error message:
Code: 24.DB::Exception: Cannot write to ostream at offset 1082130432: While executing ParallelFormattingOutputFormat.(CANNOT_WRITE_TO_OSTREAM)
The text was updated successfully, but these errors were encountered: