-
Notifications
You must be signed in to change notification settings - Fork 4.1k
openssl短连接方式访问服务,偶发E1008超时错误 #3098
Description
Describe the bug
openssl短连接方式访问服务,偶发E1008超时错误
To Reproduce
请求回复数据在64K-200K相对容易触发,更大或者更小不容易出现
分析日志发现,客户端出现E1008错误时,IOBuf::cut_multiple_into_SSL_channel,
nw返回都是完整的消息长度,同时BIO_flush(wbio) <=0,BIO_should_write(wbio) >0,BIO_fd_non_fatal_error(errno)>0
说明ssl缓冲区中还有数据。
iobuf.cpp
IOBuf::cut_multiple_into_SSL_channel
1072 #ifndef USE_MESALINK
1073 // Flush remaining data inside the BIO buffer layer
1074 BIO* wbio = SSL_get_wbio(ssl);
1075 if (BIO_wpending(wbio) > 0) {
1076 int rc = BIO_flush(wbio);
1077 if (rc <= 0 && BIO_fd_non_fatal_error(errno) == 0) {
1078 // Fatal error during BIO_flush
1079 *ssl_error = SSL_ERROR_SYSCALL;
1080 return rc;
1081 }
//---<<<<<执行这条路径
1082 }
1083 #else
1084 int rc = SSL_flush(ssl);
1085 if (rc <= 0) {
1086 *ssl_error = SSL_ERROR_SYSCALL;
1087 return rc;
1088 }
1089 #endif
1090
1091 return nw;
1092 }
由于所有数据写入了ssl session,在KeepWrite中IsWriteComplete 返回了true,
无法进入doWrite和cut_multiple_into_SSL_channel,没能再次bio_flush 发送ssl中的数据,
造成最有一段数据没能发送到客户端,进而触发E1008超时错误.
在KeepWrite函数中检查IsWriteComplete后,直接调用bio_flush,能够解决此问题,
但不确定是否有其他问题
socket.cpp
void* Socket::KeepWrite(void* void_arg)
1910 if (NULL == cur_tail) {
1911 for (cur_tail = req; cur_tail->next != NULL;
1912 cur_tail = cur_tail->next);
1913 }
1914 // Return when there's no more WriteRequests and req is completely
1915 // written.
1916 if (s->IsWriteComplete(cur_tail, (req == cur_tail), &cur_tail)) {
1917 CHECK_EQ(cur_tail, req);
1918 s->ReturnSuccessfulWriteRequest(req);
1919 return NULL;
1920 }
1921 } while (1);
1922
1923 // Error occurred, release all requests until no new requests.
1924 s->ReleaseAllFailedWriteRequests(req);
1925 return NULL;
1926 }
Expected behavior
Versions
OS:rhel8/rhel9
Compiler:
brpc:1.9-1.14
protobuf:
Additional context/screenshots