-
-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sporadic 'index out of bounds' panic in tls client #15226
Comments
Now we have both http client and server, let's also gain a TLS server in addition to TLS client, and then we can add some unit testing and fuzz testing for all 4 of these APIs. |
zig version: 0.11.0-dev.4006+bf827d0b5 I had to changed the test code to in order to run on latest zig master. const std = @import("std");
pub fn main() !void {
var gpa = std.heap.GeneralPurposeAllocator(.{ .never_unmap = true, .retain_metadata = true }){};
defer _ = gpa.deinit();
const allocator = gpa.allocator();
var args = try std.process.argsWithAllocator(allocator);
defer args.deinit();
_ = args.next();
const uri = try std.Uri.parse(args.next() orelse unreachable);
var client = std.http.Client{ .allocator = allocator };
defer client.deinit();
var req = try client.request(.GET, uri, .{ .allocator = allocator }, .{});
defer req.deinit();
try req.start();
try req.wait();
const body1 = try req.reader().readAllAlloc(std.heap.page_allocator, 8192 * 1024 * 1024);
std.debug.print("{d} {s}\n", .{ body1.len, body1[0..10] });
defer std.heap.page_allocator.free(body1);
}
|
Thanks @sgwong, same result here. Tested several URLs in a loop with no issues. |
This is still an issue. The following code should eventually panic, but it can take a very long time. Might just be my imagination, but I feel if it doesn't happen in the first 5 minutes, it's best to stop and restart it. const std = @import("std");
const net = std.net;
const tls = std.crypto.tls;
pub fn main() !void {
var general_purpose_allocator = std.heap.GeneralPurposeAllocator(.{}){};
const allocator = general_purpose_allocator.allocator();
var bundle = std.crypto.Certificate.Bundle{};
try bundle.rescan(allocator);
const stream = try net.tcpConnectToHost(allocator, "ws-feed.exchange.coinbase.com", 443);
var client = try tls.Client.init(stream, bundle, "ws-feed.exchange.coinbase.com");
// Websocket Handshake.
try client.writeAll(stream, &.{71, 69, 84, 32, 47, 32, 72, 84, 84, 80, 47, 49, 46, 49, 13, 10, 99, 111, 110, 116, 101, 110, 116, 45, 108, 101, 110, 103, 116, 104, 58, 32, 48, 13, 10, 117, 112, 103, 114, 97, 100, 101, 58, 32, 119, 101, 98, 115, 111, 99, 107, 101, 116, 13, 10, 115, 101, 99, 45, 119, 101, 98, 115, 111, 99, 107, 101, 116, 45, 118, 101, 114, 115, 105, 111, 110, 58, 32, 49, 51, 13, 10, 99, 111, 110, 110, 101, 99, 116, 105, 111, 110, 58, 32, 117, 112, 103, 114, 97, 100, 101, 13, 10, 115, 101, 99, 45, 119, 101, 98, 115, 111, 99, 107, 101, 116, 45, 107, 101, 121, 58, 32, 103, 57, 83, 108, 115, 76, 55, 50, 65, 90, 69, 66, 97, 69, 50, 99, 101, 106, 108, 43, 106, 81, 61, 61, 13, 10, 72, 111, 115, 116, 58, 32, 119, 115, 45, 102, 101, 101, 100, 46, 101, 120, 99, 104, 97, 110, 103, 101, 46, 99, 111, 105, 110, 98, 97, 115, 101, 46, 99, 111, 109, 13, 10, 13, 10});
// This sends a subscribe message. The payload is JSON, but it's framed and masked
// as a websocket frame. It's taken from the first example at:
// https://docs.cloud.coinbase.com/exchange/docs/websocket-overview
try client.writeAll(stream, &.{129, 254, 0, 218, 38, 123, 44, 129, 93, 113, 37, 163, 82, 2, 92, 228, 4, 65, 12, 163, 85, 14, 78, 242, 69, 9, 69, 227, 67, 89, 0, 139, 47, 89, 92, 243, 73, 31, 89, 226, 82, 36, 69, 229, 85, 89, 22, 161, 125, 113, 37, 136, 47, 89, 105, 213, 110, 86, 121, 210, 98, 89, 0, 139, 47, 114, 37, 163, 99, 47, 100, 172, 99, 46, 126, 163, 44, 114, 113, 173, 44, 114, 14, 226, 78, 26, 66, 239, 67, 23, 95, 163, 28, 91, 119, 139, 47, 114, 37, 163, 74, 30, 90, 228, 74, 73, 14, 173, 44, 114, 37, 136, 4, 19, 73, 224, 84, 15, 78, 228, 71, 15, 14, 173, 44, 114, 37, 136, 93, 113, 37, 136, 47, 114, 37, 163, 72, 26, 65, 228, 4, 65, 12, 163, 82, 18, 79, 234, 67, 9, 14, 173, 44, 114, 37, 136, 47, 114, 14, 241, 84, 20, 72, 244, 69, 15, 115, 232, 66, 8, 14, 187, 6, 32, 38, 136, 47, 114, 37, 136, 47, 114, 14, 196, 114, 51, 1, 195, 114, 56, 14, 173, 44, 114, 37, 136, 47, 114, 37, 136, 4, 62, 120, 201, 11, 46, 127, 197, 4, 113, 37, 136, 47, 114, 37, 220, 44, 114, 37, 136, 91, 113, 37, 220, 44, 6});
var buf: [512]u8 = undefined;
while (true) {
const n = try client.read(stream, buf[0..]);
if (n == 0) {
break;
}
std.debug.print("{s}\n", .{buf[0..n]});
}
} All this program does is send a websocket handshake, and then sends a framed and masked json subscribed message (from the coinbase documentation). It then just continues to read from the socket and print the output. If you run this and notice that every json message is prefixed with something like "�~�", that's the websocket framing data. The sample program doesn't try to parse the data, just print it, so that's normal. Eventually, you get:
|
I'm trying to reproduce another possible bug with the TLS client, and in doing so, I might have found a slightly more reliable way to reproduce this. It seems to happen a lot more often around network hiccups. What I did was run the above code in a docker container, and use:
Some time between 30-45 seconds works most reliably on my computer. This is the Dockerfile I used:
where test.zig is the above code. |
Another stack trace crashing on line 1271 instead of 1233:
I wish I could speculate where the error is, but it takes a minute to reproduce and |
This issue may or may not be related to another issue: reading works, but fills the buffer with garbage data (maybe it's encrypted data, I dunno). I have not been able to reproduce it reliably at all. For one, the Getting out of bounds is one thing....but having it succeed but with corrupt data seems like a much worse issue. |
I've figured out a bit more. Crash happens after Example bad
|
Made a repo which consistently reproduces panic:
|
When calculating how much ciphertext from the stream can fit into user and internal buffers we should also take into account ciphertext data which are already in internal buffer. Fixes: 15226 Tested with [this](ziglang#15226 (comment)). Using client with different read buffers until I, hopefully, understood what is happening. Not relevant to this fix, but this [part](https://github.com/ziglang/zig/blob/95d9292a7a09ed883e65510ec054619747315c48/lib/std/crypto/tls/Client.zig#L988-L991) is still mystery to me. Why we don't use free_size in buf_cap calculation. Seems like rudiment from previous implementation without iovec.
When calculating how much ciphertext from the stream can fit into user and internal buffers we should also take into account ciphertext data which are already in internal buffer. Fixes: 15226 Tested with [this](#15226 (comment)). Using client with different read buffers until I, hopefully, understood what is happening. Not relevant to this fix, but this [part](https://github.com/ziglang/zig/blob/95d9292a7a09ed883e65510ec054619747315c48/lib/std/crypto/tls/Client.zig#L988-L991) is still mystery to me. Why we don't use free_size in buf_cap calculation. Seems like rudiment from previous implementation without iovec.
When calculating how much ciphertext from the stream can fit into user and internal buffers we should also take into account ciphertext data which are already in internal buffer. Fixes: 15226 Tested with [this](ziglang#15226 (comment)). Using client with different read buffers until I, hopefully, understood what is happening. Not relevant to this fix, but this [part](https://github.com/ziglang/zig/blob/95d9292a7a09ed883e65510ec054619747315c48/lib/std/crypto/tls/Client.zig#L988-L991) is still mystery to me. Why we don't use free_size in buf_cap calculation. Seems like rudiment from previous implementation without iovec.
When calculating how much ciphertext from the stream can fit into user and internal buffers we should also take into account ciphertext data which are already in internal buffer. Fixes: 15226 Tested with [this](ziglang#15226 (comment)). Using client with different read buffers until I, hopefully, understood what is happening. Not relevant to this fix, but this [part](https://github.com/ziglang/zig/blob/95d9292a7a09ed883e65510ec054619747315c48/lib/std/crypto/tls/Client.zig#L988-L991) is still mystery to me. Why we don't use free_size in buf_cap calculation. Seems like rudiment from previous implementation without iovec.
When calculating how much ciphertext from the stream can fit into user and internal buffers we should also take into account ciphertext data which are already in internal buffer. Fixes: 15226 Tested with [this](ziglang#15226 (comment)). Using client with different read buffers until I, hopefully, understood what is happening. Not relevant to this fix, but this [part](https://github.com/ziglang/zig/blob/95d9292a7a09ed883e65510ec054619747315c48/lib/std/crypto/tls/Client.zig#L988-L991) is still mystery to me. Why we don't use free_size in buf_cap calculation. Seems like rudiment from previous implementation without iovec.
Zig Version
0.11.0-dev.2477+2ee328995
Steps to Reproduce and Observed Behavior
Using the test program I received from @truemedian to test a fix for the slow TLS bug:
zig build-exe test-https.zig
Then execute this in a loop in a shell:
Usually it works, but sporadically the program either hangs or panics with:
Expected Behavior
No
panic:
and no hangsNote
The response headers may vary in length for the same URL. Some responses can have different lengths of various ID fields, sometimes the response is chunked, sometimes with response-length.
Both chunked and non-chunked responses sometimes work, so that's not the issue by itself.
The text was updated successfully, but these errors were encountered: