Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

method might not refer to a location within buffer #124

Closed
jkarneges opened this issue Oct 12, 2022 · 2 comments · Fixed by #136
Closed

method might not refer to a location within buffer #124

jkarneges opened this issue Oct 12, 2022 · 2 comments · Fixed by #136

Comments

@jkarneges
Copy link

Our app parses a request but doesn't need to do anything with it until the body is also received. In order to allow the underlying buffer to be used for reading the body, and also to avoid parsing the request twice, the app converts the various slices (method, path, headers) into integer indexes within the underlying buffer and then drops the Request in order to unborrow the buffer. Later on, the slices can be reestablished using the indexes.

As of httparse 1.8, our app started failing due to slice locations potentially existing outside of the buffer, likely due to "GET" and "POST" now being returned as static strings.

In hindsight I suppose we were abusing the API. httparse never guaranteed the slices would always point within the buffer. It was just an easy assumption to make since httparse is known to do in-place parsing without copying. I'm not sure if anything should be changed in httparse and we will look at reworking our code. Posting this in case anyone else ran into the same issue.

@jkarneges
Copy link
Author

Following up on this. I came up with three possible solutions:

  • Parse the request twice, once before reading the body and once after. This is likely the simplest approach.

  • Keep doing the slice->index conversion thing we were doing, but take into account that a slice might not be in the buffer. Each tracked field could be an enum, something like enum FieldRef { RangeInBuf(u32, u32), Slice(&'static str) }. This is likely the closest solution to what we were already doing.

  • Use a different buffer for subsequent reads. After parsing the request, copy any remaining bytes into a second buffer, and then start using that second buffer for reading the body. The original buffer stays borrowed by the Request. Further, when it's time to read the next request, read it from the second buffer. This is likely the most optimal solution, if there happens to be an unused buffer lying around. In our case we did have this: an idle write buffer.

I ended up going with the last approach. For each connection we were allocating two Vec<u8> buffers, one for reading and one for writing, but the write buffer was not used at all during the reading process and so it was available. The implementation was a bit tricky, with the buffers switching places for each request, but it seems to work.

@AaronO
Copy link
Contributor

AaronO commented Apr 24, 2023

@jkarneges I'll fix this for the next release, my fault, it makes sense to return buffer pointers instead of static strings.

AaronO added a commit to AaronO/httparse that referenced this issue Apr 24, 2023
Instead of static constants for GET/POST, fixes seanmonstar#124
seanmonstar pushed a commit that referenced this issue Apr 24, 2023
Instead of static constants for GET/POST, fixes #124
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants