Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve memory use and performance of multipart parsing #745

Merged
merged 2 commits into from Aug 31, 2021

Conversation

quinnj
Copy link
Member

@quinnj quinnj commented Aug 31, 2021

Memory usage and performance noted in
https://discourse.julialang.org/t/http-parse-multipart-form-does-a-lot-of-memory-allocations/66661.
This uses some of the same regex tricks we use in the other http parsing
code that operates on SubStrings and avoids excessive allocations. We
also streamline things to avoid allocating when comparing byte buffers
and the unnecessary allocated arrays of the content disposition
processing.

On my machine for current master, I see 35.022 μs (866 allocations: 70.27 KiB) for parsing a simple multipart request from the tests, and
with this PR, I see 3.603 μs (42 allocations: 2.55 KiB).

Memory usage and performance noted in
https://discourse.julialang.org/t/http-parse-multipart-form-does-a-lot-of-memory-allocations/66661.
This uses some of the same regex tricks we use in the other http parsing
code that operates on SubStrings and avoids excessive allocations. We
also streamline things to avoid allocating when comparing byte buffers
and the unnecessary allocated arrays of the content disposition
processing.

On my machine for current master, I see `35.022 μs (866 allocations:
70.27 KiB)` for parsing a simple multipart request from the tests, and
with this PR, I see `3.603 μs (42 allocations: 2.55 KiB)`.

"""
find_multipart_boundary(bytes, boundaryDelimiter; start::Int=1)

Find the first and last index of the next boundary delimiting a part, and if
the discovered boundary is the terminating boundary.
"""
function find_multipart_boundary(bytes::AbstractVector{UInt8}, boundaryDelimiter::AbstractVector{UInt8}; start::Int=1)
@inline function find_multipart_boundary(bytes::AbstractVector{UInt8}, boundaryDelimiter::AbstractVector{UInt8}; start::Int=1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm kind of amazed that inlining this function has any performance improvement.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, it doesn't. This was from an iteration where I was returning a SubString instead of indices and it could avoid the allocation if it was inlined.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So should it have been removed?

@mbauman
Copy link

mbauman commented Aug 31, 2021

I can confirm this significantly improves the original request that was being tested/reported on Discourse — it is much larger and has an even larger benefit.

Before:

julia> @benchmark HTTP.parse_multipart_form($r)
BenchmarkTools.Trial: 26 samples with 1 evaluation.
 Range (min  max):  181.312 ms  235.950 ms  ┊ GC (min  max): 3.11%  3.33%
 Time  (median):     195.296 ms               ┊ GC (median):    3.16%
 Time  (mean ± σ):   197.687 ms ±  13.029 ms  ┊ GC (mean ± σ):  3.20% ± 0.16%

 Memory estimate: 584.19 MiB, allocs estimate: 4785830.

After:

julia> @benchmark HTTP.parse_multipart_form($r)
BenchmarkTools.Trial: 917 samples with 1 evaluation.
 Range (min  max):  5.131 ms    9.564 ms  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     5.309 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):   5.447 ms ± 448.252 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

 Memory estimate: 2.03 KiB, allocs estimate: 32.

const CR_BYTE = 0x0d # \r
const LF_BYTE = 0x0a # \n
const DASH_BYTE = 0x2d # -
const HTAB_BYTE = 0x09 # \t
const SPACE_BYTE = 0x20
const RETURN_BYTES = [CR_BYTE, LF_BYTE]
const SEMICOLON_BYTE = UInt8(';')
const CRLFCRLF = [CR_BYTE, LF_BYTE, CR_BYTE, LF_BYTE]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make into a tuple (allows the compiler to know it's length etc)? I don't think it is mutated anywhere?

@quinnj
Copy link
Member Author

quinnj commented Aug 31, 2021

Ok, I unexported closewrite (and friends) so nightly isn't broken due to Julia Base now exporting closewrite. I also fixed the websocket tests which were failing because echo.websocket.org doesnt' exist anymore. Let's see how CI fares now.

@codecov-commenter
Copy link

codecov-commenter commented Aug 31, 2021

Codecov Report

Merging #745 (bbff1de) into master (7a3d9c9) will decrease coverage by 0.04%.
The diff coverage is 78.78%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #745      +/-   ##
==========================================
- Coverage   76.89%   76.84%   -0.05%     
==========================================
  Files          38       38              
  Lines        2445     2514      +69     
==========================================
+ Hits         1880     1932      +52     
- Misses        565      582      +17     
Impacted Files Coverage Δ
src/HTTP.jl 97.87% <ø> (+0.04%) ⬆️
src/IOExtras.jl 72.22% <0.00%> (-9.03%) ⬇️
src/parsemultipart.jl 90.90% <89.47%> (-3.92%) ⬇️
src/Messages.jl 87.94% <100.00%> (+0.17%) ⬆️
src/TimeoutRequest.jl 80.00% <0.00%> (-4.22%) ⬇️
src/debug.jl 46.42% <0.00%> (-3.58%) ⬇️
src/ConnectionRequest.jl 80.88% <0.00%> (-2.46%) ⬇️
src/CookieRequest.jl 85.29% <0.00%> (-1.81%) ⬇️
src/Servers.jl 85.97% <0.00%> (-0.36%) ⬇️
src/Handlers.jl 62.65% <0.00%> (-0.32%) ⬇️
... and 19 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a2b467e...bbff1de. Read the comment docs.

Also clean up usages of startread, closeread, startwrite, closewrite to
quality to the IOExtras module. Also fix the websocket tests that were
hanging since echo.websocket.org doesn't exist anymore. I found pie
socket that offers a free websocket testing server if you sign up and
get an api key, so that's been added as a repository secret now. If
anyone wants the api key to run the tests on their local machine, I'm
happy to share.
@quinnj quinnj merged commit 0d037f6 into master Aug 31, 2021
@quinnj quinnj deleted the jq/multipart_parsing_perf branch August 31, 2021 19:24
closeread(io) = nothing
if isdefined(Base, :startwrite)
"$_doc"
Base.startwrite(io) = nothing
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is type piracy now

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we'll need to make a plan around this since Base doesn't define a fallback for closewrite. Either we add closewrite(io) = nothing to Base and we can check if it's defined here, or it's also possible we don't need the fallback anymore. I left this as-is for now since it ensures non-breaking.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why extend the function at all? You can just keep using it as a separate function and qualify it (or the Base one).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because then that's breaking; we have all these code examples all over the package where we tell people to write streaming servers doing closewrite(stream).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can still export it. Downstreams users just must then decide if they want to use Base.closewrite or HTTP.closewrite.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, I would strongly prefer that we remove the type piracy and just make a breaking release of HTTP.jl.

(I don't think it's actually breaking. I think the burden is on the downstream users to make sure that they properly qualify any names that they use.

But if this is considering breaking, then I'd rather just tag the breaking release so that we can get rid of the type piracy.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants