Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix parsing of uneven stream bodies #1764

Merged
merged 5 commits into from
Apr 4, 2018
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
38 changes: 15 additions & 23 deletions core/src/main/scala/org/http4s/multipart/MultipartParser.scala
Original file line number Diff line number Diff line change
Expand Up @@ -222,7 +222,6 @@ object MultipartParser {
*
*/
private def splitCompleteMatch[F[_]: Sync](
state: Int,
middleChunked: Boolean,
sti: Int,
i: Int,
Expand All @@ -236,12 +235,6 @@ object MultipartParser {
//Emit the partial match as well
acc ++ carry ++ Stream.chunk(c.take(i - sti)),
Stream.chunk(c.drop(i))) //Emit after the match
} else if (state == 0) {
(
sti,
//Ignore the partial match (carry)
acc ++ Stream.chunk(c.take(i - sti)),
Stream.chunk(c.drop(i)))
} else {
(
sti,
Expand All @@ -265,7 +258,6 @@ object MultipartParser {
*
*/
def splitPartialMatch[F[_]: Sync](
state: Int,
middleChunked: Boolean,
currState: Int,
i: Int,
Expand All @@ -274,7 +266,7 @@ object MultipartParser {
c: Chunk[Byte]
): (Int, Stream[F, Byte], Stream[F, Byte]) = {
val ixx = i - currState
if (middleChunked || state == 0) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

state == 0 condition was removed because it is not longer necessary: this case will be picked up as a middleChunked case, now always.

if (middleChunked) {
val (lchunk, rchunk) = c.splitAt(ixx)
(currState, acc ++ carry ++ Stream.chunk(lchunk), Stream.chunk(rchunk))
} else {
Expand Down Expand Up @@ -305,25 +297,28 @@ object MultipartParser {
var i = 0
var currState = state
val len = values.length
var middleChunked = false
while (currState < len && i < c.size) {
if (c(i) == values(currState)) {
currState += 1
} else if (c(i) == values(0)) {
middleChunked = true
currState = 1
} else {
currState = 0
}
i += 1
}
//It will only be zero if
//the chunk matches from the very beginning,
//since currstate can never be greater than
//(i + state).
val middleChunked = i + state - currState > 0
Copy link
Contributor Author

@jmcardon jmcardon Apr 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some background on this condition to help review:

state is essentially a counter to the last index of a particular byte sequence seen. This is necessary since boundaries may be spread across chunks. I also denote carry as a possible partial match on a boundary, which gets re-emitted if it happens that it wasn't, or ignored if it completed a boundary.

middleChunked is simply the condition of whether the particular match, for example a boundary match, starts somewhere after the start of this chunk (index 0). If that is the case, then we know that the previous possible partial match on the boundary was a false positive and we can reemit it.

i - (currState - state) > 0 is essentially the condition we are after: This means if the amount of the value to match is equal to i (which is the counter over the whole chunk until possibly completing the match), then clearly the match started at the beginning of the chunk, in which case we can ignore the previous carry. If this condition is greater than 0, this means that this boundary match began somewhere in the midde, thus it is middleChunked


if (currState == 0) {
(0, acc ++ carry ++ Stream.chunk(c), Stream.empty)
} else if (currState == len) {
splitCompleteMatch(state, middleChunked, currState, i, acc, carry, c)
splitCompleteMatch(middleChunked, currState, i, acc, carry, c)
} else {
splitPartialMatch(state, middleChunked, currState, i, acc, carry, c)
splitPartialMatch(middleChunked, currState, i, acc, carry, c)
}
}

Expand Down Expand Up @@ -540,13 +535,6 @@ object MultipartParser {
//Emit after the match
Stream.chunk(c.drop(i)),
state + i - sti)
} else if (state == 0) {
(
sti,
//Ignore the partial match (carry)
acc ++ Stream.chunk(c.take(i - sti)),
Stream.chunk(c.drop(i)),
i - sti)
} else {
(
sti,
Expand Down Expand Up @@ -580,7 +568,7 @@ object MultipartParser {
c: Chunk[Byte]
): (Int, Stream[F, Byte], Stream[F, Byte], Int) = {
val ixx = i - currState
if (middleChunked || state == 0) {
if (middleChunked) {
val (lchunk, rchunk) = c.splitAt(ixx)
(
currState,
Expand All @@ -602,19 +590,23 @@ object MultipartParser {
var i = 0
var currState = state
val len = values.length
var middleChunked = false
while (currState < len && i < c.size) {
if (c(i) == values(currState)) {
currState += 1
} else if (c(i) == values(0)) {
middleChunked = true
currState = 1
} else {
currState = 0
}
i += 1
}

//It will only be zero if
//the chunk matches from the very beginning,
//since currstate can never be greater than
//(i + state).
val middleChunked = i + state - currState > 0

if (currState == 0) {
(0, acc ++ carry ++ Stream.chunk(c), Stream.empty, i)
} else if (currState == len) {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -669,6 +669,44 @@ object MultipartParserSpec extends Specification {
bodies.attempt.unsafeRunSync() must beRight("bar")
}

"parse uneven input properly" in {
val unprocessed =
Stream
.segment(
List(
"--_5PHqf8_Pl1FCzBuT5o_mVZg36k67UYI\n",
"Content-Disposition: form-data; name=\"upload\"; filename=\"integration.txt\"\n",
"""Content-Type: application/octet-stream
|Content-Transfer-Encoding: binary
|
|this is a test
|here's another test
|catch me if you can!
|
|--_5PHqf8_Pl1FCzBuT5o_mVZg36k67UYI
|Content-Disposition: form-data; name="foo"
|
|""".stripMargin,
"""bar
|--_5PHqf8_Pl1FCzBuT5o_mVZg36k67UYI--""".stripMargin
).map(_.replaceAllLiterally("\n", "\r\n"))
.map(str => Segment.chunk(Chunk.bytes(str.getBytes)))
.foldLeft(Segment.empty[Byte])(_ ++ _)
)
.covary[IO]

val results = unprocessed.through(MultipartParser.parseStreamed(boundary))
val multipartMaterialized = results.compile.last.map(_.get).unsafeRunSync()
val bodies = multipartMaterialized
.parts(1)
.body
.through(asciiDecode)
.compile
.fold("")(_ ++ _)

bodies.attempt.unsafeRunSync() must beRight("bar")
}

"produce the correct headers from a two part input" in {
val unprocessedInput =
"""
Expand Down