Skip to content

feat: add gzip_partial_decode function and update documentation#248

Closed
HaimLC wants to merge 1 commit intoprojectdiscovery:mainfrom
HaimLC:feat/gzip_partial_decode
Closed

feat: add gzip_partial_decode function and update documentation#248
HaimLC wants to merge 1 commit intoprojectdiscovery:mainfrom
HaimLC:feat/gzip_partial_decode

Conversation

@HaimLC
Copy link
Copy Markdown

@HaimLC HaimLC commented May 13, 2025

implemented #247

@ehsandeep ehsandeep requested a review from dwisiswant0 May 13, 2025 23:59
Copy link
Copy Markdown
Member

@dwisiswant0 dwisiswant0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hi, thanks for the PR! i appreciate the effort, but i don’t think it makes sense to have two separate functions that essentially do the same thing. there’s no need to add gzip_partial_decode; instead, we can simply refactor gzip_decode to include an optional limit parameter - if the user provides this parameter, we can use io.LimitReader to read only the specified $$N$$ of bytes. if it’s not provided, we can fallback to using DefaultMaxDecompressionSize var as the limit. this way, we keep things clean and avoid duplication.

also, the test is fails.

@HaimLC
Copy link
Copy Markdown
Author

HaimLC commented May 14, 2025

i would fix the pipeline; Locally it worked well :)
They are very different, the current gzip_decode is returns empty if the gzip is cut in the middle
the new one is returning the data

original:

data, err := io.ReadAll(limitReader)
if err != nil {
  _ = reader.Close()
  return "", err
}

partial:

if err != nil && err != io.ErrUnexpectedEOF {
    _ = reader.Close()
return "", err
}

Why is this important?:
When the gzip stream is incomplete, io.ReadAll raises io.ErrUnexpectedEOF.
In the original code, this results in an empty string being returned.
In the new code, we handle io.ErrUnexpectedEOF separately, so we return the partially decoded data instead.
for example, if we have gzip for 'hello', and it's has been cut, we would still get 'hell'. in the original we would get ''

@dwisiswant0
Copy link
Copy Markdown
Member

i noticed that you’re only handling the EOF case, but i didn’t see any parameter being supplied to limit the reader to $$N$$ bytes, alias you’re still relying on DefaultMaxDecompressionSize var. i’ll go ahead and take over this and refactor it, since i also notice a few DSL functions that depend on DefaultMaxDecompressionSize and could benefit from being more flexible.

thanks for your contribution, really appreciate the effort!

@dwisiswant0
Copy link
Copy Markdown
Member

Superseded by #249.

@HaimLC
Copy link
Copy Markdown
Author

HaimLC commented May 14, 2025

The limit happening in the template :)
for example:
https://github.com/dk4trin/templates-nuclei/blob/main/springboot-heapdump-v2.yaml#L27C1-L28C1
i wanted to improve the template by using gzip_decode, to reduce false positives
but because the bytes are cut in the middle, i can't search for java_profile when gzip_decode, because it's returns "" when unexpeceted EOF

@dwisiswant0
Copy link
Copy Markdown
Member

the max-size prop in the template isn’t the same as a read limit. how are you simulating truncated data and passing it to the DSL function? AFAIK, it’s not possible since DSL functions don’t work in a streaming manner, they need the full literal data upfront. unless i’m missing something or got it wrong? could you explain how you’re doing it, @HaimLC?

@dwisiswant0 dwisiswant0 added the question Further information is requested label May 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

question Further information is requested

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants