Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(httpclientx): allow configuring max-response-body size #1588

Merged
merged 91 commits into from
May 3, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
91 commits
Select commit Hold shift + click to select a range
75ef7fd
refactor: consolidate httpx and httpapi
bassosimone Apr 22, 2024
f9210ec
refactor to make testing the whole package easier
bassosimone Apr 23, 2024
587290c
Merge branch 'master' into issue/2700
bassosimone Apr 23, 2024
af394c2
Merge branch 'master' into issue/2700
bassosimone Apr 23, 2024
c6f2f5a
Merge branch 'master' into issue/2700
bassosimone Apr 23, 2024
68c9779
Merge branch 'issue/2700' of github.com:ooni/probe-cli into issue/2700
bassosimone Apr 23, 2024
57e29da
Merge branch 'master' into issue/2700
bassosimone Apr 23, 2024
5c953f0
x
bassosimone Apr 23, 2024
e03e810
x
bassosimone Apr 23, 2024
a6046fd
x
bassosimone Apr 23, 2024
341fcf2
x
bassosimone Apr 23, 2024
8c34524
x
bassosimone Apr 23, 2024
4b464ff
try to entirely remove httpx usages
bassosimone Apr 23, 2024
6d57184
fix: make sure there is nil safety
bassosimone Apr 23, 2024
9c2a226
oxford comma: yes/no?
bassosimone Apr 23, 2024
1123b4e
x
bassosimone Apr 23, 2024
d421d24
fix: unit test needs to be adapted
bassosimone Apr 24, 2024
67e0a10
chore: improve testing for cloudflare IP lookup
bassosimone Apr 24, 2024
a69d981
chore: improve the ubuntu IP lookup tests
bassosimone Apr 24, 2024
cd25c56
Merge branch 'master' into issue/2700
bassosimone Apr 24, 2024
642ae5c
x
bassosimone Apr 24, 2024
548e6bc
doc: document oonirun/v2_test.go tests
bassosimone Apr 24, 2024
40db0e5
Merge branch 'master' into issue/2700
bassosimone Apr 24, 2024
4cf3566
start improving probeservices tests
bassosimone Apr 24, 2024
e736e42
Merge branch 'master' into issue/2700
bassosimone Apr 26, 2024
e8471c4
x
bassosimone Apr 26, 2024
aa1c836
Merge branch 'master' into issue/2700
bassosimone Apr 26, 2024
08e81a9
x
bassosimone Apr 26, 2024
fa74b48
x
bassosimone Apr 26, 2024
a7e748f
x
bassosimone Apr 26, 2024
87146cc
x
bassosimone Apr 26, 2024
dac7b8f
x
bassosimone Apr 26, 2024
04b0071
Merge branch 'master' into issue/2700
bassosimone Apr 26, 2024
79d1fee
Merge branch 'master' into issue/2700
bassosimone Apr 29, 2024
88b399d
Merge branch 'master' into issue/2700
bassosimone Apr 29, 2024
de23e7d
x
bassosimone Apr 29, 2024
9d87673
Merge branch 'master' into issue/2700
bassosimone Apr 29, 2024
a436f1e
x
bassosimone Apr 29, 2024
08f8ca9
Merge branch 'master' into issue/2700
bassosimone Apr 29, 2024
25140f3
x
bassosimone Apr 29, 2024
1bbe0b7
chore: write tests for oonicollector.go
bassosimone Apr 30, 2024
6707d61
Merge branch 'master' into issue/2700
bassosimone Apr 30, 2024
4ddd507
Merge branch 'master' into issue/2700
bassosimone Apr 30, 2024
c453ee2
x
bassosimone Apr 30, 2024
ad3d84f
Merge branch 'master' into issue/2700
bassosimone May 2, 2024
28d64f1
feat(probeservices): use httpclientx for check-in
bassosimone May 2, 2024
2107750
cleanup: remove check-in from ooapi
bassosimone May 2, 2024
c2c8ebf
Merge branch 'master' into issue/2700
bassosimone May 2, 2024
36610a8
feat: start moving TH call into engine/session.go
bassosimone May 2, 2024
b7ccf2f
Merge branch 'master' into issue/2700
bassosimone May 2, 2024
b94a8b8
x
bassosimone May 2, 2024
5f1994c
x
bassosimone May 2, 2024
6e16369
Merge branch 'master' into issue/2700
bassosimone May 2, 2024
8400bde
x
bassosimone May 2, 2024
117fcc2
Merge branch 'master' into issue/2700
bassosimone May 2, 2024
17f9b83
x
bassosimone May 2, 2024
8ca93f0
x
bassosimone May 2, 2024
854da9a
Merge branch 'master' into issue/2700
bassosimone May 2, 2024
f78e32d
x
bassosimone May 2, 2024
5a32450
fix(httpclientx): fast fallback on immediate failure
bassosimone May 2, 2024
eb47e28
x
bassosimone May 2, 2024
dbb7d6e
x
bassosimone May 2, 2024
c63e68d
x
bassosimone May 2, 2024
1c55710
try to write algorithm without deadlocks
bassosimone May 2, 2024
b461aa4
x
bassosimone May 2, 2024
421f179
x
bassosimone May 2, 2024
2a44cd5
x
bassosimone May 2, 2024
13777ba
x
bassosimone May 2, 2024
a4fb49e
x
bassosimone May 2, 2024
cdd0e72
x
bassosimone May 3, 2024
0594703
x
bassosimone May 3, 2024
0b693ce
x
bassosimone May 3, 2024
bf35e09
x
bassosimone May 3, 2024
9c94348
x
bassosimone May 3, 2024
8de9e6b
x
bassosimone May 3, 2024
857ff1f
x
bassosimone May 3, 2024
08691b3
x
bassosimone May 3, 2024
b5ef086
x
bassosimone May 3, 2024
9c87724
x
bassosimone May 3, 2024
c857375
x
bassosimone May 3, 2024
9051155
x
bassosimone May 3, 2024
60749d0
Merge branch 'master' into issue/2700
bassosimone May 3, 2024
d99bb15
x
bassosimone May 3, 2024
fad03f7
x
bassosimone May 3, 2024
8fc01d8
x
bassosimone May 3, 2024
210393d
Merge branch 'master' into issue/2700
bassosimone May 3, 2024
2b56f26
x
bassosimone May 3, 2024
c8d0e66
x
bassosimone May 3, 2024
3623db4
x
bassosimone May 3, 2024
9158901
x
bassosimone May 3, 2024
88b40b0
x
bassosimone May 3, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 4 additions & 8 deletions internal/httpclientx/DESIGN.md
Original file line number Diff line number Diff line change
Expand Up @@ -212,7 +212,7 @@ We compare to `httpapi.Call` and `httpx.GetJSONWithQuery`.
| handle gzip encoding | yes | yes | NO |
| limit io.Reader | yes | yes | yes |
| netxlite.ReadAllContext() | yes | yes | yes |
| handle truncated body | NO | yes | NO |
| handle truncated body | yes | yes | NO |
| log response body | yes | yes | yes |
| handle non-200 response | ️ yes | yes* | yes* |
| unmarshal JSON | yes | yes | yes |
Expand All @@ -232,10 +232,6 @@ introducing the new `./internal/urlx` package to handle this situation).
3. Setting the `Accept` header does not seem to matter in out context because we mostly
call API for which there's no need for content negotiation.

4. It's difficult to say whether a body size was exactly the amount specified for truncation
or the body has been truncated. While this is a corner case, it seems perhaps wiser to let
the caller try parsing the body and failing if it is indeed truncated.

#### GetRaw

Here we're comparing to `httpapi.Call` and `httpx.FetchResource`.
Expand All @@ -257,7 +253,7 @@ Here we're comparing to `httpapi.Call` and `httpx.FetchResource`.
| handle gzip encoding | yes | yes | NO |
| limit io.Reader | yes | yes | yes |
| netxlite.ReadAllContext() | yes | yes | yes |
| handle truncated body | NO | yes | NO |
| handle truncated body | yes | yes | NO |
| log response body | yes | yes | yes |
| handle non-200 response | ️ yes | yes* | yes |

Expand Down Expand Up @@ -285,7 +281,7 @@ two APIs, the caller would need to fetch a raw body and then manually parse XML.
| handle gzip encoding | yes | N/A | N/A |
| limit io.Reader | yes | N/A | N/A |
| netxlite.ReadAllContext() | yes | N/A | N/A |
| handle truncated body | NO | N/A | N/A |
| handle truncated body | yes | N/A | N/A |
| log response body | yes | N/A | N/A |
| handle non-200 response | ️ yes | N/A | N/A |
| unmarshal XML | yes | N/A | N/A |
Expand Down Expand Up @@ -316,7 +312,7 @@ Here we're comparing to `httpapi.Call` and `httpx.PostJSON`.
| handle gzip encoding | yes | yes | NO |
| limit io.Reader | yes | yes | yes |
| netxlite.ReadAllContext() | yes | yes | yes |
| handle truncated body | NO | yes | NO |
| handle truncated body | yes | yes | NO |
| log response body | yes | yes | yes |
| handle non-200 response | ️ yes | yes* | yes* |
| unmarshal JSON | yes | yes | yes |
Expand Down
15 changes: 15 additions & 0 deletions internal/httpclientx/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,21 @@ type Config struct {
// Logger is the MANDATORY [model.Logger] to use.
Logger model.Logger

// MaxResponseBodySize OPTIONALLY limits the maximum body size. If not set, we
// use the [DefaultMaxResponseBodySize] value.
MaxResponseBodySize int64

// UserAgent is the MANDATORY User-Agent header value to use.
UserAgent string
}

// DefaultMaxResponseBodySize is the default maximum response body size.
const DefaultMaxResponseBodySize = 1 << 24

func (c *Config) maxResponseBodySize() (value int64) {
value = c.MaxResponseBodySize
if value <= 0 {
value = DefaultMaxResponseBodySize
}
return
}
21 changes: 21 additions & 0 deletions internal/httpclientx/config_test.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
package httpclientx

import "testing"

func TestConfigMaxResponseBodySize(t *testing.T) {
t.Run("the default returned value corresponds to the constant default", func(t *testing.T) {
config := &Config{}
if value := config.maxResponseBodySize(); value != DefaultMaxResponseBodySize {
t.Fatal("unexpected maxResponseBodySize()", value)
}
})

t.Run("we can override the default", func(t *testing.T) {
config := &Config{}
const expectedValue = DefaultMaxResponseBodySize / 2
config.MaxResponseBodySize = expectedValue
if value := config.maxResponseBodySize(); value != expectedValue {
t.Fatal("unexpected maxResponseBodySize()", value)
}
})
}
21 changes: 15 additions & 6 deletions internal/httpclientx/httpclientx.go
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ package httpclientx
import (
"compress/gzip"
"context"
"errors"
"io"
"net/http"

Expand All @@ -34,11 +35,11 @@ func zeroValue[T any]() T {
return *new(T)
}

// newLimitReader is a wrapper for [io.LimitReader] that automatically
// sets the maximum readable amount of bytes.
func newLimitReader(r io.Reader) io.Reader {
return io.LimitReader(r, 1<<24)
}
// ErrTruncated indicates we truncated the response body.
//
// Note: we SHOULD NOT change the error string because this error string was previously
// used by the httpapi package and it's better to keep the same strings.
var ErrTruncated = errors.New("httpapi: truncated response body")

// do is the internal function to finish preparing the request and getting a raw response.
func do(ctx context.Context, req *http.Request, epnt *Endpoint, config *Config) ([]byte, error) {
Expand Down Expand Up @@ -84,7 +85,10 @@ func do(ctx context.Context, req *http.Request, epnt *Endpoint, config *Config)
}

// protect against unreasonably large response bodies
limitReader := newLimitReader(baseReader)
//
// read one more byte than the maximum allowed size so we can
// always tell whether it was truncated here
limitReader := io.LimitReader(baseReader, config.maxResponseBodySize()+1)

// read the raw body
rawrespbody, err := netxlite.ReadAllContext(ctx, limitReader)
Expand All @@ -94,6 +98,11 @@ func do(ctx context.Context, req *http.Request, epnt *Endpoint, config *Config)
return nil, err
}

// handle the case of truncated body
if int64(len(rawrespbody)) > config.maxResponseBodySize() {
return nil, ErrTruncated
}

// log the response body for debugging purposes
config.Logger.Debugf("%s %s: raw response body: %s", req.Method, req.URL.String(), string(rawrespbody))

Expand Down
111 changes: 111 additions & 0 deletions internal/httpclientx/httpclientx_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,20 @@ import (
"github.com/ooni/probe-cli/v3/internal/testingx"
)

// createGzipBomb creates a gzip bomb with the given size.
func createGzipBomb(size int) []byte {
input := make([]byte, size)
runtimex.Assert(len(input) == size, "unexpected input length")
var buf bytes.Buffer
gz := runtimex.Try1(gzip.NewWriterLevel(&buf, gzip.BestCompression))
_ = runtimex.Try1(gz.Write(input))
runtimex.Try0(gz.Close())
return buf.Bytes()
}

// gzipBomb is a gzip bomb containing 1 megabyte of zeroes
var gzipBomb = createGzipBomb(1 << 20)

func TestGzipDecompression(t *testing.T) {
t.Run("we correctly handle gzip encoding", func(t *testing.T) {
expected := []byte(`Bonsoir, Elliot!!!`)
Expand Down Expand Up @@ -83,6 +97,36 @@ func TestGzipDecompression(t *testing.T) {
t.Fatal("expected nil response body")
}
})

t.Run("we can correctly decode a large body", func(t *testing.T) {
// create a server returning compressed content
server := testingx.MustNewHTTPServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.Header().Add("Content-Encoding", "gzip")
w.Write(gzipBomb)
}))
defer server.Close()

// make sure we can read it
respbody, err := GetRaw(
context.Background(),
NewEndpoint(server.URL),
&Config{
Client: http.DefaultClient,
Logger: model.DiscardLogger,
UserAgent: model.HTTPHeaderUserAgent,
})

//t.Log(respbody) // maybe this operation is a bit expensive to be the default
t.Log(err)

if err != nil {
t.Fatal(err)
}

if length := len(respbody); length != 1<<20 {
t.Fatal("unexpected response body length", length)
}
})
}

func TestHTTPStatusCodeHandling(t *testing.T) {
Expand Down Expand Up @@ -142,3 +186,70 @@ func TestHTTPReadBodyErrorsHandling(t *testing.T) {
t.Fatal("expected nil response body")
}
}

func TestLimitMaximumBodySize(t *testing.T) {
t.Run("we can correctly avoid receiving a large body when uncompressed", func(t *testing.T) {
// create a server returning uncompressed content
server := testingx.MustNewHTTPServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.Write(make([]byte, 1<<20))
}))
defer server.Close()

// make sure we can read it
//
// note: here we're using a small max body size, definitely smaller than what we send
respbody, err := GetRaw(
context.Background(),
NewEndpoint(server.URL),
&Config{
Client: http.DefaultClient,
Logger: model.DiscardLogger,
MaxResponseBodySize: 1 << 10,
UserAgent: model.HTTPHeaderUserAgent,
})

t.Log(respbody)
t.Log(err)

if !errors.Is(err, ErrTruncated) {
t.Fatal("unexpected error", err)
}

if len(respbody) != 0 {
t.Fatal("expected zero length response body length")
}
})

t.Run("we can correctly avoid receiving a large body when compressed", func(t *testing.T) {
// create a server returning compressed content
server := testingx.MustNewHTTPServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.Header().Add("Content-Encoding", "gzip")
w.Write(gzipBomb)
}))
defer server.Close()

// make sure we can read it
//
// note: here we're using a small max body size, definitely smaller than the gzip bomb
respbody, err := GetRaw(
context.Background(),
NewEndpoint(server.URL),
&Config{
Client: http.DefaultClient,
Logger: model.DiscardLogger,
MaxResponseBodySize: 1 << 10,
UserAgent: model.HTTPHeaderUserAgent,
})

t.Log(respbody)
t.Log(err)

if !errors.Is(err, ErrTruncated) {
t.Fatal("unexpected error", err)
}

if len(respbody) != 0 {
t.Fatal("expected zero length response body length")
}
})
}
Loading