Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't load files in tarball exceeding size_limit_bytes #6516

Merged

Conversation

anderseknert
Copy link
Member

@anderseknert anderseknert commented Jan 9, 2024

Previously we'd check the size limit after the file was read, which mostly defeats the point of the limit. Now we check the size from the header in the tar archive and exit early if it exceeds the configured limit.

In order to do this, I had to extend the DirectoryLoader interface with a method to set the max size. While I added implementations for the other (than tarball) loader types, the limit is not currently set anywhere for those. Perhaps we'll want to do that at some later point but it feels like this is mainly relevant when files are loaded via remote bundles.

Fixes #6514

@anderseknert
Copy link
Member Author

Running OPA with the following configuration:

services:
  test:
    url: https://dl.styra.com

bundles:
  test:
    resource: load/bundle-opa-200.tar.gz
    size_limit_bytes: 1024

Fails in both cases, as the file of ~200MB is way bigger than the configured limit. The difference is that now the failure preceeds reading the file in the first place.

Before fix:
Screenshot 2024-01-09 at 14 57 17

After fix:
Screenshot 2024-01-09 at 14 57 56

Copy link
Member

@ashutosh-narkar ashutosh-narkar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The approach lgtm. Few comments 👇

bundle/file.go Outdated Show resolved Hide resolved
@@ -206,6 +214,9 @@ func (d *dirLoader) NextFile() (*Descriptor, error) {
if d.filter != nil && d.filter(filepath.ToSlash(path), info, getdepth(path, false)) {
return nil
}
if d.maxSizeLimitBytes > 0 && info.Size() > d.maxSizeLimitBytes {
return fmt.Errorf("file %s size %d exceeds limit of %d", path, info.Size(), d.maxSizeLimitBytes)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: It would be great if we could apply a common format for the error message for this and the tar one.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to revert this as files like data.json and .signatures would not pass through that path.. at least not in our tests.

@@ -343,6 +362,11 @@ func (t *tarballLoader) NextFile() (*Descriptor, error) {
}
}

if t.maxSizeLimitBytes > 0 && header.Size > t.maxSizeLimitBytes {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The description for the header.Size is Logical file size in bytes and Header.Size determines how many bytes can be read for the next file. It seems this would reflect the actual file size, correct?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think so.

bundle/filefs.go Outdated Show resolved Hide resolved
Comment on lines +1468 to +1478
if bb, ok := f.reader.(*bytes.Buffer); ok {
_ = f.Close() // always close, even on error

if int64(bb.Len()) >= sizeLimitBytes {
return *bb, fmt.Errorf("bundle file '%v' size (%d bytes) exceeded max size (%v bytes)",
strings.TrimPrefix(f.Path(), "/"), bb.Len(), sizeLimitBytes-1)
}

return *bb, nil
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not clear to me why we need this check here. By the time this method is called we've already called NextFile which has the size check. So if we have an offending file we would have already failed. We can keep the changes limited to the loaders looks like.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, removing the size check from here.

@anderseknert anderseknert force-pushed the loader-size-limit branch 2 times, most recently from f8aa25d to baaadd1 Compare January 15, 2024 12:50
Copy link
Member

@ashutosh-narkar ashutosh-narkar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@anderseknert the changes lgtm. Not sure if we need to update the readFile method and change only the loaders. Please let me know if I missed something. Thanks.

@anderseknert
Copy link
Member Author

@ashutosh-narkar sorry, I had apparently forgot to submit my comments 😅 Removing the check from there broke checking other types of files than .rego.

Previously we'd check the size limit *after* the file was read, which
mostly defeats the point of the limit. Now we check the size from the
header in the tar archive and exit early if it exceeds the configured
limit.

In order to do this, I had to extend the `DirectoryLoader` interface
with a method to set the max size. While I added implementations for
the other (than tarball) loader types, the limit is not currently set
anywhere for those. Perhaps we'll want to do that at some later point
but it feels like this is mainly relevant when files are loaded via
remote bundles.

Fixes open-policy-agent#6514

Signed-off-by: Anders Eknert <anders@styra.com>
@ashutosh-narkar ashutosh-narkar merged commit c0589c1 into open-policy-agent:main Jan 18, 2024
25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

OPA process OOM when a bundle contains a large file despite having size_limit_bytes
2 participants