-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[refactor] - Mime Type Detection #3084
base: perf-optimize-chunker
Are you sure you want to change the base?
Conversation
3364dbe
to
63229d5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pretty confused by adding undocumented file type detection way down in the call stack (specific comments inline), as well as the thrown-away MIME type detection.
@@ -89,7 +89,7 @@ func (h *archiveHandler) openArchive(ctx logContext.Context, depth int, reader f | |||
|
|||
if reader.format == nil { | |||
if depth > 0 { | |||
mtr, err := newSizedMimetypeReaderFromFileReader(reader) | |||
mtr, err := newSizedReaderFromFileReader(reader) | |||
if err != nil { | |||
return fmt.Errorf("error reading MIME type: %w", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this error message be updated?
@@ -44,7 +44,7 @@ func (h *defaultHandler) HandleFile(ctx logContext.Context, input fileReader) (c | |||
h.metrics.incFilesProcessed() | |||
}() | |||
|
|||
mtr, err := newSizedMimetypeReaderFromFileReader(input) | |||
mtr, err := newSizedReaderFromFileReader(input) | |||
if err != nil { | |||
ctx.Logger().Error(err, "error reading MIME type") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this error message be updated?
return sizedReader{}, errEmptyReader | ||
} | ||
|
||
bufReader, _, err := determineMIMEType(r) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now you determine the MIME type but immediately throw it away? This seems to reverse all the optimizations you've been adding :/
// This function is particularly useful for specialized archive handlers | ||
// that need to pass extracted content to the default handler without modifying the original reader. | ||
func newSizedMimeTypeReader(r io.Reader, size int64) (sizedMimeTypeReader, error) { | ||
func newSizedReader(r io.Reader, size int64) (sizedReader, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would never occur to me that this function could fail because of an unsupported file type - that behavior isn't hinted at in the name or the doc comment.
if errors.Is(err, errEmptyReader) { | ||
ctx.Logger().V(5).Info("empty reader, skipping file") | ||
return nil | ||
} else if errors.Is(err, errUnsupportedMIME) { | ||
ctx.Logger().V(5).Info("skipping file") | ||
return nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not use handleReaderError
here?
Description:
This PR refactors the handling of MIME type detection during file processing. Instead of performing this check at the final stage of the file handling in
handleNonArchiveContent
, it is moved to an earlier stage to avoid unnecessary operations.Checklist:
make test-community
)?make lint
this requires golangci-lint)?