Fix KeyError in S3BasedDocs when prefix path has no common prefixes#190
Merged
acarbonetto merged 1 commit intoawslabs:mainfrom Apr 9, 2026
Merged
Conversation
S3 list_objects_v2 paginator omits the CommonPrefixes key entirely when there are no sub-prefixes in the response. Both S3DocDownloader and S3ChunkDownloader accessed this key directly, causing a KeyError when querying an empty or non-existent S3 path. Replace dict key access with .get() and empty list default at both locations (lines 82 and 290).
acarbonetto
approved these changes
Apr 9, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Fixes a
KeyError: 'CommonPrefixes'crash in bothS3DocDownloader.download()andS3ChunkDownloader.download()when the S3 prefix path is empty or has no sub-prefixes.Problem
The S3
list_objects_v2paginator omits theCommonPrefixeskey entirely from the response when there are no common prefixes. Both downloaders accessed this key directly:This crashes when:
Fix
Replace direct dict access with
.get()and empty list default at both locations:Tests
Added 4 unit tests covering:
S3DocDownloaderwith empty S3 response (noCommonPrefixeskey)S3ChunkDownloaderwith empty S3 responseS3DocDownloaderwith mixed pages (some with prefixes, some without)S3ChunkDownloaderwith mixed pagesAll 21 tests in
test_s3_based_docs.pypass.By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.