Skip to content

CodeCreator/s3grep

 
 

Repository files navigation

s3grep - Fast Parallel grep for S3

I use grep almost daily, but often deal with unstructured content and logs on S3...so I wrote an easy way to grep S3!

CI

s3grep is a parallel CLI tool for searching logs and unstructured content in Amazon S3 buckets. It supports .gz decompression, progress bars, and robust error handling—making it ideal for cloud-native log analysis.


Features

  • Parallel, concurrent search across S3 objects
  • Supports plain text and .gz compressed files
  • Progress bars for files and bytes processed
  • Case-sensitive and insensitive search
  • Line number output option
  • Graceful handling of binary files and decompression errors
  • Colorized match highlighting

Installation

crates.io

cargo install s3grep

Usage

s3grep --pattern "ERROR" --bucket my-logs-bucket --prefix logs/ --concurrent-tasks 16

CLI Options

Flag Description
-p, --pattern Search pattern (required)
-b, --bucket S3 bucket name (required)
-z, --prefix S3 prefix to search in (default: "")
-c, --concurrent-tasks Number of concurrent tasks (default: 8)
-i, --case-sensitive Case sensitive search
-q, --quiet Hide progress bar
-n, --line-number Show line numbers in output

Example

s3grep --pattern "timeout" --bucket my-bucket --prefix logs/2025/06/ --concurrent-tasks 12 --line-number

Testing

Integration tests use Localstack to mock S3. See CONTRIBUTING.md for details.


Contributing

Contributions are welcome! Please see CONTRIBUTING.md for guidelines.


License

This project is licensed under the MIT License.


Acknowledgments

  • Inspired by daily use of grep and the need for cloud-native log search.
  • Built with Rust, aws-sdk-rust, and Localstack for testing.

About

CLI tool for searching logs and unstructured content in Amazon S3 buckets

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Rust 100.0%