░██████╗██╗░░██╗███████╗██████╗░██╗░░░░░░█████╗░░█████╗░██╗░░██╗
██╔════╝██║░░██║██╔════╝██╔══██╗██║░░░░░██╔══██╗██╔══██╗██║░██╔╝
╚█████╗░███████║█████╗░░██████╔╝██║░░░░░██║░░██║██║░░╚═╝█████═╝░
░╚═══██╗██╔══██║██╔══╝░░██╔══██╗██║░░░░░██║░░██║██║░░██╗██╔═██╗░
██████╔╝██║░░██║███████╗██║░░██║███████╗╚█████╔╝╚█████╔╝██║░╚██╗
╚═════╝░╚═╝░░╚═╝╚══════╝╚═╝░░╚═╝╚══════╝░╚════╝░░╚════╝░╚═╝░░╚═╝
sherlock detects file types and extracts metadata from raw bytes — images, documents, archives, video, executables, and more.
go get ella.to/sherlockdata, _ := os.ReadFile("photo.jpg")
meta, err := sherlock.BytesDetect(data)
if err != nil {
log.Fatal(err)
}
fmt.Println(meta["mime"]) // ["image/jpeg"]
fmt.Println(meta["width"]) // ["4032"]
fmt.Println(meta["height"]) // ["3024"]There are two detection functions. Both return map[string][]string with the extracted metadata.
// From a byte slice
meta, err := sherlock.BytesDetect(data)
// From any io.Reader
meta, err := sherlock.ReaderDetect(reader)For cases where data arrives in chunks (uploads, network streams), use StreamDetector. It implements io.WriteCloser, so you can write data to it incrementally.
detector := sherlock.NewStreamDetector()
// Write data as it arrives
detector.Write(chunk1)
detector.Write(chunk2)
detector.Write(chunk3)
// Get the result (closes the detector and returns metadata)
meta, err := detector.CloseAndResult()
fmt.Println(meta["mime"]) // ["text/csv"]Results are cached — calling Result() multiple times after closing returns the same value without re-processing.
detector.Close()
meta1, _ := detector.Result()
meta2, _ := detector.Result() // same object, no extra workPNG, JPEG, GIF, BMP, TIFF, WebP, HEIC/HEIF, PSD, SVG, ICO
Image metadata includes dimensions (width, height), and for JPEG/HEIC files with EXIF data: camera make/model, GPS coordinates, and timestamps.
PDF, DOCX, XLSX, PPTX, EPUB, OLE-based Office files (DOC, XLS, PPT)
PDF metadata includes version, encryption status, linearization, and approximate page count.
ZIP, GZIP, TAR
Archive metadata includes entry names, entry counts, and compressed/uncompressed sizes.
MP4, QuickTime, HEIF containers, FLV, Ogg, FLAC, WebM
Video metadata includes container format, major brand, and compatible brands.
ELF (Linux), PE (Windows), Mach-O (macOS), DMG (Apple disk images), DOS COM
Executable metadata includes architecture bits, endianness, and format-specific details.
CSV, shell scripts (via shebang detection), BitTorrent files
Every detection result includes these baseline fields:
| Key | Description |
|---|---|
detector_version |
Version of the detection engine |
size_bytes |
Size of the input data |
mime |
Detected MIME type |
type |
Primary type (e.g. image, video, application) |
subtype |
Subtype (e.g. png, pdf, zip) |
Additional keys depend on the file type:
Images: width, height, image_format, resolution, datetime, camera_make, camera_model, location_latitude, location_longitude
CSV: csv_rows, csv_columns, csv_consistent_columns, csv_header
PDF: pdf_version, pdf_encrypted, pdf_linearized, pdf_pages_approx
Archives: zip_entries, zip_entry_name, tar_entries_sampled, gzip_name
Executables: executable_format, architecture_bits, endianness
Video: video_major_brand, video_compatible_brand, video_container
Sherlock can be compiled to WebAssembly for use in browsers or WASI runtimes. See the examples directory for:
- WASI — CLI tool that reads a file and outputs JSON metadata. Build with
GOOS=wasip1 GOARCH=wasm. - Browser WASM — Exposes a
sherlockDetectBase64()function to JavaScript. Build withGOOS=js GOARCH=wasm.
MIT — see LICENSE for details.