filescan.sh is a Bash script utility that scans files in a directory and prints their contents with structured annotations. It intelligently handles binary files and provides a clean output format suitable for documentation and code analysis.
- Download the script:
curl -O https://your-repo/filescan.sh- Source it in your shell profile for persistent access:
# Add to ~/.bashrc or ~/.zshrc
source /path/to/filescan.sh- Or source it temporarily in your current session:
source ./filescan.shNote: This script is designed to be sourced, not executed. You don't need to make it executable with chmod +x.
Simply source the script and call the filescan function:
# Source the script
source ./filescan.sh
# Use the function directly
filescan # Scan current directory
filescan /path/to/directory # Scan specific directory- directory (optional): The target directory to scan. If not provided, defaults to the current directory (
.).
# Source the script first
source ./filescan.sh
# Scan current directory
filescan
# Scan specific directory
filescan /path/to/project
# Scan and save to file
filescan > project_contents.txt# Scan multiple directories
for dir in src tests docs; do
echo "=== Scanning $dir ===" >> output.txt
filescan "$dir" >> output.txt
done
# Use in a pipeline
filescan src | grep -A 10 "//begin.*\.js$"
# Combine with other tools
filescan | pbcopy # Copy to clipboard (macOS)#!/usr/bin/env bash
# Source filescan
source /path/to/filescan.sh
# Use it in your workflow
echo "Analyzing project structure..."
filescan src > codebase_snapshot.txt
# You can also use helper functions
if filescan_is_binary "myfile.dat"; then
echo "File is binary"
fiThe script automatically:
- Ignores common directories and files:
node_modules,.git,dist,.DS_Store, and*.logfiles - Detects binary files: Using file extensions, size checks, and the
filecommand - Handles large files: Files larger than 1MB are skipped to prevent performance issues
The script processes text files while skipping:
- Binary extensions: jpg, jpeg, png, gif, mp3, mp4, zip, tar, gz, pdf, exe, o, so, dylib, dll
- Large files: Files exceeding 1MB in size
- Non-text files: Files identified as binary by the
filecommand
The script generates output with clear file pointer annotations. Each file is wrapped with begin/end markers that include the relative path from your current working directory (cwd):
//begin relative/path/to/file.txt
[file content here]
//end relative/path/to/file.txt
Important: The paths in the annotations are always relative to your current working directory, not the target directory being scanned. This ensures consistent, predictable paths regardless of where you're scanning.
Example:
# If your cwd is /home/user/project
# And you run: filescan src
# The output will show:
//begin src/components/Button.js
[content]
//end src/components/Button.js
# Not just: components/Button.jsEach file is separated by empty lines for better readability.
When you source the script, you gain access to these functions:
Main function to scan and print file contents with relative path annotations.
Path Behavior: All file paths in the output are relative to your current working directory when filescan is called.
Check if a file is binary. Returns 0 (true) if binary, 1 (false) if text.
if filescan_is_binary "myfile.txt"; then
echo "Binary file"
else
echo "Text file"
fiCheck if a file matches ignore patterns. Returns 0 (true) if should ignore.
if filescan_should_ignore "node_modules/package.json"; then
echo "This file will be ignored"
fiProcess a single file with annotations. Used internally but available for custom workflows.
Parameters:
file: Absolute or relative path to the filecwd: Current working directory (used to calculate relative paths in output)
- Extension Check: Compares file extensions against known binary types
- Size Validation: Skips files larger than 1MB (1048576 bytes)
- Content Analysis: Uses the
filecommand to detect text vs binary content - Cross-Platform Support: Compatible with both GNU and BSD
statcommands
- The script captures your current working directory when
filescanis invoked - All file paths in the
//beginand//endannotations are calculated relative to this directory - This ensures that the output is consistent and paths are meaningful in context
The script automatically ignores:
- Development directories (
node_modules,.git,dist) - System files (
.DS_Store) - Log files (
*.log)
- Gracefully handles files that cannot be accessed
- Provides informative error messages to stderr
- Continues processing other files when errors occur
source ./filescan.sh
filescan src > codebase_documentation.txtsource ./filescan.sh
filescan . > project_snapshot_$(date +%Y%m%d).txtsource ./filescan.sh
filescan src tests | grep -v "test fixtures" > review.txtsource ./filescan.sh
filescan src > context_for_ai.txt
# Share this file with AI tools for better code understanding
# The relative paths make it easy for AI to understand your project structureMake sure you've sourced the script:
source ./filescan.shThe script uses the file command. Ensure it's installed:
# macOS (usually pre-installed)
which file
# Linux
sudo apt-get install file # Debian/Ubuntu
sudo yum install file # RedHat/CentOSRemember that paths are relative to your current working directory when you call filescan, not relative to the target directory. If you want cleaner paths, cd into the parent directory first:
# Instead of this from /home/user:
filescan project/src # Shows: project/src/file.js
# Do this:
cd project
filescan src # Shows: src/file.js-
Add to your shell profile for permanent access:
echo "source ~/scripts/filescan.sh" >> ~/.bashrc
-
Create aliases for common operations:
alias scan-src='filescan src' alias scan-all='filescan .'
-
Combine with version control:
# Scan only tracked files git ls-files | while read f; do filescan_process_file "$f" "$(pwd)"; done
-
Use with AI tools: The structured output format with relative paths from your cwd is ideal for providing context to AI assistants and LLMs, making it clear how files relate to your project structure.
Proprietary Software. See LICENSE file for terms.
Copyright © 2025 Imre Toth tothimre@gmail.com