FastFileScrape v0.1.0 [ALPHA] — Ultra‑Fast File Tree & Content Scraper for Java

⚡ Scrape and process millions of files in milliseconds with zero latency.

FastFileScrape is the high‑speed file scraping module of the FastJava ecosystem.
It provides two core capabilities:

FastFileTree — build complete directory trees with include/exclude rules
FastFileScrapeContent — extract file contents with chunking for LLMs and agents

Key Features

🟩 FastFileTree — Directory Structure Engine

Recursive directory walking
Include/Exclude glob filters
Sorted output (folders → files)
JSON or ASCII tree output
Git‑ignore aware (optional)

🟧 FastFileScrapeContent — File Content Engine

Extracts file contents with UTF‑8 safety
Chunking by byte size or newline boundaries
Include/Exclude patterns
JSONL or plain text output
Ideal for LLM context ingestion

🟦 CLI Tool — `fastfilescrape`

tree → structure only
content → file contents only
all → both combined
Output to stdout or file
JSONL mode for AI pipelines

Quick Start

# Show directory tree
fastfilescrape tree --root . --include "**/*.java"

# Extract file contents
fastfilescrape content --root . --include "**/*.java" --out repo.txt

# Tree + Content in JSONL
fastfilescrape all --root . --include "**/*.java" --format jsonl --out repo.jsonl

Demo (Java)

import fastfilescrape.*;

public class Demo {
    public static void main(String[] args) throws Exception {

        // Tree
        var tcfg = new FastFileTree.Config();
        tcfg.root = Path.of(".");
        var tree = FastFileTree.build(tcfg);
        FastFileTree.printTree(tree, System.out);

        // Content
        var ccfg = new FastFileScrapeContent.Config();
        ccfg.root = Path.of(".");
        ccfg.includeGlobs = List.of("**/*.java");

        FastFileScrapeContent.scrape(ccfg, (file, chunk, text) -> {
            System.out.println("=== " + file + " (chunk " + chunk + ") ===");
            System.out.println(text);
        });
    }
}

Installation

Option 1: Maven (Recommended)

Add the JitPack repository and the dependencies to your pom.xml:

<repositories>
    <repository>
        <id>jitpack.io</id>
        <url>https://jitpack.io</url>
    </repository>
</repositories>
<dependencies>
    <dependency>
        <groupId>com.github.andrestubbe</groupId>
        <artifactId>FastFileScrape</artifactId>
        <version>v0.1.0</version>
    </dependency>
    <dependency>
        <groupId>com.github.andrestubbe</groupId>
        <artifactId>FastGLOB</artifactId>
        <version>v0.1.0</version>
    </dependency>
    <dependency>
        <groupId>com.github.andrestubbe</groupId>
        <artifactId>FastCore</artifactId>
        <version>v1.0.0</version>
    </dependency>
</dependencies>

Option 2: Gradle (via JitPack)

repositories {
    maven { url 'https://jitpack.io' }
}
dependencies {
    implementation 'com.github.andrestubbe:FastFileScrape:v0.1.0'
    implementation 'com.github.andrestubbe:FastGLOB:v0.1.0'
    implementation 'com.github.andrestubbe:FastCore:v1.0.0'
}

Option 3: Direct Download (No Build Tool)

Download the pre-compiled JARs to add them to your classpath:

📦 FastFileScrape-v0.1.0.jar (The Scraper Core Library)
📦 FastGlob-v0.1.0.jar (The Native Glob Matching Library)
⚙️ fastcore-v1.0.0.jar (The Mandatory JNI Loader)

Important

Since FastFileScrape is natively accelerated, all three JARs must be present in your classpath for the JNI-accelerated directory walking to operate correctly on Windows.

API Reference

FastFileTree

Method	Description
`Node build(Config cfg)`	Builds the directory tree
`printTree(Node, Appendable)`	Prints ASCII tree

FastFileScrapeContent

Method	Description
`scrape(Config cfg, Sink sink)`	Reads files and emits chunks

Documentation

COMPILE.md: Full compilation guide (MSVC C++17 build chain + JNI Setup).
REFERENCE.md: Full API descriptions, border configurations, and codepoint index.
PHILOSOPHIE.md: The engineering rationale for zero-allocation performance.
ROADMAP.md: Future milestones and planned features.

Platform Support

Platform	Status
Windows 10/11	✅ Fully Supported
Linux	🚧 Planned
macOS	🚧 Planned

License

MIT License — See LICENSE file for details.

Related Projects

FastFileIndex — Ultra-fast filesystem scanner
FastFileContentIndex — High-speed in-file text indexing
FastFileWatch — High-performance directory watch service using USN Journal
FastFileSearch — Ultra-fast indexed file prefix trie search
FastGLOB — Ultra-fast native Win32 glob matching and traversal
FastFileSystem — Unified filesystem operations (Index, Search, Watch, Scrape) in one API

Part of the FastJava Ecosystem — Making the JVM faster. Small package. Maximum speed. Zero bloat. 🚀📋

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
.idea		.idea
docs		docs
examples		examples
release		release
src/main/java/fastfilescrape		src/main/java/fastfilescrape
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
COMPILE.md		COMPILE.md
LICENSE		LICENSE
PHILOSOPHIE.md		PHILOSOPHIE.md
README.md		README.md
REFERENCE.md		REFERENCE.md
ROADMAP.md		ROADMAP.md
pom.xml		pom.xml
repo.jsonl		repo.jsonl
repo.txt		repo.txt
run-benchmark.bat		run-benchmark.bat
run-demo.bat		run-demo.bat

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FastFileScrape v0.1.0 [ALPHA] — Ultra‑Fast File Tree & Content Scraper for Java

Table of Contents

Key Features

🟩 FastFileTree — Directory Structure Engine

🟧 FastFileScrapeContent — File Content Engine

🟦 CLI Tool — `fastfilescrape`

Quick Start

Demo (Java)

Installation

Option 1: Maven (Recommended)

Option 2: Gradle (via JitPack)

Option 3: Direct Download (No Build Tool)

API Reference

FastFileTree

FastFileScrapeContent

Documentation

Platform Support

License

Related Projects

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

FastFileScrape v0.1.0 [ALPHA] — Ultra‑Fast File Tree & Content Scraper for Java

Table of Contents

Key Features

🟩 FastFileTree — Directory Structure Engine

🟧 FastFileScrapeContent — File Content Engine

🟦 CLI Tool — fastfilescrape

Quick Start

Demo (Java)

Installation

Option 1: Maven (Recommended)

Option 2: Gradle (via JitPack)

Option 3: Direct Download (No Build Tool)

API Reference

FastFileTree

FastFileScrapeContent

Documentation

Platform Support

License

Related Projects

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

🟦 CLI Tool — `fastfilescrape`

Packages