Skip to content

Fichero is a tool that processes archival materials, and converts them (for now) to transcribed Word documents with the right verso page as the image and the recto page as the text.

License

Notifications You must be signed in to change notification settings

Jrardila/fichero

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

45 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Fichero

Fichero processes archival materials (documents in JPG, PDF, TIFF format), and crops, splits, enhances contrast, removes backgrounds, and then transcribes text using AI LLMs, before exporting them to Word documents, with the image of the document on the right verso page, and the recto page as the text.

Follow along on the Fichero website

Fichero:

  • Processes archival materials (scanned documents, images, etc.)
  • Splits multi-page materials into single pages
  • Enhances image quality and removes backgrounds
  • Transcribe text using:
    • Qwen Max (full document or segmented processing)
    • LM Studio Models (full document or segmented processing)
  • Cleans and format transcriptions
  • Generate Word documents with side-by-side layout
  • Processes text files using LLMs with configurable prompts.
  • Converts output to formatted Word (.docx) documents.

Installation

For command-line usage or contributing to development:

  1. Clone the repository:
git clone https://github.com/dtubb/fichero.git
cd fichero
  1. Install system dependencies:

    On macOS:

    brew install poppler  # Required for PDF processing
    brew install libjxl   # Optional: For JPEG XL support
    brew install libheif  # Optional: For HEIC/HEIF support
    brew install libraw   # Optional: For RAW format support
    brew install exiftool # Optional: For metadata handling
  2. Create and activate a virtual environment:

    Option 1 - Using venv:

    python -m venv venv
    source venv/bin/activate
  3. Install Python dependencies:

pip install -r requirements.txt

Running Fichero

Quick Start (GUI Mode - No Briefcase Required)

Run the GUI directly without Briefcase:

# From the project root directory
python -m src/fichero

This launches the GUI application directly using Toga, without needing to use Briefcase.

Command Line Interface (CLI Mode)

Run Fichero from the command line for advanced usage:

# From the src directory
cd src

# Show all available commands
python -m fichero --help

# Process a single folder
python -m fichero process-folders /path/to/input /path/to/output

# Process folders with a specific workflow
python -m fichero process-folders /path/to/input /path/to/output default

# Prepare folders for processing (first phase)
python -m fichero prepare /path/to/input /path/to/output

# Check worker status
python -m fichero worker-status

# See example usage
python -m fichero example

Available CLI Commands

  • process-folders: Main command to process documents with AI transcription
  • prepare: Prepare folders by copying and organizing files
  • worker-status: Check status of background processing workers
  • reset-workers: Restart all background workers
  • stop-workers: Stop all background workers
  • purge-tasks: Clear all pending processing tasks
  • example: Show detailed usage examples

Building the Native App (Optional)

Fichero can be built using BeeWare Briefcase, which packages Python apps as native applications for multiple platforms.

Development Mode

Run the app in development mode (faster iteration):

For GUI development, install BeeWare Briefcase.

pip install briefcase
# Run the GUI app directly
briefcase dev

Building for Distribution

Create a native macOS app:

# Create the app bundle
briefcase create

# Build the app (compile and package)
briefcase build

# Create a distributable package (.dmg) of the console app.
briefcase package

Platform Support

The app should be able to be built for multiple platforms. But, I've only tested macOS.

  • macOS: .app bundle, .dmg installer
  • Windows: .exe application, .msi installer
  • Linux: AppImage, native packages

For more information, see the BeeWare documentation.

The easiest way to use Fichero:

  1. Run the app directly: briefcase dev
  2. Click "Choose Folder" to select a folder containing your documents
  3. Click "Process" to start processing
  4. The app will:
    • Process all documents in the selected folder
    • Save processed files to your Desktop in a Fichero_Output_[folder_name] folder

Configuration

Plan.yml files

Fichero supports multiple formats and features.You can configure this in your plan.yml file stored in /src/resources/plans/plan.yml:

Alibaba API Key Setup

To transcribe with Alibababa features, you'll need to set up your DashScope API key:

  1. Sign up or log in to your Alibaba Cloud account
  2. Navigate to the DashScope console
  3. Create an API key
  4. Create a .env file in the project root:
touch .env
  1. Open the file with TextEdit:
open -a TextEdit .env
  1. Add your API key:
DASHSCOPE_API_KEY=your_api_key_here
  1. Save the file

Note: The DashScope API costs money.

LM Studio Setup

To use the LM Studio workflows, you'll need to:

  1. Download and install LM Studio from lmstudio.ai
  2. Download the Qwen 2.5 VL 7B model (or another VL model) in LM Studio:
    • Open LM Studio
    • Go to the "Models" tab
    • Search for "Qwen2.5-VL-3B-Instruct-8bit"
    • Download the model
  3. Start the LM Studio server:
    • Go to the "Local Server" tab
    • Click "Start Server"
    • The server will run on http://localhost:1234 by default
    • The API endpoint for chat completions is http://localhost:1234/v1/chat/completions

Citation

Citation for Fichero: Tubb, Daniel, and Andrew Janco. "Fichero: Document Processing and Transcription." GitHub, May 9, 2025. https://github.com/dtubb/fichero.

About

Fichero is a tool that processes archival materials, and converts them (for now) to transcribed Word documents with the right verso page as the image and the recto page as the text.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages