Ollama Batch Processor 📚

A powerful PySide6-based GUI application for batch processing text files using Ollama LLM models. Supports translation, audiobook formatting, and intelligent text paraphrasing with customizable pipeline operations.

Features

🌐 Translation: Professional translation between languages with context preservation
🎧 Audiobook Formatting: Optimize text for text-to-speech systems
✍️ Paraphrasing: Improve flow, simplify language, remove idioms, adjust tone
📊 Pipeline Processing: Chain multiple operations in custom order
✂️ Smart Chunking: Intelligent text splitting with overlap and boundary detection
💾 Progressive Saving: Each pipeline step saved to separate files
🔄 Batch Processing: Process multiple files sequentially
🎯 Model Selection: Use different Ollama models per operation

Requirements

Python 3.8+
Ollama installed and running
At least one Ollama model installed

Installation

1. Install Ollama

Download and install from ollama.ai

# Install a model
ollama pull mistral
# or
ollama pull llama3.2
ollama pull aya-expanse:32b

2. Install Python Dependencies

pip install PySide6 aiohttp qasync ollama

3. Run the Application

# Start Ollama server (in separate terminal)
ollama serve

# Run the application
python main.py

Configuration

Edit config.json to customize:

Ollama host: Default http://localhost:11434
Chunking presets: Adjust chunk sizes and overlap
Operation settings: Modify prompts, icons, defaults
UI settings: Window size, titles

Usage

Basic Workflow

Start Ollama: Run ollama serve in a terminal
Launch App: Run python main.py
Add Files: Click "📁 Add File" and select .txt files
Configure Pipeline:
- Check operations to enable (Translation, Audiobook, Paraphrase)
- Drag operations to reorder
- Configure each operation's settings
Select Models: Choose Ollama model for each operation
Set Chunking: Select preset or enable "Process entire file"
Start Processing: Click "🚀 START"
Monitor Progress: Watch Activity Log and progress bar
Access Outputs: Find processed files and step files in output directory

Translation

Set source and target languages
Handles idioms intelligently
Maintains consistency across chunks
Temperature: 0.2-0.4 recommended

Audiobook Formatting

Enable options:

Expand Contractions: "don't" → "do not"
Spell Out Numbers: "123" → "one hundred twenty-three"
Remove Special Characters: Clean non-standard symbols
Add Reading Markers: Insert TTS-friendly markers

Paraphrasing

Enable sub-operations:

Improve Flow: Better sentence structure and transitions
Simplify Language: Make complex text accessible
Remove Idioms: Convert figurative to literal language
Adjust Tone: Formal, casual, professional, or conversational

Chunking Settings

Presets:

Fast (2000/150): Quick processing
Balanced (2500/200): Default, good quality
High Context (3000/250): Better continuity
Large (4000/300): Fewer API calls
Extra Large (6000/400): Maximum context

Process Entire File: Disable chunking for small files (< 2500 chars)

Output Files

The application creates multiple output files:

input.txt                          # Original file
input_step_01_translated.txt       # After translation
input_step_02_audiobook.txt        # After audiobook formatting
input_processed.txt                # Final output

Troubleshooting

"Nothing Happens" When Clicking Start

Check:

Is Ollama running? → ollama serve
Is a model installed? → ollama list
Did you add input files?
Is at least one operation checked?
Is a model selected in dropdown?

Connection Errors

# Test Ollama
ollama list

# If empty, install a model
ollama pull mistral

# Test inference
ollama run mistral "hello"

Model Not Loading

Large models (30B+) take 1-2 minutes to load on first use. Watch console output for [DEBUG] messages.

Performance Tips

Smaller chunks: Faster processing, less context
Larger chunks: Slower but better quality
Combined operations: More efficient than separate runs
Fast models: Use smaller quantized models for speed
GPU: Ensure Ollama uses GPU for better performance

Advanced Configuration

Custom Prompts

Edit config.json operation prompts:

{
  "operations": {
    "translation": {
      "prompts": {
        "system_first": "Your custom system prompt...",
        "user_first": "Your custom user prompt..."
      }
    }
  }
}

Temperature Settings

Translation: 0.2-0.4 (deterministic)
Paraphrasing: 0.4-0.6 (creative)
Creative Writing: 0.7-1.0 (very creative)

Pipeline Order

Operations execute in order from top to bottom. Typical workflows:

Translation → Audiobook: Translate then optimize for TTS
Paraphrase → Simplify → Remove Idioms: Multi-step text cleanup
Translation → Paraphrase (tone): Translate and adjust formality

Keyboard Shortcuts

Ctrl+O: Add files
Ctrl+S: Select output directory
Ctrl+R: Start processing
Esc: Stop processing

Diagnostic Tools

Test Ollama Connection

python test_ollama.py

Shows detailed connection diagnostics and model availability.

Debug Mode

Console output shows [DEBUG] messages for:

Model calls with parameters
Response types and content length
Error tracebacks
Pipeline execution flow

Known Limitations

Only .txt files supported
No real-time progress within chunks (model inference time varies)
Large models require significant RAM/VRAM
Async operations prevent UI responsiveness during heavy processing

Tips for Best Results

Pre-process text: Remove excessive whitespace, fix encoding
Use appropriate models: Match model size to task complexity
Test with small files first: Verify settings before batch processing
Monitor first chunk: Shows model loading time and quality
Enable deduplication: Removes duplicate paragraphs at chunk boundaries
Save intermediate steps: Useful for debugging and iterative refinement

Architecture

main.py: PySide6 GUI and application logic
Translator.py: Ollama API interface and text processing
config.json: Configuration and prompts
qasync: Async event loop integration with Qt

License

MIT License - feel free to modify and distribute.

Contributing

Improvements welcome! Focus areas:

Additional operation types
Better error recovery
Real-time streaming output
Support for other file formats
Memory optimization for large files

Credits

Built with:

Ollama - Local LLM inference
PySide6 - Qt GUI framework
qasync - Qt async support

Version: 1.0
Last Updated: October 2025

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.gitignore		.gitignore
.python-version		.python-version
LICENSE		LICENSE
README.md		README.md
Translator.py		Translator.py
config.json		config.json
main.py		main.py
styles.qss		styles.qss

Uh oh!

License

Uh oh!

hclivess/ollama-batch-processor

Folders and files

Latest commit

History

Repository files navigation

Ollama Batch Processor 📚

Features

Requirements

Installation

1. Install Ollama

2. Install Python Dependencies

3. Run the Application

Configuration

Usage

Basic Workflow

Translation

Audiobook Formatting

Paraphrasing

Chunking Settings

Output Files

Troubleshooting

"Nothing Happens" When Clicking Start

Connection Errors

Model Not Loading

Performance Tips

Advanced Configuration

Custom Prompts

Temperature Settings

Pipeline Order

Keyboard Shortcuts

Diagnostic Tools

Test Ollama Connection

Debug Mode

Known Limitations

Tips for Best Results

Architecture

License

Contributing

Credits

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Contributors 2

Languages