A comprehensive web-based tool for extracting and analyzing scenes from screenplay PDFs. This tool processes screenplay documents to identify scenes, characters, locations, and provides detailed analysis with export capabilities.
- PDF Text Extraction: Uses PDF.js to extract text from screenplay PDFs
- Scene Detection: Automatically identifies scene breaks using slugline patterns
- Character Recognition: Extracts character names from dialogue
- Location Analysis: Identifies and categorizes shooting locations
- Time Canonicalization: Normalizes time-of-day indicators
- Length Estimation: Calculates scene lengths in 1/8-page units
- Deterministic Scene IDs: Generates consistent scene identifiers
- CSV/JSON Export: Multiple export formats for further analysis
- Comprehensive Testing: Full test suite with 95%+ pass rate
- Drag & Drop Upload: Simple file upload interface
- Real-time Progress: Visual progress indicators during processing
- Tabbed Results: Organized display of scenes, characters, and locations
- Interactive Demo: Built-in sample analysis demonstration
- Responsive Design: Works on desktop and mobile devices
Visit the live demo at: https://rleitzell.github.io
-
Clone the repository:
git clone https://github.com/rleitzell/rleitzell.github.io.git cd rleitzell.github.io -
Start a local web server:
python3 -m http.server 8000 # or npx http-server -
Open your browser to
http://localhost:8000
- Upload Screenplay: Drag and drop a PDF screenplay file or click to select
- Processing: Watch real-time progress as the tool extracts and analyzes content
- Review Results: Browse scenes, characters, and locations in tabbed interface
- Export Data: Download analysis as CSV or JSON for further processing
Visit /demo.html to see the tool in action with a sample screenplay text.
Run the comprehensive test suite at /tests/ to verify functionality.
models.js: Data models for Scene, Character, and Location entitiestextProcessor.js: Text parsing and scene extraction logicpdfExtractor.js: PDF processing using PDF.js librarysceneAnalyzer.js: Scene analysis and aggregation engineexportUtils.js: Data export utilities for CSV/JSON/XMLmain.js: Main application controller and UI logic
{
id: "deterministic-hash",
number: 1,
slugline: "INT. COFFEE SHOP - DAY",
location: "COFFEE SHOP",
timeOfDay: "DAY",
characters: ["SARAH", "MIKE"],
content: "Scene dialogue and action...",
estimatedLength: 2, // in 1/8 page units
pageNumber: 1
}{
name: "SARAH",
scenes: ["scene-id-1", "scene-id-2"],
totalAppearances: 2
}{
name: "COFFEE SHOP",
scenes: ["scene-id-1", "scene-id-2"],
totalUses: 2
}- Recognizes standard slugline formats (INT./EXT. LOCATION - TIME)
- Handles scene numbers and variations
- Processes multiple screenplay formatting styles
- Resolves duplicate scene numbers automatically
- Identifies character names from dialogue formatting
- Filters out non-character elements (camera directions, etc.)
- Tracks character appearances across scenes
- Provides character-based scene analysis
- Extracts locations from sluglines
- Normalizes location names
- Tracks location usage frequency
- Supports location-based filtering
- CSV Export: Scenes, characters, and locations as separate CSV files
- JSON Export: Complete analysis data in structured format
- XML Export: Alternative structured format
- Batch Export: Download multiple formats simultaneously
The project includes a comprehensive test suite covering:
- Data model functionality
- Text processing algorithms
- Scene analysis logic
- Export utilities
- Integration scenarios
Run tests by visiting /tests/ in your browser.
- Enhanced scene review interface
- Drag/drop grouping for characters/locations
- Duplicate scene number resolution UI
- Mapping import/export functionality
- Advanced PDF layout analysis
- Bounding box-based page calculations
- Layout-derived length computations
- Multi-language support
- Collaborative editing
- Version comparison
- Integration with screenplay software
- Chrome/Chromium 80+
- Firefox 75+
- Safari 13+
- Edge 80+
Requires JavaScript enabled and modern browser support for:
- PDF.js library
- File API
- ES6+ features
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
This project is open source and available under the MIT License.
- PDF.js library for PDF processing
- Screenplay formatting standards
- Community feedback and testing


