Forth Documentation Scraper & Implementation Tracker

This project provides tools to scrape Forth word documentation from the official Forth-2012 standard website and track implementation progress for the Forth.js JavaScript interpreter. I wanted to have the core words categorized so I could focus on certain parts of the interpreter before others. So this is what I came up with and the scripts I used to do it.

🛠️ Components

1. Web Scraper (`main.py`)

Scrapes all 133 core Forth words from forth-standard.org.

Features:

Fetches word descriptions and test examples
Progress tracking with percentage completion
Polite rate limiting (0.5s delay between requests)
Saves results to JSON

Usage:

# Activate virtual environment
source venv/bin/activate

# Install dependencies
pip install requests beautifulsoup4 lxml

# Run scraper
python main.py

Output: forth_words_scraped.json - Raw scraped data with 133 Forth words

2. Categorization Script (`categorize_words.py`)

Analyzes the scraped data and categorizes words by which stack they affect.

Categories:

Data Stack (50 words) - Arithmetic, stack manipulation, comparisons
Memory (25 words) - Storage, allocation, address operations
Control Flow (12 words) - IF/THEN, loops, conditionals
Compilation (11 words) - Colon definitions, immediate words
Input Stream (10 words) - Parsing and input operations
I/O (9 words) - Input/output operations
Return Stack (6 words) - >R, R>, R@, etc.
Pictured Numeric Output (6 words) - #, #>, <#, etc.
System/Environment (4 words) - System-level operations

Output: forth_words_categorized.json - Organized by category

3. Implementation Tracker (Static Website)

Interactive web interface to track which Forth words you've implemented in your interpreter. See the live tracker at https://taus9.github.io/forth_tracker/

Features:

✅ Visual progress tracking with color-coded cards
📊 Real-time statistics (Total, Implemented, Tested)
🔍 Search and filter by word name or description
💾 Auto-save progress to browser localStorage
📥 Export progress as JSON
🎨 Dark theme
📱 Responsive design (3 columns on desktop, 2 on tablet, 1 on mobile)

Card States:

Default - Not yet implemented (gray border)
Implemented - Code written but not tested (green border)
Tested - Implementation with passing tests (blue border)

🚀 Quick Start

Setting Up the Scraper

# Create virtual environment
python3 -m venv venv
source venv/bin/activate

# Install dependencies
pip install requests beautifulsoup4 lxml

# Run scraper
python main.py

# Run categorization
python categorize_words.py

Using the Tracker Locally

# Start local server
python3 -m http.server 8000

# Open in browser
firefox http://localhost:8000/index.html

📄 License

MIT License - see LICENSE file for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Forth Documentation Scraper & Implementation Tracker

🛠️ Components

1. Web Scraper (`main.py`)

2. Categorization Script (`categorize_words.py`)

3. Implementation Tracker (Static Website)

🚀 Quick Start

Setting Up the Scraper

Using the Tracker Locally

📄 License

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
categorize_words.py		categorize_words.py
forth_words_categorized.json		forth_words_categorized.json
forth_words_scraped.json		forth_words_scraped.json
index.html		index.html
main.py		main.py

License

taus9/forth_tracker

Folders and files

Latest commit

History

Repository files navigation

Forth Documentation Scraper & Implementation Tracker

🛠️ Components

1. Web Scraper (main.py)

2. Categorization Script (categorize_words.py)

3. Implementation Tracker (Static Website)

🚀 Quick Start

Setting Up the Scraper

Using the Tracker Locally

📄 License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

1. Web Scraper (`main.py`)

2. Categorization Script (`categorize_words.py`)

Packages