A Next.js application that cleans WordPress-exported content for Webflow compatibility using Airtable as the data source. This tool automates the process of transforming WordPress API JSON format, HTML entities, and WordPress-specific markup into Webflow-compatible content.
- WordPress Content Cleanup: Removes JSON wrappers, decodes HTML entities, and cleans WordPress-specific markup
- Airtable Integration: Connects to Airtable to fetch and update content records
- Progress Tracking: Real-time progress monitoring with detailed statistics
- Rate Limiting: Respects Airtable API rate limits with intelligent batching
- Quality Assurance: Provides QA notes and validation for cleaned content
- Webflow Cloud Ready: Optimized for deployment on Webflow Cloud
- Character Encoding: Converts curly quotes to straight quotes, normalizes escape sequences
- WordPress JSON Wrappers: Removes
{"rendered":"content"}formatting - HTML Entities: Decodes
&,<,>, etc. to readable characters - HTML Tags: Removes disallowed tags (div, section, etc.) and wraps embed tags
- Image URLs: Fixes relative image paths to absolute URLs
- Code Blocks: Cleans syntax highlighting bloat from WordPress code blocks
- Figure Tags: Updates to Webflow-compatible classes
- HTML Spacing: Normalizes whitespace and removes unnecessary spacing
- Node.js 18+
- Airtable API key
- Airtable base with content to clean
- Clone the repository:
git clone https://github.com/8020admin/wordpress-html-cleanup.git
cd wordpress-html-cleanup- Install dependencies:
npm install- Run the development server:
npm run dev- Open http://localhost:3000 in your browser.
- Enter your Airtable API key (found in your Airtable account settings)
- The app will validate the API key before proceeding
- Select your Airtable base and table
- Choose the source field containing WordPress content
- Select the output field for cleaned content
- Optionally choose a notes field for QA information
- The app will fetch all records from the selected table
- Clean each record's content according to the specification
- Update the records in batches with rate limiting
- Show real-time progress and statistics
- View a comprehensive summary of the cleanup process
- See which records had issues and what was cleaned
- Start a new cleanup or review the results
This app is optimized for Webflow Cloud deployment:
- Build the app:
npm run build-
Deploy to Webflow Cloud:
- Connect your GitHub repository to Webflow Cloud
- Set environment variables in Webflow Cloud dashboard
- Deploy using the Webflow CLI or dashboard
-
Environment Variables:
- Set any required environment variables in Webflow Cloud
- API keys should be stored securely in Webflow Cloud variables
The app can also be deployed to:
- Vercel
- Netlify
- AWS
- Any Node.js hosting platform
- GET: Fetch available Airtable bases
- Headers:
x-airtable-api-key
- GET: Fetch tables for a specific base
- Headers:
x-airtable-api-key - Query:
baseId
- GET: Fetch fields for a specific table
- Headers:
x-airtable-api-key - Query:
baseId,tableId
- POST: Start the cleanup process
- Headers:
x-airtable-api-key,Content-Type: application/json - Body: Setup configuration with base, table, and field selections
The cleanup process can be customized through the configuration object:
interface CleanupConfig {
imageDomain?: string; // Domain for fixing image URLs
allowedTags?: Set<string>; // HTML tags to preserve
disallowedTags?: string[]; // HTML tags to remove
embedTags?: string[]; // Tags to wrap in embed containers
preserveSchema?: boolean; // Whether to preserve schema fields
strictMode?: boolean; // Enable strict validation
maxContentLength?: number; // Maximum content length to process
enableQANotes?: boolean; // Generate QA notes
}The app implements intelligent rate limiting to respect Airtable's API limits:
- Processes records in batches of 10
- 200ms delay between batches
- Automatic retry logic for failed requests
- Progress tracking with error handling
- Graceful degradation when cleanup fails
- Detailed error messages and logging
- Retry mechanisms for transient failures
- Validation of input data and API responses
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests if applicable
- Submit a pull request
MIT License - see LICENSE file for details
For issues and questions:
- Create an issue on GitHub
- Check the WordPress to Webflow Cleanup Specification
- Based on the comprehensive WordPress to Webflow cleanup specification
- Built with Next.js, TypeScript, and Tailwind CSS
- Uses Airtable API for data management
- Optimized for Webflow Cloud deployment