Skip to content

A comprehensive Python tool for migrating Confluence pages, including content, attachments, and hierarchical structure, from one Confluence instance to another. The tool uses a two-step process to ensure safe and reliable migration.

Notifications You must be signed in to change notification settings

luismr/python-confluence-migration-tool

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 

Repository files navigation

Confluence Migration Tool

Python Version License Confluence API Platform Status Code Style

A comprehensive Python tool for migrating Confluence pages, including content, attachments, and hierarchical structure, from one Confluence instance to another. The tool uses a two-step process to ensure safe and reliable migration.

Features

  • Two-Step Migration Process: Separate read and publish phases for safe testing
  • Complete Page Migration: Migrates page content, attachments, and hierarchical structure
  • Parent Page Inclusion: Migrates the parent page along with all its children
  • Duplicate Handling: Automatically handles duplicate page titles with timestamps
  • Attachment Support: Downloads and uploads all page attachments (images, documents, etc.)
  • Published Pages: Creates pages as published (not drafts) with proper edit capabilities
  • Dry Run Mode: Test publishing without making actual changes
  • Rate Limiting: Handles Confluence API rate limits automatically
  • Error Recovery: Robust error handling with detailed logging

Requirements

  • Python 3.11+ (configured with pyenv)
  • Confluence Cloud instances (source and destination)
  • API tokens for both Confluence instances
  • Appropriate permissions in both spaces

Installation

  1. Clone or download the project:

    cd confluence_migration
  2. Set up Python environment:

    # Python 3.13.9 is configured via .python-version
    python -m venv venv
    source venv/bin/activate  # On Windows: venv\Scripts\activate
  3. Install dependencies:

    pip install -r requirements.txt

Configuration

The tool uses command-line arguments for all configuration. No hardcoded values need to be changed.

Required Information

Before running the migration, gather the following information:

Source Confluence

  • Base URL: Your source Confluence URL (e.g., https://source.atlassian.net/wiki)
  • API Token: Generate from Atlassian Account Settings → Security → API tokens
  • Email: Your Atlassian account email
  • Parent Page ID: The ID of the parent page whose children you want to migrate

Destination Confluence

  • Base URL: Your destination Confluence URL (e.g., https://destination.atlassian.net/wiki)
  • API Token: Generate from Atlassian Account Settings → Security → API tokens
  • Email: Your Atlassian account email
  • Parent Page ID: The ID of the page under which migrated pages will be created
  • OR Space Key: Alternatively, specify a space key to create pages at space root

Creating API Tokens

API tokens are required to authenticate with Confluence. You need to create separate tokens for source and destination instances.

Step-by-Step Guide:

  1. Log into your Atlassian account:

  2. Create a new token:

    • Click "Create API token"
    • Enter a descriptive label (e.g., "Confluence Migration Tool - Source" or "Confluence Migration - Destination")
    • Click "Create"
  3. Copy and save the token:

    • ⚠️ Important: Copy the token immediately - you won't be able to see it again
    • Store it securely (password manager, environment variables, etc.)
    • The token format looks like: ATATT3xFfGF0KFkgI5j5DegaKJ_Dpt0FHWPrrL0vGk8j...
  4. Repeat for each Confluence instance:

    • Create separate tokens for source and destination if they're different Atlassian accounts
    • If using the same account for both, you can use the same token

Token Security Best Practices:

  • Never commit tokens to version control
  • Use environment variables for production usage
  • Create descriptive labels to identify token purposes
  • Regularly rotate tokens (delete old ones, create new ones)
  • Use separate tokens for different tools/purposes
  • Revoke unused tokens from the same security page

Required Permissions:

Your Atlassian account must have:

  • Source Confluence: 'View' permission for the space and pages
  • Destination Confluence: 'Create' and 'Edit' permissions in the target space
  • Space Admin permissions may be required for some operations

Finding Page IDs

To find a page ID:

  1. Open the page in Confluence
  2. Look at the URL: https://yoursite.atlassian.net/wiki/spaces/SPACE/pages/123456789/Page+Title
  3. The number 123456789 is the page ID

Alternative Methods:

  • Page Information: Go to page → ⋯ (More actions) → Page Information → Look for "Page ID"
  • Edit URL: When editing a page, the URL contains the page ID
  • API Call: Use GET /api/v2/pages with title filter to find the ID

Usage

The migration process consists of two steps:

Step 1: Read All Pages (Dry Run)

This step reads all pages, content, and attachments from the source Confluence and saves them locally. It's safe to run multiple times.

python main.py step1 \
  --source-url https://source.atlassian.net/wiki \
  --source-email your-email@example.com \
  --source-token YOUR_SOURCE_API_TOKEN \
  --source-parent-id 123456789

What it does:

  • Fetches the parent page and all its children recursively
  • Downloads all page content in Confluence storage format
  • Downloads all attachments to local attachments_download/ directory
  • Saves everything to migration_data.json for step 2
  • Shows statistics (total pages, attachments, etc.)

Step 2: Publish All Pages

This step publishes all the saved pages to the destination Confluence.

python main.py step2 \
  --dest-url https://destination.atlassian.net/wiki \
  --dest-email your-email@example.com \
  --dest-token YOUR_DEST_API_TOKEN \
  --dest-parent-id 987654321

What it does:

  • Creates the parent page under the specified destination parent
  • Creates all child pages maintaining the original hierarchy
  • Uploads all attachments
  • Ensures pages are published (not drafts)
  • Removes page restrictions to make them editable

Step 2: Dry Run (Recommended)

Test the publishing process without making actual changes:

python main.py step2 --dry-run \
  --dest-url https://destination.atlassian.net/wiki \
  --dest-email your-email@example.com \
  --dest-token YOUR_DEST_API_TOKEN \
  --dest-parent-id 987654321

Command Line Options

Step 1 (Read) Options

  • --source-url: Source Confluence base URL (required)
  • --source-email: Source API email (optional, can use env var)
  • --source-token: Source API token (required, can use env var)
  • --source-use-bearer: Use Bearer token auth instead of Basic Auth
  • --source-parent-id: Source parent page ID (required)

Step 2 (Publish) Options

  • --dest-url: Destination Confluence base URL (required)
  • --dest-email: Destination API email (optional, can use env var)
  • --dest-token: Destination API token (required, can use env var)
  • --dest-use-bearer: Use Bearer token auth instead of Basic Auth
  • --dest-parent-id: Destination parent page ID (required if no space key)
  • --dest-space-key: Destination space key (alternative to parent ID)
  • --dry-run: Simulate publishing without making changes

Common Options

  • --data-file: Migration data file (default: migration_data.json)
  • --attach-dir: Attachments directory (default: attachments_download)

Environment Variables

For security, you can use environment variables for sensitive data:

# Set environment variables (recommended for security)
export SOURCE_API_TOKEN="ATATT3xFfGF0KFkgI5j5DegaKJ_Dpt0FHWPrrL0vGk8j..."
export SOURCE_API_EMAIL="your-email@example.com"
export DEST_API_TOKEN="ATATT3xFfGF0KFkgI5j5DegaKJ_Dpt0FHWPrrL0vGk8j..."
export DEST_API_EMAIL="your-email@example.com"

# Then run without --source-token and --dest-token flags
python main.py step1 --source-url https://source.atlassian.net/wiki --source-parent-id 123456789

Creating a .env file (Alternative):

Create a .env file in the project directory:

# .env file (add to .gitignore!)
SOURCE_API_TOKEN=ATATT3xFfGF0KFkgI5j5DegaKJ_Dpt0FHWPrrL0vGk8j...
SOURCE_API_EMAIL=your-email@example.com
DEST_API_TOKEN=ATATT3xFfGF0KFkgI5j5DegaKJ_Dpt0FHWPrrL0vGk8j...
DEST_API_EMAIL=your-email@example.com

Then load it before running:

# Load environment variables from .env file
export $(cat .env | xargs)
python main.py step1 --source-url https://source.atlassian.net/wiki --source-parent-id 123456789

Examples

Complete Migration Example

# Step 1: Read all pages from source
python main.py step1 \
  --source-url https://mycompany.atlassian.net/wiki \
  --source-email john.doe@mycompany.com \
  --source-token ATATT3xFfGF0... \
  --source-parent-id 2280718350

# Step 2: Test with dry run
python main.py step2 --dry-run \
  --dest-url https://newcompany.atlassian.net/wiki \
  --dest-email john.doe@newcompany.com \
  --dest-token ATATT3xFfGF0... \
  --dest-parent-id 4556062765

# Step 3: Actual migration
python main.py step2 \
  --dest-url https://newcompany.atlassian.net/wiki \
  --dest-email john.doe@newcompany.com \
  --dest-token ATATT3xFfGF0... \
  --dest-parent-id 4556062765

Using Space Root Instead of Parent Page

python main.py step2 \
  --dest-url https://destination.atlassian.net/wiki \
  --dest-email your-email@example.com \
  --dest-token YOUR_TOKEN \
  --dest-space-key MYSPACE

File Structure

After running the tool, you'll have:

confluence_migration/
├── main.py                    # Main migration script
├── requirements.txt           # Python dependencies
├── .python-version           # Python version (3.13.9)
├── .gitignore               # Git ignore rules
├── migration_data.json      # Saved migration data (created by step1)
├── attachments_download/    # Downloaded attachments (created by step1)
│   ├── page_123456/
│   │   ├── image1.png
│   │   └── document.pdf
│   └── page_789012/
│       └── diagram.jpg
└── README.md               # This file

Troubleshooting

Pages Appear in Edit Mode

If migrated pages appear stuck in edit mode:

  1. Check that your user has 'Edit' permission in the destination space
  2. Verify no page-level restrictions are inherited from parent pages
  3. Try refreshing the page or clearing browser cache
  4. Check Confluence collaborative editing settings

Attachment Issues

If attachments don't display properly:

  1. Verify attachments were downloaded in step 1 (check attachments_download/ directory)
  2. Ensure the destination space allows file uploads
  3. Check file size limits in destination Confluence
  4. Verify attachment file types are allowed

Permission Errors

If you get permission errors:

  1. Verify API tokens are valid and not expired
  2. Check that your user has appropriate permissions:
    • Source: 'View' permission for pages and space
    • Destination: 'Create' and 'Edit' permissions in space
  3. Ensure you're using the correct space IDs and page IDs

Rate Limiting

The tool automatically handles rate limiting, but if you encounter issues:

  1. The tool will automatically wait and retry when rate limited
  2. You can run step 1 multiple times safely
  3. If step 2 fails partway through, you can re-run it (it will handle duplicates)

Duplicate Page Titles

If pages with the same title already exist in the destination:

  1. The tool automatically appends timestamps: Page Title (20241120_153327)
  2. This ensures no conflicts while preserving the original content
  3. You can manually rename pages after migration if needed

API Documentation

This tool uses the Confluence Cloud REST API v2 for all operations.

Security Notes

  • API tokens are sensitive - use environment variables in production
  • The tool creates published pages by default
  • Page restrictions are removed to ensure editability
  • All API calls use HTTPS

Limitations

  • Only works with Confluence Cloud (not Server/Data Center)
  • Requires API tokens (not username/password)
  • Large migrations may take time due to API rate limits
  • Some advanced page features may not migrate perfectly
  • Comments and page history are not migrated

Support

For issues or questions:

  1. Check the troubleshooting section above
  2. Verify your API tokens and permissions
  3. Test with a small page hierarchy first
  4. Use dry-run mode to test before actual migration

License

This tool is provided as-is for migration purposes. Please test thoroughly before using in production environments.

About

A comprehensive Python tool for migrating Confluence pages, including content, attachments, and hierarchical structure, from one Confluence instance to another. The tool uses a two-step process to ensure safe and reliable migration.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages