Confluence Migration Tool

A comprehensive Python tool for migrating Confluence pages, including content, attachments, and hierarchical structure, from one Confluence instance to another. The tool uses a two-step process to ensure safe and reliable migration.

Features

Two-Step Migration Process: Separate read and publish phases for safe testing
Complete Page Migration: Migrates page content, attachments, and hierarchical structure
Parent Page Inclusion: Migrates the parent page along with all its children
Duplicate Handling: Automatically handles duplicate page titles with timestamps
Attachment Support: Downloads and uploads all page attachments (images, documents, etc.)
Published Pages: Creates pages as published (not drafts) with proper edit capabilities
Dry Run Mode: Test publishing without making actual changes
Rate Limiting: Handles Confluence API rate limits automatically
Error Recovery: Robust error handling with detailed logging

Requirements

Python 3.11+ (configured with pyenv)
Confluence Cloud instances (source and destination)
API tokens for both Confluence instances
Appropriate permissions in both spaces

Installation

Clone or download the project:
```
cd confluence_migration
```

Set up Python environment:

# Python 3.13.9 is configured via .python-version
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install dependencies:
```
pip install -r requirements.txt
```

Configuration

The tool uses command-line arguments for all configuration. No hardcoded values need to be changed.

Required Information

Before running the migration, gather the following information:

Source Confluence

Base URL: Your source Confluence URL (e.g., https://source.atlassian.net/wiki)
API Token: Generate from Atlassian Account Settings → Security → API tokens
Email: Your Atlassian account email
Parent Page ID: The ID of the parent page whose children you want to migrate

Destination Confluence

Base URL: Your destination Confluence URL (e.g., https://destination.atlassian.net/wiki)
API Token: Generate from Atlassian Account Settings → Security → API tokens
Email: Your Atlassian account email
Parent Page ID: The ID of the page under which migrated pages will be created
OR Space Key: Alternatively, specify a space key to create pages at space root

Creating API Tokens

API tokens are required to authenticate with Confluence. You need to create separate tokens for source and destination instances.

Step-by-Step Guide:

Log into your Atlassian account:
- Go to https://id.atlassian.com/manage-profile/security/api-tokens
- Or navigate to: Atlassian Account → Settings → Security → API tokens
Create a new token:
- Click "Create API token"
- Enter a descriptive label (e.g., "Confluence Migration Tool - Source" or "Confluence Migration - Destination")
- Click "Create"
Copy and save the token:
- ⚠️ Important: Copy the token immediately - you won't be able to see it again
- Store it securely (password manager, environment variables, etc.)
- The token format looks like: ATATT3xFfGF0KFkgI5j5DegaKJ_Dpt0FHWPrrL0vGk8j...
Repeat for each Confluence instance:
- Create separate tokens for source and destination if they're different Atlassian accounts
- If using the same account for both, you can use the same token

Token Security Best Practices:

Never commit tokens to version control
Use environment variables for production usage
Create descriptive labels to identify token purposes
Regularly rotate tokens (delete old ones, create new ones)
Use separate tokens for different tools/purposes
Revoke unused tokens from the same security page

Required Permissions:

Your Atlassian account must have:

Source Confluence: 'View' permission for the space and pages
Destination Confluence: 'Create' and 'Edit' permissions in the target space
Space Admin permissions may be required for some operations

Finding Page IDs

To find a page ID:

Open the page in Confluence
Look at the URL: https://yoursite.atlassian.net/wiki/spaces/SPACE/pages/123456789/Page+Title
The number 123456789 is the page ID

Alternative Methods:

Page Information: Go to page → ⋯ (More actions) → Page Information → Look for "Page ID"
Edit URL: When editing a page, the URL contains the page ID
API Call: Use GET /api/v2/pages with title filter to find the ID

Usage

The migration process consists of two steps:

Step 1: Read All Pages (Dry Run)

This step reads all pages, content, and attachments from the source Confluence and saves them locally. It's safe to run multiple times.

python main.py step1 \
  --source-url https://source.atlassian.net/wiki \
  --source-email your-email@example.com \
  --source-token YOUR_SOURCE_API_TOKEN \
  --source-parent-id 123456789

What it does:

Fetches the parent page and all its children recursively
Downloads all page content in Confluence storage format
Downloads all attachments to local attachments_download/ directory
Saves everything to migration_data.json for step 2
Shows statistics (total pages, attachments, etc.)

Step 2: Publish All Pages

This step publishes all the saved pages to the destination Confluence.

python main.py step2 \
  --dest-url https://destination.atlassian.net/wiki \
  --dest-email your-email@example.com \
  --dest-token YOUR_DEST_API_TOKEN \
  --dest-parent-id 987654321

What it does:

Creates the parent page under the specified destination parent
Creates all child pages maintaining the original hierarchy
Uploads all attachments
Ensures pages are published (not drafts)
Removes page restrictions to make them editable

Step 2: Dry Run (Recommended)

Test the publishing process without making actual changes:

python main.py step2 --dry-run \
  --dest-url https://destination.atlassian.net/wiki \
  --dest-email your-email@example.com \
  --dest-token YOUR_DEST_API_TOKEN \
  --dest-parent-id 987654321

Command Line Options

Step 1 (Read) Options

--source-url: Source Confluence base URL (required)
--source-email: Source API email (optional, can use env var)
--source-token: Source API token (required, can use env var)
--source-use-bearer: Use Bearer token auth instead of Basic Auth
--source-parent-id: Source parent page ID (required)

Step 2 (Publish) Options

--dest-url: Destination Confluence base URL (required)
--dest-email: Destination API email (optional, can use env var)
--dest-token: Destination API token (required, can use env var)
--dest-use-bearer: Use Bearer token auth instead of Basic Auth
--dest-parent-id: Destination parent page ID (required if no space key)
--dest-space-key: Destination space key (alternative to parent ID)
--dry-run: Simulate publishing without making changes

Common Options

--data-file: Migration data file (default: migration_data.json)
--attach-dir: Attachments directory (default: attachments_download)

Environment Variables

For security, you can use environment variables for sensitive data:

# Set environment variables (recommended for security)
export SOURCE_API_TOKEN="ATATT3xFfGF0KFkgI5j5DegaKJ_Dpt0FHWPrrL0vGk8j..."
export SOURCE_API_EMAIL="your-email@example.com"
export DEST_API_TOKEN="ATATT3xFfGF0KFkgI5j5DegaKJ_Dpt0FHWPrrL0vGk8j..."
export DEST_API_EMAIL="your-email@example.com"

# Then run without --source-token and --dest-token flags
python main.py step1 --source-url https://source.atlassian.net/wiki --source-parent-id 123456789

Creating a .env file (Alternative):

Create a .env file in the project directory:

# .env file (add to .gitignore!)
SOURCE_API_TOKEN=ATATT3xFfGF0KFkgI5j5DegaKJ_Dpt0FHWPrrL0vGk8j...
SOURCE_API_EMAIL=your-email@example.com
DEST_API_TOKEN=ATATT3xFfGF0KFkgI5j5DegaKJ_Dpt0FHWPrrL0vGk8j...
DEST_API_EMAIL=your-email@example.com

Then load it before running:

# Load environment variables from .env file
export $(cat .env | xargs)
python main.py step1 --source-url https://source.atlassian.net/wiki --source-parent-id 123456789

Examples

Complete Migration Example

# Step 1: Read all pages from source
python main.py step1 \
  --source-url https://mycompany.atlassian.net/wiki \
  --source-email john.doe@mycompany.com \
  --source-token ATATT3xFfGF0... \
  --source-parent-id 2280718350

# Step 2: Test with dry run
python main.py step2 --dry-run \
  --dest-url https://newcompany.atlassian.net/wiki \
  --dest-email john.doe@newcompany.com \
  --dest-token ATATT3xFfGF0... \
  --dest-parent-id 4556062765

# Step 3: Actual migration
python main.py step2 \
  --dest-url https://newcompany.atlassian.net/wiki \
  --dest-email john.doe@newcompany.com \
  --dest-token ATATT3xFfGF0... \
  --dest-parent-id 4556062765

Using Space Root Instead of Parent Page

python main.py step2 \
  --dest-url https://destination.atlassian.net/wiki \
  --dest-email your-email@example.com \
  --dest-token YOUR_TOKEN \
  --dest-space-key MYSPACE

File Structure

After running the tool, you'll have:

confluence_migration/
├── main.py                    # Main migration script
├── requirements.txt           # Python dependencies
├── .python-version           # Python version (3.13.9)
├── .gitignore               # Git ignore rules
├── migration_data.json      # Saved migration data (created by step1)
├── attachments_download/    # Downloaded attachments (created by step1)
│   ├── page_123456/
│   │   ├── image1.png
│   │   └── document.pdf
│   └── page_789012/
│       └── diagram.jpg
└── README.md               # This file

Troubleshooting

Pages Appear in Edit Mode

If migrated pages appear stuck in edit mode:

Check that your user has 'Edit' permission in the destination space
Verify no page-level restrictions are inherited from parent pages
Try refreshing the page or clearing browser cache
Check Confluence collaborative editing settings

Attachment Issues

If attachments don't display properly:

Verify attachments were downloaded in step 1 (check attachments_download/ directory)
Ensure the destination space allows file uploads
Check file size limits in destination Confluence
Verify attachment file types are allowed

Permission Errors

If you get permission errors:

Verify API tokens are valid and not expired
Check that your user has appropriate permissions:
- Source: 'View' permission for pages and space
- Destination: 'Create' and 'Edit' permissions in space
Ensure you're using the correct space IDs and page IDs

Rate Limiting

The tool automatically handles rate limiting, but if you encounter issues:

The tool will automatically wait and retry when rate limited
You can run step 1 multiple times safely
If step 2 fails partway through, you can re-run it (it will handle duplicates)

Duplicate Page Titles

If pages with the same title already exist in the destination:

The tool automatically appends timestamps: Page Title (20241120_153327)
This ensures no conflicts while preserving the original content
You can manually rename pages after migration if needed

API Documentation

This tool uses the Confluence Cloud REST API v2 for all operations.

Security Notes

API tokens are sensitive - use environment variables in production
The tool creates published pages by default
Page restrictions are removed to ensure editability
All API calls use HTTPS

Limitations

Only works with Confluence Cloud (not Server/Data Center)
Requires API tokens (not username/password)
Large migrations may take time due to API rate limits
Some advanced page features may not migrate perfectly
Comments and page history are not migrated

Support

For issues or questions:

Check the troubleshooting section above
Verify your API tokens and permissions
Test with a small page hierarchy first
Use dry-run mode to test before actual migration

License

This tool is provided as-is for migration purposes. Please test thoroughly before using in production environments.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.gitignore		.gitignore
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

luismr/python-confluence-migration-tool

Folders and files

Latest commit

History

Repository files navigation