# Enterprise Edition - Basic Redundancy with Replication Groups (Read-Only)
## Practical Setup Guide (No Failover)

This Snowflake Notebook is a practical checklist for setting up **basic redundancy** across Snowflake accounts using **replication groups**.

- **Goal**: create **read-only replicas** of one or more databases in one or more target accounts.
- **Works on**: Standard and Enterprise editions.

If you need **business continuity / failover** (promote a target to primary, replicate account objects like users/roles/warehouses, etc.), use the separate Business Critical guide:
- `tools/replication-workbook/business_critical_business_continuity_guide.ipynb`

### What this guide covers
- Replication groups for **databases** (and optionally shares)
- Scheduled refresh (RPO)
- Creating a **secondary replication group** in the target account
- Manual refresh and monitoring (progress/history/usage)

### What this guide does NOT cover
- Failover groups and promotion (Business Critical edition)
- Replicating account objects beyond databases/shares (Business Critical edition)

---
## Prerequisites (Enterprise / Standard)

Before starting, verify you have:

### Account prerequisites
- [ ] Source and target accounts are in the same Snowflake organization
- [ ] ORGADMIN has enabled replication for both accounts
- [ ] In the source account, you have a role with:
  - CREATE REPLICATION GROUP on the account (ACCOUNTADMIN has this by default)
  - MONITOR on each database you plan to include
- [ ] If replicating shares, you have OWNERSHIP on each share you plan to include

### Database prerequisites
- [ ] Databases to replicate are not created from a share
- [ ] Databases are permanent or transient (not temporary)
- [ ] You understand replicas in target accounts are read-only (basic redundancy)

### Network / region prerequisites
- [ ] Target accounts can receive replication traffic
- [ ] If replicating across regions/clouds, those regions are supported for replication

### Business Critical-only capabilities (not covered here)
- Failover groups / promotion to primary
- Replication of users/roles/warehouses/network policies/account parameters

---
## Step 1 (Source account): Discover replication-enabled accounts

In the source account, identify which accounts in your organization are enabled for replication. You will use these values in `ALLOWED_ACCOUNTS` when creating the replication group.

In [None]:
from snowflake.snowpark.context import get_active_session

session = get_active_session()

accounts_df = session.sql("SHOW REPLICATION ACCOUNTS").to_pandas()

# Defensive normalization: Snowflake Notebooks sometimes return column names wrapped in quotes.
# Example: '"snowflake_region"' instead of 'snowflake_region'.
accounts_df.columns = [str(c).strip().replace('"', '').lower() for c in accounts_df.columns]

print("\n=== Replication-enabled accounts ===")

expected_cols = ["snowflake_region", "account_name", "organization_name"]
existing_cols = [c for c in expected_cols if c in accounts_df.columns]
missing_cols = [c for c in expected_cols if c not in accounts_df.columns]

if missing_cols:
    print("\nWarning: expected columns missing:", missing_cols)
    print("Available columns:", list(accounts_df.columns))

if existing_cols:
    print(accounts_df[existing_cols].to_string(index=False))
else:
    print(accounts_df.to_string(index=False))

print(f"\nTotal accounts available: {len(accounts_df)}")

### ðŸ“‹ Checklist: Account Discovery
- [ ] All target accounts appear in the list above
- [ ] Note the organization_name and account_name for target accounts
- [ ] Verify regions support your replication requirements

---
## Step 2: Create a Replication Group

A replication group defines which objects to replicate and which accounts can receive them.

In [None]:
replication_group_name = "MY_REPLICATION_GROUP"

# Use the org/account-name form from SHOW REPLICATION ACCOUNTS (Step 1 above).
# Replace with actual values from the output of cell 3:
#   - organization_name from the 'organization_name' column
#   - account_name from the 'account_name' column
source_account_identifier = "MYORG.SOURCE_ACCOUNT"
target_accounts = [
    "MYORG.TARGET_ACCOUNT_1",
    # "MYORG.TARGET_ACCOUNT_2",
]

databases_to_replicate = ["MYDB"]

primary_group_identifier = f"{source_account_identifier}.{replication_group_name}"

print("Configuration:")
print(f"  Replication group name (source): {replication_group_name}")
print(f"  Primary group identifier: {primary_group_identifier}")
print(f"  Target accounts: {', '.join(target_accounts)}")
print(f"  Databases: {', '.join(databases_to_replicate)}")
print("\nReview the configuration above before executing the next cell.")

In [None]:
target_list = ", ".join(target_accounts)
create_sql = f"""
CREATE REPLICATION GROUP IF NOT EXISTS {replication_group_name}
  OBJECT_TYPES = DATABASES
  ALLOWED_DATABASES = ({', '.join(databases_to_replicate)})
  ALLOWED_ACCOUNTS = ({target_list})
  REPLICATION_SCHEDULE = '10 MINUTE';
""".strip()

print("Executing in SOURCE account:\n")
print(create_sql)

session.sql(create_sql).collect()

print("\nCreated (or already existed):", replication_group_name)

# Optional quick visibility check
show_df = session.sql(f"SHOW REPLICATION GROUPS LIKE '{replication_group_name}'").to_pandas()
show_df.columns = [str(c).strip().replace('"', '').lower() for c in show_df.columns]
print("\nSHOW REPLICATION GROUPS (filtered):")
print(show_df.to_string(index=False))

### ðŸ“‹ Checklist: Replication Group Creation
- [ ] Replication group created without errors
- [ ] Replication schedule configured (10 minutes in example)
- [ ] All target accounts are specified
- [ ] All databases to replicate are included

---
## Step 3: Business continuity (failover) is Business Critical-only

This guide intentionally stops at **read-only replicas** (basic redundancy).

If you need the ability to **promote** a target account to become primary (read-write) during an outage, you must use **failover groups** (Business Critical edition or higher).

Use:
- `tools/replication-workbook/business_critical_business_continuity_guide.ipynb`

In [None]:
print("This guide is Enterprise/Standard focused (basic redundancy, read-only replicas).")
print("\nFailover and account-object replication (users/roles/warehouses/etc.) require Business Critical edition and use FAILOVER GROUPS.")
print("\nUse: tools/replication-workbook/business_critical_business_continuity_guide.ipynb")

### Checklist: If you need failover
- [ ] Confirm the account is Business Critical (or higher)
- [ ] Use the Business Critical guide to create a FAILOVER GROUP and test promotion
- [ ] Document RPO/RTO and the operational failover runbook

---
## Step 4: View Replication Groups

Verify the replication group configuration.

In [None]:
groups_df = session.sql("SHOW REPLICATION GROUPS").to_pandas()
groups_df.columns = [str(c).strip().replace('"', '').lower() for c in groups_df.columns]

print("\n=== Replication groups (this account) ===")
if groups_df.empty:
    print("No replication groups found")
else:
    cols = [c for c in ["name", "type", "snowflake_region", "allowed_accounts", "replication_schedule", "is_primary", "primary"] if c in groups_df.columns]
    print(groups_df[cols].to_string(index=False))

print("\n=== Databases in this replication group (source account view) ===")
try:
    members_df = session.sql(f"SHOW DATABASES IN REPLICATION GROUP {replication_group_name}").to_pandas()
    members_df.columns = [str(c).strip().replace('"', '').lower() for c in members_df.columns]
    print(members_df.to_string(index=False))
except Exception as e:
    print("Could not list group membership (this can fail if the group does not exist yet):")
    print(e)

### ðŸ“‹ Checklist: Verification
- [ ] Replication group appears in SHOW output
- [ ] Primary databases are listed
- [ ] Allowed accounts match your configuration
- [ ] Replication schedule is correct

---
## Step 5 (Target account): Create the secondary replication group

Execute this section in each TARGET account to create the **secondary replication group**. This creates the read-only replicas in the target account.

### Important notes
- Sign in to the target account before running.
- Use the same group name in target as in source (recommended).
- The source group identifier must be in the form `<org_name>.<source_account_name>.<group_name>`.

In [None]:
print("\n=== TARGET ACCOUNT: Create a secondary replication group ===")
print("\nRun this cell in the TARGET account (the account that will receive the read-only replicas).\n")

# IMPORTANT: If you're running this in the TARGET account directly (without running cells 6-7),
# configure these variables to match your SOURCE account setup:
# replication_group_name = "MY_REPLICATION_GROUP"
# primary_group_identifier = "MYORG.SOURCE_ACCOUNT.MY_REPLICATION_GROUP"

secondary_group_name = replication_group_name

create_secondary_sql = f"""
CREATE REPLICATION GROUP IF NOT EXISTS {secondary_group_name}
  AS REPLICA OF {primary_group_identifier};
""".strip()

print("Executing in TARGET account:\n")
print(create_secondary_sql)

session.sql(create_secondary_sql).collect()

print("\nCreated (or already existed):", secondary_group_name)
print("\nNext: refresh (manual) or rely on REPLICATION_SCHEDULE.")

### Checklist: Secondary replication group creation
- [ ] Secondary replication group created in each target account
- [ ] Secondary group name matches the primary group name (recommended)
- [ ] Initial refresh completed successfully (automatic on creation, or manual refresh executed)

---
## Step 6 (Target account): Manual refresh (optional)

If you set `REPLICATION_SCHEDULE` on the primary group, refreshes run automatically.

Execute this step in the TARGET account only when you want to force a refresh for validation or troubleshooting.

In [None]:
print("\n=== TARGET ACCOUNT: Refresh the secondary replication group (manual) ===")
print("\nIf you configured REPLICATION_SCHEDULE on the primary group, refreshes happen automatically.")
print("Use this manual refresh for validation or troubleshooting.\n")

refresh_sql = f"ALTER REPLICATION GROUP {replication_group_name} REFRESH;"

print("Executing:\n")
print(refresh_sql)

# Optional: increase timeout for large refreshes
# session.sql("ALTER SESSION SET STATEMENT_TIMEOUT_IN_SECONDS = 604800").collect()

session.sql(refresh_sql).collect()
print("\nRefresh submitted.")

### Checklist: Manual refresh
- [ ] `ALTER REPLICATION GROUP ... REFRESH` executed successfully in the target account
- [ ] No timeout errors (increase statement timeout if needed)
- [ ] Refresh progress/history show SUCCEEDED phases

---
## Step 7: Monitor refresh progress

Track the status of refresh operations using the Information Schema table function:
- `INFORMATION_SCHEMA.REPLICATION_GROUP_REFRESH_PROGRESS('<secondary_group_name>')`

In [None]:
secondary_group_name = replication_group_name

progress_sql = f"""
SELECT
    phase_name,
    start_time,
    end_time,
    progress,
    details
FROM TABLE(INFORMATION_SCHEMA.REPLICATION_GROUP_REFRESH_PROGRESS('{secondary_group_name}'))
ORDER BY start_time DESC;
""".strip()

print("=== Replication group refresh progress (last 14 days) ===")
print(progress_sql)

progress_df = session.sql(progress_sql).to_pandas()
progress_df.columns = [str(c).strip().replace('"', '').lower() for c in progress_df.columns]

if progress_df.empty:
    print("No progress data available (no refresh in last 14 days, or insufficient privileges).")
else:
    print(progress_df.to_string(index=False))

### Checklist: Progress monitoring
- [ ] Refresh phases are visible for the secondary group
- [ ] Latest refresh ends in COMPLETED / SUCCEEDED (as appropriate)
- [ ] `end_time` is populated for completed phases
- [ ] `progress` reaches 100% for long-running phases

---
## Step 8: View refresh history

Review past refresh operations using:
- `INFORMATION_SCHEMA.REPLICATION_GROUP_REFRESH_HISTORY('<secondary_group_name>')`

In [None]:
secondary_group_name = replication_group_name

history_sql = f"""
SELECT
    phase_name,
    start_time,
    end_time,
    total_bytes,
    object_count
FROM TABLE(INFORMATION_SCHEMA.REPLICATION_GROUP_REFRESH_HISTORY('{secondary_group_name}'))
ORDER BY start_time DESC
LIMIT 20;
""".strip()

print("=== Replication group refresh history (last 14 days) ===")
print(history_sql)

history_df = session.sql(history_sql).to_pandas()
history_df.columns = [str(c).strip().replace('"', '').lower() for c in history_df.columns]

if history_df.empty:
    print("No history data available (no refresh in last 14 days, or insufficient privileges).")
else:
    print(history_df.to_string(index=False))

### Checklist: History review
- [ ] Historical refreshes completed successfully
- [ ] No recurring failures
- [ ] Refresh frequency matches your RPO requirements
- [ ] Total bytes / object counts look reasonable for your databases

---
## Step 9: Monitor usage (credits and bytes)

Track credits and bytes transferred using:
- `INFORMATION_SCHEMA.REPLICATION_GROUP_USAGE_HISTORY(...)` (last 14 days)

For longer retention, use the Account Usage / Organization Usage views described in Snowflake docs.

In [None]:
secondary_group_name = replication_group_name

# Note: INFORMATION_SCHEMA.REPLICATION_GROUP_USAGE_HISTORY only returns the last 14 days.
cost_sql = f"""
SELECT
  start_time,
  end_time,
  replication_group_name,
  credits_used,
  bytes_transferred
FROM TABLE(information_schema.replication_group_usage_history(
  date_range_start => dateadd('day', -14, current_timestamp()),
  replication_group_name => '{secondary_group_name}'
))
ORDER BY start_time DESC;
""".strip()

print("=== Replication group usage (last 14 days) ===")
print(cost_sql)

try:
    cost_df = session.sql(cost_sql).to_pandas()
    cost_df.columns = [str(c).strip().replace('"', '').lower() for c in cost_df.columns]

    if cost_df.empty:
        print("No usage data available (or insufficient privileges).")
    else:
        # credits_used is documented as TEXT, so coerce before summing
        cost_df["credits_used"] = cost_df["credits_used"].astype(float)
        cost_df["gb_transferred"] = cost_df["bytes_transferred"] / 1024 / 1024 / 1024

        print(cost_df.to_string(index=False))
        print(f"\nTotal credits (window): {cost_df['credits_used'].sum():.4f}")
        print(f"Total data (window): {cost_df['gb_transferred'].sum():.4f} GB")
except Exception as e:
    print("Unable to query replication-group usage history:")
    print(e)

### ðŸ“‹ Checklist: Cost Monitoring
- [ ] Replication costs are within budget
- [ ] No unexpected spikes in data transfer
- [ ] Credit usage trends are acceptable
- [ ] Cost monitoring alerts configured

---
## Step 10: Validate Data Consistency (Optional)

Use HASH_AGG to verify data matches between primary and secondary.

In [None]:
print("=== Data consistency validation (optional) ===")
print("""
To validate consistency between primary and secondary replicas:

1. In the TARGET account (secondary replication group):
   - Get the primary snapshot timestamp from REPLICATION_GROUP_REFRESH_PROGRESS(details)
   - Compute HASH_AGG(*) on a table in the replicated database.

2. In the SOURCE account (primary):
   - Compute HASH_AGG(*) on the same table using Time Travel
     AT(TIMESTAMP => '<primarySnapshotTimestamp>'::TIMESTAMP)

3. Compare hash values - they should match.
""")

validation_example = f"""
-- STEP 1: Run this in TARGET account to get the snapshot timestamp:
SELECT PARSE_JSON(details)['primarySnapshotTimestamp'] AS primary_snapshot_ts
FROM TABLE(INFORMATION_SCHEMA.REPLICATION_GROUP_REFRESH_PROGRESS('{replication_group_name}'))
WHERE phase_name = 'PRIMARY_UPLOADING_METADATA';

-- STEP 2: Run this in TARGET account to hash the replicated table:
-- SELECT HASH_AGG(*) FROM <db>.<schema>.<table>;

-- STEP 3: Run this in SOURCE account to hash the primary table at the snapshot timestamp:
-- SELECT HASH_AGG(*) FROM <db>.<schema>.<table>
--   AT(TIMESTAMP => '<primarySnapshotTimestamp>'::TIMESTAMP);

-- STEP 4: Compare the hash values from STEP 2 and STEP 3 - they should match exactly.
""".strip()

print(validation_example)

### ðŸ“‹ Checklist: Data Validation
- [ ] Sample tables selected for validation
- [ ] Hash values match between primary and secondary
- [ ] Row counts verified
- [ ] Critical data validated

---
## Step 11: Configure the replication schedule (recommended)

Replication groups support scheduled refresh via `REPLICATION_SCHEDULE` on the primary group.

You typically do not need tasks for basic redundancy.

### Notes
- Set or change the schedule in the SOURCE account with `ALTER REPLICATION GROUP ... SET REPLICATION_SCHEDULE = ...`.
- In a TARGET account, you can pause/resume scheduled replication with `ALTER REPLICATION GROUP ... SUSPEND/RESUME`.

In [None]:
print("\n=== Scheduled refresh (recommended) ===")
print("Replication groups support automatic refresh via REPLICATION_SCHEDULE on the PRIMARY group.")
print("Tasks are typically unnecessary for basic redundancy.\n")

print("To change the schedule (run in SOURCE account):")
alter_schedule_sql = f"""
ALTER REPLICATION GROUP {replication_group_name} SET
  REPLICATION_SCHEDULE = '10 MINUTE';
""".strip()
print(alter_schedule_sql)

print("\nTo pause/resume scheduled replication (run in TARGET account):")
print(f"ALTER REPLICATION GROUP {replication_group_name} SUSPEND;")
print(f"ALTER REPLICATION GROUP {replication_group_name} RESUME;")

### Checklist: Replication schedule
- [ ] `REPLICATION_SCHEDULE` is configured on the primary group (or explicitly left unset)
- [ ] Schedule matches RPO requirements
- [ ] Target accounts are receiving refreshes on the expected cadence
- [ ] You can suspend/resume scheduled refresh in target accounts when needed

---
## Step 12: Check replication group status

Execute in the source or target account to verify schedule state, next scheduled refresh, and which account is primary/secondary.

In [None]:
print("\n=== Replication group status (current account) ===")

show_df = session.sql(f"SHOW REPLICATION GROUPS LIKE '{replication_group_name}'").to_pandas()
show_df.columns = [str(c).strip().replace('"', '').lower() for c in show_df.columns]

if show_df.empty:
    print("Group not found in this account.")
else:
    cols = [c for c in [
        "name",
        "type",
        "is_primary",
        "primary",
        "allowed_accounts",
        "replication_schedule",
        "secondary_state",
        "next_scheduled_refresh",
    ] if c in show_df.columns]
    print(show_df[cols].to_string(index=False))

print("\nFor detailed refresh progress/history, use the progress and history steps above (INFORMATION_SCHEMA.REPLICATION_GROUP_REFRESH_*).")

### Checklist: Ongoing monitoring
- [ ] Refreshes are succeeding on schedule
- [ ] No recurring failures in refresh history
- [ ] Duration and transferred bytes are within expected ranges
- [ ] Alerts exist for refresh failures (optional)

---
## Troubleshooting guide

### Common issues and solutions

#### Issue: "Account not enabled for replication"
- Confirm both accounts are in the same organization.
- Have ORGADMIN enable replication for the accounts (Snowflake docs: enable replication for accounts).

#### Issue: "Database cannot be replicated" / "Database missing from group"
- Confirm the database is not created from a share.
- Confirm the role creating the group has `MONITOR` on the database.
- Confirm the database does not contain unsupported objects for replication (see Snowflake limitations).

#### Issue: "Timeout during a refresh"
- Increase statement timeout for the session running the refresh:

```sql
ALTER SESSION SET STATEMENT_TIMEOUT_IN_SECONDS = 604800;
```

#### Issue: "Scheduled refresh not running"
- In the SOURCE account, confirm `REPLICATION_SCHEDULE` is set on the primary group.
- In the TARGET account, confirm scheduled refresh is not suspended (`SHOW REPLICATION GROUPS LIKE '<name>'`).
- Ensure the group owner role has the privileges required for refresh.

#### Issue: "High replication costs"
- Increase the refresh interval.
- Reduce the number/size of databases in the group.
- Use `INFORMATION_SCHEMA.REPLICATION_GROUP_USAGE_HISTORY` (or Account Usage views) to quantify credits/bytes.

#### Issue: "Refresh fails"
- Use `INFORMATION_SCHEMA.REPLICATION_GROUP_REFRESH_PROGRESS` and `..._HISTORY` to get phase-level details.
- Check for dangling references with `INFORMATION_SCHEMA.REPLICATION_GROUP_DANGLING_REFERENCES`.
- If needed, run a manual refresh (`ALTER REPLICATION GROUP <name> REFRESH`).

---
## Best practices (Enterprise / basic redundancy)

### 1. Naming
- Name replication groups clearly (for example, `PROD_REDUNDANCY_RG`).
- Use the same replication group name in source and target accounts.
- Document the primary group identifier (`<org>.<account>.<group>`).

### 2. Scheduling (RPO)
- Match refresh frequency to your RPO.
- Avoid over-frequent refreshes if you do not need them (cost).

### 3. Monitoring
- Use:
  - `INFORMATION_SCHEMA.REPLICATION_GROUP_REFRESH_PROGRESS` (current/most recent)
  - `INFORMATION_SCHEMA.REPLICATION_GROUP_REFRESH_HISTORY` (last 14 days)
  - `INFORMATION_SCHEMA.REPLICATION_GROUP_USAGE_HISTORY` (credits/bytes, last 14 days)
- Add alerts for failed refreshes (optional).

### 4. Security and access control
- Use least privilege:
  - `CREATE REPLICATION GROUP` on the account (or ACCOUNTADMIN)
  - `MONITOR` on each database you include
  - `REPLICATE` on the group for roles that need to refresh a secondary group

### 5. Performance and reliability
- Increase statement timeout for large refreshes when needed.
- Keep the group focused (only replicate what you actually need).

### 6. Documentation
- Maintain a list of replicated databases and target accounts.
- Document the intended RPO and monitoring approach.
- Keep a short runbook for manual refresh and common failure modes.

---
## Final checklist (Enterprise / basic redundancy)

### Configuration
- [ ] All target accounts identified and enabled for replication
- [ ] Primary replication group created with correct settings
- [ ] Secondary replication group created in each target account
- [ ] Scheduled refresh configured (or explicitly omitted) on the primary group

### Verification
- [ ] Manual refresh tested at least once in a target account
- [ ] Refresh progress/history show successful completion
- [ ] Usage history (credits/bytes) is visible for the secondary group

### Documentation and operations
- [ ] Replication architecture documented (accounts, databases, schedule)
- [ ] Contact list and access model documented
- [ ] Runbook exists for manual refresh and common failure modes
- [ ] Team knows where to monitor refreshes and usage

---
## Additional resources

### Snowflake documentation
- `https://docs.snowflake.com/en/user-guide/account-replication-intro`
- `https://docs.snowflake.com/en/user-guide/account-replication-config`

### Key SQL commands (replication groups)
- `SHOW REPLICATION ACCOUNTS`
- `CREATE REPLICATION GROUP`
- `ALTER REPLICATION GROUP`
- `DROP REPLICATION GROUP`
- `SHOW REPLICATION GROUPS`
- `SHOW DATABASES IN REPLICATION GROUP <group_name>`

### Monitoring (Information Schema table functions)
- `INFORMATION_SCHEMA.REPLICATION_GROUP_REFRESH_PROGRESS('<secondary_group_name>')`
- `INFORMATION_SCHEMA.REPLICATION_GROUP_REFRESH_HISTORY('<secondary_group_name>')`
- `INFORMATION_SCHEMA.REPLICATION_GROUP_USAGE_HISTORY(...)`
- `INFORMATION_SCHEMA.REPLICATION_GROUP_DANGLING_REFERENCES('<group_name>')`

### Business continuity (Business Critical)
Failover groups and promotion are covered in:
- `tools/replication-workbook/business_critical_business_continuity_guide.ipynb`

---
## Cleanup (Optional)

Use this section to remove replication configuration if needed.

In [None]:
print("\n=== Cleanup (replication groups) ===")
print("\nWARNING: Dropping groups changes replica protection semantics.")
print("- Dropping a secondary replication group can make replicated databases writable in the target account.")
print("- A primary replication group cannot be dropped until all linked secondary groups are dropped.\n")

cleanup_sql = f"""
-- TARGET account: drop the secondary replication group first
-- DROP REPLICATION GROUP IF EXISTS {replication_group_name};

-- SOURCE account: after all secondary groups are dropped
-- DROP REPLICATION GROUP IF EXISTS {replication_group_name};
""".strip()

print(cleanup_sql)
print("\nUncomment and execute carefully.")

In [None]:
print("Notebook complete.")