Add tenant migration script across organizations#881
Conversation
|
@coderabbitai full review |
✅ Actions performedFull review triggered. |
WalkthroughIntroduces a comprehensive migration script that orchestrates tenant-to-tenant organization data moves with validation, transactional copy/move operations, external metadata remapping, role-mapping strategies, dry-run mode, cleanup options, and extensive logging and verification. Changes
Sequence DiagramsequenceDiagram
actor User
participant Script as "Migration Script"
participant DB as "Database"
participant Ext as "External Service"
participant Tx as "Transaction Manager"
User->>Script: Start migration (source, target, options)
Script->>Script: Validate args & build context
Script->>Tx: Begin SERIALIZABLE transaction
activate Tx
Script->>DB: Lock relevant tables (conditional)
DB-->>Script: Locks ack
Script->>DB: Read source tenant/org data
DB-->>Script: Source context
Script->>DB: Read target tenant/org data
DB-->>Script: Target context
Script->>Ext: Fetch external entity specs (batched)
Ext-->>Script: External mappings
Script->>Script: Remap external metadata & resolve role mappings
alt Dry-run
Script->>User: Log planned operations (no DB writes)
else Execute
Script->>DB: Copy forms, entity_types, entities, templates
DB-->>Script: Copy results
Script->>DB: Move users, user_organizations, user_roles (apply mappings)
DB-->>Script: Move results
Script->>DB: Migrate or skip sessions per option
DB-->>Script: Sessions handled
Script->>DB: Cleanup source rows (soft/hard/none)
DB-->>Script: Cleanup done
Script->>DB: Validate counts and consistency
DB-->>Script: Validation results
end
alt Validation OK
Script->>Tx: Commit
Tx-->>Script: Committed
Script->>User: Success
else Validation failed
Script->>Tx: Rollback
Tx-->>Script: Rolled back
Script->>User: Failure (error)
end
deactivate Tx
Estimated code review effort🎯 4 (Complex) | ⏱️ ~65 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment Tip CodeRabbit can use OpenGrep to find security vulnerabilities and bugs across 17+ programming languages.OpenGrep is compatible with Semgrep configurations. Add an |
There was a problem hiding this comment.
Actionable comments posted: 5
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/scripts/migrateTenantOrgData.js`:
- Around line 2318-2368: The external I/O call remapExternalMetaForUsers is
executed while a SERIALIZABLE transaction and locks are held (tx created and
lockByStrategy called), which can prolong locks; move the
remapExternalMetaForUsers call out of the write transaction by invoking it
before opening the transaction and applying locks (or immediately after
buildContextAndPrecheck but outside any tx), assign its usersForInsert back onto
context, and keep the same external_meta_remap_ready logging; update any call
sites (remapExternalMetaForUsers(context)) so it does not depend on tx or locked
resources and ensure context is passed/returned unchanged except for
usersForInsert.
- Around line 90-109: The usage text in printUsage incorrectly shows "node
scripts/migrateTenantOrgData.js"; update the first usage example string to the
correct entrypoint "node src/scripts/migrateTenantOrgData.js" (i.e., change the
path in the template literal inside the printUsage function) so the help output
points to the actual file location.
- Around line 1220-1254: The target-role SELECT used for strict-id (the
querySelect that populates targetRolesByIdRows) currently includes soft-deleted
rows; modify that SQL to exclude deleted roles (e.g., add a WHERE clause
condition like ur.deleted_at IS NULL or equivalent) so the Map created as
targetRoleById only contains active roles, and then re-run the same change for
the other analogous query that builds targetRolesByIdRows in the later block
(the other strict-id path around the code referenced as applying to 1448-1459)
to ensure strict-id never matches soft-deleted roles.
- Around line 1079-1115: The query that builds sourceRolesFromUsers currently
un-nests users.roles without ensuring those role IDs actually belong to the
source org(s), so stale cross-org role IDs can be pulled in; update the SQL used
for sourceRolesFromUsers (the querySelect that assigns sourceRolesFromUsers) to
only select role IDs that exist in user_roles for the source tenant and allowed
organization scope (e.g., JOIN user_roles ur ON ur.id = unnest(roles) and join
organizations o to ensure ur.organization_id / o.code matches the source/default
org), so requiredRoleIds / requiredRoleIdArray only contains role IDs that truly
belong to the source org before fetching sourceRoleRows and building
sourceRoleById.
- Around line 1395-1446: After rebasing user_roles.id via the loops that use
queryRaw (see nextFreeRoleId, tempSeed, targetToSourceIdMap, tx), run a final
queryRaw inside the same transaction to reset the table's backing sequence to at
least the current max(id) for that tenant; call
setval(pg_get_serial_sequence('user_roles','id'), coalesce((SELECT MAX(id) FROM
user_roles WHERE tenant_code = $targetTenant AND deleted_at IS NULL), 1), true)
(or equivalent) passing targetTenant and tx so subsequent inserts won't reuse
updated ids.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 8af91057-976c-4e6e-bdcc-afd837d57ec1
⛔ Files ignored due to path filters (2)
AGENTS.mdis excluded by!**/*.mdsrc/scripts/readme.mdis excluded by!**/*.md
📒 Files selected for processing (1)
src/scripts/migrateTenantOrgData.js
| tx = await sequelize.transaction({ | ||
| isolationLevel: Transaction.ISOLATION_LEVELS.SERIALIZABLE, | ||
| }) | ||
| log('info', 'transaction_started', { isolationLevel: 'SERIALIZABLE' }) | ||
|
|
||
| await lockByStrategy(sequelize, tx, options) | ||
| log('info', 'lock_strategy_applied', { lockStrategy: options.lockStrategy }) | ||
|
|
||
| const context = await buildContextAndPrecheck(sequelize, tx, options) | ||
| log('info', 'precheck_success', { | ||
| sourceOrgId: context.sourceOrg.id, | ||
| targetOrgId: context.targetOrg.id, | ||
| userCount: context.sourceCounts.users, | ||
| sourceCounts: context.sourceCounts, | ||
| }) | ||
| log('info', 'default_org_resolved', { | ||
| sourceTenant: context.sourceTenant, | ||
| defaultOrgCode: context.defaultOrgCode, | ||
| defaultOrgId: context.sourceDefaultOrg.id, | ||
| }) | ||
| log('info', 'external_entity_type_detection_summary', { | ||
| sourceTenant: context.sourceTenant, | ||
| sourceOrgCode: context.orgCode, | ||
| defaultOrgCode: context.defaultOrgCode, | ||
| sourceScopedCount: context.externalEntityTypeStats.sourceScopedCount, | ||
| defaultScopedCount: context.externalEntityTypeStats.defaultScopedCount, | ||
| dedupedFinalKeyCount: context.externalEntityTypeStats.dedupedCount, | ||
| externalMetaKeys: context.externalMetaKeys, | ||
| }) | ||
|
|
||
| let roleMap = new Map(context.requiredRoleIdArray.map((roleId) => [roleId, roleId])) | ||
| let performedStrictIdRebase = false | ||
|
|
||
| if (options.roleResolution === 'strict-id') { | ||
| const strictResult = await ensureStrictIdCompatibility(sequelize, tx, context, options) | ||
| roleMap = strictResult.roleMap | ||
| performedStrictIdRebase = strictResult.performedRebase | ||
| } else { | ||
| roleMap = await buildMapByTitleRoleMap(sequelize, tx, context) | ||
| } | ||
|
|
||
| log('info', 'role_resolution_ready', { | ||
| roleResolution: options.roleResolution, | ||
| requiredRoleCount: context.requiredRoleIdArray.length, | ||
| mapSize: roleMap.size, | ||
| performedStrictIdRebase, | ||
| }) | ||
|
|
||
| const externalRemapResult = await remapExternalMetaForUsers(context) | ||
| context.usersForInsert = externalRemapResult.usersForInsert | ||
| log('info', 'external_meta_remap_ready', externalRemapResult.stats) |
There was a problem hiding this comment.
Move external remap I/O out of the write transaction.
The SERIALIZABLE transaction and lock strategy are opened on Lines 2318-2324, but Line 2366 can still fan out into paginated external calls before any writes happen. A slow dependency will keep the snapshot and any advisory/table locks open for the full round-trip, which raises lock contention and serialization-retry risk for the rest of the system.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/scripts/migrateTenantOrgData.js` around lines 2318 - 2368, The external
I/O call remapExternalMetaForUsers is executed while a SERIALIZABLE transaction
and locks are held (tx created and lockByStrategy called), which can prolong
locks; move the remapExternalMetaForUsers call out of the write transaction by
invoking it before opening the transaction and applying locks (or immediately
after buildContextAndPrecheck but outside any tx), assign its usersForInsert
back onto context, and keep the same external_meta_remap_ready logging; update
any call sites (remapExternalMetaForUsers(context)) so it does not depend on tx
or locked resources and ensure context is passed/returned unchanged except for
usersForInsert.
There was a problem hiding this comment.
Currently the count will be less than 500 and can be taken up as an enhancement later when data grows.
There was a problem hiding this comment.
@nevil-mathew, understood — with user counts under 500, the lock contention risk is minimal and the current approach is acceptable. Happy to defer this as a future enhancement.
Would you like me to open a GitHub issue to track this optimization (moving remapExternalMetaForUsers outside the SERIALIZABLE transaction) so it's not lost when data scales up?
🧠 Learnings used
Learnt from: nevil-mathew
Repo: ELEVATE-Project/user PR: 881
File: src/scripts/migrateTenantOrgData.js:1079-1115
Timestamp: 2026-03-17T09:00:22.165Z
Learning: Scope: This learning applies specifically to migrateTenantOrgData.js in the ELEVATE-Project/user repo. Guidance: Update migration/validation logic to treat user_roles as valid when they originate from either the source organization (source org) or the default organization, identified by DEFAULT_ORGANISATION_CODE. Ensure users.roles arrays may contain a mix of role IDs from the user's primary org and the default org. Implementation notes: - When validating or migrating user data, allow role IDs from both the source org and DEFAULT_ORGANISATION_CODE. - Do not reject roles solely because they come from DEFAULT_ORGANISATION_CODE. - Ensure any checks that previously enforced a single org scope are updated to accept a dual-scope: {sourceOrg, DEFAULT_ORGANISATION_CODE}. - Update unit tests to cover: a) roles from source org only, b) roles from DEFAULT_ORGANISATION_CODE only (where applicable), c) mixed roles from both, d) absence of roles. - Maintain security and integrity by clearly documenting the dual-scope policy and ensuring downstream consumers can handle mixed scopes.
Learnt from: nevil-mathew
Repo: ELEVATE-Project/user PR: 776
File: src/database/models/entityType.js:38-38
Timestamp: 2025-07-31T08:43:35.971Z
Learning: The migration for converting tenant_code to a primary key in the EntityType model was already handled in a previous PR, not in the current refactoring PR that focuses on organization codes instead of organization IDs.
…mpatibility function
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/scripts/migrateTenantOrgData.js`:
- Around line 1068-1075: The query that populates sourceRolesFromUor (and
ultimately requiredRoleIds) is currently including soft-deleted
user_organization_roles rows; update the querySelect call against
user_organization_roles to exclude soft-deleted links by adding a condition that
filters out deleted rows (e.g., include AND deleted_at IS NULL in the WHERE
clause), so requiredRoleIds only contains active role_id entries and avoids
role-resolution failures.
- Around line 710-739: The precheck queries for crossOrgMembership and
crossOrgRoles are returning soft-deleted history rows; update both SELECTs (the
queries that populate crossOrgMembership and crossOrgRoles against
user_organizations and user_organization_roles) to exclude soft-deleted records
(e.g., add "AND deleted_at IS NULL" or the project’s soft-delete predicate) so
only active memberships/roles are considered; keep the same parameters
(sourceTenant, userIds, orgCode) and leave the assertion on crossOrgMembership
intact.
- Line 566: The lint rule useIterableCallbackReturn fails because
expression-bodied arrow callbacks like ids.forEach((id) => group.oldIds.add(id))
(and the similar callbacks using Set.add and logical AND) implicitly return
values; change these to explicit block-bodied callbacks or plain loops so they
return undefined — e.g., replace ids.forEach((id) => group.oldIds.add(id)) with
a block callback ids.forEach((id) => { group.oldIds.add(id); }) or use for
(const id of ids) { group.oldIds.add(id); } to satisfy the
useIterableCallbackReturn rule while keeping the same behavior.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 289a2cde-affc-4695-9c8a-a93f84b26e93
📒 Files selected for processing (1)
src/scripts/migrateTenantOrgData.js
| }) | ||
| } | ||
| const group = groups.get(groupKey) | ||
| ids.forEach((id) => group.oldIds.add(id)) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify remaining callbacks that may implicitly return values in forEach.
rg -nP --type=js '\.forEach\(\([^)]*\)\s*=>\s*[^({]' src/scripts/migrateTenantOrgData.jsRepository: ELEVATE-Project/user
Length of output: 418
🏁 Script executed:
#!/bin/bash
# Check for Biome configuration files
fd -e json -e jsonc | grep -i biomeRepository: ELEVATE-Project/user
Length of output: 74
🏁 Script executed:
#!/bin/bash
# View context around the flagged lines
echo "=== Line 566 context ===" && sed -n '560,572p' src/scripts/migrateTenantOrgData.js
echo -e "\n=== Lines 1090-1091 context ===" && sed -n '1085,1095p' src/scripts/migrateTenantOrgData.jsRepository: ELEVATE-Project/user
Length of output: 757
🏁 Script executed:
#!/bin/bash
# Check all forEach patterns in the file to ensure comprehensive fix
rg '\.forEach\(' src/scripts/migrateTenantOrgData.js -A 1 -B 1Repository: ELEVATE-Project/user
Length of output: 739
Resolve Biome useIterableCallbackReturn lint errors in forEach callbacks.
Expression-bodied callbacks at line 566, lines 1090–1091 return values implicitly (Set.add(...) and logical AND expressions), which violates the configured lint rule.
🛠️ Proposed fix
- ids.forEach((id) => group.oldIds.add(id))
+ ids.forEach((id) => {
+ group.oldIds.add(id)
+ })
@@
- sourceRolesFromUor.forEach((row) => row.role_id !== null && requiredRoleIds.add(Number(row.role_id)))
- sourceRolesFromUsers.forEach((row) => row.role_id !== null && requiredRoleIds.add(Number(row.role_id)))
+ sourceRolesFromUor.forEach((row) => {
+ if (row.role_id !== null) {
+ requiredRoleIds.add(Number(row.role_id))
+ }
+ })
+ sourceRolesFromUsers.forEach((row) => {
+ if (row.role_id !== null) {
+ requiredRoleIds.add(Number(row.role_id))
+ }
+ })🧰 Tools
🪛 Biome (2.4.7)
[error] 566-566: This callback passed to forEach() iterable method should not return a value.
(lint/suspicious/useIterableCallbackReturn)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/scripts/migrateTenantOrgData.js` at line 566, The lint rule
useIterableCallbackReturn fails because expression-bodied arrow callbacks like
ids.forEach((id) => group.oldIds.add(id)) (and the similar callbacks using
Set.add and logical AND) implicitly return values; change these to explicit
block-bodied callbacks or plain loops so they return undefined — e.g., replace
ids.forEach((id) => group.oldIds.add(id)) with a block callback ids.forEach((id)
=> { group.oldIds.add(id); }) or use for (const id of ids) {
group.oldIds.add(id); } to satisfy the useIterableCallbackReturn rule while
keeping the same behavior.
| const crossOrgMembership = await querySelect( | ||
| sequelize, | ||
| `SELECT user_id, organization_code | ||
| FROM user_organizations | ||
| WHERE tenant_code = $sourceTenant | ||
| AND user_id = ANY($userIds) | ||
| AND organization_code <> $orgCode | ||
| LIMIT 20;`, | ||
| { sourceTenant, userIds, orgCode }, | ||
| tx | ||
| ) | ||
| assertOrThrow( | ||
| crossOrgMembership.length === 0, | ||
| 'Users belong to multiple organizations in source tenant. Aborting.', | ||
| { | ||
| sample: crossOrgMembership, | ||
| } | ||
| ) | ||
|
|
||
| const crossOrgRoles = await querySelect( | ||
| sequelize, | ||
| `SELECT user_id, organization_code, role_id | ||
| FROM user_organization_roles | ||
| WHERE tenant_code = $sourceTenant | ||
| AND user_id = ANY($userIds) | ||
| AND organization_code <> $orgCode | ||
| LIMIT 20;`, | ||
| { sourceTenant, userIds, orgCode }, | ||
| tx | ||
| ) |
There was a problem hiding this comment.
Filter out soft-deleted memberships/roles in cross-org prechecks.
Line 716 and Line 735 currently include historical soft-deleted rows, which can falsely abort valid migrations.
🛠️ Proposed fix
const crossOrgMembership = await querySelect(
sequelize,
`SELECT user_id, organization_code
FROM user_organizations
WHERE tenant_code = $sourceTenant
AND user_id = ANY($userIds)
AND organization_code <> $orgCode
+ AND deleted_at IS NULL
LIMIT 20;`,
{ sourceTenant, userIds, orgCode },
tx
)
@@
const crossOrgRoles = await querySelect(
sequelize,
`SELECT user_id, organization_code, role_id
FROM user_organization_roles
WHERE tenant_code = $sourceTenant
AND user_id = ANY($userIds)
AND organization_code <> $orgCode
+ AND deleted_at IS NULL
LIMIT 20;`,
{ sourceTenant, userIds, orgCode },
tx
)Based on learnings: users are strictly single-org, and crossOrgMembership is the enforcement precheck for that constraint.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/scripts/migrateTenantOrgData.js` around lines 710 - 739, The precheck
queries for crossOrgMembership and crossOrgRoles are returning soft-deleted
history rows; update both SELECTs (the queries that populate crossOrgMembership
and crossOrgRoles against user_organizations and user_organization_roles) to
exclude soft-deleted records (e.g., add "AND deleted_at IS NULL" or the
project’s soft-delete predicate) so only active memberships/roles are
considered; keep the same parameters (sourceTenant, userIds, orgCode) and leave
the assertion on crossOrgMembership intact.
| const sourceRolesFromUor = await querySelect( | ||
| sequelize, | ||
| `SELECT DISTINCT role_id | ||
| FROM user_organization_roles | ||
| WHERE tenant_code = $sourceTenant | ||
| AND organization_code = $orgCode | ||
| AND user_id = ANY($userIds);`, | ||
| { sourceTenant, orgCode, userIds }, |
There was a problem hiding this comment.
Exclude soft-deleted user_organization_roles from required role-id extraction.
Line 1073-1074 can pull deleted role links into requiredRoleIds, causing avoidable role-resolution failures.
🛠️ Proposed fix
const sourceRolesFromUor = await querySelect(
sequelize,
`SELECT DISTINCT role_id
FROM user_organization_roles
WHERE tenant_code = $sourceTenant
AND organization_code = $orgCode
+ AND deleted_at IS NULL
AND user_id = ANY($userIds);`,
{ sourceTenant, orgCode, userIds },
tx
)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/scripts/migrateTenantOrgData.js` around lines 1068 - 1075, The query that
populates sourceRolesFromUor (and ultimately requiredRoleIds) is currently
including soft-deleted user_organization_roles rows; update the querySelect call
against user_organization_roles to exclude soft-deleted links by adding a
condition that filters out deleted rows (e.g., include AND deleted_at IS NULL in
the WHERE clause), so requiredRoleIds only contains active role_id entries and
avoids role-resolution failures.
Summary by CodeRabbit