Skip to content

Conversation

@eliykat
Copy link
Member

@eliykat eliykat commented Jan 1, 2026

🎟️ Tracking

https://bitwarden.atlassian.net/browse/PM-28555

📔 Objective

Users are only permitted 1 default user collection ("My Items") for an organization. Today, this is enforced by querying the database and skipping users who already have such a collection, however this is susceptible to time-of-check time-of-use bugs. There is no atomic check or constraint that prevents duplicate collections from being created. This PR adds this guarantee.

A significant challenge is that the unique columns are spread across 2 tables: Collection.Type (to identify default collections only) and CollectionUser.OrganizationUserId. Additionally, we have many complex sprocs for the Collection table, which makes it hard to modify without making a large impact throughout the codebase.

I went with a solution suggested by @mkincaid-bw, which was a semaphore table. Advantages:

  • optimistic rather than pessimistic (i.e. reduces locking)
  • no impact on existing uses of the Collection table
  • easy to migrate existing data to semaphores
  • EDD compliant: if we roll back the deployment, the old code will ignore the semaphore table

Changes in this PR:

  • create the semaphore table
  • migrate existing data to semaphores (Dapper/MSSQL only, as this feature has not been released for self-hosted yet; if this PR is not merged by 9 Jan we will have to add EF migrations as well)
  • add a new repository method to insert n default collections with semaphores; existing uses of ICollectionRepository.CreateAsync are migrated to this
  • update existing SqlBulkCopy method to use semaphores
  • DRY up default collection arrangement logic (currently repeated between repository implementations)

Design decisions worth pointing out:

  • today, the repository filters out users who already have a default collection. This PR makes that the caller's responsibility, mostly so that I could properly test the semaphore behaviour. However, I also think that the repository method was doing too much, so simplifying the repository method and letting callers handle this where needed makes sense.
  • the semaphore table contains the minimal amount of columns needed: only the OrganizationUserId, and a creation date for debugging purposes. We could have a FK to the Collection as well, but it's not strictly required. I haven't included it for now.
  • even though we don't expect any unique constraint violations, this is caught and transformed into a consistent exception for a user-facing error and easier debugging
  • we are left with 2 repository methods, one which uses a sproc and one which uses SqlBulkCopy. They are used inconsistently, and this needs to be addressed later; ideally we would use the sproc for 1-1000 insertions and SqlBulkCopy for 1000+. That is out of scope - the only goal here was to get everyone using the semaphore table.

📸 Screenshots

⏰ Reminders before review

  • Contributor guidelines followed
  • All formatters and local linters executed and passed
  • Written new unit and / or integration tests where applicable
  • Protected functional changes with optionality (feature flags)
  • Used internationalization (i18n) for all UI strings
  • CI builds passed
  • Communicated to DevOps any deployment requirements
  • Updated any necessary documentation (Confluence, contributing docs) or informed the documentation team

🦮 Reviewer guidelines

  • 👍 (:+1:) or similar for great changes
  • 📝 (:memo:) or ℹ️ (:information_source:) for notes or general info
  • ❓ (:question:) for questions
  • 🤔 (:thinking:) or 💭 (:thought_balloon:) for more open inquiry that's not quite a confirmed issue and could potentially benefit from discussion
  • 🎨 (:art:) for suggestions / improvements
  • ❌ (:x:) or ⚠️ (:warning:) for more significant problems or concerns needing attention
  • 🌱 (:seedling:) or ♻️ (:recycle:) for future improvements or indications of technical debt
  • ⛏ (:pick:) for minor or nitpick changes

@codecov
Copy link

codecov bot commented Jan 1, 2026

Codecov Report

❌ Patch coverage is 91.11111% with 8 lines in your changes missing coverage. Please review.
✅ Project coverage is 58.87%. Comparing base (35868c2) to head (88036ea).

Files with missing lines Patch % Lines
...ucture.Dapper/Repositories/CollectionRepository.cs 75.75% 7 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #6791      +/-   ##
==========================================
+ Coverage   54.93%   58.87%   +3.93%     
==========================================
  Files        1927     1928       +1     
  Lines       85457    85454       -3     
  Branches     7648     7649       +1     
==========================================
+ Hits        46949    50312    +3363     
+ Misses      36723    33274    -3449     
- Partials     1785     1868      +83     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@github-actions
Copy link
Contributor

github-actions bot commented Jan 1, 2026

Logo
Checkmarx One – Scan Summary & Details7b51ba4b-6929-4c8b-bb70-ba80b55ffb0b

New Issues (3)

Checkmarx found the following issues in this Pull Request

# Severity Issue Source File / Package Checkmarx Insight
1 MEDIUM CSRF /src/Api/Vault/Controllers/CiphersController.cs: 1511
detailsMethod at line 1511 of /src/Api/Vault/Controllers/CiphersController.cs gets a parameter from a user request from id. This parameter value flows ...
Attack Vector
2 MEDIUM CSRF /src/Api/Vault/Controllers/CiphersController.cs: 1387
detailsMethod at line 1387 of /src/Api/Vault/Controllers/CiphersController.cs gets a parameter from a user request from id. This parameter value flows ...
Attack Vector
3 MEDIUM CSRF /src/Api/Vault/Controllers/CiphersController.cs: 245
detailsMethod at line 245 of /src/Api/Vault/Controllers/CiphersController.cs gets a parameter from a user request from id. This parameter value flows t...
Attack Vector
Fixed Issues (2)

Great job! The following issues were fixed in this Pull Request

Severity Issue Source File / Package
MEDIUM CSRF /src/Api/Billing/Controllers/VNext/AccountBillingVNextController.cs: 98
MEDIUM CSRF /src/Api/Vault/Controllers/CiphersController.cs: 293

@eliykat eliykat added the ai-review Request a Claude code review label Jan 2, 2026
Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>
r-tome
r-tome previously approved these changes Jan 5, 2026
Copy link
Contributor

@r-tome r-tome left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work!

@withinfocus
Copy link
Contributor

I reached out to @bitwarden/dept-dbops this morning and haven't heard back yet but I am not initially understanding this pattern and usage of the database. Logistically-speaking I think it's better to name such a thing regarding its association and to drop the "default" language (we always lead with domain too) but fundamentally why is the database used for this vs. a persistent cache? I'd like to hear about rationale and what was assessed to address this problem; we don't do this anywhere else that I know of and personally I have never used a relational database table for this kind of lock. We should strive to keep logic out of the data tier.

@eliykat
Copy link
Member Author

eliykat commented Jan 5, 2026

@withinfocus The naming can be adjusted no problem. In terms of the fundamental approach, the thinking was that a database can enforce uniqueness constraints as a matter of data integrity. Same as enforcing a uniqueness constraint on the User.Email column. The difficulty here is that the relevant columns are spread across multiple tables, hence the use of the separate table, but that's just a different way of enforcing the same principle.

Let me know how you go with dbops - if you want to revisit this approach we can, but probably best that we have a meeting about it to get alignment before spending more time on it.

@eliykat eliykat removed the ai-review Request a Claude code review label Jan 6, 2026
@eliykat eliykat marked this pull request as draft January 6, 2026 04:16
@eliykat eliykat added the ai-review Request a Claude code review label Jan 6, 2026
Comment on lines 797 to 822
public async Task CreateDefaultCollectionsAsync(Guid organizationId, IEnumerable<Guid> organizationUserIds, string defaultCollectionName)
{
organizationUserIds = organizationUserIds.ToList();
if (!organizationUserIds.Any())
{
return;
}

using var scope = ServiceScopeFactory.CreateScope();
var dbContext = GetDatabaseContext(scope);

var orgUserIdWithDefaultCollection = await GetOrgUserIdsWithDefaultCollectionAsync(dbContext, organizationId);
var missingDefaultCollectionUserIds = organizationUserIds.Except(orgUserIdWithDefaultCollection);

var (collectionUsers, collections) = BuildDefaultCollectionForUsers(organizationId, missingDefaultCollectionUserIds, defaultCollectionName);
var (collections, collectionUsers) = CollectionUtils.BuildDefaultUserCollections(organizationId, missingDefaultCollectionUserIds, defaultCollectionName);

if (!collectionUsers.Any() || !collections.Any())
if (!collections.Any() || !collectionUsers.Any())
{
return;
}

await dbContext.BulkCopyAsync(collections);
await dbContext.BulkCopyAsync(collectionUsers);
await dbContext.BulkCopyAsync(Mapper.Map<IEnumerable<Collection>>(collections));
await dbContext.BulkCopyAsync(Mapper.Map<IEnumerable<CollectionUser>>(collectionUsers));

await dbContext.SaveChangesAsync();
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CRITICAL: Missing transaction wrapper for atomic operations

This method performs multiple database operations (BulkCopyAsync for Collections, BulkCopyAsync for CollectionUsers) without wrapping them in an explicit transaction. This violates the atomicity requirement stated in the stored procedure comment: "this MUST be executed in a single transaction to ensure consistency."

Risk: Partial failures could leave the database in an inconsistent state where collections are created without their corresponding collection users, or vice versa.

Evidence: The Dapper implementation correctly wraps these operations in an explicit transaction (lines 410-436 in Dapper CollectionRepository.cs):

await using var transaction = connection.BeginTransaction();
try {
    // operations
    await transaction.CommitAsync();
} catch {
    await transaction.RollbackAsync();
    throw;
}

Fix: Wrap the entire method body in a transaction similar to the Dapper implementation, or document why EF's implicit transaction handling is sufficient here (though BulkCopyAsync typically bypasses EF's change tracker).

Comment on lines 402 to 437
public async Task CreateDefaultCollectionsBulkAsync(Guid organizationId, IEnumerable<Guid> organizationUserIds, string defaultCollectionName)
{
organizationUserIds = organizationUserIds.ToList();
if (!organizationUserIds.Any())
{
return;
}

await using var connection = new SqlConnection(ConnectionString);
connection.Open();
await using var transaction = connection.BeginTransaction();
try
{
var orgUserIdWithDefaultCollection = await GetOrgUserIdsWithDefaultCollectionAsync(connection, transaction, organizationId);

var missingDefaultCollectionUserIds = organizationUserIds.Except(orgUserIdWithDefaultCollection);

var (collectionUsers, collections) = BuildDefaultCollectionForUsers(organizationId, missingDefaultCollectionUserIds, defaultCollectionName);
var (collections, collectionUsers) =
CollectionUtils.BuildDefaultUserCollections(organizationId, missingDefaultCollectionUserIds, defaultCollectionName);

if (!collectionUsers.Any() || !collections.Any())
{
return;
}

await BulkResourceCreationService.CreateCollectionsAsync(connection, transaction, collections);
await BulkResourceCreationService.CreateCollectionsUsersAsync(connection, transaction, collectionUsers);

transaction.Commit();
await transaction.CommitAsync();
}
catch
{
transaction.Rollback();
await transaction.RollbackAsync();
throw;
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ IMPORTANT: Time-of-check time-of-use (TOCTOU) vulnerability in bulk method

This method still has a TOCTOU race condition:

  1. Line 415: Read existing collections via GetOrgUserIdsWithDefaultCollectionAsync (inside transaction)
  2. Line 417: Filter in C# to find missing collections
  3. Lines 427-428: Insert collections and collection users (same transaction)

Race condition scenario:
If two concurrent calls both execute step 1 at approximately the same time for overlapping user sets, both could determine the same user is "missing" a collection, then both would attempt to insert, causing duplicate key violations or duplicate collections.

Why the transaction doesn't prevent this: While the transaction ensures atomicity of the inserts within a single call, it doesn't provide isolation between concurrent calls at the SERIALIZABLE level needed to prevent this race.

Comparison with non-bulk method: The CreateDefaultCollectionsAsync method (lines 364-400) delegates filtering to the stored procedure, which performs the NOT EXISTS check and INSERT in one operation. While this is better, it still lacks a unique constraint to fully prevent the race at the database level.

Recommendation:

  1. Add a unique constraint on CollectionUser(OrganizationUserId) where Collection.Type = DefaultUserCollection
  2. Or use SERIALIZABLE isolation level
  3. Or accept potential duplicate key errors and handle them gracefully

Comment on lines +14 to +25
-- Filter to only users who don't have default collections
SELECT ids.Id1, ids.Id2
INTO #FilteredIds
FROM @OrganizationUserCollectionIds ids
WHERE NOT EXISTS (
SELECT 1
FROM [dbo].[CollectionUser] cu
INNER JOIN [dbo].[Collection] c ON c.Id = cu.CollectionId
WHERE c.OrganizationId = @OrganizationId
AND c.[Type] = 1 -- CollectionType.DefaultUserCollection
AND cu.OrganizationUserId = ids.Id1
);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ IMPORTANT: Race condition not fully prevented by transaction alone

The NOT EXISTS check followed by INSERT operations creates a potential race condition that transactions alone cannot prevent under default READ COMMITTED isolation level.

Race scenario:

  1. Call A executes NOT EXISTS for User1 → returns true (no collection exists)
  2. Call B executes NOT EXISTS for User1 → returns true (no collection exists)
  3. Call A inserts Collection and CollectionUser for User1
  4. Call B inserts Collection and CollectionUser for User1 → Duplicate or error

Why transaction wrapper isn't sufficient: The caller wraps this in a transaction (line 378 in Dapper CollectionRepository.cs), but SQL Server's default READ COMMITTED isolation doesn't prevent phantom reads between the NOT EXISTS check and the INSERT.

Current safeguards:

  • Collection table has PK on Id (prevents duplicate collections since IDs are pre-generated)
  • But CollectionUser table may allow duplicates if no unique constraint exists on (CollectionId, OrganizationUserId)

Recommendations (choose one):

  1. Preferred: Add unique constraint (OrganizationUserId) filtered where Collection.Type = 1 - this makes the database enforce the business rule
  2. Use SERIALIZABLE isolation level in the transaction wrapper
  3. Handle unique constraint violations gracefully and return success (idempotent behavior)

Question for maintainers: Does the CollectionUser table have a unique constraint on (CollectionId, OrganizationUserId)? If yes, this would cause violations under concurrent load rather than silent duplicates.

@eliykat
Copy link
Member Author

eliykat commented Jan 6, 2026

After some discussions, we are going to simplify this approach and not seek a database-level guarantee to avoid duplicates. We are going to rely on querying for existing default collections first. If we experience TOCTOU issues in practice then we can look at increasing the isolation level of the transaction, but in any case, we are not going with the semaphore table.

I will open a new PR so there is a tidier commit history and better context for everyone.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ai-review Request a Claude code review needs-qa

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants