Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/soft lock #262

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

daveboulard
Copy link
Contributor

When automating the migration process this tool might really comes in handy and alleviate a lot of pain.
Especially in a micro-services architecture with a lot of environments.
Nevertheless, depending on the deployment process, one might have to trigger the automated migration just right before the start of the application.
And if there are multiple applications running in parallel, multiple deployment might occur in parallel. Thus, the risk of having migration running in parallel is real and can lead to a lot of drama.

So, in order to avoid that, I propose a little feature which I would call "soft lock" relying on the TTL indexes mongodb provides (https://docs.mongodb.com/manual/core/index-ttl/).

The idea is to have a collection which should contain maximum one document. A fetch on this collection is done at the start of a migration.

  • If no document is found in this lock collection, the migration will start and add one document to the lock collection.
    When the process finishes - or has failed - this document is removed.
  • If a document is found in this lock collection at the start of the migration, the migration is then cancelled.
    If anything goes wrong and the lock is not cleared, the TTL index will take care of it.
  • npm test passes and has 100% coverage
  • README.md is updated

@coveralls
Copy link

coveralls commented Oct 14, 2020

Coverage Status

Coverage remained the same at 100.0% when pulling cb3a8b7 on letsbuilders:feature/soft-lock into 4906ff2 on seppevs:master.

@shweshi
Copy link

shweshi commented Mar 27, 2021

This can be very useful.

@dlecan
Copy link

dlecan commented Mar 31, 2021

This feature is very interesting!
What is missing to merge this PR?

@shweshi
Copy link

shweshi commented Mar 31, 2021

@daveboulard There is a conflict with the README.md file. Please resolve this.
@seppevs Can we merge this pull request, feature is very interesting and useful, development side also I feel the PR is complete and ready to merge. If anything is blocking from getting merge please give your opinion.

@daveboulard
Copy link
Contributor Author

Thank you for your feedback !

@shweshi @seppevs : branch up to date and conflict resolved.

@Igor-Techsee
Copy link

@daveboulard Great feature, thank you!
@seppevs please merge

@mat250
Copy link

mat250 commented Nov 8, 2021

@daveboulard Thanks a lot for this feature !
Need this also when you spin up multiple NodeJS server inside a Kubernetes at the same time :)

@theogravity
Copy link

I've integrated your PR into my fork:

https://github.com/theogravity/migrate-mongo-alt

@chromey
Copy link

chromey commented Jun 13, 2022

Could you shed some light on the status of this feature?

@jscheffner
Copy link

What is missing to merge this? It's 3 years since the PR was opened. I would be very interested in this feature.

@mg-83
Copy link

mg-83 commented Nov 10, 2023

@seppevs please merge, will help a lot of people running multi instance environments

@SeanReece
Copy link

I'm actually surprised that a migration tool does not have locking functionality. This means that migrations will run multiple times in any horizontally scaled environment, not only is that inefficient but could possibly break the migrations if the migration script does not account for this.

That being said I think the implementation here is fundamentally flawed in that it introduces a mutex lock that is not thread safe, introducing a race condition where multiple instances of a service can create multiple locks and enter the locked section of code.

What I believe would provide an atomic lock operation is using upsert with $setOnInsert

async function activate(db) {
     const lockCollection = await getLockCollection(db)
     if (!lockCollection) throw new Error('Cannot get lock collection')
     // This command simultaneously checks if a lock exists and inserts one if no lock exists in a single atomic operation
     const { upsertedCount } = await lockCollection.update(
       { lock: true }, // Some static value to match on
       { $setOnInsert: { lock: true, createdAt: new Date() } }, // $setOnInsert will not update any existing locks
       { upsert: true }
     )
     return upsertedCount === 1
 }

Then when using this lock we only need to perform a single action to 1. Check if the lock exists and 2. Create the lock if it doesn't exist

const lockObtained = await lock.activate(db)
if (!lockObtained) {
     throw new Error("Could not migrate up, a lock is in place.");
}
console.log('This code is now locked and thread safe')
await lock.clear(db)

@theogravity
Copy link

theogravity commented Nov 17, 2023

note: I'm not the author of this PR or know the author.

Node is single-threaded, so I don't think the argument applies here.

We've been running this code for over a year or more at this point in a h-scaled environment without any issues.

Without the code, we do experience the issue this code was designed to fix (which is why we started to use it).

@SeanReece
Copy link

@theogravity Yes Node is single threaded, but multiple node instances are not "thread safe" when operating on an external DB. In the case of a single node instance, locks are not needed.

For example, when using node cluster. Each worker is operating on a separate thread. So it is possible that 2 instances will reach the await lock.exists(db) call at approximately the same time. Both workers will call the DB, conclude that there are no locks, then they will both create a lock and each execute the migration.

The same applies for any horizontally scaled environment such as micro services.

I'm not saying this implementation will not work, I'm sure it will work fine >99% of the time. BUT there is still a race condition here and could lead to unexpected issues.

@theogravity
Copy link

theogravity commented Nov 17, 2023

I don't think that's an issue when you look into how mongodb does locking:

https://www.mongodb.com/docs/manual/faq/concurrency/#what-locks-are-taken-by-some-common-client-operations-

Doing an update would use an intent exclusive write lock against the collection

https://stackoverflow.com/questions/34613068/what-are-intention-shared-and-intention-exclusive-locks-in-mongodb

"This means, that if the update operation would take a considerable amount of time, any find operation would be blocked during the duration of the exclusive lock is in place."

It says "considerable amount of time", but doesn't specify what happens if it doesn't take a considerable amount of time though.

All in all, it definitely doesn't hurt to include your code as extra safety.

Considering we have an h-scaled application that performs a ton of read / writes outside of migrations from multiple node processes, you would think this would cause a huge amount of issues, but we just don't see them and that's probably thanks to mongo's locking mechanism under the hood.

@jscheffner
Copy link

Yes, mongodb locks while doing the update which is why the solution proposed by SeanReece is safe. It doesn't lock beween the find and the insert so the current code change isn't really safe.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.