Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use stable sort when ordering files by timestamp #813

Closed
hempels opened this issue Jun 25, 2021 · 13 comments · Fixed by #818
Closed

Use stable sort when ordering files by timestamp #813

hempels opened this issue Jun 25, 2021 · 13 comments · Fixed by #818
Assignees
Labels
old-label enhancement Functionality that enhances existing features

Comments

@hempels
Copy link

hempels commented Jun 25, 2021

The unstable sort here means different systems may produce a different run order when two or more migrations have matching timestamps:

).sort((m1, m2) => m1.timestamp - m2.timestamp)

@Shinigami92 Shinigami92 self-assigned this Jul 21, 2021
@Shinigami92 Shinigami92 added the old-label enhancement Functionality that enhances existing features label Jul 21, 2021
@littlewhywhat
Copy link
Contributor

littlewhywhat commented Sep 6, 2021

hi @hempels! thanks for creating an issue.

Could you say what is the use case to have several migrations with same timestamp?

From my point of view - the order of migrations is meant to be based on a number - timestamp. The developer needs to be careful by creating a right timestamp but that's actually in usual circumstances much easier then checking the alphabetical order of the names for example. So at the moment, I don't really see the problem here. We assume the migrations have different timestamps and it's quite hard to create unintentionally two migrations with same timestamp.

cc: @Shinigami92

@littlewhywhat
Copy link
Contributor

Closing but feel free to reply and reopen. Thanks!

@hempels
Copy link
Author

hempels commented Nov 6, 2021

I'm unsure of the circumstances that cause it. We have multiple developers working in a single project with hundreds of migrations, merging in multiple branches, and it isn't possible to monitor the naming of each one to make sure the timestamp portion is unique. If I had to guess, I'd say rather than running create, they sometimes copy a migration file and change the suffix thinking that's what matters for ordering. Fixing them once they're already committed incorrectly is a nightmare because people have already applied them.

Performing a stable sort here would be a relatively easy way to eliminate negative impacts of inadvertent overlap. Certainly much easier than checking the file names of every migration in every commit.

@Shinigami92
Copy link
Collaborator

Beside that, my fix is fairly easy to resolve this issue

@hempels
Copy link
Author

hempels commented Dec 5, 2021

Closing but feel free to reply and reopen. Thanks!

@littlewhywhat What do you mean by "reopen"? You want a new issue that goes back over the same ground?

The PR (#818) created by @Shinigami92 is simple, unlikely to introduce any bugs, and won't have any effect for the case where there isn't a problem. It will, however, effortlessly resolve the issue for cases where there currently is a problem.

@littlewhywhat littlewhywhat reopened this Dec 6, 2021
@goce-cz
Copy link
Contributor

goce-cz commented Dec 9, 2021

Honestly I'd be more inclined to fail fast in these situations. As soon as we find two migrations with the same timestamp we should fail - thus avoid the situation in the first place. Trying to work around an erroneous state isn't really a solution to the problem.

I understand your current pain @hempels, though. A general problem with this library is that once the migrations are applied, there's no way back, unless you dare to fiddle with the DB directly. Also establishing a stable order right now might not fully solve your current issues since that order might not reflect how the previous migrations were already applied.

I suggest you do you homework to solve the current problems and we'll introduce the check to avoid them in the future. How would you feel about that?

@hempels
Copy link
Author

hempels commented Dec 10, 2021

Frankly, I'm finding the resistance to this change very frustrating. It has no impact for anyone not experiencing the issue and for those that are, it solves the problem without any ambiguity. There is no need to fail in these cases. So long as the order of migrations is deterministic, the timestamp of any given migration is mostly immaterial. Even if all of the migrations have the same timestamp and only vary by suffix, it would still be fine so long as their order is stable and correct. If the order of a new entry is incorrect, attempting to apply it to an up-to-date DB will fail because it comes before an already run migration, so the issue will be caught quickly (fail fast, as you say.)

@goce-cz
Copy link
Contributor

goce-cz commented Dec 10, 2021

I suggested to fail fast before we apply anything to the DB thus avoiding this:

Fixing them once they're already committed incorrectly is a nightmare because people have already applied them.

Anyway neither I nor @littlewhywhat see any technical problem with the PR. What bothers us is, that this library is meant to be used with unique time-stamps - that's why the create command is there and that's why you have the problems. For that reason I'd rather communicate this explicitly (hence my suggestion for uniqueness check) than soften the rules and basically allow a flow that is not according to the initial design.

If we apply the fix, then we could easily say that there's no need for any timestamp anymore... your developers could basically start doing: 11111111-migration-A, 11111111-migration-B, 11111111-migration-C etc... but then what's the point of the timestamp? I'm not saying that the timestamp is the greatest solution to the problem - there might be better ones. However this is a fundamental change to the concept of the how the migrations are supposed to be handled.

That all said, I guess we'll merge the PR silently to keep you happy. But in the next major release, I'd love to see the duplicity check so that no other new user gets into your shoes. We will introduce a backwards compatibility flag though... something like --ignore-duplicate-timestamps.

@hempels
Copy link
Author

hempels commented Dec 10, 2021

I've thought through your argument more and in theory I am content with the idea of failing immediately if a duplicate timestamp is detected. That would fix the problem and would shift the onus for resolving it to the person who created the problem.

The practical issue is that would introduce a breaking change given our existing migrations where some of those duplicates are hundreds of migrations up the line. We would be forced to rename any old migrations with duplicate timestamps and deal with the renaming challenges, which are very awkward. Adding a compatibility flag helps, but then we still need the stable sort PR applied.

@goce-cz
Copy link
Contributor

goce-cz commented Dec 13, 2021

That would fix the problem and would shift the onus for resolving it to the person who created the problem.

Exactly my point. I'm happy you agree.

would introduce a breaking change

We could possibly introduce the change in not-so-breaking way... I assume we're always fetching the already applied migrations from the DB, are we @littlewhywhat? In that case we could strip those from the full list and do the dupe test only on the non-applied ones (maybe including the latest applied one). Thus we would prevent the problem with any future migration, but still allow the already applied cripples to pass.

stable sort PR applied

Let's apply it, since the new major version will take some time anyway. Agreed @littlewhywhat ?

@littlewhywhat
Copy link
Contributor

Let's apply it

ok i will reopen the PR and take a look, thanks!

I assume we're always fetching the already applied migrations from the DB

we load all migrations files and fetch all migration records to check the order each time if that's what you mean.

@littlewhywhat
Copy link
Contributor

@hempels FYI even with the PR applied, if the current migrations table (on your master branch, I guess) doesn't have migrations sorted alphabetically then you would need to fix it as our check gonna fail with error that some migration A precedes migration B. So it may not solve your issue so effortlessly.

@littlewhywhat
Copy link
Contributor

@hempels FYI 6.1.0 is now released with the fix for this issue https://github.com/salsita/node-pg-migrate/releases/tag/v6.1.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
old-label enhancement Functionality that enhances existing features
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants