Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Difference between lock.mtime and stat.mtime precision #82

Closed
pimlie opened this issue Mar 23, 2019 · 14 comments
Closed

Difference between lock.mtime and stat.mtime precision #82

pimlie opened this issue Mar 23, 2019 · 14 comments

Comments

@pimlie
Copy link
Contributor

@pimlie pimlie commented Mar 23, 2019

Possibly related to #79

We are still getting a lot of reports about stale locks, often on CI platforms (and docker). One possible cause might be users who are using Node <v9. It appears that in Node v9 there can be a difference in precision about the lock.mtime which is set with what stat.mtime reports back:

updateLock 2019-03-23T09:30:18.519Z 2019-03-23T09:30:18.519Z                                                             10:30:33
precision ms                                                                                                             10:30:33
createdMtimeChecked 1553333418519 1553333418519                                                                          10:30:33
mtime set to 2019-03-23T09:30:33.523Z                                                                                    10:30:33
updateLock 2019-03-23T09:30:33.523Z 2019-03-23T09:30:33.000Z                                                             10:30:48
precision ms                                                                                                             10:30:48
createdMtimeChecked 1553333433523 1553333433000           

(lines above are: <function.name> <lock.mtime> <stat.mtime>)

After the last line the lock gets compromised due to the 523ms difference

-- edit --
I removed Node v9 from the title because actually I am not sure if this is really a Node v9 issue only. It might very well be a Node v10+ issue as well which just doesnt show as often / repeatedly due to the general (performance) improvements made in v10+

@pimlie pimlie changed the title Difference between lock.mtime and stat.mtime precision on Node v9 Difference between lock.mtime and stat.mtime precision Mar 25, 2019
@alanshaw

This comment has been minimized.

Copy link
Contributor

@alanshaw alanshaw commented Mar 29, 2019

Can you check you're using the latest version? This issue should have been resolved already.

@pimlie

This comment has been minimized.

Copy link
Contributor Author

@pimlie pimlie commented Mar 29, 2019

@alanshaw See screenshot from vim node_modules/proper-lockfile/package.json, just tried again and the lock gets compromised again

createdMtimeChecked 1553860346774 1553860346774                                                                          12:52:41
createdMtimeChecked 1553860361781 1553860361000                                                                          12:52:56

image

@alanshaw

This comment has been minimized.

Copy link
Contributor

@alanshaw alanshaw commented Mar 29, 2019

Where are you logging createdMtimeChecked?

@pimlie

This comment has been minimized.

Copy link
Contributor Author

@pimlie pimlie commented Mar 29, 2019

node_modules/proper-lockfile/lib/lockfile.js line 330

@alanshaw

This comment has been minimized.

Copy link
Contributor

@alanshaw alanshaw commented Mar 29, 2019

Oh interesting, maybe on your system utimes is second precision but mkdir is ms precision.

@alanshaw

This comment has been minimized.

Copy link
Contributor

@alanshaw alanshaw commented Mar 29, 2019

@satazor a simple fix to this would be to do an initial utimes after the lock directory is created here

options.fs.mkdir(getLockFile(file, options), (err) => {
- adds another FS call...any better ideas?

@pimlie

This comment has been minimized.

Copy link
Contributor Author

@pimlie pimlie commented Mar 29, 2019

I dont think its my fs, its just Linux with ext4. The problem occurs always with Node v9, but in my case never with v10+

@satazor

This comment has been minimized.

Copy link
Contributor

@satazor satazor commented Mar 29, 2019

@alanshaw perhaps it would be better to implement the strategy in #79 without calling detect after the mkdir call?

@pimlie

This comment has been minimized.

Copy link
Contributor Author

@pimlie pimlie commented Mar 29, 2019

Another option would be to just call fs.stat immediately after fs.utimes so we know for sure the next stale check compares the correct mtime.

@satazor

This comment has been minimized.

Copy link
Contributor

@satazor satazor commented Mar 29, 2019

@pimlie right, but then there’s one unecessary io syscall

@pimlie

This comment has been minimized.

Copy link
Contributor Author

@pimlie pimlie commented Mar 29, 2019

@satazor Its not really unnecessary when it squashes buggy behaviour I guess (and there is no other workaround)

I spent a fair amount of time trying to debug this but am afraid although I can reproduce it consistently in one of my Nuxt projects I still dont understand what causes it. Could be linux/fs behaviour, could be node. Could be related to high cpu usage or high io usage. Sorry to say I havent been able to put my finger on it yet. Any suggestions what I could do to get to the bottom of this would really be appreciated though :)

@satazor

This comment has been minimized.

Copy link
Contributor

@satazor satazor commented Mar 29, 2019

@pimlie what I’m saying is that there’s an alternative solution that we discussed in another PR that would solve this issue while not doing any additional io syscall.

You may read the suggestion in the link I gave above but @alanshaw has more context.

Anyway, I will also accept a PR that does the extra syscall to resolve the bug.

@satazor

This comment has been minimized.

Copy link
Contributor

@satazor satazor commented Apr 2, 2019

@pimlie I have a fix in #88. Could you test it out and report back?

It would make things easier for me, otherwise I will have to spawn a VM with ext4 and test it out.

@pimlie

This comment has been minimized.

Copy link
Contributor Author

@pimlie pimlie commented Apr 3, 2019

Thanks for the quick work. Can confirm the fix seems to work:

precision s 1554276465737 1554276465000                                                                                  09:28:00
precision s 1554276480000 1554276480000                                                                                  09:28:15

I also checked whether I still get a warning when I manually change the mtime once the lock has been created and that works too:

precision s 1554276687945 978282000000                                                                                   09:31:42

 WARN  Unable to update lock within the stale threshold                                                                  09:31:42
satazor added a commit that referenced this issue Apr 3, 2019
Closes #82, #87
satazor added a commit that referenced this issue Apr 3, 2019
Closes #82, #87
satazor added a commit that referenced this issue Apr 3, 2019
Closes #82, #87
satazor added a commit that referenced this issue Apr 3, 2019
Closes #82, #87
satazor added a commit that referenced this issue Apr 3, 2019
Closes #82, #87
satazor added a commit that referenced this issue Apr 3, 2019
Closes #82, #87
satazor added a commit that referenced this issue Apr 3, 2019
Closes #82, #87
@satazor satazor closed this in #88 Apr 3, 2019
satazor added a commit that referenced this issue Apr 3, 2019
Closes #82, #87
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.