Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix behavior for OOB move followed by IB move #63

Merged
merged 4 commits into from
Apr 9, 2023

Conversation

dlqqq
Copy link
Collaborator

@dlqqq dlqqq commented Apr 7, 2023

The reason this bug exists is because LocalFIDM didn't consider mixing out-of-band and in-band filesystem operations. This meant that originally, we did not call _sync_file() in move(), because we assumed the file was not moved out-of-band after it was indexed. This optimization was done for performance reasons, as the _sync_* methods are a little expensive.

However, as @dleen describes in the original issue, this error state is too easy to reach, and should be properly handled by LocalFIDM.

@dlqqq dlqqq added the enhancement New feature or request label Apr 7, 2023
@codecov
Copy link

codecov bot commented Apr 7, 2023

Codecov Report

Patch coverage: 100.00% and project coverage change: -0.54 ⚠️

Comparison is base (9506f61) 86.03% compared to head (dc68c39) 85.49%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main      #63      +/-   ##
==========================================
- Coverage   86.03%   85.49%   -0.54%     
==========================================
  Files           5        5              
  Lines         537      531       -6     
  Branches       69       68       -1     
==========================================
- Hits          462      454       -8     
- Misses         53       55       +2     
  Partials       22       22              
Impacted Files Coverage Δ
jupyter_server_fileid/manager.py 90.15% <100.00%> (-0.65%) ⬇️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@dlqqq
Copy link
Collaborator Author

dlqqq commented Apr 7, 2023

Oh wait, this gets a little more tricky because this is technically a breaking change. Let me explain:

Imagine a user indexes a file at old_path, deleting it, and then creating a new one at new_path. Previously, if the user called local_fidm.move(old_path, new_path), we would return the ID of old_path. However, because now we call _sync_file() instead, platforms that can distinguish between a move and a deletion/creation (Windows and MacOS) will actually generate a new ID for new_path. The old behavior was asserted by this test:

# test for disjoint move handling
# disjoint move: any out-of-band move that does not preserve stat info
def test_disjoint_move_indexed(any_fid_manager, old_path, new_path, fs_helpers):
    old_id = any_fid_manager.index(old_path)

    fs_helpers.delete(old_path)
    fs_helpers.touch(new_path, dir=True)
    new_id = any_fid_manager.move(old_path, new_path)

    assert old_id == new_id

I think this is fine, given that a proper filesystem move should be done before calling local_fidm.move() anyways, and not some weird combination of deleting an indexed file and creating a new one at a different path. I'll remove the test and make sure this gets released in 0.9.0 instead of 0.8.1.

@dlqqq
Copy link
Collaborator Author

dlqqq commented Apr 7, 2023

Honestly, sometimes I'm blown away by how smart _sync_file() is 😂

@dlqqq
Copy link
Collaborator Author

dlqqq commented Apr 7, 2023

@dleen Could you pull this branch locally and verify this fixes your issue?

@dleen
Copy link

dleen commented Apr 7, 2023

Very nice @dlqqq ! I took a quick look and I noticed a small issue - the path doesn't seem to get updated in the DB. So after I rename to foo3 in my example the DB still shows foo1:

82f41128-1598-4448-9d53-79ab56ce070e|/Users/dleen/code/catalog/foo1.ipynb|91561383|1680642664149532160|1680811746190785697|0

I think you might be able to see this if you include fetching by path in your test? Or does fetching by path have the side effect of updating the DB?

@dleen
Copy link

dleen commented Apr 7, 2023

I'm checking locally but I think in the test:

    # out-of-band
    fs_helpers.move(test_path, new_path_1)
    # in-band
    fs_helpers.move(new_path_1, new_path_2)
    fid_manager.move(new_path_1, new_path_2)

I think there are two moves in the # in-band section and I'm not sure what the impact of using fs_helpers is on my repo. I'm checking to see what happens when I remove this.

Edit: nevermind I see that's supposed to be the equivalent of a contents manager move

@dleen
Copy link

dleen commented Apr 7, 2023

Figured it out - it's because commit was no longer being called. See: dleen@8338090

@dleen
Copy link

dleen commented Apr 7, 2023

@dlqqq not sure how to update this PR with my change - feel free to cherry-pick it.

Not sure why the test didn't catch this...

@dleen
Copy link

dleen commented Apr 7, 2023

Confirming that with my additional patch it works as expected

@dlqqq
Copy link
Collaborator Author

dlqqq commented Apr 9, 2023

@dleen Thank you for reporting the bug and finding that edge case I missed! 🤗 I'll merge this and cut a release for you.

Not sure why the test didn't catch this...

The SQLite client will allow you to query uncommitted transactions so long as it's done on the same DB connection. Hence, you can imagine that this is quite tricky to test; I think a good compromise is implementing a declarative way of automatically committing after public method invocation via decorators. This would provide us a strong guarantee of File ID behaving properly without having to write complex testing logic. See here: #64

@dlqqq dlqqq merged commit c2d7e78 into jupyter-server:main Apr 9, 2023
19 of 20 checks passed
@dlqqq dlqqq deleted the fix-oob-ib-move branch April 9, 2023 17:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

UNIQUE constraint failed when moving file
2 participants