Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tantivy fails with Failed purge deletes error #324

Closed
winding-lines opened this issue Jun 20, 2018 · 16 comments
Closed

Tantivy fails with Failed purge deletes error #324

winding-lines opened this issue Jun 20, 2018 · 16 comments
Assignees
Milestone

Comments

@winding-lines
Copy link

I am getting a failure starting my application

Failed purge deletes: Error(PathDoesNotExist("/SOME_PATH/c475e13ef3ca45128b1f8d9ee42fe994.term")

This is on OS X. My computer crashed recently. What's a good way to move forward here? Happy to help writing some code if that helps :)

@fulmicoton fulmicoton added the bug label Jun 20, 2018
@fulmicoton
Copy link
Collaborator

This might be an actual bug.

Which version of tantivy is this?
Can you give more details on what you were doing ?

Did you remove some files manually etc.

@fulmicoton
Copy link
Collaborator

Also is it something you can reproduce?

@fulmicoton
Copy link
Collaborator

I'm also interested in your meta.json if you still have it as well as logs.

@fulmicoton fulmicoton added this to the 0.6.0 milestone Jun 20, 2018
@fulmicoton fulmicoton self-assigned this Jun 20, 2018
@winding-lines
Copy link
Author

@fulmicoton I did not delete files manually but my computer did crash.

I am using rev = "432d49d8147624", just some hash I used at that time.

In normal operation I am indexing html documents, the fields are the url, title and body. When adding a new document I delete the previous one, there is also a button to mark documents as private.

My implementation is dumb, I initialize searchers for every request. Here is my my meta.json file.

meta.json.zip

I do not have other logs.

@fulmicoton
Copy link
Collaborator

This sounds like a bug so your report is super helpful.

If you somehow found a way to reproduce, RUST_LOG=info will log all all of the updates on the segments sets (merges, commits, etc.).

@fulmicoton
Copy link
Collaborator

Oh wait... And you are saying that you get that on your application startup ?
I had totally missed that. This is very very peculiar.

The index itself is not something you can share, is it?
Can you join the results of ls ?

@winding-lines
Copy link
Author

I don't think I can share the index right now but I've added a listing of the folder
ls.out.zip

I am trying to repro this, may need to wait until the weekend. I can send you an invite to the repo, it's on gitlab.

@fulmicoton
Copy link
Collaborator

fulmicoton commented Jun 21, 2018

@winding-lines

Can you retrace exactly what happened and the current state please ?

So far my understanding of the events :

  • you created an index. Add and remove documents.
  • your computer crashed once.
  • you restarted your program
  • Very quickly it panicked and died showing the following error : Failed purge deletes: Error(PathDoesNotExist("/SOME_PATH/c475e13ef3ca45128b1f8d9ee42fe994.term")
  • If you restart your program right now, it works normally but has lost a bunch of its updates. (!?)

Can you especially confirm that last point?

@fulmicoton
Copy link
Collaborator

@winding-lines Thanks for the invite. I see the two projects listed, but I cannot see the source code somehow.

@winding-lines
Copy link
Author

Sorry @fulmicoton, I am not very familiar with Gitlab, I've changed the settings some more let me know if things work better.

I will work over the weekend to try to isolate the problem in a smaller test case.

@winding-lines
Copy link
Author

I had a bug in my start code and I was swallowing an Err() on process start.

DEBUG 2018-xx-xxT16:19:54Z: tantivy::directory::mmap_directory: Open Read "09dd5ceaa17542c09c42bc1f4cd58644.term"
ERROR 2018-xx-xxT16:19:54Z: error open index
ERROR 2018-xx-xxT16:19:54Z: caused by: path does not exist: '"/Users/SOME_PATH/09dd5ceaa17542c09c42bc1f4cd58644.term"'

Happy to add some more debugging statements if that helps :)

@fulmicoton
Copy link
Collaborator

fulmicoton commented Jun 21, 2018

@winding-lines Sorry I am still not entirely sure I understand what you experience and the current state of your index.

I still don't understand whether your index is currently in a corrupted state or not.

Ideally could you go through the list I wrote above, copy paste and edit it?

@fulmicoton fulmicoton modified the milestones: 0.6.0, 0.7.0 Jun 22, 2018
@winding-lines
Copy link
Author

@fulmicoton I deleted the index for the time being and I am adding more logging should this happen at some other time. I think it's ok to close this issue since I cannot provide more information to help debug.

Sorry about the trouble.

Marius

@fulmicoton
Copy link
Collaborator

@winding-lines Reopening the issue.

If tantivy was working according to spec, your error message could only provoked manually by removing a file which you haven't done. There is most likely a severe issue here (thank you very much for reporting it).

If someone manages to reproduce it, please document it here.
Please set up and env_logger in your app to get tantivy logs.

@winding-lines
Copy link
Author

winding-lines commented Jun 23, 2018 via email

fulmicoton added a commit that referenced this issue Jun 23, 2018
Merge SegmentEntry are not requested atomically.
Because of that, the list of entries may change in the middle of
`start_merge` executing.

The `segment_entries` fetched for each of the segment id end
up having a different revision. As a result, we end up protecting a
`.del` files  associated to different opstamp.

We select the delete queue opstamp by looking at the first segment entry
only.

If GC also kicks in, the .del files required to catch up with the
delete queue may be deleted.
@fulmicoton
Copy link
Collaborator

Fix has been confirmed. Will be published as a hotfix in 0.6.1 today.

@fulmicoton fulmicoton modified the milestones: 0.7.0, 0.6.1 Jul 10, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants