Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Removal of 20 second delay in overwriting index #63

Closed
chris1ance opened this issue Jan 19, 2024 · 4 comments
Closed

Removal of 20 second delay in overwriting index #63

chris1ance opened this issue Jan 19, 2024 · 4 comments
Labels
ongoing Feature is currently being worked on question Further information is requested

Comments

@chris1ance
Copy link

Hello,

When I run

RAG = RAGPretrainedModel.from_pretrained("colbert-ir/colbertv2.0")
processor = CorpusProcessor()
my_documents = processor.process_corpus(segments)
index_path = RAG.index(index_name="my_index", collection=my_documents, split_documents=True, max_document_length = 256*4)

where segments is a list of strings, if my_index already exists, I get the message:

Jan 18, 17:36:43] #> Note: Output directory .ragatouille/colbert/indexes/my_index already exists


[Jan 18, 17:36:43] #> Will delete 10 files already at .ragatouille/colbert/indexes/my_index in 20 seconds...

I was wondering if I could eliminate this 20 second delay?

@bclavie
Copy link
Collaborator

bclavie commented Jan 24, 2024

Hey, there's actually an argument that can be used internally to bypass the 20 seconds delay, but it's not currently exposed because I wanted to avoid people accidentally deleting indices! If this is useful, I'll make it a passable arg in an upcoming release!

@bclavie bclavie added question Further information is requested ongoing Feature is currently being worked on labels Jan 24, 2024
@chris1ance
Copy link
Author

Thanks for your reply! Makes sense. I would find it useful to be able to switch it on/off, or alternatively to set the number of seconds myself.

@bclavie
Copy link
Collaborator

bclavie commented Jan 27, 2024

Done in 0.0.6a1! You can pass overwrite_index="force_silent_overwrite" to an index() call to immediately (&permanently) delete any existing index at the target path.

@bclavie bclavie closed this as completed Jan 27, 2024
@chris1ance
Copy link
Author

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ongoing Feature is currently being worked on question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants