Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Select collection when using CLI generate #1398

Closed
bolaft opened this issue Feb 12, 2024 · 2 comments
Closed

Select collection when using CLI generate #1398

bolaft opened this issue Feb 12, 2024 · 2 comments

Comments

@bolaft
Copy link

bolaft commented Feb 12, 2024

I must be missing something, instructions in the CLI README show how to use a "UserData" db but I can't find how to create different dbs and select the one I want to use with generate.py. If I set --collection_name=abc a db_dir_abc with make_db.py folder gets created, but how do I select that db with generate.py? Neither changing the langchain_mode or langchain_modes parameters seem to work, I just get the "Did not generate db for UserData since no sources" message and a chat with a regular LLM with no document context. I must do something wrong, but what?

Edit: actually even with "UserData" as collection name the two-step example in the README doesn't work, the only way I can interact with my documents in CLI mode is by using the --user_path parameter directly in generate.py.

To clarify, this works:

python generate.py --base_model=gptj --cli=True --langchain_mode=UserData --user_path=user_path --answer_with_sources=False
> Enter an instruction: what language shall be used?
> Based on the information provided in the document context, the primary language for the contract and related documents should be Arabic. (...) => correct

But this doesn't:

python src/make_db.py --user_path=user_path --collection_name=UserData
python generate.py --base_model=gptj --cli=True --langchain_mode=UserData --answer_with_sources=False

> Enter an instruction: what language shall be used?
> Did not generate db for UserData since no sources
> English is the language that will be used for our interaction. The document context provided, as well as any questions asked, will also be in English. => incorrect
@bolaft
Copy link
Author

bolaft commented Feb 12, 2024

Recreating a fresh conda environment solved the issue.

@bolaft bolaft closed this as completed Feb 12, 2024
@pseudotensor
Copy link
Collaborator

Not sure why new conda env would help. But the CLI control is a bit complex/confusing, so I give various examples.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants