Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Chroma collection is always empty for local persistence #683

Closed
jma7889 opened this issue Jun 9, 2023 · 2 comments
Closed

[Bug]: Chroma collection is always empty for local persistence #683

jma7889 opened this issue Jun 9, 2023 · 2 comments
Labels
bug Something isn't working

Comments

@jma7889
Copy link

jma7889 commented Jun 9, 2023

What happened?

I am using local persistence, files are saved. However, each time I load, it creates new files. The list_collections function always return none.

import chromadb
from chromadb.config import Settings

WORKING_DIR = '../..'
CHROMA_STORAGE_PATH = os.path.join(WORKING_DIR, 'bin', 'chromadb')

full_path = os.path.abspath(CHROMA_STORAGE_PATH)
print(full_path)
full_path = os.path.abspath(CHROMA_STORAGE_PATH)

assert os.path.exists(full_path), f"The path {full_path} does not exist."

chroma_client = chromadb.Client(
    Settings(
        chroma_db_impl="duckdb+parquet",
        persist_directory=CHROMA_STORAGE_PATH))
collections = chroma_client.list_collections()
        
if not collections:
    print('Chroma collection is empty')
for collection in collections:
    print('Retrieved chroma collection with name: %s %s',
                    collection.id, collection.name)

Versions

chromadb 0.3.26

Relevant log output

Chroma collection is empty
@jma7889 jma7889 added the bug Something isn't working label Jun 9, 2023
@jeffchuber
Copy link
Contributor

@jma7889 are you using many clients? This usually happens when the way atExit handles cleanup... it will auto-persist the first client last which makes it look like it is wiping the db.

Sorry about this! we are considering taking out the magic entirely since it seems to bite a lot of people.

@jma7889
Copy link
Author

jma7889 commented Jun 11, 2023

Update1:
It seems code to get chroma_client can only be called once. Otherwise, it will create a new database. This is confusing. Once I call below code only once, i can see the collection is not empty. chromadb.Client function is not getting a client, it creates a instance of database!

Original:
@jeffchuber I don't intend to use many or more than 1 client. The only place I get client is using code above. I do use it in jupyter notebook, so the code might be called more than once. And each time it seems create a new set of files but list collections always return None.

            chroma_client = chromadb.Client(
                Settings(
                    chroma_db_impl="duckdb+parquet",
                    persist_directory=CHROMA_STORAGE_PATH))

is above code allowed to be called more than once? or every time it creates a client as well as a new database?

@jma7889 jma7889 closed this as completed Jun 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants