Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

upgrade chroma to 0.4.0 #7749

Merged
merged 7 commits into from
Jul 19, 2023

Conversation

jeffchuber
Copy link
Contributor

@jeffchuber jeffchuber commented Jul 15, 2023

** This should land Monday the 17th **

Chroma is upgrading from 0.3.29 to 0.4.0. 0.4.0 is easier to build, more durable, faster, smaller, and more extensible. This comes with a few changes:

  1. A simplified and improved client setup. Instead of having to remember weird settings, users can just do EphemeralClient, PersistentClient or HttpClient (the underlying direct Client implementation is also still accessible)

  2. We migrated data stores away from duckdb and clickhouse. This changes the api for the PersistentClient that used to reference chroma_db_impl="duckdb+parquet". Now we simply set is_persistent=true. is_persistent is set for you to true if you use PersistentClient.

  3. Because we migrated away from duckdb and clickhouse - this also means that users need to migrate their data into the new layout and schema. Chroma is committed to providing extension notification and tooling around any schema and data migrations (for example - this PR!).

After upgrading to 0.4.0 - if users try to access their data that was stored in the previous regime, the system will throw an Exception and instruct them how to use the migration assistant to migrate their data. The migration assitant is a pip installable CLI: pip install chroma_migrate. And is runnable by calling chroma_migrate

-- TODO ADD here is a short video demonstrating how it works.

Please reference the readme at chroma-core/chroma-migrate to see a full write-up of our philosophy on migrations as well as more details about this particular migration.

Please direct any users facing issues upgrading to our Discord channel called #get-help. We have also created a email listserv to notify developers directly in the future about breaking changes.

TODO

  • Migrated any duckdb+parquet strings to the new format
  • Notified users about the breaking change (this PR, other suggestions?)
  • Move pypi target away from feature-branch and to 0.4.0 after 0.4.0 is released.
  • We need to merge in a more flexible range for fastapi to the chroma branch - as is in-progress here test this range chroma-core/chroma#807

@vercel
Copy link

vercel bot commented Jul 15, 2023

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Ignored Deployment
Name Status Preview Comments Updated (UTC)
langchain ⬜️ Ignored (Inspect) Jul 19, 2023 0:06am

@dosubot dosubot bot added Ɑ: vector store Related to vector store module 🤖:improvement Medium size change to existing code to handle new use-cases labels Jul 15, 2023
@jeffchuber
Copy link
Contributor Author

This requires more work - first resolving the fastapi version issue.

HammadB pushed a commit to chroma-core/chroma that referenced this pull request Jul 17, 2023
@jeffchuber jeffchuber marked this pull request as ready for review July 18, 2023 00:17
@@ -95,7 +95,7 @@ def __init__(
_client_settings = client_settings
elif persist_directory:
_client_settings = chromadb.config.Settings(
chroma_db_impl="duckdb+parquet",
is_persistent=True,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is update backwards compatible with old config settings? ie if someone was passing in

_client_settings = chromadb.config.Settings(
                     chroma_db_impl="duckdb+parquet",
                     persist_directory=persist_directory,
                 )

directly before, would behavior they're seeing change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

they will see this error

You are using a deprecated configuration of Chroma. Please pip install chroma-migrate and run chroma-migrate to upgrade your configuration. See https://docs.trychroma.com/migration for more information or join our discord at https://discord.gg/8g5FESbj for help!

more here https://docs.trychroma.com/migration

@baskaryan baskaryan mentioned this pull request Jul 18, 2023
@baskaryan baskaryan added the lgtm PR looks good. Use to confirm that a PR is ready for merging. label Jul 19, 2023
@baskaryan baskaryan merged commit 2139d01 into langchain-ai:master Jul 19, 2023
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🤖:improvement Medium size change to existing code to handle new use-cases lgtm PR looks good. Use to confirm that a PR is ready for merging. Ɑ: vector store Related to vector store module
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants