Qdrant: add helm charts #334

camdencheek · 2023-08-18T22:01:30Z

This implements helm charts to add a Qdrant deployment to our cloud deployments. The intent is to replace the embeddings service entirely and to stop depending on blobstore for embeddings storage.

Checklist

Follow the manual testing process
Update changelog

Test plan

I ran the cluster locally, ensuring that I can generate and search embeddings with Qdrant.

I tested with the following override.yaml:

# Disable SC creation
storageClass:
  create: false
  name: standard

# Disable resources requests/limits
sourcegraph:
  localDevMode: true
  image:
    defaultTag: insiders
    useGlobalTagAsDefault: true
# More values to be added in order to test your change

qdrant:
  enabled: true

gitserver:
  env:
    SRC_REPOS_DESIRED_PERCENT_FREE:
      value: "0"

frontend:
  env:
    QDRANT_ENDPOINT:
      value: "qdrant:6334"

camdencheek · 2023-08-18T22:02:47Z

charts/sourcegraph/values.yaml

+  extraVolumeMounts: {}
+  extraVolumes: {}
+  # -- PVC Storage Request for `qdrant` data volume
+  storageSize: 100Gi


How difficult is it to increase the size of a PVC?

It looks like we allow volume expansion by default, but I'm honestly not sure what the ramificaitons are. 100Gi is large enough for a large chunk of customers, but certainly not all.

cloud prefers to start small and scale up as needed (scaling up is easy and we have monitoring for it)

camdencheek · 2023-08-18T22:06:33Z

charts/sourcegraph/values.yaml

+    # -- Docker image name for the `embeddings` image
+    name: "qdrant"
+    # -- Docker image tag for the `embeddings` image
+    defaultTag: "239247_2023-08-18_5.1-433e1b1c997f@sha256:eafcd7af2aca699fa9c9ce8e6aa674cc0470441f794baf031296d5d1cdadd0bc"


Pointing to the latest build from main. Should be updated at release.

camdencheek · 2023-08-18T22:06:43Z

charts/sourcegraph/values.yaml


+qdrant:
+  # -- Enable `qdrant`
+  enabled: false


Disabled by default, just like embeddings

camdencheek · 2023-08-22T04:09:20Z

charts/sourcegraph/templates/qdrant/qdrant.ConfigMap.yaml

+    performance:
+      max_optimization_threads: 4
+    optimizers:
+      max_optimization_threads: 4
+      mmap_threshold_kb: 1
+      indexing_threshold_kb: 0
+    hnsw_index:
+      m: 8
+      ef_construct: 100
+      full_scan_threshold: 10
+      max_indexing_threads: 4
+      on_disk: true
+      payload_m: 8


Will probably pull these out to make them configurable, but they are just default values that can be overridden on collection creation which is handled in the application.

let's add a note in the manifest and say they're basically unused

camdencheek · 2023-08-22T04:22:05Z

charts/sourcegraph/templates/qdrant/qdrant.Service.yaml

+    sourcegraph.prometheus/scrape: "true"
+    prometheus.io/port: "6333"


I tested that these populate metrics in Grafana. I'll need to add a basic dashboard for qdrant

camdencheek · 2023-08-22T04:29:05Z

charts/sourcegraph/values.yaml

    # -- Docker image tag for the `embeddings` image
    defaultTag: "5.1.6@sha256:e849f52e38637882e5d2ba3d7d27a656d897c4b4e2905e1fdb843536d9c948ab"
-  # -- Resource requests & limits for the `worker` container,
+  # -- Resource requests & limits for the `embeddings` container,


unrelated, just fixing a couple of typos

camdencheek · 2023-08-22T04:35:31Z

charts/sourcegraph/values.yaml

+    limits:
+      cpu: "2"
+      memory: 8G
+    requests:
+      cpu: "500m"
+      memory: 2G


Are there any guidelines for sizing on cloud, other than just "whatever is needed for perf"?

in cloud, we will provide our own override.

https://github.com/sourcegraph/cloud/blob/main/overrides/global.override.yaml

https://github.com/sourcegraph/cloud/blob/main/overrides/small-resources.override.yaml

generally start small and increase as needed

what you put as default here would serve as the recommendation for on-prem customer, not cloud

got it, thanks!

michaellzc

overall lgtm

question to @camdencheek

qdrant seems to support cluster mode. how does it work? do we need it to support higher traffic? it's okay to say this out of the scope for MVP.

question to @sourcegraph/team-release

how do we decide whether to use an upstream chart or roll our own? https://github.com/qdrant/qdrant-helm/

michaellzc · 2023-08-22T15:47:05Z

charts/sourcegraph/templates/qdrant/qdrant.ConfigMap.yaml

+    debug: true
+    log_level: INFO
+    storage:
+      storage_path: /data
+      snapshots_path: /data/storage
+      on_disk_payload: true


these seem to be global flags; we should parameterize them

Do we need to parameterize them even if we don't want them to be configurable? We control the volume mount location as well, so I can't think of any reason to move the storage path elsewhere

at least debug should be configurable, no?

I'm not sure the implication of debug: true means for qdrant, but we can only expose some of the fields here, e.g., debug and log_level

Oooh, gotcha, yeah definitely. No idea why my eyes focused on just the storage path 😂

camdencheek · 2023-08-22T16:03:00Z

do we need it to support higher traffic?

I don't think we'll need cluster mode. A single pod with 32GB memory and 4 cores was able to handle sourcegraph.com-level size and traffic in my tests. Most of the scaling benefit comes from being able to build an index rather than horizontal scaling.

It's possible that we'll want to support cluster mode in the future, but a single pod should be plenty for the MVP (and beyond).

camdencheek · 2023-08-22T16:08:07Z

how do we decide whether to use an upstream chart or roll our own?

Oh, I kinda assumed that our chart shouldn't have dependencies and that we'd want to follow the same patterns across all our services. I guess that wasn't a clear assumption 😄

michaellzc · 2023-08-22T16:19:04Z

how do we decide whether to use an upstream chart or roll our own?

Oh, I kinda assumed that our chart shouldn't have dependencies and that we'd want to follow the same patterns across all our services. I guess that wasn't a clear assumption 😄

yeah, that's a question for @sourcegraph/team-release how they want to approach this in the future.

but it shoudln't be blocking us from merging this experimental thing.

jdpleiness · 2023-08-22T16:33:42Z

how do we decide whether to use an upstream chart or roll our own?

Oh, I kinda assumed that our chart shouldn't have dependencies and that we'd want to follow the same patterns across all our services. I guess that wasn't a clear assumption 😄

yeah, that's a question for https://github.com/orgs/sourcegraph/teams/team-release how they want to approach this in the future.

but it shoudln't be blocking us from merging this experimental thing.

Yeah, this is how we've been doing it, so it's fine for now 👍

We're still working things like this over and deciding on the best way forward.

michaellzc

👍

camdencheek added 7 commits August 18, 2023 10:40

wip

d3ae212

wip maybe complete

4d69593

fix typo

c94d333

wip permissions failing

4adad9a

working

f00027b

add TODO for readiness

766e4d7

update readme

6d1aed6

camdencheek force-pushed the cc/qdrant branch from d4a2b79 to 6d1aed6 Compare August 18, 2023 22:02

camdencheek commented Aug 18, 2023

View reviewed changes

charts/sourcegraph/values.yaml

qdrant:

# -- Enable `qdrant`

enabled: false

Copy link

Member Author

camdencheek Aug 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Disabled by default, just like embeddings

camdencheek force-pushed the cc/qdrant branch from 94c995d to 63601ee Compare August 18, 2023 22:07

make more standard

17b5399

camdencheek force-pushed the cc/qdrant branch from 63601ee to 17b5399 Compare August 18, 2023 22:11

camdencheek changed the title ~~Cc/qdrant~~ Qdrant: add helm charts Aug 21, 2023

camdencheek added 2 commits August 21, 2023 14:50

minimal working

c2e3893

update docs

7856769

camdencheek commented Aug 22, 2023

View reviewed changes

fix prometheus scraping

d5b7479

camdencheek commented Aug 22, 2023

View reviewed changes

camdencheek added 2 commits August 21, 2023 22:24

remove outdated TODO

3303bcc

add liveness and readiness

7b2814d

camdencheek commented Aug 22, 2023

View reviewed changes

camdencheek marked this pull request as ready for review August 22, 2023 04:37

update changelog

a5c2ae7

camdencheek requested review from a team and ggilmore August 22, 2023 04:40

camdencheek mentioned this pull request Aug 22, 2023

Migrate to qdrant sourcegraph/sourcegraph-public-snapshot#55527

Open

michaellzc requested a review from a team August 22, 2023 15:36

michaellzc reviewed Aug 22, 2023

View reviewed changes

jdpleiness approved these changes Aug 22, 2023

View reviewed changes

add config for debug and log_level

ba0fdec

michaellzc approved these changes Aug 22, 2023

View reviewed changes

update docs

34f82d7

camdencheek merged commit 34ff63d into main Aug 22, 2023

camdencheek deleted the cc/qdrant branch August 22, 2023 17:06

		sourcegraph.prometheus/scrape: "true"
		prometheus.io/port: "6333"

Qdrant: add helm charts #334

Qdrant: add helm charts #334

Uh oh!

Conversation

camdencheek commented Aug 18, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Checklist

Test plan

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

camdencheek Aug 18, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

michaellzc Aug 22, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

michaellzc left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

michaellzc Aug 22, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

camdencheek commented Aug 22, 2023

Uh oh!

camdencheek commented Aug 22, 2023

Uh oh!

michaellzc commented Aug 22, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jdpleiness commented Aug 22, 2023

Uh oh!

michaellzc left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

camdencheek commented Aug 18, 2023 •

edited

Loading

camdencheek Aug 18, 2023 •

edited

Loading

michaellzc Aug 22, 2023 •

edited

Loading

michaellzc Aug 22, 2023 •

edited

Loading

michaellzc commented Aug 22, 2023 •

edited

Loading