Skip to content

Conversation

@camdencheek
Copy link
Member

@camdencheek camdencheek commented Aug 18, 2023

This implements helm charts to add a Qdrant deployment to our cloud deployments. The intent is to replace the embeddings service entirely and to stop depending on blobstore for embeddings storage.

Checklist

Test plan

I ran the cluster locally, ensuring that I can generate and search embeddings with Qdrant.

I tested with the following override.yaml:

# Disable SC creation
storageClass:
  create: false
  name: standard

# Disable resources requests/limits
sourcegraph:
  localDevMode: true
  image:
    defaultTag: insiders
    useGlobalTagAsDefault: true
# More values to be added in order to test your change

qdrant:
  enabled: true

gitserver:
  env:
    SRC_REPOS_DESIRED_PERCENT_FREE:
      value: "0"

frontend:
  env:
    QDRANT_ENDPOINT:
      value: "qdrant:6334"

extraVolumeMounts: {}
extraVolumes: {}
# -- PVC Storage Request for `qdrant` data volume
storageSize: 100Gi
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How difficult is it to increase the size of a PVC?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like we allow volume expansion by default, but I'm honestly not sure what the ramificaitons are. 100Gi is large enough for a large chunk of customers, but certainly not all.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cloud prefers to start small and scale up as needed (scaling up is easy and we have monitoring for it)

# -- Docker image name for the `embeddings` image
name: "qdrant"
# -- Docker image tag for the `embeddings` image
defaultTag: "239247_2023-08-18_5.1-433e1b1c997f@sha256:eafcd7af2aca699fa9c9ce8e6aa674cc0470441f794baf031296d5d1cdadd0bc"
Copy link
Member Author

@camdencheek camdencheek Aug 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pointing to the latest build from main. Should be updated at release.


qdrant:
# -- Enable `qdrant`
enabled: false
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Disabled by default, just like embeddings

@camdencheek camdencheek changed the title Cc/qdrant Qdrant: add helm charts Aug 21, 2023
Comment on lines 17 to 29
performance:
max_optimization_threads: 4
optimizers:
max_optimization_threads: 4
mmap_threshold_kb: 1
indexing_threshold_kb: 0
hnsw_index:
m: 8
ef_construct: 100
full_scan_threshold: 10
max_indexing_threads: 4
on_disk: true
payload_m: 8
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will probably pull these out to make them configurable, but they are just default values that can be overridden on collection creation which is handled in the application.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's add a note in the manifest and say they're basically unused

Comment on lines +6 to +7
sourcegraph.prometheus/scrape: "true"
prometheus.io/port: "6333"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested that these populate metrics in Grafana. I'll need to add a basic dashboard for qdrant

# -- Docker image tag for the `embeddings` image
defaultTag: "5.1.6@sha256:e849f52e38637882e5d2ba3d7d27a656d897c4b4e2905e1fdb843536d9c948ab"
# -- Resource requests & limits for the `worker` container,
# -- Resource requests & limits for the `embeddings` container,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unrelated, just fixing a couple of typos

Comment on lines +338 to +343
limits:
cpu: "2"
memory: 8G
requests:
cpu: "500m"
memory: 2G
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there any guidelines for sizing on cloud, other than just "whatever is needed for perf"?

Copy link
Member

@michaellzc michaellzc Aug 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in cloud, we will provide our own override.

generally start small and increase as needed

what you put as default here would serve as the recommendation for on-prem customer, not cloud

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

got it, thanks!

@camdencheek camdencheek marked this pull request as ready for review August 22, 2023 04:37
Copy link
Member

@michaellzc michaellzc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall lgtm

question to @camdencheek

  • qdrant seems to support cluster mode. how does it work? do we need it to support higher traffic? it's okay to say this out of the scope for MVP.

question to @sourcegraph/team-release

Comment on lines 11 to 16
debug: true
log_level: INFO
storage:
storage_path: /data
snapshots_path: /data/storage
on_disk_payload: true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these seem to be global flags; we should parameterize them

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to parameterize them even if we don't want them to be configurable? We control the volume mount location as well, so I can't think of any reason to move the storage path elsewhere

Copy link
Member

@michaellzc michaellzc Aug 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

at least debug should be configurable, no?

I'm not sure the implication of debug: true means for qdrant, but we can only expose some of the fields here, e.g., debug and log_level

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oooh, gotcha, yeah definitely. No idea why my eyes focused on just the storage path 😂

@camdencheek
Copy link
Member Author

do we need it to support higher traffic?

I don't think we'll need cluster mode. A single pod with 32GB memory and 4 cores was able to handle sourcegraph.com-level size and traffic in my tests. Most of the scaling benefit comes from being able to build an index rather than horizontal scaling.

It's possible that we'll want to support cluster mode in the future, but a single pod should be plenty for the MVP (and beyond).

@camdencheek
Copy link
Member Author

how do we decide whether to use an upstream chart or roll our own?

Oh, I kinda assumed that our chart shouldn't have dependencies and that we'd want to follow the same patterns across all our services. I guess that wasn't a clear assumption 😄

@michaellzc
Copy link
Member

michaellzc commented Aug 22, 2023

how do we decide whether to use an upstream chart or roll our own?

Oh, I kinda assumed that our chart shouldn't have dependencies and that we'd want to follow the same patterns across all our services. I guess that wasn't a clear assumption 😄

yeah, that's a question for @sourcegraph/team-release how they want to approach this in the future.

but it shoudln't be blocking us from merging this experimental thing.

@jdpleiness
Copy link
Contributor

how do we decide whether to use an upstream chart or roll our own?

Oh, I kinda assumed that our chart shouldn't have dependencies and that we'd want to follow the same patterns across all our services. I guess that wasn't a clear assumption 😄

yeah, that's a question for https://github.com/orgs/sourcegraph/teams/team-release how they want to approach this in the future.

but it shoudln't be blocking us from merging this experimental thing.

Yeah, this is how we've been doing it, so it's fine for now 👍

We're still working things like this over and deciding on the best way forward.

Copy link
Member

@michaellzc michaellzc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@camdencheek camdencheek merged commit 34ff63d into main Aug 22, 2023
@camdencheek camdencheek deleted the cc/qdrant branch August 22, 2023 17:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants