Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add e2e test with WebCrawler + embeddings + vector sink + completions #444

Merged
merged 1 commit into from
Sep 19, 2023

Conversation

nicoloboschi
Copy link
Member

It uses a bunch of agents:

  • webcrawler source (uses minio for status)
  • text-extractor
  • language-detector
  • text-splitter
  • document-to-json
  • compute
  • compute-ai-embeddings
  • query (astra + vector)
  • ai-chat-completions
  • vector-db-sink (astra)

It's disabled on CI, to run it you have to export all of these

export OPEN_AI_ACCESS_KEY=
export OPEN_AI_URL=
export OPEN_AI_EMBEDDINGS_MODEL=
export OPEN_AI_CHAT_COMPLETIONS_MODEL=
export OPEN_AI_PROVIDER=
export ASTRA_TOKEN=
export ASTRA_CLIENT_ID=
export ASTRA_SECRET=
export ASTRA_SECURE_BUNDLE=
export ASTRA_ENVIRONMENT=
export ASTRA_DATABASE=

Copy link
Member

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

i have left some minor feedback

@@ -281,7 +281,11 @@ protected static ConsumeGatewayMessage consumeOneMessageFromGateway(
"bin/langstream gateway consume %s %s %s -n 1"
.formatted(applicationId, gatewayId, String.join(" ", extraArgs));
final String response = executeCommandOnClient(command);
final String secondLine = response.lines().collect(Collectors.toList()).get(1);
final List<String> lines = response.lines().collect(Collectors.toList());
if (lines.size() <= 1) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is in the first line ? should we make some assertions ?

- name: "log-topic"
creation-mode: create-if-not-exists
errors:
on-failure: "skip"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we can drop this and let it fail

@nicoloboschi nicoloboschi merged commit 9bba4f0 into main Sep 19, 2023
15 of 16 checks passed
@nicoloboschi nicoloboschi deleted the test-webcrawler branch September 19, 2023 14:59
benfrank241 pushed a commit to vectorize-io/langstream that referenced this pull request May 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants