Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

e2e test added #134

Conversation

gecBurton
Copy link
Collaborator

Context

As an Engineer I want an e2e test to know that a change i have made in one service has not adversely affected another.

Changes proposed in this pull request

  1. A new test has been added that tests that on uploading a file at least one embedding happens
  2. embed_sentences has been moved in model_db.py

Note: Whilst trying to remove all the hardcoded model names I have identified a problem whereby the docker image always preloads the default paraphrase-albert-small-v2 so any attempt to use a smaller model actually makes this worse by adding another, albeit smaller, model. I have removed the smaller model until we find a way to resolve this.

It is not possible to run this test in CI at this time as the disk runs out of space.

Guidance to review

docker compose down
cp .env.example .env
docker compose up -d ingest embed core-api
make test-integration

Relevant links

https://technologyprogramme.atlassian.net/browse/REDBOX-44

Things to check

  • I have added any new ENV vars in all deployed environments
  • I have tested any code added or changed

@@ -39,6 +39,12 @@ test-ingest:
poetry run pytest ingest/tests --cov=ingest -v --cov-report=term-missing --cov-fail-under=40

test-django:
docker compose up -d db --wait db
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

duplicated db statement

core_api/src/app.py Show resolved Hide resolved
core_api/tests/test_app.py Outdated Show resolved Hide resolved
download_embedder.py Show resolved Hide resolved
embed/src/app.py Show resolved Hide resolved
model_db.py Show resolved Hide resolved
model_db.py Show resolved Hide resolved
data=reformatted_embeddings,
embedding_id=str(uuid4()),
model=model,
model_info=self.get_model_info(model),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we need this model property if we're also giving back themodel_info that includes the model name as well?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can leave it in by all means if you think it would be useful by consuming functions, just spitballing

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lmwilki will explain! (i think this is a standard way of representing the response)

pyproject.toml Outdated Show resolved Hide resolved
tests/test_e2e.py Show resolved Hide resolved
Makefile Show resolved Hide resolved
@gecBurton gecBurton force-pushed the feature/REDBOX-44-test-end-to-end-file-processing-from-core-api-to-elastic-4 branch from ba6af26 to f635e35 Compare March 21, 2024 14:47
@gecBurton gecBurton merged commit 8c5bc57 into main Mar 21, 2024
3 checks passed
@gecBurton gecBurton deleted the feature/REDBOX-44-test-end-to-end-file-processing-from-core-api-to-elastic-4 branch March 21, 2024 14:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants