Skip to content

Commit

Permalink
feat: add pytest-vcr for recording HTTP interactions in integration t…
Browse files Browse the repository at this point in the history
…ests (#2445)

Using `pytest-vcr` in integration tests has several benefits. Firstly,
it removes the need to mock external services, as VCR records and
replays HTTP interactions on the fly. Secondly, it simplifies the
integration test setup by eliminating the need to set up and tear down
external services in some cases. Finally, it allows for more reliable
and deterministic integration tests by ensuring that HTTP interactions
are always replayed with the same response.
Overall, `pytest-vcr` is a valuable tool for simplifying integration
test setup and improving their reliability

This commit adds the `pytest-vcr` package as a dependency for
integration tests in the `pyproject.toml` file. It also introduces two
new fixtures in `tests/integration_tests/conftest.py` files for managing
cassette directories and VCR configurations.

In addition, the
`tests/integration_tests/vectorstores/test_elasticsearch.py` file has
been updated to use the `@pytest.mark.vcr` decorator for recording and
replaying HTTP interactions.

Finally, this commit removes the `documents` fixture from the
`test_elasticsearch.py` file and replaces it with a new fixture defined
in `tests/integration_tests/vectorstores/conftest.py` that yields a list
of documents to use in any other tests.

This also includes my second attempt to fix issue :
#2386

Maybe related #2484
  • Loading branch information
sergerdn committed Apr 7, 2023
1 parent c9f93f5 commit 6dc86ad
Show file tree
Hide file tree
Showing 11 changed files with 1,945 additions and 16 deletions.
4 changes: 2 additions & 2 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -27,15 +27,15 @@ FROM builder AS dependencies
COPY pyproject.toml poetry.lock poetry.toml ./

# Install the Poetry dependencies (this layer will be cached as long as the dependencies don't change)
RUN $POETRY_HOME/bin/poetry install --no-interaction --no-ansi
RUN $POETRY_HOME/bin/poetry install --no-interaction --no-ansi --with test

# Use a multi-stage build to run tests
FROM dependencies AS tests

# Copy the rest of the app source code (this layer will be invalidated and rebuilt whenever the source code changes)
COPY . .

RUN /opt/poetry/bin/poetry install --no-interaction --no-ansi
RUN /opt/poetry/bin/poetry install --no-interaction --no-ansi --with test

# Set the entrypoint to run tests using Poetry
ENTRYPOINT ["/opt/poetry/bin/poetry", "run", "pytest"]
Expand Down
25 changes: 22 additions & 3 deletions langchain/vectorstores/elastic_vector_search.py
Original file line number Diff line number Diff line change
Expand Up @@ -146,6 +146,7 @@ def add_texts(
List of ids from adding the texts into the vectorstore.
"""
try:
from elasticsearch.exceptions import NotFoundError
from elasticsearch.helpers import bulk
except ImportError:
raise ValueError(
Expand All @@ -155,6 +156,17 @@ def add_texts(
requests = []
ids = []
embeddings = self.embedding.embed_documents(list(texts))
dim = len(embeddings[0])
mapping = _default_text_mapping(dim)

# check to see if the index already exists
try:
self.client.indices.get(index=self.index_name)
except NotFoundError:
# TODO would be nice to create index before embedding,
# just to save expensive steps for last
self.client.indices.create(index=self.index_name, mappings=mapping)

for i, text in enumerate(texts):
metadata = metadatas[i] if metadatas else {}
_id = str(uuid.uuid4())
Expand Down Expand Up @@ -229,6 +241,7 @@ def from_texts(
)
try:
import elasticsearch
from elasticsearch.exceptions import NotFoundError
from elasticsearch.helpers import bulk
except ImportError:
raise ValueError(
Expand All @@ -245,9 +258,15 @@ def from_texts(
embeddings = embedding.embed_documents(texts)
dim = len(embeddings[0])
mapping = _default_text_mapping(dim)
# TODO would be nice to create index before embedding,
# just to save expensive steps for last
client.indices.create(index=index_name, mappings=mapping)

# check to see if the index already exists
try:
client.indices.get(index=index_name)
except NotFoundError:
# TODO would be nice to create index before embedding,
# just to save expensive steps for last
client.indices.create(index=index_name, mappings=mapping)

requests = []
for i, text in enumerate(texts):
metadata = metadatas[i] if metadatas else {}
Expand Down
46 changes: 40 additions & 6 deletions poetry.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 2 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -90,6 +90,8 @@ optional = true
[tool.poetry.group.test_integration.dependencies]
openai = "^0.27.4"
elasticsearch = {extras = ["async"], version = "^8.6.2"}
pytest-vcr = "^1.0.2"
wrapt = "^1.15.0"

[tool.poetry.group.lint.dependencies]
ruff = "^0.0.249"
Expand Down
57 changes: 57 additions & 0 deletions tests/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
# Readme tests(draft)

## Integrations Tests

### Prepare

This repository contains functional tests for several search engines and databases. The
tests aim to verify the correct behavior of the engines and databases according to their
specifications and requirements.

To run some integration tests, such as tests located in
`tests/integration_tests/vectorstores/`, you will need to install the following
software:

- Docker
- Python 3.8.1 or later

We have optional group `test_integration` in the `pyproject.toml` file. This group
should contain dependencies for the integration tests and can be installed using the
command:

```bash
poetry install --with test_integration
```

Any new dependencies should be added by running:

```bash
poetry add some_new_deps --group "test_integration"
```

Before running any tests, you should start a specific Docker container that has all the
necessary dependencies installed. For instance, we use the `elasticsearch.yml` container
for `test_elasticsearch.py`:

```bash
cd tests/integration_tests/vectorstores/docker-compose
docker-compose -f elasticsearch.yml up
```

Additionally, it's important to note that some integration tests may require certain
environment variables to be set, such as `OPENAI_API_KEY`. Be sure to set any required
environment variables before running the tests to ensure they run correctly.

### Recording HTTP interactions with pytest-vcr

Some of the integration tests in this repository involve making HTTP requests to
external services. To prevent these requests from being made every time the tests are
run, we use pytest-vcr to record and replay HTTP interactions.

When running tests in a CI/CD pipeline, you may not want to modify the existing
cassettes. You can use the --vcr-record=none command-line option to disable recording
new cassettes. Here's an example:

```bash
pytest tests/integration_tests/vectorstores/test_elasticsearch.py --vcr-record=none
```
32 changes: 32 additions & 0 deletions tests/integration_tests/conftest.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
import os

import pytest

# Getting the absolute path of the current file's directory
ABS_PATH = os.path.dirname(os.path.abspath(__file__))


# This fixture returns a string containing the path to the cassette directory for the
# current module
@pytest.fixture(scope="module")
def vcr_cassette_dir(request: pytest.FixtureRequest) -> str:
return os.path.join(
os.path.dirname(request.module.__file__),
"cassettes",
os.path.basename(request.module.__file__).replace(".py", ""),
)


# This fixture returns a dictionary containing filter_headers options
# for replacing certain headers with dummy values during cassette playback
# Specifically, it replaces the authorization header with a dummy value to
# prevent sensitive data from being recorded in the cassette.
@pytest.fixture(scope="module")
def vcr_config() -> dict:
return {
"filter_headers": [
("authorization", "authorization-DUMMY"),
("X-OpenAI-Client-User-Agent", "X-OpenAI-Client-User-Agent-DUMMY"),
("User-Agent", "User-Agent-DUMMY"),
],
}

0 comments on commit 6dc86ad

Please sign in to comment.