LCORE-417: Convert tests to pytest by max-svistunov · Pull Request #42 · lightspeed-core/rag-content

max-svistunov · 2025-10-21T12:55:40Z

Description

Type of change

Related Tickets & Documents

Related Issue # LCORE-417
Closes # LCORE-417

Checklist before requesting a review

I have performed a self-review of my code.
PR has passed all pre-merge test jobs.
If it is a core feature, I have added thorough tests.

Testing

Please provide detailed steps to perform tests related to this code change.
How were the fix/results from this change verified? Please provide relevant screenshots or results.

Summary by CodeRabbit

Tests
- Test suite migrated from unittest to pytest across converters, CLI, and processors; uses fixtures, pytest-style assertions, and pytest-mock for mocking.
- Introduced a deterministic mock embedding fixture and removed legacy test utility helpers.
Chores
- Dev tooling updated: added pytest-mock and linting rule to discourage unittest usage.

coderabbitai · 2025-10-21T12:55:57Z

Walkthrough

Migrates tests from unittest to pytest (fixtures, mocker, caplog, pytest.raises), consolidates test helpers by moving RagMockEmbedding into tests/conftest.py and removing tests/utils.py, adds pytest-mock to dev dependencies, and applies minor formatting edits to scripts/remove_pytorch_cpu_pyproject.py. No production API or runtime control-flow changes.

Changes

Cohort / File(s)	Summary
Test infra — consolidation `tests/conftest.py`, `tests/utils.py`	Adds `RagMockEmbedding` to `tests/conftest.py`; deletes `tests/utils.py` (removes previous `RagMockEmbedding`, `subtest` decorator, and `TestCase` test-helper class).
Asciidoc tests `tests/asciidoc/test__main__.py`, `tests/asciidoc/test_asciidoc_converter.py`	Converts tests from unittest to pytest: introduces fixtures, replaces `unittest` patching with `mocker.patch`, switches to pytest assertions, and adapts tests to new fixture-driven args.
Document-processor tests `tests/test_document_processor.py`, `tests/test_document_processor_llama_index.py`, `tests/test_document_processor_llama_stack.py`	Replaces class-based `setUp` with pytest fixtures, migrates mocks to `mocker.patch`, converts assertions to plain `assert`, and adds parametrization / fixture-driven setups.
Metadata / OKP / utils tests `tests/test_metadata_processor.py`, `tests/test_okp.py`, `tests/test_utils.py`	Migrates from `unittest` to pytest: adds fixtures, swaps `@patch` for `mocker.patch`, updates assertions to pytest style, and uses `caplog` / `pytest.raises` where applicable.
Test helper addition `tests/conftest.py`	New deterministic `RagMockEmbedding` providing 768-dim mock embeddings for tests.
Dev dependencies `pyproject.toml`	Adds `pytest-mock>=3.15.1` to dev dependencies and a banned-api entry for `unittest`/`unittest.mock`.
Build script formatting `scripts/remove_pytorch_cpu_pyproject.py`	Minor docstring and import-order formatting changes only; no behavioral changes.

Sequence Diagram(s)

None — changes are limited to test harness refactors, dev-dependency updates, and minor formatting; no runtime control-flow modifications warranting a sequence diagram.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Llama stack dbs #28 — overlaps in areas touching document processor, llama-stack support, and pyproject/dependency changes; likely related at code/test surface.

Suggested reviewers

umago
tisnik
lpiwowar

Poem

🐰 I hopped through tests at break of dawn,
Swapped classes for fixtures on the lawn.
Mocker stitched seams with pytest thread,
Conftest cradles embeddings in its bed.
Hooray — the test-burrow’s tidy and gone!

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The PR title "LCORE-417: Convert tests to pytest" is fully related to the main change in the changeset and clearly summarizes the primary objective. The changeset predominantly consists of migrations from unittest-based test style to pytest across multiple test files (test__main__.py, test_asciidoc_converter.py, test_document_processor.py, test_document_processor_llama_index.py, test_document_processor_llama_stack.py, test_metadata_processor.py, test_okp.py, test_utils.py), along with supporting changes like moving RagMockEmbedding to conftest.py, updating pyproject.toml to ban unittest, and adding pytest-mock as a dependency. The title is concise, specific, and directly communicates the core refactoring effort without vague language.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

✨ Finishing touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 03f4d47 and 16f2c1a.

📒 Files selected for processing (1)

pyproject.toml (2 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

pyproject.toml

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: build-and-push-dev
GitHub Check: Pylinter

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 4

🧹 Nitpick comments (4)

tests/asciidoc/test_asciidoc_converter.py (1)

116-132: Consider more specific caplog assertion.

The assertion assert "WARNING" in caplog.text is broad and could match unrelated warnings. For more precise verification, consider checking for the specific warning message:
     with caplog.at_level(logging.WARNING):
         adoc_text_converter.convert(converter_data["input_file"], mock_output_file)
-        assert "WARNING" in caplog.text
+        assert any("exists" in record.message and "overwritten" in record.message 
+                   for record in caplog.records)
or simply check that a warning was logged:
-        assert "WARNING" in caplog.text
+        assert len(caplog.records) > 0
+        assert caplog.records[0].levelname == "WARNING"

tests/test_document_processor.py (1)

68-90: Consider using monkeypatch for environment variables.

The manual cleanup of environment variables with os.environ.pop works but could be more robust. Pytest's monkeypatch fixture provides automatic cleanup:

-def test_init_default(self, mock_processor):
+def test_init_default(self, mock_processor, monkeypatch):
     """Test DocumentProcessor initialization with default vector store type (faiss)."""
+    monkeypatch.delenv("HF_HOME", raising=False)
+    monkeypatch.delenv("TRANSFORMERS_OFFLINE", raising=False)
+    
     doc_processor = document_processor.DocumentProcessor(**mock_processor["params"])
     
     # ... rest of test ...
     
     assert expected_params["embeddings_model_dir"] == os.environ["HF_HOME"]
     assert os.environ["TRANSFORMERS_OFFLINE"] == "1"
-    os.environ.pop("TRANSFORMERS_OFFLINE", None)

This ensures cleanup even if the test fails midway.

tests/test_document_processor_llama_index.py (2)

28-57: Consider using a structured return type for better type safety.

The fixture returns a dictionary with string keys, which works but lacks type hints and IDE support. For better maintainability, consider using a dataclass, NamedTuple, or SimpleNamespace:

Example with dataclass:

from dataclasses import dataclass

@dataclass
class ProcessorFixture:
    processor: document_processor.DocumentProcessor
    model_name: str
    chunk_size: int
    chunk_overlap: int
    num_workers: int
    embeddings_model_dir: str

@pytest.fixture
def doc_processor(mocker) -> ProcessorFixture:
    # ... setup code ...
    return ProcessorFixture(
        processor=processor,
        model_name=model_name,
        chunk_size=chunk_size,
        chunk_overlap=chunk_overlap,
        num_workers=num_workers,
        embeddings_model_dir=embeddings_model_dir,
    )

Then access as doc_processor.processor instead of doc_processor["processor"].

213-233: Use pytest-mock's mocker.patch.dict for consistency.

This test uses @mock.patch.dict (unittest style) while all other tests in this file use pytest-mock's mocker fixture. For consistency with the pytest migration, consider using mocker.patch.dict instead.

Apply this diff:

-    @mock.patch.dict(
-        os.environ,
-        {
-            "POSTGRES_USER": "postgres",
-            "POSTGRES_PASSWORD": "somesecret",
-            "POSTGRES_HOST": "localhost",
-            "POSTGRES_PORT": "15432",
-            "POSTGRES_DATABASE": "postgres",
-        },
-    )
-    def test_pgvector(self, doc_processor):
+    def test_pgvector(self, mocker, doc_processor):
         """Test that DocumentProcessor initializes successfully with postgres vector store."""
+        mocker.patch.dict(
+            os.environ,
+            {
+                "POSTGRES_USER": "postgres",
+                "POSTGRES_PASSWORD": "somesecret",
+                "POSTGRES_HOST": "localhost",
+                "POSTGRES_PORT": "15432",
+                "POSTGRES_DATABASE": "postgres",
+            },
+        )
         proc = document_processor.DocumentProcessor(

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a346c2f and 07a8aa9.

📒 Files selected for processing (11)

scripts/remove_pytorch_cpu_pyproject.py (2 hunks)
tests/asciidoc/test__main__.py (2 hunks)
tests/asciidoc/test_asciidoc_converter.py (2 hunks)
tests/conftest.py (1 hunks)
tests/test_document_processor.py (2 hunks)
tests/test_document_processor_llama_index.py (3 hunks)
tests/test_document_processor_llama_stack.py (7 hunks)
tests/test_metadata_processor.py (1 hunks)
tests/test_okp.py (7 hunks)
tests/test_utils.py (2 hunks)
tests/utils.py (0 hunks)

💤 Files with no reviewable changes (1)

tests/utils.py

🧰 Additional context used

🧬 Code graph analysis (8)

tests/asciidoc/test_asciidoc_converter.py (1)

src/lightspeed_rag_content/asciidoc/asciidoctor_converter.py (3)

AsciidoctorConverter (63-188)

_get_converter_file (104-119)

_get_attribute_list (132-146)

tests/test_metadata_processor.py (2)

src/lightspeed_rag_content/metadata_processor.py (4)

MetadataProcessor (26-96)

ping_url (44-56)

get_file_title (34-42)

populate (58-91)

tests/test_okp.py (1)

test_get_file_title (194-198)

tests/test_document_processor_llama_stack.py (2)

tests/conftest.py (1)

RagMockEmbedding (19-28)

src/lightspeed_rag_content/document_processor.py (6)

_Config (44-61)

_LlamaStackDB (184-378)

write_yaml_config (288-307)

_start_llama_stack (309-319)

add_docs (137-140)

add_docs (321-350)

tests/test_document_processor.py (2)

tests/conftest.py (1)

RagMockEmbedding (19-28)

src/lightspeed_rag_content/document_processor.py (6)

DocumentProcessor (381-490)

_Config (44-61)

_check_config (425-430)

process (443-485)

add_docs (137-140)

add_docs (321-350)

tests/test_document_processor_llama_index.py (2)

tests/conftest.py (1)

RagMockEmbedding (19-28)

src/lightspeed_rag_content/document_processor.py (9)

DocumentProcessor (381-490)

_got_whitespace (77-82)

_filter_out_invalid_nodes (85-94)

_save_index (149-156)

_save_metadata (158-181)

process (443-485)

save (142-147)

save (352-378)

save (487-490)

tests/test_utils.py (1)

src/lightspeed_rag_content/utils.py (1)

get_common_arg_parser (19-71)

tests/asciidoc/test__main__.py (1)

src/lightspeed_rag_content/asciidoc/__main__.py (3)

main_convert (38-52)

main_get_structure (55-76)

get_argument_parser (79-140)

tests/test_okp.py (1)

src/lightspeed_rag_content/okp.py (7)

metadata_has_url_and_title (55-67)

is_file_related_to_projects (28-52)

parse_metadata (112-138)

yield_files_related_to_projects (70-109)

OKPMetadataProcessor (141-152)

url_function (144-147)

get_file_title (149-152)

🔇 Additional comments (23)

scripts/remove_pytorch_cpu_pyproject.py (1)

1-50: Formatting and import order changes look good.

The docstring formatting and import reordering are purely cosmetic. The remove_sections function logic is sound — the for-else+break pattern correctly handles cases where intermediate section keys are missing, ensuring safe removal without errors. Type hints and the # type: ignore annotation are appropriate for tomlkit's API.

tests/asciidoc/test_asciidoc_converter.py (3)

15-44: LGTM! Clean fixture-based test data setup.

The migration to pytest fixtures is well-structured. The converter_data fixture provides a clear, reusable test data dictionary.

47-114: LGTM! Proper pytest-mock usage.

The tests correctly use mocker.patch and verify subprocess invocations with fixture data. The migration from unittest mocks to pytest-mock is clean.

133-191: LGTM! Complete pytest migration.

The remaining tests properly use pytest patterns including pytest.raises for exception testing and mocker for file I/O. The migration is consistent and correct.

tests/asciidoc/test__main__.py (3)

31-54: LGTM! Well-structured test fixtures.

The main_data fixture and get_mock_parsed_args helper provide clean test setup for the main module tests.

57-84: LGTM! Correct pytest migration.

This test properly uses mocker and verifies the subprocess call with fixture data.

113-133: LGTM! Proper test structure.

The test correctly mocks dependencies and verifies the expected subprocess invocation.

tests/test_utils.py (1)

16-64: LGTM! Clean pytest migration.

The test class correctly uses pytest assertions and pytest.raises for exception testing. The migration is straightforward and correct.

tests/conftest.py (1)

19-28: LGTM! Proper relocation of test utility.

The RagMockEmbedding class is correctly moved from tests/utils.py to tests/conftest.py, which is the appropriate location for shared test fixtures. The implementation provides deterministic mock embeddings for testing.

tests/test_document_processor.py (2)

25-63: LGTM! Well-designed pytest fixture.

The mock_processor fixture provides a clean, reusable setup for mocking document processor dependencies. The pattern of yielding a dictionary with mocks and test parameters is effective.

91-197: LGTM! Excellent use of pytest parametrization.

The tests effectively use pytest.mark.parametrize to cover multiple vector store types and chunking strategies. The mocking patterns and assertions are correct and thorough.

tests/test_metadata_processor.py (1)

25-147: LGTM! Thorough pytest migration.

The metadata processor tests are well-migrated to pytest with proper use of fixtures, mocker, and caplog. The test coverage is maintained and the assertions are appropriate.

tests/test_document_processor_llama_stack.py (4)

26-52: LGTM! Comprehensive test fixture.

The llama_stack_processor fixture properly mocks all dependencies for testing the Llama Stack DB processor. The fixture design is clean and provides reusable test configuration.

57-101: LGTM! Thorough initialization testing.

Both init tests properly verify the processor initialization with different model directory configurations. The environment variable checks and mock assertions are correct.

102-213: LGTM! Comprehensive YAML configuration testing.

Both tests properly verify the YAML configuration generation for different vector store backends. The use of mock_open and assertion of the complete expected output is correct.

214-355: LGTM! Complete test coverage.

The tests thoroughly cover document addition and saving for both manual and automatic chunking modes. The helper function _test_save reduces duplication effectively.

tests/test_okp.py (1)

16-198: LGTM! Excellent pytest migration with autouse fixture.

The OKP tests are well-migrated to pytest. The use of an autouse fixture at lines 175-186 to mock parse_metadata for all tests in TestOKPMetadataProcessor is a clean pattern that reduces duplication. All mocking and assertions are correct.

tests/test_document_processor_llama_index.py (6)

16-25: LGTM!

The imports are correctly updated for the pytest migration. The RagMockEmbedding is now imported from tests.conftest, and pytest is properly imported. Retaining unittest.mock for mock.Mock and mock.sentinel usage is acceptable.

63-85: LGTM!

The whitespace and filter node tests are correctly migrated to pytest style with proper fixture usage and assertions.

87-125: LGTM!

The _save_index and _save_metadata tests properly use mocker for patching and verify the expected behavior. The use of mock.sentinel for test values is appropriate.

127-179: LGTM!

The process method tests correctly handle the different unreachable document scenarios (normal, drop, fail) and use appropriate patching and assertions.

181-211: LGTM!

The tests for failure on unreachable documents and the save method are correctly implemented with proper use of pytest.raises and fixture-based mocking.

235-245: LGTM!

The test correctly verifies that an invalid vector store type raises a RuntimeError using pytest.raises.

coderabbitai

Actionable comments posted: 2

♻️ Duplicate comments (4)

tests/asciidoc/test__main__.py (4)

98-100: Move assertion outside pytest.raises context.

The assertion at line 100 never executes because main_convert(mock_args) raises SystemExit and immediately exits the context manager. The captured exception is only available after the with block completes.

Apply this diff:

     with pytest.raises(SystemExit) as e:
         main_convert(mock_args)
-        assert e.value.code != 0
+    assert e.value.code != 0

109-111: Move assertion outside pytest.raises context.

Same issue: the assertion at line 111 is unreachable inside the context manager.

Apply this diff:

     with pytest.raises(SystemExit) as e:
         main_convert(mock_args)
-        assert e.value.code != 0
+    assert e.value.code != 0

134-146: Fix mock patch target and assertion placement.

Two issues:

The patch targets asciidoctor_converter.subprocess.run, but main_get_structure calls subprocess.run from __main__.py
The assertion at line 146 is inside the pytest.raises context and never executes

Apply this diff:

     def test_main_incorrect_asciidoctor_cmd(self, mocker, main_data):
         mock_run = mocker.patch(
-            "lightspeed_rag_content.asciidoc.asciidoctor_converter.subprocess.run"
+            "lightspeed_rag_content.asciidoc.__main__.subprocess.run"
         )
         mock_run.side_effect = subprocess.CalledProcessError(
             cmd=main_data["asciidoctor_cmd"], returncode=1
         )
         mock_args = Mock()
         mock_args.input_file = main_data["input_file"]

         with pytest.raises(SystemExit) as e:
             main_get_structure(mock_args)
-            assert e.value.code != 0
+    assert e.value.code != 0

148-160: Fix mock patch target and assertion placement.

Two issues:

The patch targets asciidoctor_converter.shutil.which, but main_get_structure calls shutil.which from __main__.py
Both assertions (lines 159-160) are inside the nested context managers and never execute

Apply this diff:

     def test_main_missing_asciidoctor_cmd(self, mocker, main_data, caplog):
         mock_which = mocker.patch(
-            "lightspeed_rag_content.asciidoc.asciidoctor_converter.shutil.which"
+            "lightspeed_rag_content.asciidoc.__main__.shutil.which"
         )
         mock_which.return_value = ""
         mock_args = Mock()
         mock_args.input_file = main_data["input_file"]

-        with pytest.raises(SystemExit) as e:
-            with caplog.at_level(logging.ERROR):
-                main_get_structure(mock_args)
-                assert e.value.code != 0
-                assert "ERROR" in caplog.text
+        with caplog.at_level(logging.ERROR):
+            with pytest.raises(SystemExit) as e:
+                main_get_structure(mock_args)
+        assert e.value.code != 0
+        assert "ERROR" in caplog.text

🧹 Nitpick comments (1)

tests/test_okp.py (1)

118-163: Consider using Path objects in the mock for accuracy.

The mock returns strings (["file1.md", "file2.md", ...]) but Path.glob() returns Path objects in the actual implementation. While the test passes, it doesn't accurately simulate the real behavior where filepath would be a Path object that gets converted via str(filepath) before being passed to parse_metadata.

Apply this diff to better match the real behavior:

+from pathlib import Path
+
 def test_yield_files_related_to_projects(self, mocker):
     """Test yielding files related to specific projects."""
     mock_glob = mocker.patch("lightspeed_rag_content.okp.Path.glob")
     mock_glob.return_value = [
-        "file1.md",
-        "file2.md",
-        "file3.md",  # Should be ignored, missing metadata
+        Path("file1.md"),
+        Path("file2.md"),
+        Path("file3.md"),  # Should be ignored, missing metadata
     ]
 
     # ... (rest of mock setup)
     
     projects = ["foo", "bar"]
     files = list(okp.yield_files_related_to_projects("/fake", projects))
 
     # Check that the correct files are yielded
     assert len(files) == 2
-    assert "file1.md" in files
-    assert "file2.md" in files
+    assert Path("file1.md") in files
+    assert Path("file2.md") in files

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 07a8aa9 and c5355b6.

📒 Files selected for processing (11)

scripts/remove_pytorch_cpu_pyproject.py (2 hunks)
tests/asciidoc/test__main__.py (2 hunks)
tests/asciidoc/test_asciidoc_converter.py (2 hunks)
tests/conftest.py (1 hunks)
tests/test_document_processor.py (2 hunks)
tests/test_document_processor_llama_index.py (3 hunks)
tests/test_document_processor_llama_stack.py (7 hunks)
tests/test_metadata_processor.py (1 hunks)
tests/test_okp.py (7 hunks)
tests/test_utils.py (2 hunks)
tests/utils.py (0 hunks)

💤 Files with no reviewable changes (1)

tests/utils.py

🚧 Files skipped from review as they are similar to previous changes (2)

tests/conftest.py
tests/test_document_processor.py

🧰 Additional context used

🧬 Code graph analysis (7)

tests/asciidoc/test__main__.py (1)

src/lightspeed_rag_content/asciidoc/__main__.py (3)

main_convert (38-52)

main_get_structure (55-76)

get_argument_parser (79-140)

tests/test_document_processor_llama_index.py (2)

tests/conftest.py (1)

RagMockEmbedding (19-28)

src/lightspeed_rag_content/document_processor.py (9)

DocumentProcessor (384-495)

_got_whitespace (80-85)

_filter_out_invalid_nodes (88-97)

_save_index (152-159)

_save_metadata (161-184)

process (448-490)

save (145-150)

save (355-381)

save (492-495)

tests/asciidoc/test_asciidoc_converter.py (1)

src/lightspeed_rag_content/asciidoc/asciidoctor_converter.py (3)

AsciidoctorConverter (63-188)

_get_converter_file (104-119)

_get_attribute_list (132-146)

tests/test_utils.py (1)

src/lightspeed_rag_content/utils.py (1)

get_common_arg_parser (19-78)

tests/test_okp.py (1)

src/lightspeed_rag_content/okp.py (7)

metadata_has_url_and_title (55-67)

is_file_related_to_projects (28-52)

parse_metadata (112-138)

yield_files_related_to_projects (70-109)

OKPMetadataProcessor (141-152)

url_function (144-147)

get_file_title (149-152)

tests/test_metadata_processor.py (2)

src/lightspeed_rag_content/metadata_processor.py (4)

MetadataProcessor (26-96)

ping_url (44-56)

get_file_title (34-42)

populate (58-91)

tests/test_okp.py (1)

test_get_file_title (194-198)

tests/test_document_processor_llama_stack.py (2)

tests/conftest.py (1)

RagMockEmbedding (19-28)

src/lightspeed_rag_content/document_processor.py (6)

_Config (45-62)

_LlamaStackDB (187-381)

write_yaml_config (291-310)

_start_llama_stack (312-322)

add_docs (140-143)

add_docs (324-353)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: Pylinter
GitHub Check: build-and-push-dev

🔇 Additional comments (16)

scripts/remove_pytorch_cpu_pyproject.py (1)

1-50: LGTM! Minor formatting improvements.

The formatting changes (docstring alignment and alphabetical import ordering) are harmless and follow common conventions. No functional impact.

Note: This file appears to be a utility script for removing PyTorch CPU dependencies, which doesn't seem directly related to the PR objective of converting tests to pytest. If this cleanup is intentional, consider noting it in the PR description for clarity.

tests/test_document_processor_llama_stack.py (1)

19-356: Excellent pytest migration!

The conversion from unittest to pytest is well-executed:

Clean fixture-based setup replacing setUp methods

Proper use of mocker from pytest-mock for patching

Consistent pytest-style assertions throughout

Maintains full test coverage while improving readability

tests/test_document_processor_llama_index.py (1)

20-245: Well-structured pytest migration!

The migration demonstrates excellent pytest practices:

Fixture returns dictionary for flexible access to processor and configuration

Consistent use of mocker.patch.object for targeted mocking

Proper use of pytest.raises for exception assertions (lines 198, 237)

Appropriate use of mock.patch.dict decorator for environment variable testing (line 213)

tests/test_metadata_processor.py (4)

25-38: Well-structured fixtures!

The fixtures provide clean separation of concerns and make the tests more maintainable by centralizing test data and setup.

44-73: Excellent test coverage for ping_url!

The tests properly cover the success, failure, and exception scenarios using pytest-mock. The migration is clean and correct.

75-93: Correct migration of file mocking!

The use of mocker.mock_open properly simulates file operations and the exception handling test verifies graceful degradation.

95-147: Thorough testing of populate method!

The tests correctly mock the abstract url_function method and other dependencies using mocker.patch.object. The use of caplog for logging assertions follows pytest best practices and properly verifies warning messages for unreachable URLs.

tests/test_utils.py (2)

17-22: LGTM! Clean migration to pytest.

The addition of the pytest import and removal of unittest.TestCase inheritance correctly converts this test class to pytest style. Class-based tests with instance methods remain fully supported in pytest.

29-64: LGTM! All assertions correctly migrated.

All unittest assertions have been properly converted to pytest equivalents:

assertIsInstance → assert isinstance(...)

assertEqual → assert ... ==

assertRaises(SystemExit) → pytest.raises(SystemExit) context manager

assertFalse → assert not ...

assertTrue → assert ...

The conversions are semantically equivalent and maintain the original test behavior.

tests/asciidoc/test_asciidoc_converter.py (1)

28-191: Excellent pytest migration!

The test migration from unittest to pytest is well-executed:

Clean fixture-based test data organization

Proper use of mocker for all mocking

Correct pytest.raises and caplog usage

All assertions properly placed outside context managers

tests/test_okp.py (6)

16-21: LGTM! Clean pytest migration.

The import and class declaration correctly follow pytest conventions by removing the unittest.TestCase inheritance and importing pytest for fixture support.

24-62: LGTM! Assertions converted correctly.

All unittest-style assertions have been properly converted to pytest-style plain assert statements. The test logic is preserved and the assertions are clear and idiomatic.

64-116: LGTM! Proper mocker fixture usage.

The test correctly uses the mocker fixture to mock file I/O operations. The mock_open with read_data appropriately simulates reading the TOML metadata from the file, and the assertion validates the complete parsed structure.

166-169: LGTM! Well-structured fixture.

The okp_mp fixture is properly defined and provides a clean way to instantiate the OKPMetadataProcessor for tests. This follows pytest best practices for test setup.

172-186: LGTM! Excellent use of autouse fixture.

The autouse=True fixture efficiently mocks parse_metadata for all tests in the class, providing consistent test data. This is a clean pattern that reduces code duplication and ensures all tests in this class have the necessary mocking in place.

188-198: LGTM! Clean fixture-based tests.

Both tests properly utilize the okp_mp fixture parameter and rely on the autouse fixture for mocking. The tests are concise and follow pytest conventions correctly.

coderabbitai

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c984c7f and bec475e.

📒 Files selected for processing (2)

tests/asciidoc/test__main__.py (2 hunks)
tests/test_metadata_processor.py (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

tests/test_metadata_processor.py

🧰 Additional context used

🧬 Code graph analysis (1)

tests/asciidoc/test__main__.py (1)

src/lightspeed_rag_content/asciidoc/__main__.py (3)

main_convert (38-52)

main_get_structure (55-76)

get_argument_parser (79-140)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)

GitHub Check: mypy
GitHub Check: build-and-push-dev
GitHub Check: Pylinter

🔇 Additional comments (7)

tests/asciidoc/test__main__.py (7)

15-29: LGTM: Imports updated for pytest.

The imports correctly include pytest, logging for caplog usage, and Mock from unittest.mock. The test targets are properly imported from the main module.

31-43: LGTM: Well-structured pytest fixture.

The main_data fixture provides centralized test data that's reused across multiple tests. Good pytest practice.

45-53: LGTM: Helper reduces test duplication.

The get_mock_parsed_args helper appropriately constructs mock arguments from fixture data, reducing boilerplate across tests.

57-83: LGTM: Correct mock targets for main_convert.

The test correctly patches asciidoctor_converter.shutil.which and asciidoctor_converter.subprocess.run since main_convert invokes AsciidoctorConverter which uses these modules.

85-111: LGTM: Error scenarios correctly tested.

Both error tests properly use pytest.raises with assertions outside the context block (addressing past review feedback). Mock targets are appropriate for testing main_convert error paths.

113-132: LGTM: Correct patch targets for main_get_structure.

The test correctly patches __main__.shutil.which and __main__.subprocess.run since main_get_structure uses these directly from the __main__ module (addressing past review feedback).

162-165: LGTM: Simple assertion correctly checks parser type.

The pytest-style assertion appropriately verifies the return type.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (8)

tests/test_metadata_processor.py (8)

24-27: Fixture may break if class becomes an ABC; consider a tiny concrete subclass.

If MetadataProcessor is ever changed to subclass abc.ABC, instantiating it here will fail. A minimal test subclass that implements url_function (or patching the instance method) makes the fixture future-proof.

43-53: Patch where used and assert call details (robustness).

Patch the imported symbol in the target module and assert the single-call/timeout behavior.

-        mock_get = mocker.patch("requests.get")
+        mock_get = mocker.patch("lightspeed_rag_content.metadata_processor.requests.get")
@@
         result = md_processor.ping_url(processor_data["url"])
 
         assert result is True
+        assert mock_get.call_count == 1
+        assert mock_get.call_args == ((processor_data["url"],), {"timeout": 30})

54-64: Verify retries and patch at the module path.

Given ping_url retries on non-200, assert it tried three times and used the expected args.

-        mock_get = mocker.patch("requests.get")
+        mock_get = mocker.patch("lightspeed_rag_content.metadata_processor.requests.get")
@@
         result = md_processor.ping_url(processor_data["url"])
 
         assert result is False
+        assert mock_get.call_count == 3
+        for args, kwargs in mock_get.call_args_list:
+            assert args == (processor_data["url"],)
+            assert kwargs == {"timeout": 30}

65-73: Also assert retries on exceptions and patch at the module path.

-        mock_get = mocker.patch("requests.get")
+        mock_get = mocker.patch("lightspeed_rag_content.metadata_processor.requests.get")
@@
         result = md_processor.ping_url(processor_data["url"])
 
         assert result is False
+        assert mock_get.call_count == 3
+        for args, kwargs in mock_get.call_args_list:
+            assert args == (processor_data["url"],)
+            assert kwargs == {"timeout": 30}

74-84: Include a newline in mocked file data to mirror real files.

This exercises rstrip("\n") as in production reads.

-            read_data=f'# {processor_data["title"]}',
+            read_data=f'# {processor_data["title"]}\n',

85-93: Assert open() was called with the expected encoding.

Strengthens the contract that get_file_title reads as UTF‑8.

         result = md_processor.get_file_title(processor_data["file_path"])
 
         assert "" == result
+        mock_file.assert_called_once_with(
+            processor_data["file_path"], "r", encoding="utf-8"
+        )

94-118: Patch instance methods to simplify call assertions and verify interactions.

Patching on the instance avoids self in call args and lets you assert exact inputs.

-        mock_url_func = mocker.patch.object(
-            metadata_processor.MetadataProcessor, "url_function"
-        )
-        mock_get_title = mocker.patch.object(
-            metadata_processor.MetadataProcessor, "get_file_title"
-        )
-        mock_ping_url = mocker.patch.object(
-            metadata_processor.MetadataProcessor, "ping_url"
-        )
+        mock_url_func = mocker.patch.object(md_processor, "url_function")
+        mock_get_title = mocker.patch.object(md_processor, "get_file_title")
+        mock_ping_url = mocker.patch.object(md_processor, "ping_url")
@@
         result = md_processor.populate(processor_data["file_path"])
 
         expected_result = {
@@
         }
         assert expected_result == result
+        mock_url_func.assert_called_once_with(processor_data["file_path"])
+        mock_get_title.assert_called_once_with(processor_data["file_path"])
+        mock_ping_url.assert_called_once_with(processor_data["url"])

119-146: Target the module logger in caplog and strengthen assertions.

Make logging capture deterministic and verify message contents and interactions.

-        mock_url_func = mocker.patch.object(
-            metadata_processor.MetadataProcessor, "url_function"
-        )
-        mock_get_title = mocker.patch.object(
-            metadata_processor.MetadataProcessor, "get_file_title"
-        )
-        mock_ping_url = mocker.patch.object(
-            metadata_processor.MetadataProcessor, "ping_url"
-        )
+        mock_url_func = mocker.patch.object(md_processor, "url_function")
+        mock_get_title = mocker.patch.object(md_processor, "get_file_title")
+        mock_ping_url = mocker.patch.object(md_processor, "ping_url")
@@
-        with caplog.at_level(logging.WARNING):
+        with caplog.at_level(logging.WARNING, logger=metadata_processor.__name__):
             result = md_processor.populate(processor_data["file_path"])
@@
         assert expected_result == result
-        assert "URL not reachable" in caplog.text
+        assert "URL not reachable" in caplog.text
+        assert processor_data["url"] in caplog.text
+        assert processor_data["title"] in caplog.text
+        mock_url_func.assert_called_once_with(processor_data["file_path"])
+        mock_get_title.assert_called_once_with(processor_data["file_path"])
+        mock_ping_url.assert_called_once_with(processor_data["url"])

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between bec475e and 9fe59eb.

📒 Files selected for processing (2)

tests/asciidoc/test__main__.py (2 hunks)
tests/test_metadata_processor.py (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

tests/asciidoc/test__main__.py

🧰 Additional context used

🧬 Code graph analysis (1)

tests/test_metadata_processor.py (1)

src/lightspeed_rag_content/metadata_processor.py (4)

MetadataProcessor (26-96)

ping_url (44-56)

get_file_title (34-42)

populate (58-91)

🔇 Additional comments (1)

tests/test_metadata_processor.py (1)

40-146: Overall: migration looks solid.

Good pytest idioms, clear fixtures, and coverage of success/failure paths. With the minor robustness tweaks above, this suite will be very resilient.

tisnik

LGTM

coderabbitai Bot reviewed Oct 21, 2025

View reviewed changes

Comment thread tests/asciidoc/test__main__.py Outdated

Comment thread tests/asciidoc/test__main__.py Outdated

Comment thread tests/asciidoc/test__main__.py Outdated

Comment thread tests/asciidoc/test__main__.py Outdated

max-svistunov added 2 commits October 21, 2025 15:26

Convert tests to pytest

c86b2cb

Re-introduce doc_type to tests

c5355b6

max-svistunov force-pushed the lcore-417-convert-to-pytest branch from 07a8aa9 to c5355b6 Compare October 21, 2025 13:45

coderabbitai Bot reviewed Oct 21, 2025

View reviewed changes

Comment thread tests/asciidoc/test__main__.py

Comment thread tests/test_metadata_processor.py Outdated

max-svistunov added 2 commits October 21, 2025 16:03

Add the pytest-mock dependency

c984c7f

Fix assertion placement

22ae4ee

coderabbitai Bot reviewed Oct 21, 2025

View reviewed changes

Comment thread tests/asciidoc/test__main__.py

Comment thread tests/asciidoc/test__main__.py

Fix mock patch targets; add missing mock; remove unneeded import

9fe59eb

max-svistunov force-pushed the lcore-417-convert-to-pytest branch from bec475e to 9fe59eb Compare October 21, 2025 14:51

coderabbitai Bot reviewed Oct 21, 2025

View reviewed changes

Implement Coderabbit's nitpicks for better test robustness

03f4d47

max-svistunov force-pushed the lcore-417-convert-to-pytest branch from da6d3e3 to 03f4d47 Compare October 21, 2025 15:19

Disallow importing unittest using ruff

1337d7b

max-svistunov force-pushed the lcore-417-convert-to-pytest branch from f274973 to 1337d7b Compare October 21, 2025 20:53

tisnik approved these changes Oct 22, 2025

View reviewed changes

tisnik reviewed Oct 22, 2025

View reviewed changes

Comment thread pyproject.toml

Move unittest banning to the existing [tool.ruff] section

16f2c1a

tisnik merged commit 0d4c9c8 into lightspeed-core:main Oct 23, 2025
14 checks passed

coderabbitai Bot mentioned this pull request Jan 15, 2026

[LCORE-1035] Add HTML support for BYOK #62

Merged

18 tasks

coderabbitai Bot mentioned this pull request Jan 27, 2026

LCORE-1240: Bump up llama-stack to 0.4.3, fixing the vector registration issues #70

Merged

18 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LCORE-417: Convert tests to pytest#42

LCORE-417: Convert tests to pytest#42
tisnik merged 8 commits intolightspeed-core:mainfrom
max-svistunov:lcore-417-convert-to-pytest

max-svistunov commented Oct 21, 2025 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Oct 21, 2025 •

edited

Loading

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

tisnik left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

max-svistunov commented Oct 21, 2025 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of change

Related Tickets & Documents

Checklist before requesting a review

Testing

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Oct 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

tisnik left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

max-svistunov commented Oct 21, 2025 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Oct 21, 2025 •

edited

Loading