Skip to content

feat: Set up Python testing infrastructure with Poetry and pytest #65

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

llbbl
Copy link

@llbbl llbbl commented Jun 27, 2025

Add Python Testing Infrastructure

Summary

This PR sets up a comprehensive testing infrastructure for the pointer-summarizer project using Poetry as the package manager and pytest as the testing framework. The setup provides a solid foundation for writing and running tests with proper coverage reporting and organization.

Changes Made

Package Management

  • Poetry Configuration: Created pyproject.toml with Poetry package management configuration
  • Dependencies: Added testing dependencies as development dependencies:
    • pytest (^7.4.0) - Core testing framework
    • pytest-cov (^4.1.0) - Coverage reporting plugin
    • pytest-mock (^3.11.0) - Mocking utilities

Testing Configuration

  • pytest Settings:

    • Configured test discovery patterns for test_*.py and *_test.py files
    • Set up custom markers: unit, integration, and slow
    • Enabled strict marker enforcement
    • Configured verbose output with short traceback format
  • Coverage Settings:

    • Source directories: data_util and training_ptr_gen
    • Coverage reports: HTML, XML, and terminal output
    • Excluded Python 2 syntax file (data_util/data.py) temporarily
    • Coverage threshold set to 0% initially (to be increased as tests are added)

Directory Structure

tests/
├── __init__.py
├── conftest.py              # Shared pytest fixtures
├── test_setup_validation.py # Validation tests
├── unit/
│   └── __init__.py
└── integration/
    └── __init__.py

Shared Fixtures (conftest.py)

Created comprehensive fixtures for common testing needs:

  • temp_dir - Temporary directory management
  • mock_config - Mock configuration dictionary
  • sample_vocab - Mock vocabulary object
  • sample_batch_data - Sample batch data for model testing
  • device - PyTorch device detection
  • reset_random_seeds - Reproducible test runs
  • mock_data_path, mock_model_path, mock_log_path - Mock directory structures

Development Workflow

  • Updated .gitignore with testing artifacts, virtual environments, and IDE files
  • Note: poetry.lock is NOT gitignored to ensure reproducible builds

How to Use

Installation

# Install Poetry if not already installed
curl -sSL https://install.python-poetry.org | python3 -

# Install project dependencies
poetry install --with dev

Running Tests

Both commands work identically:

poetry run test
# or
poetry run tests

Test Options

All standard pytest options are available:

# Run only unit tests
poetry run test -m unit

# Run with specific verbosity
poetry run test -v

# Run specific test file
poetry run test tests/test_setup_validation.py

# Run without coverage
poetry run test --no-cov

Coverage Reports

After running tests, coverage reports are available:

  • HTML Report: htmlcov/index.html
  • XML Report: coverage.xml
  • Terminal Report: Displayed after test run

Notes

  1. Python Version: The project appears to have some Python 2 syntax (e.g., data_util/data.py). These files are temporarily excluded from coverage until migrated.

  2. pyrouge Dependency: The original pyrouge package is not available on PyPI. It needs to be installed separately following its specific installation instructions.

  3. Coverage Threshold: Currently set to 0% to allow the infrastructure setup to complete. This should be increased (e.g., to 80%) as actual tests are added.

  4. Validation Tests: The PR includes validation tests that verify the testing infrastructure is working correctly. These are not unit tests for the actual codebase.

Next Steps

With this testing infrastructure in place, developers can now:

  1. Write unit tests for individual modules
  2. Add integration tests for end-to-end workflows
  3. Gradually increase the coverage threshold
  4. Consider migrating Python 2 code to Python 3 for full coverage

- Initialize Poetry package manager with pyproject.toml configuration
- Add pytest, pytest-cov, and pytest-mock as dev dependencies
- Configure pytest with custom markers (unit, integration, slow)
- Set up coverage reporting with HTML and XML output
- Create test directory structure with shared fixtures in conftest.py
- Add validation tests to verify the testing setup works correctly
- Update .gitignore with testing and development artifacts
- Configure test commands accessible via `poetry run test` or `poetry run tests`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant