Skip to content

codecheckers/codecheck-py

Repository files navigation

codecheck-py

Binder

Python-based template for writing CODECHECK certificates.

Note that this is an alpha version, please report any issues in the issue tracker.

Try it Online

🚀 Launch on Binder: Click the badge above to try this template in your browser without installing anything! The Binder environment includes all dependencies and opens an interactive notebook to explore the template.

What You Can Do on Binder

  • Explore the template: Run the certificate generation notebook
  • Test validation: Try the validation system with example data
  • Experiment: Modify code and see results immediately
  • Learn: Follow the workflow without local setup

Note: Binder environments are temporary and reset when idle. For actual CODECHECK work, clone the repository locally.

Usage

The files in this repository are meant to be placed into the codecheck directory of a repository that has been forked for a CODECHECK. The codechecker needs to fill out the jupyter notebook codecheck.ipynb which automatically generates some boilerplate content based on the codecheck.yml file and includes figures and a summary of CSV files regenerated by the codechecker (see below and comments in codecheck.py).

To generate a report from the notebook (which by default hides all the code used for the automatic content generation), run:

jupyter nbconvert --to pdf --no-input --no-prompt --execute --LatexExporter.template_file nbconvert_template.tex.j2 codecheck.ipynb

The environment.yml file defines a conda environment that can be used to install all the necessary packages:

conda env create -f environment.yml
conda activate codecheck-env

The (optional) nbconvert_template.tex.j2 template overrides some of the default nbconvert settings, in particular by reducing the font size and the margins a bit, and enabling line breaks in verbatim output (useful e.g. when including command line output/log files in the report).

This template makes a few assumptions:

  • The main directory needs to have a codecheck.yml (i.e. ../codecheck.yml from the point of view of the jupyter notebook) file with all the required information stated in the configuration file specification.
  • Files referenced in the manifest of the configuration file are reproduced in an outputs subdirectory of codecheck. The remaining directory structure needs to be reproduced in this directory. For example, if the manifest references a file figures/image.png, the reproduced image needs to be placed into codecheck/outputs/figures/image.png.

For an example use of this template in a CODECHECK, see https://github.com/codecheckers/causality-review/

Repository Structure

When using this template for a CODECHECK, your repository should follow this structure:

repository-root/
├── codecheck.yml                    # Configuration file (at root level)
├── figures/                         # Original figures from paper
│   ├── plot1.png
│   └── plot2.pdf
├── data/                           # Original data files
│   └── results.csv
├── code/                           # Original code to reproduce results
│   ├── analysis.py
│   └── generate_figures.R
└── codecheck/                      # CODECHECK materials (this template)
    ├── codecheck.py                # Helper module
    ├── codecheck.ipynb             # Certificate notebook
    ├── validation.py               # Validation module (NEW)
    ├── validation_config.py        # Validation configuration (NEW)
    ├── manifest.py                 # Manifest processing (NEW)
    ├── environment.yml             # Conda environment
    ├── nbconvert_template.tex.j2   # LaTeX template
    ├── codecheck_logo.png          # CODECHECK logo
    ├── codecheck.pdf               # Generated certificate (output)
    └── outputs/                    # Reproduced files from manifest
        ├── figures/
        │   ├── plot1.png           # Reproduced version
        │   └── plot2.pdf           # Reproduced version
        └── data/
            └── results.csv         # Reproduced version

Key Points:

  1. codecheck.yml is at the repository root, not inside the codecheck/ directory
  2. Original files (from the paper) live in the main repository structure
  3. Reproduced files (regenerated by the codechecker) go in codecheck/outputs/
  4. The directory structure within outputs/ should mirror the paths specified in the manifest
  5. File paths in codecheck.yml manifest are relative to the repository root

Example Workflow

# 1. Fork the repository to be checked
# 2. Copy this template into a codecheck/ directory
# 3. Create codecheck.yml at root with manifest entries
# 4. Run the authors' code to reproduce outputs
# 5. Copy reproduced files to codecheck/outputs/
cp figures/plot1.png codecheck/outputs/figures/plot1.png
cp data/results.csv codecheck/outputs/data/results.csv

# 6. Fill out codecheck.ipynb with your notes
# 7. Generate the certificate PDF
cd codecheck
jupyter nbconvert --to pdf --no-input --no-prompt --execute \
  --LatexExporter.template_file nbconvert_template.tex.j2 codecheck.ipynb

Validation Features

This template now includes comprehensive validation features to check your codecheck.yml configuration before generating the certificate. Validation helps catch errors early and ensures your CODECHECK meets the specification.

Quick Start with Validation

from codecheck import Codecheck

# Initialize with validation enabled
check = Codecheck(validate=True, strict=False)

# Or validate manually
check = Codecheck()
passed, issues = check.validate(
    check_manifest=True,
    check_register=True,  # Check GitHub register (default: True)
    strict=False
)
check.validation_report()

Validation Checks Performed

The validation system performs the following checks on your codecheck.yml file:

1. YAML Syntax Validation

  • ✓ Valid YAML structure
  • ✓ Proper indentation
  • ✓ No syntax errors
  • ✓ File is readable and parseable

2. Field Completeness

  • Mandatory fields (must be present):

    • manifest - List of reproduced files
    • codechecker - Name and ORCID of checker
    • report - DOI or URL of certificate report
  • Recommended fields (warnings if missing):

    • version - Config specification version
    • paper - Paper metadata (title, authors, reference)
    • repository - Code repository URL
    • check_time - When the check was performed
    • certificate - Certificate ID (YYYY-NNN format)
  • Optional fields:

    • summary - Summary of findings
    • source - Additional source information

3. Placeholder Detection

  • ✓ Detects common placeholder patterns:
    • "FIXME", "TODO", "template", "example"
    • "XXXXX", "placeholder"
  • ✓ Flags incomplete configuration values
  • ✓ Helps ensure all fields are filled in

4. Certificate ID Format

  • ✓ Format: YYYY-NNN (e.g., 2023-001)
  • ✓ Year must be 4 digits
  • ✓ Number must be 3 digits
  • ✓ Detects placeholder certificates:
    • YYYY-001, 0000-001, 9999-001

5. Report DOI/URL Validation

  • ✓ Must be a valid URL or DOI
  • ✓ Detects placeholder DOIs:
    • 10.5281/zenodo.XXXXXX
    • URLs containing "placeholder" or "example"

6. ORCID Format Validation

  • ✓ Format: 0000-0000-0000-0000 (or ending in X)
  • ✓ Validates for all authors
  • ✓ Validates for codechecker(s)
  • ✓ Detects invalid ORCID patterns

7. Date/Time Format Validation

  • check_time must be ISO 8601 format
  • ✓ Format: YYYY-MM-DDTHH:MM:SS
  • ✓ Example: 2023-11-15T14:30:00

8. Paper Structure Validation

  • ✓ Paper section must be a dictionary
  • ✓ Must contain: title, authors, reference
  • ✓ Authors must be a list
  • ✓ Each author must have name field
  • ✓ ORCID recommended for each author

9. Codechecker Structure Validation

  • ✓ Must be a dictionary
  • ✓ Must contain name field
  • ✓ ORCID strongly recommended

10. Manifest Structure Validation

  • ✓ Must be a list (not empty)
  • ✓ Each entry must be a dictionary
  • ✓ Each entry must have file field
  • comment field is optional but recommended

11. Manifest File Existence

  • ✓ Checks all files exist in codecheck/outputs/
  • ✓ Reports missing files
  • ✓ Validates paths are safe (no path traversal)

12. GitHub Register Issue Verification (NEW)

  • ✓ Checks if a GitHub issue exists in codecheckers/register
  • ✓ Searches for issue with certificate ID in title
  • ERROR if no matching issue found
  • WARNING if issue is closed
  • WARNING if issue is unassigned
  • ✓ Can be disabled with check_register=False
  • ✓ Gracefully handles network errors (warns but doesn't fail)

Validation Modes

Non-strict mode (default):

  • Reports errors and warnings
  • Only fails on errors
  • Allows generation with warnings

Strict mode:

  • Treats warnings as failures
  • Ensures complete configuration
  • Use for final validation

Example Validation Output

## ❌ Errors (2)

- **manifest**: Missing 2 file(s) in outputs/: figures/plot1.png, data/results.csv
  - *Suggestion*: Copy all manifest files to codecheck/outputs/ directory

- **codechecker.name**: Codechecker name is missing
  - *Suggestion*: Add name field for codechecker

## ⚠️  Warnings (3)

- **certificate**: Certificate ID 'YYYY-001' appears to be a placeholder
  - *Suggestion*: Replace with actual certificate ID (format: YYYY-NNN)

- **paper.authors[0].ORCID**: Author 1 ORCID is missing
  - *Suggestion*: Add ORCID for complete author information

- **summary**: Recommended field 'summary' is missing
  - *Suggestion*: Consider adding 'summary' for a complete certificate

Disabling Register Checks

If you need to validate without checking the GitHub register (e.g., for offline work or testing):

# Disable register check
passed, issues = check.validate(
    check_manifest=True,
    check_register=False,  # Skip GitHub API call
    strict=False
)

Testing

This template includes a comprehensive test suite. To run tests:

# Install test dependencies
conda env create -f environment.yml
conda activate codecheck-env

# Run tests
pytest tests/ -v

# Run with coverage
pytest tests/ -v --cov=. --cov-report=term-missing

The test suite includes:

  • 50+ unit tests for validation functions
  • 14 tests for GitHub register issue verification
  • Integration tests for the complete workflow
  • Tests with various invalid configurations
  • Fixture-based testing with example configurations
  • Mock-based testing for GitHub API interactions

Continuous Integration

Tests run automatically on GitHub Actions for every push to the main branch. See .github/workflows/test.yml for the CI configuration.

License

This repository is licensed under MIT License, see the LICENSE file for details.

About

Python-based template for writing CODECHECK certificates

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •