Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Nov 9, 2025

ExamKit Pull Request

Description

Built complete offline exam preparation system that transforms lecture materials (video, transcripts, slides, exams) into cited study PDFs using local LLM and RAG pipeline.

Core pipeline: Ingestion → NLP (embeddings + FAISS) → LLM synthesis (Ollama) → PDF rendering (Typst/Pandoc)

Implementation

Ingestion (6 modules)

  • Multi-format parsing: VTT/SRT/TXT transcripts, PPTX/PDF slides, PDF exams
  • FFmpeg audio extraction, Tesseract OCR fallback
  • Normalized JSONL output with manifest validation

NLP (5 modules)

  • Embeddings: sentence-transformers (all-MiniLM-L6-v2, 384-dim)
  • Vector search: FAISS indexing with semantic retrieval
  • Topic mapping with coverage metrics

Synthesis (5 modules)

  • Ollama client for local LLM (llama3.2:8b)
  • Jinja2 prompt templates: definition, derivation, mistakes, revision
  • Citation tracking: [vid HH:MM:SS], [slide N], [exam Q2b]
  • Graphviz diagram generation

Rendering (3 modules)

  • Typst compilation (primary), Pandoc fallback
  • Professional templates with ToC, formulas, styling

QA & Reports (4 modules)

  • Formula validation, link checking, keyword coverage
  • Coverage CSV, citations JSON export

CLI Commands

# Process inputs
examkit ingest --manifest manifest.json

# Generate PDF with citations
examkit build --config config.yml --out notes.pdf --offline

# Coverage report
examkit report --session demo --open

Configuration

asr:
  engine: faster-whisper
  model: small
llm:
  engine: ollama
  model: llama3.2:8b
  temperature: 0.2
embedding:
  model: all-MiniLM-L6-v2
  dim: 384
retrieval:
  top_k: 8
pdf:
  engine: typst

Type of Change

  • New feature
  • Bug fix
  • Breaking change
  • Documentation update

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

Testing

  • VTT/SRT/TXT transcript normalization (3 formats × edge cases)
  • PPTX/PDF slides parsing with OCR fallback
  • Exam structure extraction with marks detection
  • Topic mapping and coverage calculation
  • Citation formatting (video, slides, exam)
  • Template rendering (Typst, Jinja2)
  • Formula validation and link checking

Additional Context

Project Stats: 38 Python modules, 5 test suites, 9 templates, ~5K LOC

Security: CodeQL scan passed with 0 vulnerabilities. Offline mode enforced—no network calls during processing.

Dependencies: All free/OSS (faster-whisper, FAISS, spaCy, Ollama, Typst). System deps via Homebrew.

Docs: README (installation/usage), ARCHITECTURE (system design), CONTRIBUTING (dev guidelines), IMPLEMENTATION_SUMMARY (requirements checklist).

Original prompt

This section details on the original issue you should resolve

<issue_title>Master Build: Production-Grade Python "ExamKit" Project Generator (macOS, Offline, OSS)</issue_title>
<issue_description>Perfect—here’s a ready-to-use, master prompt you can paste into ChatGPT (or your favorite code-gen assistant) to generate the full Python project from the requirements we just defined.


✅ Master Build Prompt — “ExamKit” (Python, Local-Only, macOS)

Role: You are a senior Python engineer. Generate a complete, production-grade Python project called examkit that runs fully offline on macOS using only free/open-source tools. The project ingests lecture video, transcripts, slides, session topics, exam topics, and exam papers and outputs an exam-ready PDF with citations, formulas, diagrams, and a coverage report.

Follow every instruction precisely. Produce all files exactly as specified, with type hints, docstrings, and clear comments.


1) Objectives & Constraints

  • Local-only (offline): No network calls during processing. Everything must run on macOS with Apple Silicon/Intel.
  • Free/Open-source: Use faster-whisper, PyMuPDF, python-pptx, tesseract, ffmpeg, faiss-cpu, sentence-transformers, spaCy, matplotlib, jinja2, Typst (preferred) OR pandoc+wkhtmltopdf fallback. Use Ollama for local LLM (llama3.2:8b default).
  • Reproducible CLI pipeline with config (config/config.yml) and deterministic outputs.
  • Traceability: Every paragraph in PDF must cite sources (video timecodes, slide numbers, exam question ids).
  • Portability: No Docker required. Poetry environment or uv is fine.

2) Deliverables (Create all these files)

Project root

examkit/
  pyproject.toml
  README.md
  LICENSE
  Makefile
  .gitignore
  .env.example
  examkit/                      # Python package
    __init__.py
    cli.py                      # Typer-based CLI (or Click), entrypoint
    config.py                   # Pydantic models for config
    logging_utils.py
    utils/
      __init__.py
      io_utils.py
      text_utils.py
      timecode.py
      math_utils.py
    ingestion/
      __init__.py
      ingest.py                 # Manifest, validation, ffmpeg extract
      transcript_normalizer.py  # VTT/SRT/TXT → jsonl segments
      slides_parser.py          # PPTX→JSONL, images; PDF→JSONL via PyMuPDF+OCR
      exam_parser.py            # Exam paper structure/marks extraction
      ocr.py                    # Tesseract helper
    asr/
      __init__.py
      whisper_runner.py         # faster-whisper wrapper (offline)
    nlp/
      __init__.py
      splitter.py               # sentence/paragraph segmentation
      embeddings.py             # sentence-transformers; FAISS index
      topic_mapping.py          # syllabus mapping, coverage matrix
      retrieval.py              # RAG over FAISS
      spaCy_nlp.py              # NER, cleanup (en_core_web_sm)
    synthesis/
      __init__.py
      prompts.py                # Jinja templates for prompts
      ollama_client.py          # local LLM calls via subprocess/http
      composer.py               # section builders: def/intuit/derivation/examples/common mistakes
      citations.py              # manage refs: [vid hh:mm:ss][slide N][exam Q2b]
      diagrams.py               # Graphviz/Mermaid helpers
    render/
      __init__.py
      templater.py              # Jinja2 → Markdown/Typst
      typst_renderer.py         # Typst compile
      pandoc_renderer.py        # Fallback path
    qa/
      __init__.py
      checks.py                 # formulas compile, link checker, keyword recall
    reports/
      __init__.py
      coverage.py               # topic coverage csv/json
      export.py                 # write citations.json, coverage.csv
  config/
    config.yml
    templates/
      typst/
        main.typ                # Typst main template
        theme.typ               # typography/theme
      markdown/
        section.md.j2           # per-topic section template
        pdf_main.md.j2          # stitched MD template
      prompts/
        definition.j2
        derivation.j2
        mistakes.j2
        compare.j2
        fast_revision.j2
  input/
    sample/
      video/sample.mp4          # (stub, small or placeholder note)
      transcript/sample.vtt
      slides/sample.pptx
      exam/sample_exam.pdf
      topics/session_topics.yml
      topics/exam_topics.yml
  out/                          # build artifacts
  cache/
  logs/
  tests/
    test_ingestion.py
    test_parsers.py
    test_topic_mapping.py
    test_render.py

3) pyproject.toml (Poetry) — Required Dependencies

Include at least:

  • typer[all], rich, pydantic, pyyaml, tqdm
  • faster-whisper, ffmpeg-python
  • pymupdf, pdfminer.six, python-pptx
  • pytesseract, Pillow
  • sentence-transformers, faiss-cpu, spacy (en_core_web_sm in README)
  • `m...

💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

Summary by CodeRabbit

Release Notes

  • New Features

    • CLI tool for ingesting and processing educational content (videos, transcripts, slides, exams)
    • Offline-first PDF generation for exam preparation materials
    • Semantic search and automatic topic mapping with coverage analysis
    • Quality assurance checks for generated content
    • Citation tracking and coverage reporting
  • Documentation

    • Architecture guide, contribution guidelines, and implementation summary
    • Configuration templates and sample workflows
  • Chores

    • Project structure, dependencies, and CI configuration

@coderabbitai
Copy link

coderabbitai bot commented Nov 9, 2025

Important

Review skipped

Bot user detected.

To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

Introduces ExamKit, a comprehensive Python application for generating exam preparation materials offline. Creates a complete project structure including CLI, ingestion pipeline for videos/transcripts/slides/exams, NLP processing with embeddings and topic mapping, content synthesis via local Ollama LLM, PDF rendering with Typst, QA checks, and reporting—all with Jinja2 templates, Poetry dependency management, and extensive documentation.

Changes

Cohort / File(s) Summary
Configuration & Environment
.env.example, config/config.yml, pyproject.toml, Makefile
Environment template with logging/Ollama settings; YAML config defaults for ASR, LLM, embeddings, retrieval, PDF, diagrams, offline mode; Poetry project manifest with dev tools; Makefile with setup, test, lint, format, build-demo, clean targets
Jinja2 Templates
config/templates/markdown/pdf_main.md.j2, config/templates/markdown/section.md.j2, config/templates/prompts/*.j2, config/templates/typst/main.typ, config/templates/typst/theme.typ
Markdown document template with metadata, TOC, coverage summary; section template for topics (definition, formulas, derivation, examples, mistakes, revision); 6 prompt templates (definition, derivation, mistakes, compare, revision, examples) for LLM generation; Typst document configuration function and theme system with colors, boxes, and citation styling
GitHub & Project Meta
.github/PULL_REQUEST_TEMPLATE.md, .gitignore, LICENSE, README.md
PR template with description/checklist sections; comprehensive ignore patterns for Python/IDE/media artifacts; MIT License; project overview and setup instructions
Documentation
ARCHITECTURE.md, CONTRIBUTING.md, IMPLEMENTATION_SUMMARY.md
Architecture diagram and module descriptions with data flow; contribution guidelines with code standards, PR workflow, testing, setup; implementation completion summary with deliverables, dependencies, acceptance criteria
Core CLI & Config
examkit/__init__.py, examkit/cli.py, examkit/config.py, examkit/logging_utils.py
Package metadata (version 0.1.0, contributors); Typer CLI with ingest, build, report, cache commands and session management; Pydantic config models for ASR/LLM/embedding/retrieval/PDF with YAML I/O; centralized logging setup with Rich console support
Ingestion Pipeline
examkit/ingestion/__init__.py, examkit/ingestion/ingest.py, examkit/ingestion/exam_parser.py, examkit/ingestion/slides_parser.py, examkit/ingestion/transcript_normalizer.py, examkit/ingestion/ocr.py
Manifest validation and orchestration pipeline with ffmpeg audio extraction, transcript normalization (VTT/SRT/TXT); PDF exam parser extracting marks/sections/questions; PPTX/PDF slide parsing with OCR fallback; Tesseract-based OCR with availability guards
NLP & Embeddings
examkit/nlp/__init__.py, examkit/nlp/embeddings.py, examkit/nlp/retrieval.py, examkit/nlp/splitter.py, examkit/nlp/spacy_nlp.py, examkit/nlp/topic_mapping.py
Sentence-transformers embeddings with FAISS indexing; RAG retrieval with deduplication, source diversity ranking, confidence filtering; spaCy-based NLP (entities, phrases, lemmatization, language patterns); text chunking/merging; topic loading, chunk-to-topic mapping, coverage calculation, gap identification
Synthesis & LLM
examkit/synthesis/__init__.py, examkit/synthesis/composer.py, examkit/synthesis/ollama_client.py, examkit/synthesis/prompts.py, examkit/synthesis/citations.py, examkit/synthesis/diagrams.py
Main orchestration pipeline: load data, embed chunks, map to topics, retrieve context, generate content via LLM, render sections, compile PDF, export reports; Ollama HTTP client with availability checks, completion/chat generation, model pulling; Jinja prompt template rendering; CitationManager for tracking sources (transcript/slides/exam); Graphviz/Mermaid diagram generation with fallback handling
Rendering
examkit/render/__init__.py, examkit/render/templater.py, examkit/render/typst_renderer.py, examkit/render/pandoc_renderer.py
Jinja environment setup and Markdown/Typst document rendering; Typst compilation with Markdown wrapping; Pandoc fallback PDF generation with XeLaTeX; template and section rendering helpers
QA & Reporting
examkit/qa/__init__.py, examkit/qa/checks.py, examkit/reports/__init__.py, examkit/reports/coverage.py, examkit/reports/export.py
Quality checks: LaTeX formula validation, markdown link verification, keyword recall, citation detection, equation consistency; coverage reporting with CSV export and gap identification; session report generation with coverage/QA/citations aggregation and text/JSON export
Utilities
examkit/utils/__init__.py, examkit/utils/io_utils.py, examkit/utils/text_utils.py, examkit/utils/math_utils.py, examkit/utils/timecode.py
File I/O (JSON, JSONL, text, directory management); text processing (cleaning, tokenization, keyword extraction, truncation); LaTeX formula extraction/validation, coverage calculation, score normalization, symbol extraction; video timecode conversion and citation formatting
ASR Module
examkit/asr/__init__.py, examkit/asr/whisper_runner.py
Faster-Whisper offline ASR wrapper with 16kHz mono WAV conversion, segment to timecode mapping, VTT export
Test Suite
tests/__init__.py, tests/test_ingestion.py, tests/test_parsers.py, tests/test_render.py, tests/test_topic_mapping.py
Unit tests for transcript parsing (VTT/SRT/TXT), manifest validation, exam/text/math utilities, rendering, config loading, coverage calculation, topic mapping, segmentation, QA checks
Sample Fixtures
input/sample/manifest.json, input/sample/exam/README.md, input/sample/slides/README.md, input/sample/transcript/sample.vtt, input/sample/video/README.md, input/sample/topics/exam_topics.yml, input/sample/topics/session_topics.yml
Manifest metadata for lecture session; sample exam structure with sections and marks; sample WebVTT transcript; topic definitions for session and exam; READMEs explaining sample content and generation

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant CLI
    participant Ingest as Ingestion
    participant NLP
    participant Synth as Synthesis
    participant Render
    participant Report

    User->>CLI: ingest --manifest
    CLI->>Ingest: validate_manifest
    Ingest->>Ingest: extract_audio_from_video (ffmpeg)
    Ingest->>Ingest: normalize_transcript (VTT/SRT/TXT)
    Ingest->>Ingest: parse_slides (PPTX/PDF + OCR)
    Ingest->>Ingest: parse_exam (marks, questions)
    Ingest-->>CLI: cache → segments.jsonl

    User->>CLI: build --config --session_id
    CLI->>NLP: generate_embeddings
    NLP->>NLP: load FAISS index
    NLP-->>Synth: embeddings, index
    Synth->>NLP: map_chunks_to_topics
    NLP-->>Synth: topic_mapping, coverage
    
    Synth->>Synth: For each topic:<br/>retrieve_context_for_topic
    Synth->>Synth: RAG + Ollama LLM<br/>generate definition/derivation/etc
    Synth->>Synth: CitationManager.add_citation
    Synth-->>Synth: sections with citations

    Synth->>Render: render_markdown_document
    Render->>Render: setup Jinja environment
    Render->>Render: render section templates
    Render-->>Synth: markdown content
    Synth->>Render: compile_typst_to_pdf
    Render->>Render: Typst or Pandoc compile
    Render-->>Synth: out/session.pdf

    Synth->>Report: generate_report
    Report->>Report: run_all_checks
    Report-->>Report: coverage.csv, citations.json, notes.md
    CLI-->>User: ✓ Pipeline complete
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

  • Scope & heterogeneity: 100+ files spanning configuration, templates, 8 Python packages with distinct responsibilities (ingestion, NLP, synthesis, rendering, QA, reporting, utilities, ASR), external tool integration, and tests. While patterns are consistent, each domain requires separate reasoning.
  • Logic density: Moderate to high in orchestration points (composer.py, ingest.py, embeddings/FAISS integration, Ollama client, rendering pipeline). Straightforward utility functions balance heavier logic.
  • External dependencies: Multiple third-party tools (ffmpeg, Tesseract, Ollama, Typst/Pandoc, FAISS, sentence-transformers) with availability guards and fallback strategies require careful verification.
  • Specific attention areas:
    • examkit/synthesis/composer.py — orchestrates entire pipeline; complex error handling per topic
    • examkit/ingestion/ingest.py — ffmpeg subprocess calls and file I/O error paths
    • examkit/nlp/embeddings.py — FAISS index creation and search logic
    • examkit/synthesis/ollama_client.py — HTTP client with offline validation
    • examkit/render/typst_renderer.py — subprocess management and Markdown-to-Typst conversion
    • Citation tracking and chunk metadata flow across modules
    • Template rendering contexts and Jinja2 safety

Poem

🐰 Behold! A kit for exams so bright,
With whispered words and topics right,
From video streams to PDF dreams,
Citations flow through every seam,
Offline magic, local and true—
ExamKit's here to prep for you! 📚✨

Pre-merge checks and finishing touches

✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title clearly and concisely summarizes the main deliverable: a production-grade, offline exam prep toolkit with RAG pipeline named ExamKit.
Linked Issues check ✅ Passed The PR comprehensively implements all major requirements from #1: complete offline ingestion pipeline (VTT/SRT/TXT/PPTX/PDF/exam parsing), NLP (embeddings, FAISS, topic mapping), synthesis (Ollama, Jinja templates, citations), rendering (Typst/Pandoc), QA checks, CLI, config, tests, and documentation.
Out of Scope Changes check ✅ Passed All changes are directly aligned with issue #1 requirements: configuration templates, ingestion modules, NLP components, synthesis pipeline, rendering infrastructure, QA framework, utilities, CLI, tests, documentation, and sample fixtures are all specified deliverables.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

Comment @coderabbitai help to get the list of available commands and usage tips.

Copilot AI and others added 5 commits November 9, 2025 17:04
…, nlp, synthesis)

Co-authored-by: thecoder8890 <33773206+thecoder8890@users.noreply.github.com>
Co-authored-by: thecoder8890 <33773206+thecoder8890@users.noreply.github.com>
Co-authored-by: thecoder8890 <33773206+thecoder8890@users.noreply.github.com>
Co-authored-by: thecoder8890 <33773206+thecoder8890@users.noreply.github.com>
Co-authored-by: thecoder8890 <33773206+thecoder8890@users.noreply.github.com>
Copilot AI changed the title [WIP] Add production-grade Python ExamKit project generator Implement production-grade ExamKit: offline exam prep toolkit with RAG pipeline Nov 9, 2025
Copilot finished work on behalf of thecoder8890 November 9, 2025 17:22
Copilot AI requested a review from thecoder8890 November 9, 2025 17:22
@thecoder8890
Copy link
Owner

@coderabbitai review

@coderabbitai
Copy link

coderabbitai bot commented Nov 10, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@thecoder8890 thecoder8890 marked this pull request as ready for review November 10, 2025 04:13
@thecoder8890
Copy link
Owner

@coderabbitai review

@coderabbitai
Copy link

coderabbitai bot commented Nov 10, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai
Copy link

coderabbitai bot commented Nov 10, 2025

Note

Docstrings generation - SUCCESS
Generated docstrings for this pull request at #3

coderabbitai bot added a commit that referenced this pull request Nov 10, 2025
Docstrings generation was requested by @thecoder8890.

* #2 (comment)

The following files were modified:

* `examkit/asr/whisper_runner.py`
* `examkit/cli.py`
* `examkit/config.py`
* `examkit/ingestion/exam_parser.py`
* `examkit/ingestion/ingest.py`
* `examkit/ingestion/ocr.py`
* `examkit/ingestion/slides_parser.py`
* `examkit/ingestion/transcript_normalizer.py`
* `examkit/logging_utils.py`
* `examkit/nlp/embeddings.py`
* `examkit/nlp/retrieval.py`
* `examkit/nlp/spacy_nlp.py`
* `examkit/nlp/splitter.py`
* `examkit/nlp/topic_mapping.py`
* `examkit/qa/checks.py`
* `examkit/render/pandoc_renderer.py`
* `examkit/render/templater.py`
* `examkit/render/typst_renderer.py`
* `examkit/reports/coverage.py`
* `examkit/reports/export.py`
* `examkit/synthesis/citations.py`
* `examkit/synthesis/composer.py`
* `examkit/synthesis/diagrams.py`
* `examkit/synthesis/ollama_client.py`
* `examkit/synthesis/prompts.py`
* `examkit/utils/io_utils.py`
* `examkit/utils/math_utils.py`
* `examkit/utils/text_utils.py`
* `examkit/utils/timecode.py`
* `tests/test_render.py`
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 19

🧹 Nitpick comments (43)
pyproject.toml (1)

61-66: Consider stricter MyPy configuration for production-grade code.

The PR objectives emphasize "production-grade" development with type hints, but the current MyPy settings are quite permissive:

  • disallow_untyped_defs = false allows functions without type annotations
  • ignore_missing_imports = true suppresses import resolution errors

For a production-grade codebase, consider gradually tightening these settings as the project matures.

Apply this diff to enable stricter type checking:

 [tool.mypy]
 python_version = "3.11"
 warn_return_any = true
 warn_unused_configs = true
-disallow_untyped_defs = false
-ignore_missing_imports = true
+disallow_untyped_defs = true
+ignore_missing_imports = false
+# Temporarily allow untyped definitions for gradual adoption
+# disallow_untyped_defs = false

Alternatively, keep the current permissive settings but add a comment indicating plans for future strictness, and consider enabling per-module overrides as modules become fully typed.

examkit/ingestion/ocr.py (2)

16-43: Consider using logging.exception for better error diagnostics.

The current error logging provides the error message but not the full traceback. Using logging.exception would include the stack trace, making debugging easier.

Apply this diff:

     except Exception as e:
-        logger.error(f"OCR failed for {image_path}: {e}")
+        logger.exception(f"OCR failed for {image_path}: {e}")
         return ""

46-72: Consider using logging.exception for better error diagnostics.

Similar to extract_text_with_ocr, using logging.exception would provide more debugging context when OCR confidence calculation fails.

Apply this diff:

     except Exception as e:
-        logger.error(f"Failed to get OCR confidence for {image_path}: {e}")
+        logger.exception(f"Failed to get OCR confidence for {image_path}: {e}")
         return 0.0
examkit/utils/math_utils.py (1)

6-6: Remove unused import.

The Optional type is imported but never used in any function signature.

Apply this diff:

-from typing import List, Optional
+from typing import List
examkit/nlp/spacy_nlp.py (1)

75-98: Non-deterministic ordering of key phrases.

Line 98 converts phrases to a set() and then slices, which produces arbitrary ordering since sets are unordered in Python. If consistent ordering is important, consider preserving insertion order.

If deterministic ordering is desired, apply this diff:

-    # Return unique phrases, limited to top_n
-    return list(set(phrases))[:top_n]
+    # Return unique phrases, limited to top_n (preserve order)
+    seen = set()
+    unique_phrases = []
+    for phrase in phrases:
+        if phrase not in seen:
+            seen.add(phrase)
+            unique_phrases.append(phrase)
+    return unique_phrases[:top_n]
README.md (2)

21-69: Consider adding language specifiers for better rendering.

The ASCII architecture diagram renders correctly but adding a language specifier would improve syntax highlighting and rendering consistency.

Apply this change:

-```
+```text
 ┌─────────────────┐
 │  Input Sources  │
 │ Video, Slides,  │

259-284: Consider adding language specifier for project structure.

The file tree would benefit from a language specifier for consistent rendering.

Apply this change:

-```
+```text
 examkit/
 ├── examkit/              # Main package
examkit/synthesis/ollama_client.py (7)

44-52: Use explicit Optional type annotation for logger parameter.

PEP 484 recommends using explicit Optional[T] or T | None rather than implicit Optional.

Apply this diff:

 def generate_completion(
     prompt: str,
     model: str = "llama3.2:8b",
     system_prompt: Optional[str] = None,
     temperature: float = 0.2,
     max_tokens: int = 900,
     offline: bool = True,
-    logger: logging.Logger = None
+    logger: Optional[logging.Logger] = None
 ) -> str:

101-104: Use logging.exception and preserve exception chain.

Replace logging.error with logging.exception to include the stack trace, and use raise ... from e to preserve the exception chain.

Apply this diff:

     except requests.exceptions.RequestException as e:
         if logger:
-            logger.error(f"Ollama request failed: {e}")
-        raise RuntimeError(f"Failed to generate completion: {e}")
+            logger.exception("Ollama request failed")
+        raise RuntimeError(f"Failed to generate completion: {e}") from e

107-113: Consider adding offline parameter for consistency.

generate_completion has an offline parameter to enforce availability checks, but generate_chat_completion always checks availability. Consider adding the parameter for API consistency.

Apply this diff if you want consistency:

 def generate_chat_completion(
     messages: list,
     model: str = "llama3.2:8b",
     temperature: float = 0.2,
     max_tokens: int = 900,
-    logger: logging.Logger = None
+    offline: bool = True,
+    logger: Optional[logging.Logger] = None
 ) -> str:

Then update the availability check to respect the parameter:

-    if not check_ollama_available():
+    if offline and not check_ollama_available():
         raise RuntimeError("Ollama not available")

107-113: Use explicit Optional type annotation for logger parameter.

Same issue as in generate_completion.

Apply this diff:

 def generate_chat_completion(
     messages: list,
     model: str = "llama3.2:8b",
     temperature: float = 0.2,
     max_tokens: int = 900,
-    logger: logging.Logger = None
+    logger: Optional[logging.Logger] = None
 ) -> str:

149-152: Use logging.exception and preserve exception chain.

Same logging issue as in generate_completion.

Apply this diff:

     except requests.exceptions.RequestException as e:
         if logger:
-            logger.error(f"Ollama chat request failed: {e}")
-        raise RuntimeError(f"Failed to generate chat completion: {e}")
+            logger.exception("Ollama chat request failed")
+        raise RuntimeError(f"Failed to generate chat completion: {e}") from e

155-165: Use explicit Optional type annotation for logger parameter.

Same issue in the pull_model function.

Apply this diff:

-def pull_model(model: str, logger: logging.Logger = None) -> bool:
+def pull_model(model: str, logger: Optional[logging.Logger] = None) -> bool:
     """
     Pull a model using Ollama CLI.

177-180: Use logging.exception for better error diagnostics.

Using logging.exception instead of logging.error will automatically include the stack trace.

Apply this diff:

     except Exception as e:
         if logger:
-            logger.error(f"Failed to pull model: {e}")
+            logger.exception("Failed to pull model")
         return False
examkit/synthesis/diagrams.py (2)

151-154: Use explicit Optional type annotation for logger parameter.

PEP 484 recommends explicit Optional[T] or T | None rather than implicit Optional.

Apply this diff:

 def generate_mermaid_diagram(
     mermaid_code: str,
     output_path: Path,
-    logger: logging.Logger = None
+    logger: Optional[logging.Logger] = None
 ) -> bool:

189-192: Use logging.exception for better error diagnostics.

Using logging.exception instead of logging.error automatically includes the stack trace.

This is addressed in the previous comment's diff, but if applied separately:

     except subprocess.CalledProcessError as e:
         if logger:
-            logger.error(f"Failed to generate Mermaid diagram: {e}")
+            logger.exception("Failed to generate Mermaid diagram")
         return False
config/templates/markdown/pdf_main.md.j2 (1)

10-12: Enhance anchor generation to handle special characters.

The anchor generation using {{ section.topic | lower | replace(' ', '-') }} only handles spaces. Topics containing special characters (parentheses, slashes, apostrophes, colons, etc.) could produce invalid or non-unique anchors, resulting in broken TOC links.

Consider creating a custom Jinja2 filter for robust anchor generation that:

  1. Converts to lowercase
  2. Replaces spaces and special characters with hyphens
  3. Removes or escapes problematic characters
  4. Ensures uniqueness (e.g., by appending a counter for duplicates)

Example implementation in the templating module:

import re

def slugify(text: str) -> str:
    """Convert text to a valid anchor slug."""
    # Convert to lowercase and replace spaces/special chars with hyphens
    slug = re.sub(r'[^\w\s-]', '', text.lower())
    slug = re.sub(r'[-\s]+', '-', slug)
    return slug.strip('-')

Then register it in your Jinja2 environment and use: {{ section.topic | slugify }}

tests/test_topic_mapping.py (5)

24-37: Consider adding edge case tests.

While the current test validates basic coverage calculation, consider adding tests for edge cases such as topics with zero chunks or 100% coverage to ensure robustness.


40-47: Tighten the assertion for more precise validation.

The assertion len(chunks) > 2 is quite weak. Given the 155-character second segment with max_chunk_size=50, you should expect at least 4-5 chunks total. Consider asserting a more specific range like assert len(chunks) >= 4.


50-58: Strengthen the assertion to validate exact merge behavior.

The assertion len(merged) < len(segments) is weak. Since the first two short segments should merge into one and the third remains separate, you should expect exactly 2 segments: assert len(merged) == 2.


61-73: LGTM! Consider testing additional citation types.

The test appropriately validates video citation formatting. For more comprehensive coverage, consider adding tests for slide and exam citation types mentioned in the PR objectives.


76-86: LGTM! Consider adding tests for invalid formulas.

The test validates basic QA checks. For better coverage, consider adding tests for invalid LaTeX syntax and multiple citation types to ensure robust error detection.

tests/test_parsers.py (2)

10-15: LGTM! Test covers common mark formats.

The test appropriately validates extraction of marks in different bracket styles and the absence case. For additional robustness, consider edge cases like multiple marks in one string.


18-35: Strengthen assertions to validate complete exam structure.

The assertion len(questions) >= 1 is weak for an exam with 2 questions and sub-parts. Consider asserting len(questions) >= 2 and validating that sub-parts (a, b) are correctly parsed.

examkit/logging_utils.py (1)

16-73: Validate log_file is not a directory before creating FileHandler.

If log_file points to an existing directory rather than a file path, FileHandler(log_file) at line 64 will fail with a confusing error. Consider adding a check:

if log_file:
    if log_file.exists() and log_file.is_dir():
        raise ValueError(f"log_file must be a file path, not a directory: {log_file}")
    log_file.parent.mkdir(parents=True, exist_ok=True)
    file_handler = logging.FileHandler(log_file)
ARCHITECTURE.md (1)

400-400: Minor: Optional style improvement for Docker section.

The static analysis tool flagged "Could be containerized" as lacking a subject. While acceptable for documentation, you could optionally revise to "The application could be containerized..." for more formal writing.

tests/test_render.py (5)

12-27: LGTM! Consider expanding test coverage for other section types.

The test validates basic markdown rendering. For more comprehensive coverage, consider testing additional section types mentioned in the PR (Derivation, Examples, Common Mistakes, Quick Revision).


30-57: Consider using pytest's tmp_path fixture for cleaner temp file handling.

The manual tempfile creation and cleanup works, but pytest's tmp_path fixture provides automatic cleanup and is more idiomatic:

def test_typst_wrapper_creation(tmp_path):
    """Test Typst wrapper creation."""
    from examkit.render.typst_renderer import create_typst_wrapper_for_markdown
    
    md_content = """# Test Title
...
"""
    temp_path = tmp_path / "test.md"
    temp_path.write_text(md_content)
    
    typst_content = create_typst_wrapper_for_markdown(temp_path)
    assert "= Test Title" in typst_content
    assert "== Section 1" in typst_content
    assert "=== Subsection" in typst_content

60-81: Consider using pytest's tmp_path fixture here as well.

Similar to the previous test, using tmp_path would simplify the temp file handling:

def test_config_loading(tmp_path):
    """Test configuration loading."""
    import yaml
    
    config_data = {
        "asr": {"model": "small"},
        "llm": {"model": "llama3.2:8b"},
        "offline": True
    }
    
    temp_path = tmp_path / "config.yml"
    temp_path.write_text(yaml.dump(config_data))
    
    config = ExamKitConfig.from_yaml(temp_path)
    assert config.asr.model == "small"
    assert config.llm.model == "llama3.2:8b"
    assert config.offline is True

84-89: Strengthen Jinja environment test to verify actual functionality.

The current test only checks that setup returns a non-None value. Consider verifying the environment's template loader path, or better yet, test that it can actually load and render a template:

def test_jinja_template_setup():
    """Test Jinja2 template environment setup."""
    from examkit.render.templater import setup_jinja_environment
    
    env = setup_jinja_environment()
    assert env is not None
    
    # Verify we can list templates
    templates = env.list_templates()
    assert len(templates) > 0
    
    # Or verify loader points to correct directory
    assert env.loader is not None

92-109: Tighten the tolerance for coverage mean calculation.

Line 103 uses pytest.approx(48.33, rel=0.1) which allows 10% error. For exact arithmetic (145/3 = 48.333...), you can use a much tighter tolerance like rel=0.01 or even exact comparison with more decimal places:

assert stats["mean"] == pytest.approx(48.333, rel=0.01)
examkit/config.py (2)

81-94: Add error handling for file I/O and YAML parsing.

The method lacks error handling for common failure cases. Consider adding:

@classmethod
def from_yaml(cls, path: Path) -> "ExamKitConfig":
    """
    Load configuration from a YAML file.

    Args:
        path: Path to the YAML configuration file.

    Returns:
        ExamKitConfig instance.
        
    Raises:
        FileNotFoundError: If the config file doesn't exist.
        ValueError: If the YAML is invalid or doesn't contain a dict.
    """
    if not path.exists():
        raise FileNotFoundError(f"Config file not found: {path}")
    
    with open(path, "r") as f:
        data = yaml.safe_load(f)
    
    if not isinstance(data, dict):
        raise ValueError(f"Config file must contain a YAML dict, got {type(data)}")
    
    return cls(**data)

96-104: Add directory creation and improve error handling.

The method should ensure the parent directory exists before writing:

def to_yaml(self, path: Path) -> None:
    """
    Save configuration to a YAML file.

    Args:
        path: Path to save the YAML configuration file.
        
    Raises:
        OSError: If writing the file fails.
    """
    path.parent.mkdir(parents=True, exist_ok=True)
    
    with open(path, "w") as f:
        yaml.dump(self.model_dump(), f, default_flow_style=False, sort_keys=False)
examkit/render/templater.py (9)

15-15: Use explicit Path | None type annotation.

Per PEP 484, implicit Optional is prohibited. Update the type hint to be explicit.

Apply this diff:

-def setup_jinja_environment(templates_dir: Path = None) -> Environment:
+def setup_jinja_environment(templates_dir: Path | None = None) -> Environment:

28-32: Consider autoescape setting for template security.

While this module generates Markdown/Typst (not HTML), explicitly setting autoescape=True or using select_autoescape() is a security best practice if templates might ever include user-controlled content.

Based on learnings


37-41: Remove unused config parameter or document its intended purpose.

The config parameter is declared but never used in the function body. Either remove it or add a comment explaining why it's reserved for future use.


54-61: Remove unnecessary f-string prefixes.

Lines 56, 58, 59, and 60 use f-strings without any placeholders. Use regular strings instead for clarity and minor performance improvement.

Apply this diff:

         f"# Exam Preparation Notes - {session_id}",
-        f"",
+        "",
         f"**Generated:** {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}",
-        f"",
-        f"---",
-        f""
+        "",
+        "---",
+        ""

77-101: Remove unnecessary f-string prefixes in section rendering.

Lines 78, 84, 88, 92, 96, and 100 use f-strings without placeholders.

Apply this diff to the subsection headers:

         if section.get("definition"):
-            lines.append(f"### Definition\n")
+            lines.append("### Definition\n")
             lines.append(f"{section['definition']}\n")
             if citations:
                 lines.append(f"*Sources: {citations}*\n")
 
         if section.get("key_formulas"):
-            lines.append(f"### Key Formulas\n")
+            lines.append("### Key Formulas\n")
             lines.append(f"{section['key_formulas']}\n")
 
         if section.get("derivation"):
-            lines.append(f"### Derivation\n")
+            lines.append("### Derivation\n")
             lines.append(f"{section['derivation']}\n")
 
         if section.get("examples"):
-            lines.append(f"### Worked Examples\n")
+            lines.append("### Worked Examples\n")
             lines.append(f"{section['examples']}\n")
 
         if section.get("mistakes"):
-            lines.append(f"### Common Mistakes\n")
+            lines.append("### Common Mistakes\n")
             lines.append(f"{section['mistakes']}\n")
 
         if section.get("revision"):
-            lines.append(f"### Quick Revision\n")
+            lines.append("### Quick Revision\n")
             lines.append(f"{section['revision']}\n")

108-112: Remove unused config parameter or document its intended purpose.

Same issue as in render_markdown_document - the config parameter is unused.


125-133: Remove unnecessary f-string prefixes.

Lines 128, 130, and 131 use f-strings without placeholders.

Apply this diff:

         "#import \"theme.typ\": *",
         "",
-        f"#show: doc => conf(",
-        f"  title: \"Exam Notes - {session_id}\",",
-        f"  date: datetime.today().display(),",
-        f"  doc",
+        "#show: doc => conf(",
+        f"  title: \"Exam Notes - {session_id}\",",
+        "  date: datetime.today().display(),",
+        "  doc",

158-158: Use explicit Path | None type annotation.

Same issue as line 15 - use explicit optional type per PEP 484.

Apply this diff:

-def load_template(template_name: str, templates_dir: Path = None) -> Template:
+def load_template(template_name: str, templates_dir: Path | None = None) -> Template:

176-176: Use explicit Path | None type annotation.

Same PEP 484 issue as lines 15 and 158.

Apply this diff:

 def render_section_template(
     template_name: str,
     context: Dict[str, Any],
-    templates_dir: Path = None
+    templates_dir: Path | None = None
 ) -> str:
examkit/qa/checks.py (1)

13-13: Use explicit logging.Logger | None type annotation.

Per PEP 484, implicit Optional is prohibited. This applies to all logger parameters in this file.

Apply this diff pattern to lines 13, 44, 86, 119, 150, 194, 195:

-def check_formula_compilation(content: str, logger: logging.Logger = None) -> Dict[str, Any]:
+def check_formula_compilation(content: str, logger: logging.Logger | None = None) -> Dict[str, Any]:
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3fe2ee4 and 83b53b0.

📒 Files selected for processing (70)
  • .env.example (1 hunks)
  • .github/PULL_REQUEST_TEMPLATE.md (1 hunks)
  • .gitignore (1 hunks)
  • ARCHITECTURE.md (1 hunks)
  • CONTRIBUTING.md (1 hunks)
  • IMPLEMENTATION_SUMMARY.md (1 hunks)
  • LICENSE (1 hunks)
  • Makefile (1 hunks)
  • README.md (1 hunks)
  • config/config.yml (1 hunks)
  • config/templates/markdown/pdf_main.md.j2 (1 hunks)
  • config/templates/markdown/section.md.j2 (1 hunks)
  • config/templates/prompts/compare.j2 (1 hunks)
  • config/templates/prompts/definition.j2 (1 hunks)
  • config/templates/prompts/derivation.j2 (1 hunks)
  • config/templates/prompts/fast_revision.j2 (1 hunks)
  • config/templates/prompts/mistakes.j2 (1 hunks)
  • config/templates/typst/main.typ (1 hunks)
  • config/templates/typst/theme.typ (1 hunks)
  • examkit/__init__.py (1 hunks)
  • examkit/asr/__init__.py (1 hunks)
  • examkit/asr/whisper_runner.py (1 hunks)
  • examkit/cli.py (1 hunks)
  • examkit/config.py (1 hunks)
  • examkit/ingestion/__init__.py (1 hunks)
  • examkit/ingestion/exam_parser.py (1 hunks)
  • examkit/ingestion/ingest.py (1 hunks)
  • examkit/ingestion/ocr.py (1 hunks)
  • examkit/ingestion/slides_parser.py (1 hunks)
  • examkit/ingestion/transcript_normalizer.py (1 hunks)
  • examkit/logging_utils.py (1 hunks)
  • examkit/nlp/__init__.py (1 hunks)
  • examkit/nlp/embeddings.py (1 hunks)
  • examkit/nlp/retrieval.py (1 hunks)
  • examkit/nlp/spacy_nlp.py (1 hunks)
  • examkit/nlp/splitter.py (1 hunks)
  • examkit/nlp/topic_mapping.py (1 hunks)
  • examkit/qa/__init__.py (1 hunks)
  • examkit/qa/checks.py (1 hunks)
  • examkit/render/__init__.py (1 hunks)
  • examkit/render/pandoc_renderer.py (1 hunks)
  • examkit/render/templater.py (1 hunks)
  • examkit/render/typst_renderer.py (1 hunks)
  • examkit/reports/__init__.py (1 hunks)
  • examkit/reports/coverage.py (1 hunks)
  • examkit/reports/export.py (1 hunks)
  • examkit/synthesis/__init__.py (1 hunks)
  • examkit/synthesis/citations.py (1 hunks)
  • examkit/synthesis/composer.py (1 hunks)
  • examkit/synthesis/diagrams.py (1 hunks)
  • examkit/synthesis/ollama_client.py (1 hunks)
  • examkit/synthesis/prompts.py (1 hunks)
  • examkit/utils/__init__.py (1 hunks)
  • examkit/utils/io_utils.py (1 hunks)
  • examkit/utils/math_utils.py (1 hunks)
  • examkit/utils/text_utils.py (1 hunks)
  • examkit/utils/timecode.py (1 hunks)
  • input/sample/exam/README.md (1 hunks)
  • input/sample/manifest.json (1 hunks)
  • input/sample/slides/README.md (1 hunks)
  • input/sample/topics/exam_topics.yml (1 hunks)
  • input/sample/topics/session_topics.yml (1 hunks)
  • input/sample/transcript/sample.vtt (1 hunks)
  • input/sample/video/README.md (1 hunks)
  • pyproject.toml (1 hunks)
  • tests/__init__.py (1 hunks)
  • tests/test_ingestion.py (1 hunks)
  • tests/test_parsers.py (1 hunks)
  • tests/test_render.py (1 hunks)
  • tests/test_topic_mapping.py (1 hunks)
🧰 Additional context used
🪛 checkmake (0.2.2)
Makefile

[warning] 3-3: Target body for "help" exceeds allowed length of 5 (9).

(maxbodylength)


[warning] 54-54: Target body for "clean" exceeds allowed length of 5 (6).

(maxbodylength)


[warning] 1-1: Missing required phony target "all"

(minphony)

🪛 dotenv-linter (4.0.0)
.env.example

[warning] 13-13: [UnorderedKey] The MAX_WORKERS key should go before the OFFLINE_MODE key

(UnorderedKey)


[warning] 18-18: [UnorderedKey] The LOGS_DIR key should go before the OUTPUT_DIR key

(UnorderedKey)

🪛 LanguageTool
CONTRIBUTING.md

[grammar] ~98-~98: Use a hyphen to join words.
Context: ...tion signatures - Docstrings: Google style docstrings for all public function...

(QB_NEW_EN_HYPHEN)

.github/PULL_REQUEST_TEMPLATE.md

[style] ~5-~5: Consider using a different verb for a more formal wording.
Context: ...mmary of the changes and which issue is fixed. ## Type of Change - [ ] Bug fix - [ ...

(FIX_RESOLVE)

ARCHITECTURE.md

[style] ~400-~400: To form a complete sentence, be sure to include a subject.
Context: ...opriate resources ### Docker (Future) Could be containerized with all dependencies ...

(MISSING_IT_THERE)

IMPLEMENTATION_SUMMARY.md

[style] ~178-~178: ‘vid’ is informal. Consider replacing it.
Context: ...ples, Mistakes, Revision ✅ - Citations: [vid hh:mm:ss], [slide N], [exam Q2b] ✅...

(VID)

🪛 markdownlint-cli2 (0.18.1)
README.md

3-3: Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)


21-21: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


259-259: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


326-326: Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)


336-336: Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)


349-349: Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)


355-355: Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)


361-361: Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)


407-407: Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)

IMPLEMENTATION_SUMMARY.md

24-24: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🪛 Ruff (0.14.3)
examkit/synthesis/composer.py

97-97: Avoid specifying long messages outside the exception class

(TRY003)


190-190: Do not catch blind exception: Exception

(BLE001)


191-191: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


205-205: Do not catch blind exception: Exception

(BLE001)


206-206: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


220-220: Do not catch blind exception: Exception

(BLE001)


221-221: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


235-235: Do not catch blind exception: Exception

(BLE001)


236-236: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


250-250: Do not catch blind exception: Exception

(BLE001)


251-251: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

examkit/nlp/topic_mapping.py

35-35: Unused function argument: chunks

(ARG001)


40-40: PEP 484 prohibits implicit Optional

Convert to T | None

(RUF013)

examkit/asr/whisper_runner.py

21-21: PEP 484 prohibits implicit Optional

Convert to T | None

(RUF013)


37-37: Avoid specifying long messages outside the exception class

(TRY003)


75-75: PEP 484 prohibits implicit Optional

Convert to T | None

(RUF013)

examkit/utils/math_utils.py

133-133: Consider moving this statement to an else block

(TRY300)

examkit/nlp/retrieval.py

17-17: PEP 484 prohibits implicit Optional

Convert to T | None

(RUF013)


45-45: Unused function argument: similarity_threshold

(ARG001)

examkit/synthesis/citations.py

24-24: PEP 484 prohibits implicit Optional

Convert to T | None

(RUF013)

examkit/cli.py

29-37: Do not perform function call typer.Option in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)


38-43: Do not perform function call typer.Option in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)


70-70: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


75-81: Do not perform function call typer.Option in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)


82-87: Do not perform function call typer.Option in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)


127-127: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


173-173: f-string without any placeholders

Remove extraneous f prefix

(F541)


180-180: subprocess call: check for execution of untrusted input

(S603)


180-180: Starting a process with a partial executable path

(S607)


185-185: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

examkit/qa/checks.py

13-13: PEP 484 prohibits implicit Optional

Convert to T | None

(RUF013)


44-44: PEP 484 prohibits implicit Optional

Convert to T | None

(RUF013)


86-86: PEP 484 prohibits implicit Optional

Convert to T | None

(RUF013)


119-119: PEP 484 prohibits implicit Optional

Convert to T | None

(RUF013)


150-150: PEP 484 prohibits implicit Optional

Convert to T | None

(RUF013)


175-175: Loop control variable symbol not used within loop body

(B007)


194-194: PEP 484 prohibits implicit Optional

Convert to T | None

(RUF013)


195-195: PEP 484 prohibits implicit Optional

Convert to T | None

(RUF013)

examkit/synthesis/diagrams.py

154-154: PEP 484 prohibits implicit Optional

Convert to T | None

(RUF013)


169-169: Starting a process with a partial executable path

(S607)


182-182: subprocess call: check for execution of untrusted input

(S603)


183-183: Starting a process with a partial executable path

(S607)


188-188: Consider moving this statement to an else block

(TRY300)


191-191: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

examkit/ingestion/ingest.py

28-28: Avoid specifying long messages outside the exception class

(TRY003)


32-32: Prefer TypeError exception for invalid type

(TRY004)


32-32: Avoid specifying long messages outside the exception class

(TRY003)


66-66: Consider moving this statement to an else block

(TRY300)


69-69: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


108-108: Unnecessary key check before dictionary access

Replace with dict.get

(RUF019)


118-118: Unnecessary key check before dictionary access

Replace with dict.get

(RUF019)


131-131: Unnecessary key check before dictionary access

Replace with dict.get

(RUF019)


144-144: Unnecessary key check before dictionary access

Replace with dict.get

(RUF019)

examkit/ingestion/ocr.py

39-39: Consider moving this statement to an else block

(TRY300)


41-41: Do not catch blind exception: Exception

(BLE001)


42-42: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


68-68: Consider moving this statement to an else block

(TRY300)


70-70: Do not catch blind exception: Exception

(BLE001)


71-71: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

examkit/render/pandoc_renderer.py

15-15: PEP 484 prohibits implicit Optional

Convert to T | None

(RUF013)


43-43: subprocess call: check for execution of untrusted input

(S603)


59-59: Do not catch blind exception: Exception

(BLE001)


61-61: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


74-74: Starting a process with a partial executable path

(S607)


78-78: Consider moving this statement to an else block

(TRY300)


79-79: Do not use bare except

(E722)

examkit/reports/coverage.py

15-15: PEP 484 prohibits implicit Optional

Convert to T | None

(RUF013)

examkit/render/typst_renderer.py

22-22: Starting a process with a partial executable path

(S607)


27-27: Consider moving this statement to an else block

(TRY300)


89-89: subprocess call: check for execution of untrusted input

(S603)


90-90: Starting a process with a partial executable path

(S607)


104-104: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


106-106: Do not catch blind exception: Exception

(BLE001)


107-107: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


176-176: Starting a process with a partial executable path

(S607)


178-178: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


182-182: subprocess call: check for execution of untrusted input

(S603)


183-192: Starting a process with a partial executable path

(S607)


206-206: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


208-208: Do not catch blind exception: Exception

(BLE001)


209-209: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

examkit/ingestion/slides_parser.py

33-33: Local variable images_dir is assigned to but never used

Remove assignment to unused variable images_dir

(F841)


120-120: Do not catch blind exception: Exception

(BLE001)


160-160: Avoid specifying long messages outside the exception class

(TRY003)

examkit/nlp/spacy_nlp.py

15-15: PEP 484 prohibits implicit Optional

Convert to T | None

(RUF013)

examkit/synthesis/ollama_client.py

22-22: Consider moving this statement to an else block

(TRY300)


23-23: Do not use bare except

(E722)


39-39: Do not use bare except

(E722)


39-40: try-except-pass detected, consider logging the exception

(S110)


51-51: PEP 484 prohibits implicit Optional

Convert to T | None

(RUF013)


69-69: Avoid specifying long messages outside the exception class

(TRY003)


99-99: Consider moving this statement to an else block

(TRY300)


103-103: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


104-104: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


104-104: Avoid specifying long messages outside the exception class

(TRY003)


112-112: PEP 484 prohibits implicit Optional

Convert to T | None

(RUF013)


128-128: Avoid specifying long messages outside the exception class

(TRY003)


151-151: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


152-152: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


152-152: Avoid specifying long messages outside the exception class

(TRY003)


155-155: PEP 484 prohibits implicit Optional

Convert to T | None

(RUF013)


170-170: subprocess call: check for execution of untrusted input

(S603)


171-171: Starting a process with a partial executable path

(S607)


176-176: Consider moving this statement to an else block

(TRY300)


177-177: Do not catch blind exception: Exception

(BLE001)


179-179: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

examkit/nlp/splitter.py

15-15: PEP 484 prohibits implicit Optional

Convert to T | None

(RUF013)


27-27: Avoid specifying long messages outside the exception class

(TRY003)


33-33: Consider moving this statement to an else block

(TRY300)


36-36: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


58-58: PEP 484 prohibits implicit Optional

Convert to T | None

(RUF013)

examkit/render/templater.py

15-15: PEP 484 prohibits implicit Optional

Convert to T | None

(RUF013)


28-28: By default, jinja2 sets autoescape to False. Consider using autoescape=True or the select_autoescape function to mitigate XSS vulnerabilities.

(S701)


40-40: Unused function argument: config

(ARG001)


56-56: f-string without any placeholders

Remove extraneous f prefix

(F541)


58-58: f-string without any placeholders

Remove extraneous f prefix

(F541)


59-59: f-string without any placeholders

Remove extraneous f prefix

(F541)


60-60: f-string without any placeholders

Remove extraneous f prefix

(F541)


78-78: f-string without any placeholders

Remove extraneous f prefix

(F541)


84-84: f-string without any placeholders

Remove extraneous f prefix

(F541)


88-88: f-string without any placeholders

Remove extraneous f prefix

(F541)


92-92: f-string without any placeholders

Remove extraneous f prefix

(F541)


96-96: f-string without any placeholders

Remove extraneous f prefix

(F541)


100-100: f-string without any placeholders

Remove extraneous f prefix

(F541)


111-111: Unused function argument: config

(ARG001)


128-128: f-string without any placeholders

Remove extraneous f prefix

(F541)


130-130: f-string without any placeholders

Remove extraneous f prefix

(F541)


131-131: f-string without any placeholders

Remove extraneous f prefix

(F541)


158-158: PEP 484 prohibits implicit Optional

Convert to T | None

(RUF013)


176-176: PEP 484 prohibits implicit Optional

Convert to T | None

(RUF013)

examkit/nlp/embeddings.py

25-25: PEP 484 prohibits implicit Optional

Convert to T | None

(RUF013)


37-37: Avoid specifying long messages outside the exception class

(TRY003)


50-50: PEP 484 prohibits implicit Optional

Convert to T | None

(RUF013)


80-80: PEP 484 prohibits implicit Optional

Convert to T | None

(RUF013)


94-94: Avoid specifying long messages outside the exception class

(TRY003)


138-138: pickle and modules that wrap it can be unsafe when used to deserialize untrusted data, possible security issue

(S301)


171-171: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)

🔇 Additional comments (85)
tests/__init__.py (1)

1-1: LGTM!

Standard package initializer with clear docstring.

examkit/qa/__init__.py (1)

1-1: LGTM!

Standard package initializer with clear docstring for the QA module.

examkit/reports/__init__.py (1)

1-1: LGTM!

Standard package initializer with clear docstring for the reports module.

examkit/synthesis/__init__.py (1)

1-1: LGTM!

Standard package initializer with clear, descriptive docstring for the synthesis module.

examkit/asr/__init__.py (1)

1-1: LGTM!

Standard package initializer with clear docstring for the ASR module.

examkit/render/__init__.py (1)

1-1: LGTM!

Standard package initializer with clear, descriptive docstring for the rendering module.

LICENSE (1)

1-21: LGTM!

Standard MIT License with appropriate copyright notice for the project.

.env.example (1)

1-21: LGTM! Well-organized environment template.

The configuration is logically grouped by category (Logging, Ollama, Processing, Cache), which is more maintainable than alphabetical ordering. The static analysis warnings about key ordering are pedantic; the current structure enhances readability.

examkit/nlp/__init__.py (1)

1-1: LGTM!

Clean package initializer with appropriate docstring.

input/sample/slides/README.md (1)

1-33: LGTM!

Clear documentation for sample slide structure. The content aligns well with the session topics defined in the PR.

input/sample/topics/session_topics.yml (1)

1-55: LGTM!

Well-structured topic data with consistent formatting. The keyword selections and weights are appropriate for the respective topics.

config/templates/prompts/derivation.j2 (1)

1-14: LGTM!

Clean Jinja2 template with clear instructions for derivation generation. The structured requirements ensure comprehensive output with proper citations.

.github/PULL_REQUEST_TEMPLATE.md (1)

1-33: LGTM!

Standard, comprehensive PR template. The static analysis suggestion about "fixed" vs "resolved" is overly pedantic and can be safely ignored.

examkit/__init__.py (1)

1-9: LGTM!

Clean package initialization with appropriate metadata. Version number matches pyproject.toml.

.gitignore (1)

1-68: LGTM!

Comprehensive .gitignore with well-organized sections. The PDF exception pattern (line 68) correctly preserves sample exam PDFs while ignoring generated outputs.

config/templates/prompts/fast_revision.j2 (1)

1-14: LGTM!

The template is well-structured with clear instructions for generating quick revision summaries. The format follows best practices for LLM prompting with structured context and explicit output requirements.

CONTRIBUTING.md (1)

1-177: LGTM!

The contribution guidelines are comprehensive, well-structured, and provide clear instructions for contributors. The document includes practical examples, conventional commit guidelines, and detailed development setup instructions.

Makefile (1)

1-60: LGTM!

The Makefile provides a comprehensive set of targets for development workflow automation. The commands are well-structured, use appropriate tools (Poetry, pytest, ruff, black), and include proper error handling in cleanup operations.

IMPLEMENTATION_SUMMARY.md (1)

1-373: LGTM!

The implementation summary provides a comprehensive and well-organized overview of the project's completion status. The document effectively communicates all delivered features, modules, tests, and acceptance criteria.

examkit/utils/math_utils.py (1)

9-23: Verify that the formula extraction handles edge cases correctly.

The regex patterns may produce unexpected results:

  • Display math $$...$$ contains inline math $...$, potentially causing duplicates.
  • Nested or escaped dollar signs aren't handled.

Please verify that this basic extraction approach meets the requirements. If more robust LaTeX parsing is needed, consider using a dedicated LaTeX parser library.

examkit/nlp/spacy_nlp.py (1)

136-143: Verify imperative detection logic.

Line 140 checks token.i == 0, which tests if the token is at document position 0, not sentence position 0. This means only the very first token in the entire document can trigger has_imperatives, not the first token of each sentence.

If the intent is to detect imperatives at the start of any sentence (not just the document), the logic should check sentence-relative position:

"has_imperatives": any(
    token.tag_ == "VB" and token.i == sent.start 
    for sent in doc.sents 
    for token in sent
)

Please confirm whether the current behavior is intentional or if this is a bug.

examkit/utils/io_utils.py (1)

1-124: LGTM!

The I/O utilities provide clean, well-documented wrappers around standard library operations. The functions appropriately ensure directories exist before write operations and follow consistent patterns.

examkit/ingestion/__init__.py (1)

1-1: LGTM!

Standard package initialization with appropriate docstring.

examkit/utils/__init__.py (1)

1-1: LGTM!

Standard package initialization with appropriate docstring.

input/sample/video/README.md (1)

1-20: LGTM!

Clear documentation for sample video assets with helpful ffmpeg test command. The note about .gitignore exclusion is valuable context.

config/templates/prompts/mistakes.j2 (1)

1-14: LGTM!

Well-structured Jinja2 template with clear instructions for mistake identification. The context iteration and citation requirements align with the RAG pipeline objectives.

config/templates/markdown/section.md.j2 (1)

1-43: LGTM!

Well-structured section template with appropriate conditional rendering. The sources attribution is limited to the definition block, which appears intentional given that citations are likely embedded in the section content itself per the RAG pipeline design.

README.md (1)

1-407: Excellent comprehensive documentation.

The README provides thorough coverage of features, architecture, installation, usage, and troubleshooting. Well-structured and accessible for new users.

examkit/synthesis/diagrams.py (4)

17-57: LGTM!

Clean flowchart generation with appropriate fallback when Graphviz is unavailable. Directory creation and rendering logic are correct.


60-104: LGTM!

Concept map generation follows the same clean pattern as flowchart creation. Space replacement in node IDs ensures valid Graphviz identifiers.


107-148: LGTM!

Hierarchy diagram generation is well-implemented with appropriate styling. Consistent with other diagram functions.


195-221: LGTM!

Simple and effective keyword-based diagram type detection. The heuristics are reasonable for identifying common diagram patterns.

input/sample/topics/exam_topics.yml (2)

1-5: LGTM! Clear exam metadata structure.

The exam metadata is well-structured with all necessary fields for test data.


8-24: Question ID assignments in exam_topics.yml are not processed by the application.

The codebase does not validate or use question ID assignments from the YAML topics file. topic_mapping.py maps text chunks (not questions) to topics using embeddings and calculates coverage based on chunk indices. coverage.py works with chunk counts, not question assignments. The question IDs (Q1a, Q2a, Q2b, etc.) are metadata that are never referenced in any Python code. The split questions (Q2a in linear_algebra, Q2b in calculus) have no downstream impact because the system processes text-to-topic mapping via embeddings, not question-to-topic assignments.

Likely an incorrect or invalid review comment.

input/sample/transcript/sample.vtt (1)

1-41: LGTM! Valid WebVTT format with appropriate test content.

The transcript follows proper WebVTT formatting with sequential timestamps and content that aligns well with the exam topics defined in the project. The gaps between some segments appear intentional (representing pauses in the lecture).

tests/test_ingestion.py (4)

11-27: LGTM! Comprehensive VTT parsing test.

The test properly validates VTT parsing including segment count, text content, and timestamp conversion. The assertions cover the key aspects of the parsed output.


30-44: LGTM! Proper SRT parsing test.

The test validates SRT format parsing with appropriate assertions for segment structure and content.


46-55: LGTM! Clean plain text parsing test.

The test correctly validates paragraph-based text parsing with clear assertions.


58-74: LGTM! Good coverage of manifest validation.

The test validates both positive and negative cases for manifest validation, properly using pytest.raises for error conditions.

input/sample/exam/README.md (1)

1-27: LGTM! Clear documentation for sample exam structure.

The documentation provides a well-structured example of exam format that aligns with the exam topics and testing requirements. The note at the end gives clear guidance for creating test fixtures.

config/templates/prompts/definition.j2 (1)

1-14: LGTM! Well-structured prompt template for definitions.

The Jinja2 template properly iterates over context chunks, provides clear instructions to the LLM, and includes appropriate constraints for generating exam-focused definitions with proper citations.

config/templates/prompts/compare.j2 (1)

1-19: LGTM! Well-designed comparison prompt template.

The template correctly handles two separate topic contexts and provides clear criteria for generating meaningful comparisons. The structure supports proper citation and contextual analysis.

config/templates/markdown/pdf_main.md.j2 (1)

37-43: LGTM! Proper conditional rendering for coverage data.

The conditional check for coverage_data and the table structure are well-designed, ensuring the coverage summary only renders when data is available.

input/sample/manifest.json (1)

11-11: No action needed; the review comment is incorrect.

The script output confirms both exam_topics.yml and session_topics.yml exist in input/sample/topics/. The manifest correctly references session_topics.yml, which exists and is also referenced in examkit/synthesis/composer.py. The IMPLEMENTATION_SUMMARY.md documents both files as intentional parts of the PR. No file-not-found error would occur.

Likely an incorrect or invalid review comment.

config/config.yml (6)

1-5: LGTM! ASR defaults are appropriate for offline transcription.

The faster-whisper engine with the "small" model provides a good balance between speed and accuracy for English transcription, and enabling VAD helps filter silence.


14-17: LGTM! Embedding configuration is well-suited for offline operation.

The all-MiniLM-L6-v2 model is an excellent choice for local embeddings—compact, fast, and provides good semantic similarity for retrieval tasks.


19-21: Ensure retrieval context fits within model's window.

With max_context_tokens at 2000 and max_tokens at 900, verify that the combined ~2900 tokens (plus prompt template overhead) fits within llama3.2:8b's context window. If the model's window is 8K or larger, this should be fine.


23-27: LGTM! PDF configuration is appropriate.

Typst as the primary engine is a modern, performant choice for PDF generation, and the classic theme with 11pt font provides good readability.


29-35: LGTM! Remaining configuration defaults are well-chosen.

Enabling Graphviz for diagrams, enforcing offline mode, and setting INFO-level logging align perfectly with the project's offline-first, production-grade objectives.


7-12: Verify 900-token limit aligns with typical derivation complexity in your use cases.

The max_tokens value of 900 applies per-section (derivation, worked example, etc. are generated independently), not per-topic. For standard derivations with step-by-step explanations, source citations, and logic clarifications, this budget may be tight for complex topics. Test with your target curriculum to confirm whether derivations are consistently completing within this limit, and consider increasing it if advanced topics frequently produce truncated explanations.

tests/test_topic_mapping.py (1)

12-21: LGTM! Test covers both explicit and auto-generated topic IDs.

The test appropriately validates that topics without an explicit id field receive an auto-generated identifier based on the name.

tests/test_parsers.py (2)

38-44: LGTM! Timecode conversion tests are comprehensive.

The test validates both directions of conversion and handles different time formats (with and without hours).


56-65: LGTM! Math utilities tests cover key scenarios.

The test validates extraction of both inline and display formulas, and appropriately tests both valid and invalid LaTeX syntax.

examkit/logging_utils.py (1)

76-86: LGTM! Module logger pattern follows best practices.

The function correctly creates hierarchical loggers that inherit configuration from the root "examkit" logger.

config/templates/typst/main.typ (6)

1-12: LGTM! Template structure and imports are well-organized.

The conf function signature with sensible defaults provides good flexibility for document configuration.


39-50: LGTM! Typography settings are appropriate for academic documents.

New Computer Modern font with justified paragraphs provides a professional, LaTeX-like appearance suitable for exam notes.


52-69: LGTM! Heading hierarchy is well-defined.

The font size progression (18pt → 14pt → 12pt) and weak page breaks for level 1 headings create good visual structure.


71-92: LGTM! Element styling follows academic conventions.

Code block formatting, list indentation, link colors, and equation numbering are all appropriately styled.


94-105: LGTM! Title page design is clean and professional.

Conditional rendering ensures the title page only appears when a title is provided, and the vertical spacing creates good visual balance.


107-118: LGTM! Table of contents and body layout are appropriate.

Depth 2 outline with automatic indentation and a page break before the body creates good document flow.

ARCHITECTURE.md (1)

1-424: LGTM! Architecture documentation is comprehensive and well-structured.

The documentation thoroughly covers all pipeline stages, modules, design decisions, and extension points. The ASCII diagram and detailed module descriptions provide excellent reference material.

examkit/config.py (2)

12-67: LGTM! Config classes are well-structured with appropriate type hints.

The use of Pydantic BaseModel with Literal for constrained choices (PDFConfig.engine) provides good type safety and validation.


69-80: LGTM! Main config class properly composes sub-configs.

Using Field(default_factory=...) for nested Pydantic models is the correct pattern to avoid shared mutable state.

examkit/nlp/topic_mapping.py (3)

12-31: LGTM!

The topic normalization logic is well-designed with appropriate fallbacks for missing fields.


75-108: LGTM!

Coverage calculation logic is correct with appropriate zero-division handling.


111-129: LGTM!

Simple and effective gap identification logic.

examkit/synthesis/composer.py (1)

28-64: LGTM!

Data loading logic is clean and well-structured with appropriate logging.

examkit/reports/export.py (3)

12-63: LGTM!

The report generation logic correctly aggregates coverage, citations, and QA metrics. The conditional pandas import on line 36 is appropriate for optional dependency handling.


66-104: LGTM!

Clean text formatting with appropriate sectioning and formatting.


107-116: LGTM!

Simple and effective JSON export.

examkit/utils/text_utils.py (1)

9-128: LGTM!

The text utility functions are well-implemented with appropriate use of regex for text processing. The comment on line 36 correctly notes that sentence splitting could be enhanced with spaCy for production use, which aligns with the project's NLP dependencies.

examkit/synthesis/citations.py (2)

52-101: LGTM!

Citation formatting logic correctly handles different source types and produces well-formatted citation strings. The deduplication in format_multiple_citations is appropriate for the expected citation counts.


103-151: LGTM!

Export and query methods are straightforward and correct.

examkit/nlp/embeddings.py (3)

124-140: Consider security implications of pickle deserialization.

Line 138 uses pickle.load() which can execute arbitrary code if loading untrusted data. While loading from the app's own cache is likely safe, document this assumption or add validation.

Consider adding a comment documenting the trust assumption:

     with open(metadata_path, 'rb') as f:
+        # NOTE: Loading from app-generated cache; do not use with untrusted files
         metadata = pickle.load(f)

25-103: LGTM!

Embedding and indexing functions are well-implemented with appropriate guards for optional dependencies. The flat L2 index choice on line 100 is suitable for simplicity as noted in the comment.


143-177: LGTM!

Search functionality correctly encodes the query, performs FAISS search, and enriches results with metadata and ranking.

examkit/synthesis/prompts.py (2)

8-119: LGTM!

The prompt templates are well-structured for generating exam preparation content. Each template provides clear instructions and maintains consistent citation format guidance.


122-155: LGTM!

Render functions correctly instantiate Jinja2 templates and pass through parameters. The implementation is clean and straightforward.

examkit/nlp/retrieval.py (3)

11-42: LGTM!

Context retrieval correctly constructs a composite query from topic metadata and delegates to the search function.


78-122: LGTM!

Source diversity ranking correctly interleaves chunks from different sources according to priority. The logic handles both prioritized and non-prioritized sources appropriately.


125-144: LGTM!

Confidence filtering is straightforward and correct.

examkit/qa/checks.py (4)

56-69: LGTM: Internal link checking logic is sound.

The link pattern matching and anchor generation correctly mirror the TOC generation in templater.py (line 67), ensuring consistency between link creation and validation.


99-111: LGTM: Keyword recall logic correctly handles edge cases.

The division-by-zero case is properly guarded with the ternary operator on line 110, defaulting to 100% when no keywords are required.


130-142: LGTM: Citation detection and counting logic is correct.

The regex pattern correctly extracts citation types, and the count() method accurately tallies each type in the results list.


208-228: LGTM: Check orchestration logic correctly aggregates results.

The all() expression with .get("passed", True) properly handles checks that may not have a "passed" field (like check_citation_presence), and the overall status aggregation is sound.

Comment on lines +14 to +37
set page(
paper: "a4",
margin: (x: 1in, y: 1in),
header: [
#set text(size: 9pt)
#grid(
columns: (1fr, 1fr),
align: (left, right),
[#title],
[Exam Preparation Notes]
)
#line(length: 100%, stroke: 0.5pt)
],
footer: [
#line(length: 100%, stroke: 0.5pt)
#set text(size: 9pt)
#grid(
columns: (1fr, 1fr),
align: (left, right),
[#date],
[Page #counter(page).display("1 of 1", both: true)]
)
]
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Verify page counter format is correct.

Line 34 uses counter(page).display("1 of 1", both: true). The format string "1 of 1" seems incorrect—it should likely be "1" or a pattern that displays current/total pages correctly. Verify this syntax against Typst documentation.


🌐 Web query:

What is the correct syntax for displaying page numbers with total pages in Typst counter.display function?

💡 Result:

Use the page counter's display with a numbering pattern and set both: true. Example:

#counter(page).display("1 / 1", both: true)

This displays current and total pages (pattern can be e.g. "1/1", "1 of I", etc.). [1][2]


Fix page counter format on line 34.

The format string "1 of 1" is hardcoded and will always display literally as "1 of 1" regardless of the current page or total pages. Use a numbering pattern like "1 / 1" instead, which properly displays current and total pages when both: true.

Change:

[Page #counter(page).display("1 of 1", both: true)]

To:

[Page #counter(page).display("1 / 1", both: true)]
🤖 Prompt for AI Agents
In config/templates/typst/main.typ around lines 14 to 37, the page footer
currently uses a hardcoded format string "1 of 1" which will always render
literally; replace the format passed to counter(page).display with a dynamic
numbering pattern such as "1 / 1" so the current and total page numbers render
correctly when both: true is used — update the Footer grid entry to call
counter(page).display with "1 / 1" instead of "1 of 1".

Comment on lines +68 to +83
question_match = re.match(r'(?:Question\s+)?(\d+)[.)]', line, re.IGNORECASE)
if question_match:
if current_question:
questions.append(current_question)

question_num = question_match.group(1)
current_question = {
"source": "exam",
"section": current_section,
"question_id": f"Q{question_num}",
"question_number": int(question_num),
"text": line,
"parts": [],
"marks": extract_marks(line)
}
current_part = None
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Broaden question detection to cover “Q1” headings.

parse_exam_structure advertises support for “Question 1, Q1, 1.” but the regex demands a trailing . or ); plain “Q1”/“Q12” (very common in exam PDFs) will slip through and their text will be glued onto the prior question instead of starting a new entry. Please relax the pattern so it also matches Q1, Q1:, etc., while keeping the existing cases.

-        question_match = re.match(r'(?:Question\s+)?(\d+)[.)]', line, re.IGNORECASE)
+        question_match = re.match(
+            r'^(?:Question\s+|Q\s*)?(\d+)(?:[:.)\s]|$)',
+            line,
+            re.IGNORECASE,
+        )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
question_match = re.match(r'(?:Question\s+)?(\d+)[.)]', line, re.IGNORECASE)
if question_match:
if current_question:
questions.append(current_question)
question_num = question_match.group(1)
current_question = {
"source": "exam",
"section": current_section,
"question_id": f"Q{question_num}",
"question_number": int(question_num),
"text": line,
"parts": [],
"marks": extract_marks(line)
}
current_part = None
question_match = re.match(
r'^(?:Question\s+|Q\s*)?(\d+)(?:[:.)\s]|$)',
line,
re.IGNORECASE,
)
if question_match:
if current_question:
questions.append(current_question)
question_num = question_match.group(1)
current_question = {
"source": "exam",
"section": current_section,
"question_id": f"Q{question_num}",
"question_number": int(question_num),
"text": line,
"parts": [],
"marks": extract_marks(line)
}
current_part = None

Comment on lines +75 to +103
def preprocess_image_for_ocr(image_path: Path, output_path: Path) -> Path:
"""
Preprocess image to improve OCR accuracy.

Args:
image_path: Path to input image.
output_path: Path for preprocessed image.

Returns:
Path to preprocessed image.
"""
if not TESSERACT_AVAILABLE:
return image_path

from PIL import ImageEnhance, ImageFilter

# Open and convert to grayscale
image = Image.open(image_path).convert('L')

# Enhance contrast
enhancer = ImageEnhance.Contrast(image)
image = enhancer.enhance(2.0)

# Sharpen
image = image.filter(ImageFilter.SHARPEN)

# Save preprocessed image
image.save(output_path)
return output_path
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Add error handling for image preprocessing operations.

The function lacks error handling for PIL operations, which could fail due to file I/O issues, corrupt images, or unsupported formats. This could cause the function to crash rather than gracefully degrading.

Apply this diff:

     from PIL import ImageEnhance, ImageFilter
 
-    # Open and convert to grayscale
-    image = Image.open(image_path).convert('L')
-
-    # Enhance contrast
-    enhancer = ImageEnhance.Contrast(image)
-    image = enhancer.enhance(2.0)
-
-    # Sharpen
-    image = image.filter(ImageFilter.SHARPEN)
-
-    # Save preprocessed image
-    image.save(output_path)
-    return output_path
+    try:
+        # Open and convert to grayscale
+        image = Image.open(image_path).convert('L')
+
+        # Enhance contrast
+        enhancer = ImageEnhance.Contrast(image)
+        image = enhancer.enhance(2.0)
+
+        # Sharpen
+        image = image.filter(ImageFilter.SHARPEN)
+
+        # Save preprocessed image
+        image.save(output_path)
+        return output_path
+    except Exception as e:
+        # Log error and return original path as fallback
+        import logging
+        logging.getLogger(__name__).error(f"Image preprocessing failed for {image_path}: {e}")
+        return image_path
🤖 Prompt for AI Agents
In examkit/ingestion/ocr.py around lines 75 to 103, the PIL preprocessing steps
(opening, converting, enhancing, filtering, and saving) have no error handling;
wrap the image operations in a try/except block (use with Image.open(...) as
image to ensure closure) and catch OSError and PIL.UnidentifiedImageError (or a
broad Exception as fallback), ensure output_path.parent.exists() or create it
before saving, log the error (or raise a descriptive exception) and on failure
return the original image_path so OCR can continue gracefully; also keep the
early return when TESSERACT_AVAILABLE is False.

Comment on lines +33 to +68
images_dir = ensure_dir(cache_dir / "slide_images")

for slide_num, slide in enumerate(prs.slides, start=1):
slide_info = {
"source": "slides",
"type": "pptx",
"slide_number": slide_num,
"title": "",
"content": [],
"notes": "",
"images": []
}

# Extract title
if slide.shapes.title:
slide_info["title"] = slide.shapes.title.text

# Extract text content
for shape in slide.shapes:
if hasattr(shape, "text") and shape.text:
text = shape.text.strip()
if text and text != slide_info["title"]:
slide_info["content"].append(text)

# Extract notes
if slide.has_notes_slide:
notes_slide = slide.notes_slide
if notes_slide.notes_text_frame:
slide_info["notes"] = notes_slide.notes_text_frame.text

# Extract images (basic - just note their presence)
for shape in slide.shapes:
if shape.shape_type == 13: # Picture type
image_name = f"slide_{slide_num}_img_{len(slide_info['images'])}.png"
slide_info["images"].append(image_name)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Save actual slide images instead of placeholder names.

Both PPTX and PDF branches currently emit fabricated filenames like slide_1_img_0.png / img_42 without ever writing the underlying image data. Downstream rendering and QA stages will therefore try to include diagrams that were never materialized, causing broken links in the generated study packs. Please persist the extracted blobs into cache_dir/slide_images (and return the real paths) when you encounter picture shapes or embedded PDF images.

@@
-import fitz  # PyMuPDF
-from PIL import Image
-from pptx import Presentation
+import fitz  # PyMuPDF
+from pptx import Presentation
+from pptx.enum.shapes import MSO_SHAPE_TYPE
@@
-    images_dir = ensure_dir(cache_dir / "slide_images")
+    images_dir = ensure_dir(cache_dir / "slide_images")
@@
-        for shape in slide.shapes:
-            if shape.shape_type == 13:  # Picture type
-                image_name = f"slide_{slide_num}_img_{len(slide_info['images'])}.png"
-                slide_info["images"].append(image_name)
+        for shape in slide.shapes:
+            if shape.shape_type == MSO_SHAPE_TYPE.PICTURE:
+                image = shape.image
+                ext = image.ext or "png"
+                image_path = images_dir / f"slide_{slide_num}_img_{len(slide_info['images'])}.{ext}"
+                with open(image_path, "wb") as outfile:
+                    outfile.write(image.blob)
+                slide_info["images"].append(str(image_path))
@@
-    doc = fitz.open(str(path))
-    slides_data = []
+    doc = fitz.open(str(path))
+    slides_data = []
+    images_dir = ensure_dir(cache_dir / "slide_images")
@@
-        image_list = page.get_images()
-        slide_info["images"] = [f"img_{img[0]}" for img in image_list]
+        image_list = page.get_images(full=True)
+        for img_index, img in enumerate(image_list):
+            xref = img[0]
+            pix = fitz.Pixmap(doc, xref)
+            try:
+                image_path = images_dir / f"slide_{page_num + 1}_img_{img_index}.png"
+                pix.save(str(image_path))
+                slide_info["images"].append(str(image_path))
+            finally:
+                pix = None

Also applies to: 131-133

🧰 Tools
🪛 Ruff (0.14.3)

33-33: Local variable images_dir is assigned to but never used

Remove assignment to unused variable images_dir

(F841)

Comment on lines +45 to +75
def deduplicate_chunks(chunks: List[Dict[str, Any]], similarity_threshold: float = 0.95) -> List[Dict[str, Any]]:
"""
Remove duplicate or highly similar chunks.

Args:
chunks: List of chunks.
similarity_threshold: Threshold for considering chunks as duplicates.

Returns:
Deduplicated chunks.
"""
if not chunks:
return []

unique_chunks = [chunks[0]]

for chunk in chunks[1:]:
text = chunk.get("text", "")
is_duplicate = False

for unique_chunk in unique_chunks:
unique_text = unique_chunk.get("text", "")
# Simple similarity check based on text overlap
if text == unique_text:
is_duplicate = True
break

if not is_duplicate:
unique_chunks.append(chunk)

return unique_chunks
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Unused similarity_threshold parameter - implement or remove.

The function declares a similarity_threshold parameter (line 45) but only performs exact text matching (line 68), ignoring the threshold. The docstring mentions "highly similar chunks," suggesting semantic similarity was intended.

Either implement similarity-based deduplication or simplify the API:

-def deduplicate_chunks(chunks: List[Dict[str, Any]], similarity_threshold: float = 0.95) -> List[Dict[str, Any]]:
+def deduplicate_chunks(chunks: List[Dict[str, Any]]) -> List[Dict[str, Any]]:
     """
-    Remove duplicate or highly similar chunks.
+    Remove duplicate chunks based on exact text matching.

     Args:
         chunks: List of chunks.
-        similarity_threshold: Threshold for considering chunks as duplicates.

     Returns:
         Deduplicated chunks.
🧰 Tools
🪛 Ruff (0.14.3)

45-45: Unused function argument: similarity_threshold

(ARG001)

🤖 Prompt for AI Agents
In examkit/nlp/retrieval.py around lines 45 to 75, the similarity_threshold
parameter is declared but unused and the function only does exact text equality;
either implement a similarity-based check using the threshold (e.g., compute a
normalized string similarity or cosine similarity over embeddings and treat
chunks as duplicates when similarity >= similarity_threshold) and document which
similarity metric is used, or remove the similarity_threshold parameter and
update the docstring and signature to reflect exact-match deduplication; ensure
tests and callers are updated accordingly.

Comment on lines +175 to +192
# Write Mermaid code to temp file
temp_file = output_path.with_suffix('.mmd')
with open(temp_file, 'w') as f:
f.write(mermaid_code)

# Generate diagram
try:
subprocess.run(
["mmdc", "-i", str(temp_file), "-o", str(output_path)],
capture_output=True,
check=True
)
temp_file.unlink() # Clean up temp file
return True
except subprocess.CalledProcessError as e:
if logger:
logger.error(f"Failed to generate Mermaid diagram: {e}")
return False
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Ensure temp file cleanup on subprocess failure.

The temp file is only deleted in the success path (line 187). If the subprocess fails, the .mmd file will remain on disk.

Apply this diff to ensure cleanup:

     # Write Mermaid code to temp file
     temp_file = output_path.with_suffix('.mmd')
-    with open(temp_file, 'w') as f:
-        f.write(mermaid_code)
-
-    # Generate diagram
     try:
+        with open(temp_file, 'w') as f:
+            f.write(mermaid_code)
+
+        # Generate diagram
         subprocess.run(
             ["mmdc", "-i", str(temp_file), "-o", str(output_path)],
             capture_output=True,
             check=True
         )
-        temp_file.unlink()  # Clean up temp file
         return True
     except subprocess.CalledProcessError as e:
         if logger:
-            logger.error(f"Failed to generate Mermaid diagram: {e}")
+            logger.exception("Failed to generate Mermaid diagram")
         return False
+    finally:
+        # Always clean up temp file
+        if temp_file.exists():
+            temp_file.unlink()
🧰 Tools
🪛 Ruff (0.14.3)

182-182: subprocess call: check for execution of untrusted input

(S603)


183-183: Starting a process with a partial executable path

(S607)


188-188: Consider moving this statement to an else block

(TRY300)


191-191: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

🤖 Prompt for AI Agents
In examkit/synthesis/diagrams.py around lines 175 to 192, the temp .mmd file is
only removed on the success path so it will remain if subprocess.run raises
CalledProcessError; modify the flow to always attempt to unlink the temp file by
moving temp_file.unlink() into a finally block (or call unlink() in both success
and except paths) so the temporary file is removed regardless of subprocess
outcome, and keep logging and the boolean return behavior intact.

Comment on lines +20 to +24
try:
response = requests.get("http://localhost:11434/api/tags", timeout=2)
return response.status_code == 200
except:
return False
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Replace bare except with specific exception handling.

The bare except clause catches all exceptions including system exits and keyboard interrupts, which can mask serious issues and make debugging difficult.

Apply this diff:

     try:
         response = requests.get("http://localhost:11434/api/tags", timeout=2)
         return response.status_code == 200
-    except:
+    except (requests.exceptions.RequestException, OSError):
         return False
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
try:
response = requests.get("http://localhost:11434/api/tags", timeout=2)
return response.status_code == 200
except:
return False
try:
response = requests.get("http://localhost:11434/api/tags", timeout=2)
return response.status_code == 200
except (requests.exceptions.RequestException, OSError):
return False
🧰 Tools
🪛 Ruff (0.14.3)

22-22: Consider moving this statement to an else block

(TRY300)


23-23: Do not use bare except

(E722)

🤖 Prompt for AI Agents
In examkit/synthesis/ollama_client.py around lines 20 to 24, replace the bare
except with targeted handling for HTTP/request-related errors: catch
requests.exceptions.RequestException (or subclass like Timeout, ConnectionError)
instead of a bare except, e.g. use "except
requests.exceptions.RequestException:" (or "except
requests.exceptions.RequestException as e:" if you want to log the error) and
return False; this ensures only network/request errors are swallowed while
allowing system exceptions to propagate.

Comment on lines +34 to +41
try:
response = requests.get("http://localhost:11434/api/tags", timeout=5)
if response.status_code == 200:
data = response.json()
return [model["name"] for model in data.get("models", [])]
except:
pass
return []
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Replace bare except with specific exception handling and add logging.

The bare except silently suppresses all exceptions. Consider catching specific exceptions and logging failures for debugging.

Apply this diff:

     try:
         response = requests.get("http://localhost:11434/api/tags", timeout=5)
         if response.status_code == 200:
             data = response.json()
             return [model["name"] for model in data.get("models", [])]
-    except:
-        pass
+    except requests.exceptions.RequestException:
+        # Ollama not available or connection failed
+        pass
     return []
🧰 Tools
🪛 Ruff (0.14.3)

39-39: Do not use bare except

(E722)


39-40: try-except-pass detected, consider logging the exception

(S110)

Comment on lines +10 to +33
[tool.poetry.dependencies]
python = ">=3.11,<4.0"
typer = {extras = ["all"], version = "^0.9.0"}
rich = "^13.7.0"
pydantic = "^2.5.0"
pyyaml = "^6.0.1"
tqdm = "^4.66.1"
faster-whisper = "^0.10.0"
ffmpeg-python = "^0.2.0"
pymupdf = "^1.23.8"
pdfminer-six = "^20221105"
python-pptx = "^0.6.23"
pytesseract = "^0.3.10"
Pillow = "^10.1.0"
sentence-transformers = "^2.2.2"
faiss-cpu = "^1.7.4"
spacy = "^3.7.2"
matplotlib = "^3.8.2"
pandas = "^2.1.4"
numpy = "^1.26.2"
scikit-learn = "^1.3.2"
jinja2 = "^3.1.2"
requests = "^2.31.0"
graphviz = "^0.20.1"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Verify dependency versions for security and compatibility.

Several dependencies are pinned to specific minor versions from 2023-2024. Ensure these versions are still maintained and free from known vulnerabilities, especially for security-sensitive packages like requests, Pillow, and PyMuPDF.

Run the following script to check for security advisories:


🏁 Script executed:

#!/bin/bash
# Description: Check for security advisories on key dependencies

echo "Checking security advisories for key dependencies..."

# Check requests
gh api graphql -f query='
{
  securityVulnerabilities(first: 3, ecosystem: PIP, package: "requests") {
    nodes {
      advisory { summary severity publishedAt }
      vulnerableVersionRange
      firstPatchedVersion { identifier }
    }
  }
}'

# Check Pillow
gh api graphql -f query='
{
  securityVulnerabilities(first: 3, ecosystem: PIP, package: "Pillow") {
    nodes {
      advisory { summary severity publishedAt }
      vulnerableVersionRange
      firstPatchedVersion { identifier }
    }
  }
}'

# Check PyMuPDF
gh api graphql -f query='
{
  securityVulnerabilities(first: 3, ecosystem: PIP, package: "PyMuPDF") {
    nodes {
      advisory { summary severity publishedAt }
      vulnerableVersionRange
      firstPatchedVersion { identifier }
    }
  }
}'

Length of output: 1556


Update requests dependency to fix active security vulnerability.

The project pins requests = "^2.31.0", which is vulnerable to a credential leak vulnerability published June 2025 (CVE affecting < 2.32.4). This must be updated to at least "^2.32.4" to include the required security patch.

Additionally, the current range excludes the fix for a session verification bypass (< 2.32.0, May 2024). Consider updating to "^2.32.4" to address both vulnerabilities.

Pillow and PyMuPDF have no current known vulnerabilities at their pinned versions.

🤖 Prompt for AI Agents
In pyproject.toml around lines 10 to 33, the pinned requests dependency
(^2.31.0) is vulnerable and must be updated; change the requests entry to use at
least version 2.32.4 (e.g. "^2.32.4") so the CVE fixes for credential leak and
session verification bypass are included, then run poetry lock/install to update
the lockfile and CI to ensure the new version is used.

Comment on lines +47 to +53
def test_text_cleaning():
"""Test text utilities."""
from examkit.utils.text_utils import clean_text, normalize_whitespace

dirty_text = " Hello world \n\n "
assert clean_text(dirty_text) == "Hello world"
assert normalize_whitespace(dirty_text) == "Hello world"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Verify functional differences between clean_text and normalize_whitespace.

Both functions appear to produce identical output in this test. If they serve different purposes (e.g., one removes special chars, the other only handles whitespace), consider tests that demonstrate those differences.


🏁 Script executed:

#!/bin/bash
# Description: Check the implementation differences between clean_text and normalize_whitespace

# Find the text_utils module and examine both functions
ast-grep --pattern $'def clean_text($$$) {
  $$$
}'

ast-grep --pattern $'def normalize_whitespace($$$) {
  $$$
}'

# Look for their implementations
rg -A 10 "^def clean_text" examkit/utils/text_utils.py
rg -A 10 "^def normalize_whitespace" examkit/utils/text_utils.py

Length of output: 406



---

🏁 Script executed:

```shell
#!/bin/bash
# Get the full implementation of both functions
# First, find the file location
fd -name "text_utils.py" -type f

# Then view enough context to see complete implementations
rg -A 20 "^def clean_text" 
rg -A 15 "^def normalize_whitespace"

Length of output: 2006


Test does not demonstrate the functional differences between clean_text and normalize_whitespace.

clean_text removes excessive whitespace, control characters ([\x00-\x1f\x7f-\x9f]), and strips edges, while normalize_whitespace only normalizes whitespace via ' '.join(text.split()). The test input contains no control characters, so both functions produce identical output. Consider adding a test case with control characters to verify clean_text properly removes them, or separate the assertions to document their distinct responsibilities.

🤖 Prompt for AI Agents
In tests/test_parsers.py around lines 47 to 53, the test uses input without
control characters so clean_text and normalize_whitespace behave identically;
update the test to include a separate case with control characters (e.g. include
'\x00' or '\x07' in the dirty input) and assert that clean_text removes those
control characters while normalize_whitespace does not (or preserves them),
and/or split the current assertions into two focused cases so each function's
distinct behavior is verified and documented.

@thecoder8890 thecoder8890 merged commit 8233866 into main Nov 10, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Master Build: Production-Grade Python "ExamKit" Project Generator (macOS, Offline, OSS)

2 participants