Skip to content

Conversation

@elisafalk
Copy link
Collaborator

@elisafalk elisafalk commented Nov 20, 2025

This PR introduces the foundational structure for the Natural Language Generation (NLG) engine.

Changes

  • Created the new directory backend/app/services/nlg/ to house NLG-related components.
  • Added nlg_engine.py which defines the base NLGEngine class.
  • Included generate_section_text and generate_full_report methods, preparing for future LLM provider integration.
  • Ensured the design supports structured JSON outputs for seamless integration with the orchestrator.

Summary by CodeRabbit

  • New Features
    • Introduced an extensible natural language generation framework for report creation supporting multiple provider implementations.
    • Added standardized interfaces for generating text for individual report sections and complete reports.
    • Ensures consistent JSON-formatted output across generated content for improved interoperability.
    • Includes clear documentation for the new interfaces and expected output structures.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link

coderabbitai bot commented Nov 20, 2025

Walkthrough

Adds a new abstract base class NLGEngine in backend/app/services/nlg/nlg_engine.py with abstract methods for generating section-level and full-report text and a helper to JSON-format outputs.

Changes

Cohort / File(s) Summary
NLG Engine Interface
backend/app/services/nlg/nlg_engine.py
New abstract base class NLGEngine with abstract methods generate_section_text(section_id: str, raw_data: dict) -> str and generate_full_report(data: dict) -> str, plus helper _format_output(content: dict) -> str. Docstrings describe arguments and expected JSON-formatted return structures.

Sequence Diagram(s)

sequenceDiagram
    participant Caller
    participant NLGEngine
    participant JSON

    Caller->>NLGEngine: generate_section_text(section_id, raw_data)
    activate NLGEngine
    NLGEngine-->>NLGEngine: validate inputs\ngenerate content (abstract)
    NLGEngine->>JSON: _format_output(content)
    JSON-->>NLGEngine: json_string
    NLGEngine-->>Caller: json_string
    deactivate NLGEngine

    rect rgb(240,255,240)
      Note right of NLGEngine: generate_full_report follows same flow\nbut accepts full report data
    end
Loading

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

  • Check abstract method signatures and docstrings for clarity.
  • Verify _format_output() handles serialization and edge cases (non-serializable values).

Poem

🐰 A tiny engine, clean and bright,
I hop and stitch the words just right,
Sections, reports, wrapped in JSON cheer,
A rabbit's whisper, crisp and clear,
Pluggable magic, springing near.

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Feat: Implement NLG Engine base structure' directly and clearly summarizes the main change: introducing the foundational base class structure for the NLG engine as described in the PR objectives.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/nlg-engine-setup

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4f13a1c and 16a6e76.

📒 Files selected for processing (1)
  • backend/app/services/nlg/nlg_engine.py (1 hunks)

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
backend/app/services/nlg/nlg_engine.py (1)

41-45: Consider adding error handling for robustness.

The _format_output helper is straightforward, but json.dumps() will raise a TypeError if content contains non-serializable objects (e.g., datetime, custom objects). While this may be acceptable for a base class where implementations control the content, consider adding error handling or documentation about serialization requirements.

Optional enhancement:

 def _format_output(self, content: dict) -> str:
     """
     Helper method to ensure all outputs are structured as JSON.
+    
+    Args:
+        content (dict): A dictionary containing only JSON-serializable types.
+    
+    Raises:
+        TypeError: If content contains non-serializable objects.
     """
     return json.dumps(content)

Alternatively, you could add a custom serialization handler:

def _format_output(self, content: dict) -> str:
    """
    Helper method to ensure all outputs are structured as JSON.
    """
    try:
        return json.dumps(content, default=str)
    except TypeError as e:
        raise ValueError(f"Content contains non-serializable data: {e}")
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 16a6e76 and d64dc6b.

📒 Files selected for processing (1)
  • backend/app/services/nlg/nlg_engine.py (1 hunks)
🔇 Additional comments (4)
backend/app/services/nlg/nlg_engine.py (4)

1-9: Well-structured foundation for the NLG engine.

The abstract base class pattern is appropriate for defining a pluggable interface, and the docstring clearly communicates the purpose. The imports are minimal and correct.


11-24: Excellent improvements to return type and documentation.

The return type annotation has been corrected to -> str, and the docstring now explicitly states that the method returns "A JSON-formatted string" with a clear example showing the usage of _format_output. This addresses all the concerns raised in previous reviews.


26-39: Well-documented abstract method with clear contract.

The return type has been corrected to -> str, and the docstring provides a comprehensive example showing the expected JSON structure. The example clearly demonstrates how implementations should use _format_output for consistent JSON serialization.


1-45: Package structure is correctly configured.

The NLG package is properly set up with __init__.py files at all required levels (backend/app/__init__.py, backend/app/services/__init__.py, and backend/app/services/nlg/__init__.py). The module structure supports proper imports of NLGEngine and related components.

@felixjordandev
Copy link
Collaborator

the structure for the NLG engine looks solid; merging it now! 🚀

@felixjordandev felixjordandev merged commit 207ebb5 into main Nov 20, 2025
1 check passed
@felixjordandev felixjordandev deleted the feat/nlg-engine-setup branch November 20, 2025 08:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants