Skip to content

Refactor the HWA Jira issues CSV generator script#2

Merged
glaubergedoz merged 1 commit intomainfrom
refactor/hwa-jira-issues-generator
Dec 2, 2025
Merged

Refactor the HWA Jira issues CSV generator script#2
glaubergedoz merged 1 commit intomainfrom
refactor/hwa-jira-issues-generator

Conversation

@glaubergedoz
Copy link
Copy Markdown
Member

Refactor: HelloWorldAtlas Jira Issues CSV Generator

Overview

This change refactors the Python script used to generate Jira issues from a CSV file containing the applications in the HelloWorldAtlas product line from GedozTech Lab.

The goal is to keep the script simple and focused on its original purpose, while aligning it with modern Python best practices and making it easier to maintain, understand, and evolve.

Goals

  • Improve readability and maintainability.
  • Reduce duplication (DRY) and make the structure more declarative.
  • Align with Python best practices (PEP 8, type hints, clear function boundaries).
  • Make CLI usage explicit and robust.
  • Preserve the original behavior and Jira import semantics.

Key Changes

1. Structure and CLI

  • Introduced argparse for command-line parsing instead of manually using sys.argv:

    • Adds --help support with usage documentation.
    • Provides more robust handling of required arguments.
    • Makes the script easier to extend with new flags or options in the future.
  • Extracted logic into dedicated functions:

    • split_multi() — parses semicolon-separated multi-value fields.
    • read_source_rows() — loads the input CSV into memory.
    • compute_max_multi_counts() — determines the maximum number of Stack and Label values across all rows.
    • build_header() — builds the CSV header dynamically based on the computed counts.
    • build_base_row() — constructs the fixed part of each issue row.
    • generate_issue_rows_for_application() — generates all issues for a single application.

    This separation makes the script much easier to follow and test in isolation.

2. Readability and Maintainability

  • Introduced constants for input column names:

    COL_APPLICATION = "Application"
    COL_PRODUCT_CATEGORY = "Product Category"
    COL_PRODUCT_TYPE = "Product Type"
    COL_PLATFORM = "Platform"
    COL_LAYER = "Layer"
    COL_OS = "OS"
    COL_COMPLEXITY = "Complexity"
    COL_LANGUAGE = "Language"
    COL_STACK = "Stack"
    COL_LABELS = "Labels"

    This helps avoid typos and centralizes any future schema changes in a single place.

  • Replaced repeated issue blocks with a data-driven structure:

    Previously, each of the five issues per application was implemented as a separate block of code, repeating the same boilerplate for building rows.

    The refactored version uses a simple list of issue definitions:

    issue_definitions = [
        (
            "Task",
            "[{app}] Conduct product and tech discovery",
            "...",
            "Discovery",
        ),
        (
            "User Story",
            "[{app}] Implement the solution",
            "...",
            "Delivery",
        ),
        ...
    ]

    Each entry holds (issue_type, summary_template, description, components), and the script loops over this list to generate the rows. The {app} placeholder is filled dynamically for each application. This approach reduces duplication and makes it easy to add, remove, or modify issues in the future.

  • Centralized the construction of the base issue row:

    The build_base_row() helper builds the fixed part of a row (fields that do not depend on the specific issue type except for issue_type, summary, description, and components):

    base_row = build_base_row(
        issue_type=issue_type,
        summary=summary,
        description=description,
        app=app,
        product_category=product_category,
        product_type=product_type,
        platform=platform,
        layer=layer,
        os_value=os_value,
        complexity=complexity,
        language=language,
        components=components,
        priority=priority,
    )

    This keeps the logic consistent across all issues.

3. PEP 8, Type Hints, and Style

  • Added type hints throughout the script:

    • For example: split_multi(value: str | None) -> List[str], read_source_rows(input_path: Path) -> List[Dict[str, str]], etc.
    • This makes the script easier to understand and improves editor/IDE support and static analysis.
  • Improved code formatting and line-wrapping:

    • Broke long descriptions and strings into multiple lines for better readability.
    • Ensured consistent spacing and layout between top-level functions.
  • Imports are kept minimal and standard-library only (argparse, csv, sys, pathlib, typing, collections.abc).

4. Robustness and Error Handling

  • Added explicit error handling for file operations:

    • If the input CSV file does not exist, the script prints a clear error message and exits with a non-zero status code.
    • If there is a permission issue, the script handles it similarly.
  • Graceful handling of empty input:

    • If the CSV has no data rows, the script prints a message and exits without attempting to generate an output file with zero issues.

5. Jira-Specific Behavior (Stack and Label Columns)

  • Preserved the multi-value behavior for Jira fields:

    • The output header defines multiple columns with the exact same name "Stack" and "Label":

      stack_columns = ["Stack"] * max_stack_count
      label_columns = ["Label"] * max_label_count
    • This is intentional: Jira treats repeated column names as multiple values for the same field, consolidating them into the Stack and Labels fields of the issue.

    • Comments were added in the code to document this behavior and avoid confusion for future readers.

  • Multi-value parsing remains robust:

    • The split_multi() helper:

      • Accepts None or empty strings.
      • Splits by semicolons.
      • Trims whitespace.
      • Ignores empty entries.
    • This ensures that values like "; Go ; Python;; JS " are normalized to ["Go", "Python", "JS"].

6. How to Run

After the refactor, usage remains simple and explicit:

python generate_helloworldatlas_issues.py input.csv output.csv
  • input.csv: the source spreadsheet with HelloWorldAtlas applications.
  • output.csv: the generated Jira issues ready for import.

Use --help to display usage information:

python generate_helloworldatlas_issues.py --help

@glaubergedoz glaubergedoz merged commit 7dea7a1 into main Dec 2, 2025
@glaubergedoz glaubergedoz deleted the refactor/hwa-jira-issues-generator branch December 2, 2025 21:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant