Skip to content

Conversation

@shbhmexe
Copy link
Contributor

This pull request introduces several improvements and fixes to data handling and output formatting across multiple scripts, primarily focused on ensuring compatibility with binary file operations, improving error handling, and enhancing output data completeness.

File handling and compatibility improvements:

  • Changed file operations for committags.db to use binary mode ('wb' and 'rb') in both src/committags.py and src/linetags.py, ensuring compatibility with Python 3's pickle module requirements. [1] [2]

Error handling and debugging:

  • In src/cncfdm.py, improved handling of missing affiliations by conditionally invoking the debugger only if DebugHalt is set; otherwise, a descriptive error message is written to stderr and the entry is skipped.

Output and data completeness:

  • Updated the CSV output in AllAffsCSV (src/database.py) to include the source field, providing more comprehensive data in each row.

Default behavior and consistency:

  • Modified the default date range for logparser.LogPatchSplitter in src/gitdm.py to match the wide-open range used in cncfdm.py, ensuring consistent patch processing across scripts.…anging outputs
  • reports.py: fix NameError in ReportByReports (use reported instead of undefined report).
  • gitdm.py: pass default date range to LogPatchSplitter to match current API and avoid TypeError.
  • database.py: include source column for alias rows in AllAffsCSV to keep CSV shape consistent.
  • committags.py: write pickle in binary mode (wb) for compatibility.
  • linetags.py: read pickle in binary mode (rb) for compatibility.
  • cncfdm.py: avoid unconditional pdb on missing affiliation; only break into debugger with -X, otherwise log and continue.

These are no-op for normal outputs except: AllAffsCSV alias rows now include the source column as intended.

…anging outputs

 - reports.py: fix NameError in ReportByReports (use `reported` instead of undefined `report`).
  - gitdm.py: pass default date range to LogPatchSplitter to match current API and avoid TypeError.
  - database.py: include `source` column for alias rows in AllAffsCSV to keep CSV shape consistent.
  - committags.py: write pickle in binary mode (wb) for compatibility.
  - linetags.py: read pickle in binary mode (rb) for compatibility.
  - cncfdm.py: avoid unconditional pdb on missing affiliation; only break into debugger with -X, otherwise log and continue.

  These are no-op for normal outputs except: AllAffsCSV alias rows now include the `source` column as intended.

Signed-off-by: Shubham Shukla <shubhushukla586@gmail.com>
@lukaszgryglicki lukaszgryglicki self-assigned this Nov 24, 2025
Copy link
Member

@lukaszgryglicki lukaszgryglicki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@lukaszgryglicki lukaszgryglicki merged commit 8b69f6a into cncf:master Nov 24, 2025
1 check passed
@shbhmexe shbhmexe deleted the fix/logparser-call-reports-bug-csv-pickle-pdb branch November 24, 2025 06:49
@shbhmexe
Copy link
Contributor Author

/lgtm

Hey Łukasz, I’ve fixed all the issues, bugs, and improvements I could find in the repo, and I’ll continue contributing whenever I spot more. I just wanted to ask, if there are any features you’d like to add, or any medium-level issues/help-wanted tasks, feel free to let me know. I really enjoy working on this project and would be happy to take them up and help further.

@lukaszgryglicki
Copy link
Member

Hi, thanks for this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants