Skip to content

Title: IndexError: list index out of range in mardown_parser.py:54 when README has no parseable header content #897

@priya-gitTest

Description

@priya-gitTest

Environment
SOMEF version: 0.9.13
Python: 3.11.14
OS: Ubuntu (GitHub Actions ubuntu-latest)
Summary
For certain README files, header_analysis.py crashes with an IndexError inside mardown_parser.extract_content_per_header. The error is swallowed by a logging formatter bug (secondary crash in logging/init.py), making it very hard to diagnose.

Error (from stderr)

--- Logging error ---
Traceback (most recent call last):
File ".../somef/header_analysis.py", line 212, in extract_categories
data, none_header_content = extract_header_content(repo_data)
File ".../somef/header_analysis.py", line 103, in extract_header_content
content, none_header_content = mardown_parser.extract_content_per_header(text, headers)
File ".../somef/parser/mardown_parser.py", line 54, in extract_content_per_header
top = keys[0]
~~~~^^^
IndexError: list index out of range

During handling of the above exception, another exception occurred:

File ".../logging/init.py", line 377, in getMessage
msg = msg % self.args
TypeError: not all arguments converted during string formatting
Root cause
mardown_parser.extract_content_per_header (line 54) does:

top = keys[0]
without first checking if keys is empty. When a README has headers that produce no parseable content (e.g. headers with only badge/image lines, or very short single-line READMEs), the keys dict/list is empty and the index access crashes.

The TypeError in the logging handler is a secondary issue — the exception message contains %s-style format characters that conflict with the logging formatter, masking the real traceback in normal log output.

Suggested fix

mardown_parser.py line ~53

if not keys:
return {}, none_header_content # or whatever the appropriate empty return is
top = keys[0]

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions