Skip to content

How to print a statement to the user when a list index out of range when using the section object from python-docx #1079

@madden99

Description

@madden99

I am working on some code to extract the footer and header from a Word document. I am using python 3.10.4 and the python-docx library (version 0.8.1.1) on PyCharm (Community Edition 2021.3.3). Only the footer and header from the second section is extracted. This works as long as the document has two sections (created via 1x Section Break (Next Page) ).

If a document that does not have a minimum of two sections is accessed by the code, an error message is displayed by the console. This message is as follows: IndexError: list index out of range (the full message is shown directly below this paragraph). This is as expected as the document does not have the required number of sections for the code to execute properly.

Traceback (most recent call last):
  File "C:\Users\madden99\PycharmProjects\Test3\T3_header_footer.py", line 23, in <module>
    section2 = WordFile.sections[1]  # access the 2nd section
  File "C:\Users\madden99\PycharmProjects\Test3\venv\lib\site-packages\docx\section.py", line 30, in __getitem__
    return Section(self._document_elm.sectPr_lst[key], self._document_part)
IndexError: list index out of range

According to the python-docx documentation in this instance one should apply "a len() check or try block to avoid an uncaught IndexError exception stopping your program". I have tried this approach with the code as shown directly below, but the error message still occurs:

# note: the if statement and the try-except block were implemented in the code separately
if len(WordFile.sections) != 3:
    print("Wrong number of sections in document")

try:
    section2 == WordFile.section[0]
except IndexError as error:
    print("Hey user, an error occurred!")

I would like to know if it is possible to print a message (e.g. "Doc has less than 2 sections") to the user in the instance that the document causes such an error. Any form of help would be appreciated. If there are any questions about the code, please ask. The code used to extract the header and footer is shown directly below this sentence.

from docx import Document

# obtain file name from user
filename = input("Enter file name (e.g. my_file.docx): ")
print()
# access Word document file
WordFile = Document(filename)

# access the 2nd section
# [1] refers to the second section of the Word doc
section2 = WordFile.sections[1]

# loop through the 2nd section and print the header
header2 = section2.header
section2header = sorted(set())
for paragraph in header2.paragraphs:
    for run in paragraph.runs:
        # check for duplicates and append unique values to the sorted list section2header
        if paragraph.text not in section2header:
            section2header.append(paragraph.text)
print("Section 2 Header:", section2header)

# loop through the 2nd section and print the footer
footer2 = section2.footer
section2footer = sorted(set())
for paragraph in footer2.paragraphs:
    for run in paragraph.runs:
        # check for duplicates and append unique values to the sorted list section2footer
        if paragraph.text not in section2footer:
            section2footer.append(paragraph.text)
print("Section 2 Footer:", section2footer)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions