Skip to content

_next_numId takes too long time when generate a big document #940

@MagicPwn

Description

@MagicPwn

in my usecase I have to generate a long document with many tables. during the process, I find that the time it takes to insert a table into a document is gradually increasing,I did a cprofile, it shows docx/oxml/numbering.py:121(_next_numId) timecost was increasing, and comes up to an unacceptable point。
after document instance added 2000 tables, next 100 table inserting will take 80 seconds, _next_numId takes 42 seconds.
_next_numId is coded as:

    @property
    def _next_numId(self):
        """
        The first ``numId`` unused by a ``<w:num>`` element, starting at
        1 and filling any gaps in numbering between existing ``<w:num>``
        elements.
        """
        numId_strs = self.xpath('./w:num/@w:numId')
        num_ids = [int(numId_str) for numId_str in numId_strs]
        for num in range(1, len(num_ids)+2):
            if num not in num_ids:
                break
        return num

when I change it to

    @property
    def _next_numId(self):
        numId_strs = self.xpath('./w:num/@w:numId')
        num_ids = [int(numId_str) for numId_str in numId_strs]
       return max(num_ids)+1

the timecost increasing was greatly relieved.
but I don't know filling any gaps in numbering between existing <w:num> elements. this logic is necessary elsewhere。
for my usecase, I need generate a document only.
if this change don't affect other place, I think I can submit a PR recently

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions