Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pdf2txt: clean up construction of LAParams from arguments #682

Merged
merged 13 commits into from
Jan 25, 2022

Commits on Aug 16, 2021

  1. Fix pdf2txt --boxes-flow=disabled

    Fixes:
    ```
    $ pdf2txt.py --boxes-flow=disabled test.pdf
    Traceback (most recent call last):
      File "tools/pdf2txt.py", line 204, in <module>
        sys.exit(main())
      File "tools/pdf2txt.py", line 198, in main
        outfp = extract_text(**vars(A))
      File "tools/pdf2txt.py", line 66, in extract_text
        pdfminer.high_level.extract_text_to_fp(fp, **locals())
      File "pdfminer/high_level.py", line 85, in extract_text_to_fp
        interpreter.process_page(page)
      File "pdfminer/pdfinterp.py", line 896, in process_page
        self.device.end_page(page)
      File "pdfminer/converter.py", line 51, in end_page
        self.cur_item.analyze(self.laparams)
      File "pdfminer/layout.py", line 822, in analyze
        group.analyze(laparams)
      File "pdfminer/layout.py", line 575, in analyze
        LTTextGroup.analyze(self, laparams)
      File "pdfminer/layout.py", line 362, in analyze
        obj.analyze(laparams)
      File "pdfminer/layout.py", line 575, in analyze
        LTTextGroup.analyze(self, laparams)
      File "pdfminer/layout.py", line 362, in analyze
        obj.analyze(laparams)
      File "pdfminer/layout.py", line 575, in analyze
        LTTextGroup.analyze(self, laparams)
      File "pdfminer/layout.py", line 362, in analyze
        obj.analyze(laparams)
      File "pdfminer/layout.py", line 577, in analyze
        self._objs.sort(
      File "pdfminer/layout.py", line 578, in <lambda>
        key=lambda obj: (1 - laparams.boxes_flow) * obj.x0
    TypeError: unsupported operand type(s) for -: 'int' and 'str'
    ```
    
    Related: Issue pdfminer#477, PR pdfminer#479
    0xabu committed Aug 16, 2021
    Configuration menu
    Copy the full SHA
    5e52132 View commit details
    Browse the repository at this point in the history

Commits on Aug 18, 2021

  1. update CHANGELOG

    0xabu committed Aug 18, 2021
    Configuration menu
    Copy the full SHA
    fbb10b8 View commit details
    Browse the repository at this point in the history

Commits on Sep 29, 2021

  1. Configuration menu
    Copy the full SHA
    d3d7d0e View commit details
    Browse the repository at this point in the history

Commits on Oct 9, 2021

  1. Configuration menu
    Copy the full SHA
    1f90cc2 View commit details
    Browse the repository at this point in the history

Commits on Oct 14, 2021

  1. Configuration menu
    Copy the full SHA
    92749b7 View commit details
    Browse the repository at this point in the history
  2. merge CHANGELOG

    0xabu committed Oct 14, 2021
    Configuration menu
    Copy the full SHA
    de280fa View commit details
    Browse the repository at this point in the history
  3. pdf2txt: clean up handling of layout parameter arguments

     * avoid specifying default values twice
     * construct LAParams earlier, rather than passing its components around
     * fix crash with --boxes_flow=disabled
    0xabu committed Oct 14, 2021
    Configuration menu
    Copy the full SHA
    896da5e View commit details
    Browse the repository at this point in the history
  4. update CHANGELOG

    0xabu committed Oct 14, 2021
    Configuration menu
    Copy the full SHA
    12e8a1b View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    60c949d View commit details
    Browse the repository at this point in the history

Commits on Jan 25, 2022

  1. Configuration menu
    Copy the full SHA
    1fa998c View commit details
    Browse the repository at this point in the history
  2. Improve readability of setting LAParams by explicitly copying them fr…

    …om parsed_args into init of LAParams. And move all parsed_args post processing to the parse_args() method.
    pietermarsman committed Jan 25, 2022
    Configuration menu
    Copy the full SHA
    d8296f6 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    2215fa0 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    c5b015c View commit details
    Browse the repository at this point in the history