Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix parse() arguments to match the documented types #77

Merged
merged 2 commits into from
May 23, 2024

Conversation

moshest
Copy link

@moshest moshest commented Jan 4, 2024

Problem

What problem are you trying to solve? What issue does this close?

The pdf2md say it will receive the following type:

@param {string|TypedArray|DocumentInitParameters|PDFDataRangeTransport} pdfBuffer
 * Passed to `pdfjs.getDocument()` to read a PDF document for conversion

However, it only works with BinaryData type.

Solution

How did you solve the problem?

I add a type check and make the first argument convert to docOptions depends on the type given:

if (typeof docOptions === "string" || docOptions instanceof URL) {
  docOptions = { url: docOptions };
} else if (docOptions instanceof ArrayBuffer || ArrayBuffer.isView(docOptions)) {
  docOptions = { data: docOptions };
}

This should match the docs of pdfjs:
https://mozilla.github.io/pdf.js/api/draft/module-pdfjsLib.html

Copy link

@LoneRifle LoneRifle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Copy link

@LoneRifle LoneRifle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hang on. it's possible for docOptions to be a string but not a URL, ie, a binary string that represents the PDF blob itself. Could there be a stronger check to validate docOptions as something we can actually supply as a url to pdfjs?

@moshest
Copy link
Author

moshest commented Jan 11, 2024

I don't think we need to be holier than the pope:
https://github.com/mozilla/pdf.js/blob/12875359c387d7e2d312c50748833b3c52d986aa/src/display/api.js#L234

function getDocument(src) {
   // ...

   if (typeof src === "string" || src instanceof URL) {
      src = { url: src };
    } else if (isArrayBuffer(src)) {
      src = { data: src };
    }

    // ...
}

@LoneRifle LoneRifle merged commit b46fcb2 into opengovsg:master May 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants