Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when saving a pdf containing only an embedPng #307

Closed
jwoodrow opened this issue Jan 6, 2020 · 8 comments
Closed

Error when saving a pdf containing only an embedPng #307

jwoodrow opened this issue Jan 6, 2020 · 8 comments
Labels

Comments

@jwoodrow
Copy link
Contributor

jwoodrow commented Jan 6, 2020

Hi @Hopding,

I've encountered a strange issue when using pdf-lib to "convert" images to PDF in my project. Usually this works fine but for one specific image I'm getting an error in my pdf.save() catch method.

const PDFLib = require('pdf-lib'),
  fs = require('fs');
/**
 * Converts a given image to a pdf in a Promise
 * @param  {Object} file    Image file to convert to pdf.
 * @param  {string} newFile Path for the newly converted pdf file.
 * @return {Promise}        A promise for the completion of the convertion.
 */
const toPdfPromise = (file, newFile) => {
  return new Promise((resolve, reject) => {
    PDFLib.PDFDocument.create()
      .then((pdf) => {
        const page = pdf.addPage(PDFLib.PageSizes.A4);
        const imageUInt8Array = fs.readFileSync(file.path);
        let imagePromise;
        if (file.mimeType === 'image/jpeg') {
          imagePromise = pdf.embedJpg(imageUInt8Array);
        } else if (file.mimeType === 'image/png') {
          imagePromise = pdf.embedPng(imageUInt8Array);
        }
        if (typeof(imagePromise) !== 'undefined') {
          imagePromise
            .then((image) => {
              const [a4_width, a4_height] = PDFLib.PageSizes.A4;
              const dimensions = imageDimensionToFit(image, { width: a4_width, height: a4_height });
              page.drawImage(image, {
                x: (a4_width - dimensions.width) / 2,
                y: (a4_height - dimensions.height) / 2,
                width: dimensions.width,
                height: dimensions.height
              });
              pdf.save()
                .then((pdfBytes) => {
                  fs.writeFileSync(newFile, pdfBytes);
                  file.oldPath = file.path;
                  file.path = newFile;
                  file.extension = 'pdf';
                  file.mimeType = 'application/pdf';
                  resolve(file);
                })
                .catch((err) => {
                  console.error(new Error(`Problem converting ${file.filename} to PDF - Could not save`));
                  reject(err);
                });
            })
            .catch((err) => {
             console.error(new Error(`Problem converting ${file.filename} to PDF - Could not embed`));
              reject(err);
            });
        } else {
          console.error(new Error(`Problem converting ${file.filename} to PDF- Invalid file type`));
          reject(new Error('Not an image file format'));
        }
      })
      .catch((err) => {
        console.error(new Error(`Problem converting ${file.filename} to PDF - Could not create PDF`));
        reject(err);
      });
  });
};

Here is my code which converts an input file to a pdf file (stored at newFile)

Here's some extra context in case:

  • file is an object structured as such
{
  path: 'path/to/image/file',
  mimeType: 'image/png', // for example
}

imageDimesionToFit is used to make an image scale to fit inside the A4 page if it is too wide or too high and looks like this

/**
 * Return valid dimensions for an image to fit a defined container size
 * @param  {Object} image         Image to fit in container.
 * @param  {Object} container     Container in which the image must fit.
 * @return {Object}               Dimensions the image must take to fit.
 */
const imageDimensionToFit = (image, container) => {
  if (Math.min(image.width, container.width) === image.width && Math.min(image.height, container.height) === image.height)
    return { width: image.width, height: image.height };
  const image_ratio = image.width / image.height;
  const container_ratio = container.width / container.height;
  if (container_ratio > image_ratio) {
    return {
      width: image.width * container.height / image.height,
      height: container.height
    };
  } else {
    return {
      width: container.width,
      height: image.height * container.width / image.width
    };
  }
};

Since this is a professional project and this is happening on a clients' file I would prefer sending you this file privately if possible/needed (hopefully I'm not using pdf-lib in a wrong way haha)

@jwoodrow jwoodrow changed the title Error When saving a pdf containing only an embedPng Error when saving a pdf containing only an embedPng Jan 6, 2020
@Hopding
Copy link
Owner

Hopding commented Jan 6, 2020

@jwoodrow What exactly is the error you're encountering? Can you share the error message and stacktrace?

@jwoodrow
Copy link
Contributor Author

jwoodrow commented Jan 6, 2020

@Hopding Completely forgot the stack trace my bad !

2|archival-service  | [2020-01-06T13:19:30.811Z] Error: Invalid filter algorithm: 28
2|archival-service  |     at PNG.decodePixels (/space/www/archival-service/node_modules/png-ts/lib/png.js:142:31)
2|archival-service  |     at PngEmbedder.<anonymous> (/space/www/archival-service/node_modules/pdf-lib/cjs/core/embedders/PngEmbedder.js:117:37)
2|archival-service  |     at step (/space/www/archival-service/node_modules/tslib/tslib.js:136:27)
2|archival-service  |     at Object.next (/space/www/archival-service/node_modules/tslib/tslib.js:117:57)
2|archival-service  |     at /space/www/archival-service/node_modules/tslib/tslib.js:110:75
2|archival-service  |     at new Promise (<anonymous>)
2|archival-service  |     at Object.__awaiter (/space/www/archival-service/node_modules/tslib/tslib.js:106:16)
2|archival-service  |     at PngEmbedder.splitAlphaChannel (/space/www/archival-service/node_modules/pdf-lib/cjs/core/embedders/PngEmbedder.js:113:24)
2|archival-service  |     at PngEmbedder.<anonymous> (/space/www/archival-service/node_modules/pdf-lib/cjs/core/embedders/PngEmbedder.js:54:51)
2|archival-service  |     at step (/space/www/archival-service/node_modules/tslib/tslib.js:136:27)
2|archival-service  |     at Object.next (/space/www/archival-service/node_modules/tslib/tslib.js:117:57)
2|archival-service  |     at /space/www/archival-service/node_modules/tslib/tslib.js:110:75
2|archival-service  |     at new Promise (<anonymous>)
2|archival-service  |     at Object.__awaiter (/space/www/archival-service/node_modules/tslib/tslib.js:106:16)
2|archival-service  |     at PngEmbedder.embedIntoContext (/space/www/archival-service/node_modules/pdf-lib/cjs/core/embedders/PngEmbedder.js:41:24)
2|archival-service  |     at PDFImage.<anonymous> (/space/www/archival-service/node_modules/pdf-lib/cjs/api/PDFImage.js:70:60)
2|archival-service  |     at step (/space/www/archival-service/node_modules/tslib/tslib.js:136:27)
2|archival-service  |     at Object.next (/space/www/archival-service/node_modules/tslib/tslib.js:117:57)
2|archival-service  |     at /space/www/archival-service/node_modules/tslib/tslib.js:110:75
2|archival-service  |     at new Promise (<anonymous>)
2|archival-service  |     at Object.__awaiter (/space/www/archival-service/node_modules/tslib/tslib.js:106:16)
2|archival-service  |     at PDFImage.embed (/space/www/archival-service/node_modules/pdf-lib/cjs/api/PDFImage.js:65:24)

EDIT:
And here's my custom error that validates where the error is happening (in the save catch)

2|archival-service  | [2020-01-06T13:19:30.803Z] Error: Problem converting [redacted] to PDF - Could not save
2|archival-service  |     at /space/www/archival-service/modules/images_to_pdf.js:75:38
2|archival-service  |     at runMicrotasks (<anonymous>)
2|archival-service  |     at processTicksAndRejections (internal/process/task_queues.js:93:5)

@jwoodrow
Copy link
Contributor Author

jwoodrow commented Jan 6, 2020

@Hopding

I use file-type to check get the mime type I set for the file's mimeType field (to avoid misinterpretation of files a user could have renamed from jpeg to png for example)
So I think the image file itself should be valid which adds to my confusion because this was working fine for the past few months (I've been using pdf-lib to merge pdfs and images together as one single file for exporting purposes)

@Hopding
Copy link
Owner

Hopding commented Jan 7, 2020

I traced the error down to this line in png-ts:

default: {
  throw new Error(`Invalid filter algorithm: ${data[pos - 1]}`);
}

https://github.com/Hopding/png-ts/blob/master/src/png.ts#L302

png-ts is a fork of png.js that I created for use in pdf-lib. I found a few issues on png.js (and pdfkit, which is a dependent) reporting similar errors. They traced it down to the lack of support for interlaced images in png.js. So I suspect if you analyzed the image you're having trouble embedding, you would find that it is interlaced. (@jwoodrow please verify whether or not this is the case).

It looks like at the time I forked png.js it did not have support for interlaced PNG images. However, support for them has since been added: foliojs/png.js@60f296a. It should be fairly straightforward to pull these changes into png-ts. It would mostly involve porting the JS code to TypeScript. Then a new version of png-ts can be released, and pdf-lib can pull it in.

Or, alternatively, it might be better to switch to a different library entirely. For example, upng-js and pngjs both look like more widely used and better maintained libraries.

I'll work on this as soon as I have time. But it may be several days before I do. If anybody would like to work on this sooner, I'm happy to answer questions. I haven't dug into this any further than I've shared above, but my initial inclination is to swap out png-ts for upng-js and just add some type declarations for it (assuming upng-js provides the necessary functionality).

@Hopding Hopding added the bug label Jan 7, 2020
@jwoodrow
Copy link
Contributor Author

jwoodrow commented Jan 7, 2020

Hi @Hopding, thanks for looking into this and you are perfectly correct

file problematic_file.png
# => problematic_file.png: PNG image data, 750 x 1334, 16-bit/color RGBA, interlaced

In the meantime I'll try and use a library to detect interlaced images and modify them (user's of my project should not be using interlaced images in the first place so I'll dress this issue beforehand) and once the png library has been changed I'll undo this tweak so I can use pdf-lib without any need for extra files.

@Hopding
Copy link
Owner

Hopding commented Feb 18, 2020

@jwoodrow Just wanted to update you on this. Today I finished swapping out png-ts for upng-js (see the UpdatePngLib branch). It has worked quite nicely thus far. I need to add more tests and then I plan to merge it. This fix should go out in the next release of pdf-lib.

Are you able to share any of the images you ran into trouble with? I'd like to include them in my test suite. Or at least test them on my new branch to make sure it will resolve the issue for you.

@Hopding Hopding mentioned this issue Feb 21, 2020
@Hopding
Copy link
Owner

Hopding commented Feb 23, 2020

@jwoodrow Version 1.3.2 is now published. It contains the fix for this issue. The full release notes are available here.

You can install this new version with npm:

npm install pdf-lib@1.3.2

It's also available on unpkg:

As well as jsDelivr:

@Hopding Hopding closed this as completed Feb 23, 2020
@MuhammadAbbasAkhtar
Copy link

When i use pdf-lib@1.3.2, i get this error

{
"type": "TypeError",
"message": "pdfDoc.getForm is not a function",
"stack": "TypeError: pdfDoc.getForm is not a function"
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants