Next JS ReferenceError: nodeUtil is not defined #303

ajith-ab · 2023-06-12T18:15:42Z

Hi all,
While trying to parse pdf in NEXT JS throwing following error. Anyone have any idea regarding this.

 {
    "pdf2json": "^3.0.4",
        "next": "13.4.5",
 }


Thanks In Advance

mattb-prg · 2023-06-18T23:32:42Z

Try adding this to your nextjs config:

const nextConfig = {
  experimental: {
    serverComponentsExternalPackages: ['pdf2json'],
  },
};

OferElfassi · 2023-06-21T13:57:04Z

It's not the ideal solution, but it works for me.
I had the same issue when using this library in the ElectronJs app.
I ended up changing each call to nodeUtil.p2j to console.. (warn, error, info, log)

As a result, the library functions as expected, but the console output is not as nice as it was before.
But I disabled console output in production anyway, so it's not a problem for me.

The mofified version can be found here: https://github.com/OferElfassi/pdf2jsonForElectron.git,
or you can simply install it using:
yarn add https://github.com/OferElfassi/pdf2jsonForElectron.git.

dprothero · 2023-07-04T00:36:36Z

@OferElfassi I came here with the same issue with Electron! Thanks for sharing your module.

dprothero · 2023-07-08T21:22:50Z

@OferElfassi your solution builds in Electron, but have you actually successfully run pdf2json from within Electron? I'm trying to run it from the main process and it just hangs, never firing pdfParser_dataReady nor pdfParser_dataError.

const pdfData: PdfData = await new Promise<PdfData>((resolve, reject) => {
  const pdfParser = new PDFParser();
  pdfParser.on("pdfParser_dataError", (errData: unknown) => {
    reject(errData);
  });
  pdfParser.on("pdfParser_dataReady", async (pdfData: PdfData) => {
    resolve(pdfData);
  });
  pdfParser.loadPDF(localPdf);
});

The console output shows as follows:

Load OK: C:\Users\david\Dropbox\DnD\Campaigns\Empire of the Chromatic Conclave\NPCs\zombie-minion.pdf
Warning: Setting up fake worker.
PDF loaded. pagesCount = 1
start to parse page:1
Skipped: tiny fill: 0 x 0

I can run the very same code in a stand-alone node process and it works just fine on the same PDF file with the following output:

Load OK: C:\Users\david\Dropbox\DnD\Campaigns\Empire of the Chromatic Conclave\NPCs\zombie-minion.pdf
Warning: Setting up fake worker.
PDF loaded. pagesCount = 1
start to parse page:1
Skipped: tiny fill: 0 x 0
Success: Page 1
complete parsing page:1

Conspicuously absent fron the Electron debug output are the last two lines showing successful parsing of the page.

So, running it from Electron, it's hanging somewhere in parsing the PDF. 😢

OferElfassi · 2023-07-08T23:06:20Z

Hey @dprothero im glad my solution helped you.
I did use this on the main process in electron, and it worked fine,
here is the exact code:

 const extractFromPdf = (pdfPath):Promise<string[]> => {
    return new Promise((resolve, reject) => {
        const pdfParser = new PDFParser();
        pdfParser.on('pdfParser_dataError', errData => {
            console.error(errData)
            reject(errData)
        });
        pdfParser.on('pdfParser_dataReady', pdfData => {
            const processedData = [];
            // ... code to extract text and other content from pdfData ...

            resolve(processedData)
        });
        let dataBuffer = fs.readFileSync(pdfPath);
        pdfParser.parseBuffer(dataBuffer);
    })
}

However, i still had some errors in production, so i ended up using the library as standalone module, outside package.json, you can find the modified version here:
https://github.com/OferElfassi/pdf2json_standalone.git
you will need to install "@xmldom/xmldom" package as well to make it work.
So i copied the folder to my project, and used it like this:

 import PDFParser from "./pdf2json_standalone/pdfparser";
 ... rest of the code is the same as above.

Hope this helps ☺

darklight9811 · 2023-10-10T01:58:15Z

Try adding this to your nextjs config:

const nextConfig = {
  experimental: {
    serverComponentsExternalPackages: ['pdf2json'],
  },
};

Now it throws the following error:

 ⨯ src\server\generation\index.ts (127:17) @ eval
 ⨯ TypeError: pdf2json__WEBPACK_IMPORTED_MODULE_1__.default is not a constructor

niemal · 2023-10-27T22:12:16Z

@OferElfassi your solution builds in Electron, but have you actually successfully run pdf2json from within Electron? I'm trying to run it from the main process and it just hangs, never firing pdfParser_dataReady nor pdfParser_dataError.
const pdfData: PdfData = await new Promise<PdfData>((resolve, reject) => {
  const pdfParser = new PDFParser();
  pdfParser.on("pdfParser_dataError", (errData: unknown) => {
    reject(errData);
  });
  pdfParser.on("pdfParser_dataReady", async (pdfData: PdfData) => {
    resolve(pdfData);
  });
  pdfParser.loadPDF(localPdf);
});
The console output shows as follows:
Load OK: C:\Users\david\Dropbox\DnD\Campaigns\Empire of the Chromatic Conclave\NPCs\zombie-minion.pdf
Warning: Setting up fake worker.
PDF loaded. pagesCount = 1
start to parse page:1
Skipped: tiny fill: 0 x 0
I can run the very same code in a stand-alone node process and it works just fine on the same PDF file with the following output:
Load OK: C:\Users\david\Dropbox\DnD\Campaigns\Empire of the Chromatic Conclave\NPCs\zombie-minion.pdf
Warning: Setting up fake worker.
PDF loaded. pagesCount = 1
start to parse page:1
Skipped: tiny fill: 0 x 0
Success: Page 1
complete parsing page:1
Conspicuously absent fron the Electron debug output are the last two lines showing successful parsing of the page.

So, running it from Electron, it's hanging somewhere in parsing the PDF. 😢

I am getting the same behavior on specific PDF files. @dprothero Did you fix it? I am also using the way @OferElfassi suggested, and it works but very rarely without just hanging. Perhaps it's because pdf2json updated their stuff and you need to update the standalone version? I would appreciate it if you took a look!

dprothero · 2023-10-30T01:30:55Z

@niemal I ran into the same issue... it would work with some PDFs and hang with others.

I switched to pdf-parse-fork and it's been working great.

niemal · 2023-10-30T11:59:48Z

@niemal I ran into the same issue... it would work with some PDFs and hang with others.

I switched to pdf-parse-fork and it's been working great.

Yeah turns out that's a memory leak and the whole thing is unusable in production. pdf-parse seems to work much more resiliently.

kleenkanteen · 2023-12-08T06:58:10Z

@niemal how do you know it's a memory leak

jscardona12 · 2023-12-20T18:30:24Z

@darklight9811 were you able to fix this issue?

modesty · 2024-05-04T19:10:37Z

please try v3.1.2 please

niemal mentioned this issue Aug 12, 2023

It is not supporting the TS type declaration. adrienjoly/npm-pdfreader#135

Open

modesty closed this as completed May 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Next JS ReferenceError: nodeUtil is not defined #303

Next JS ReferenceError: nodeUtil is not defined #303

ajith-ab commented Jun 12, 2023

mattb-prg commented Jun 18, 2023

OferElfassi commented Jun 21, 2023 •

edited

dprothero commented Jul 4, 2023

dprothero commented Jul 8, 2023 •

edited

OferElfassi commented Jul 8, 2023

darklight9811 commented Oct 10, 2023

niemal commented Oct 27, 2023 •

edited

dprothero commented Oct 30, 2023

niemal commented Oct 30, 2023

kleenkanteen commented Dec 8, 2023

jscardona12 commented Dec 20, 2023

modesty commented May 4, 2024

Next JS ReferenceError: nodeUtil is not defined #303

Next JS ReferenceError: nodeUtil is not defined #303

Comments

ajith-ab commented Jun 12, 2023

mattb-prg commented Jun 18, 2023

OferElfassi commented Jun 21, 2023 • edited

dprothero commented Jul 4, 2023

dprothero commented Jul 8, 2023 • edited

OferElfassi commented Jul 8, 2023

darklight9811 commented Oct 10, 2023

niemal commented Oct 27, 2023 • edited

dprothero commented Oct 30, 2023

niemal commented Oct 30, 2023

kleenkanteen commented Dec 8, 2023

jscardona12 commented Dec 20, 2023

modesty commented May 4, 2024

OferElfassi commented Jun 21, 2023 •

edited

dprothero commented Jul 8, 2023 •

edited

niemal commented Oct 27, 2023 •

edited