Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to decompress large file #103

Closed
pquoctuanno1 opened this issue Dec 5, 2022 · 12 comments
Closed

Unable to decompress large file #103

pquoctuanno1 opened this issue Dec 5, 2022 · 12 comments

Comments

@pquoctuanno1
Copy link

I am extracting my data file about 5gb, I tried to filter out the necessary files, but after a while of unzipping, I get the following message.

image

@pquoctuanno1
Copy link
Author

pquoctuanno1 commented Dec 5, 2022

import path from "path";
import fs from "fs-extra";
import unrar from "node-unrar-js";

export default class RAR {
  filePath: string;

  constructor(_filePath: string) {
    this.filePath = _filePath;
  }

  readStreamFile = (_filePath: string): Promise<Buffer> =>
    new Promise((resolve, reject) => {
      let buff = Buffer.from([]);
      const readStream = fs.createReadStream(_filePath, {
        highWaterMark: 128 * 1024 * 1024,
      });
      readStream.on("data", (nextBuff) => {
        buff = Buffer.concat(
          [buff, nextBuff as Buffer],
          buff.length + nextBuff.length
        );
      });
      readStream.on("end", () => resolve(buff));
      readStream.on("error", (error) => reject(error));
    });

  extractAllTo = async (targetPath: string, options?: IOptions) => {
    try {
      const buff = await this.readStreamFile(this.filePath);
      const extractor = await unrar.createExtractorFromData({
        data: Uint8Array.from(buff).buffer,
        password: options?.password,
      });
      const { files } = extractor.extract({
        files: (fileHeader) => {
          if (!fileHeader.flags.directory) {
            if (/\.txt$/.test(fileHeader.name)) {
              return true;
            } else {
              return false;
            }
          }
          return false;
        },
      });
      try {
        for (const { fileHeader, extraction } of files) {
          try {
            const isDirectory = fileHeader.flags.directory;

            if (isDirectory) {
              fs.ensureDirSync(path.join(targetPath, fileHeader.name));
            } else {
              fs.outputFileSync(
                path.join(targetPath, fileHeader.name),
                extraction
              );
            }
          } catch {}
        }
      } catch {}
      return true;
    } catch {
      return false;
    }
  };
}

export interface IOptions {
  password?: string;
}

@YuJianrong
Copy link
Owner

Quoting doc

EPIPE: A write on a pipe, socket, or FIFO for which there is no process to read the data. Commonly encountered at the net and http layers, indicative that the remote side of the stream being written to has been closed.

I guess it's because the file is too big to fit in the memory, you can try to download the file to the file system, then use createExtractorFromFile instead of createExtractorFromData to extract the files. This API will not load all the contents to memory.

@pquoctuanno1
Copy link
Author

Quoting doc

EPIPE: A write on a pipe, socket, or FIFO for which there is no process to read the data. Commonly encountered at the net and http layers, indicative that the remote side of the stream being written to has been closed.

I guess it's because the file is too big to fit in the memory, you can try to download the file to the file system, then use createExtractorFromFile instead of createExtractorFromData to extract the files. This API will not load all the contents to memory.

Your ideal worked with me, Thanks for that.

@pquoctuanno1
Copy link
Author

Sorry, I checked again, after I did as you said, it only works if the extracted data is less than 2GB, I tried extracting the file 1.9GB, and the size after unzipping 2GB large, and I was getting the error earlier.

image

@pquoctuanno1
Copy link
Author

I have tried decompressing quite a few files, but the data after decompression is larger than 2GB, it will be like that, I tried using the command here, but it still error.

pm2 start dist/index.mjs --node-args="--max-old-space-size=32768"

image
image

@YuJianrong
Copy link
Owner

Why the error in the call stack is thrown from request which is not even imported in your code?
Can you remove all the unnecessary code (like running by pm2) to get the real callstack?

@pquoctuanno1
Copy link
Author

These errors I am making run on pm2

image
image

Repository owner deleted a comment from Jianrong-Yu Dec 5, 2022
@YuJianrong
Copy link
Owner

It's still wrapped by a third-part dependency node_modules/request, can you run it without pm2 to see how the error looks like?

@pquoctuanno1
Copy link
Author

It's still wrapped by a third-part dependency node_modules/request, can you run it without pm2 to see how the error looks like?

I'll run it again in developer mode, I'll be right back, don't go anywhere bro, I need it urgently.

@pquoctuanno1
Copy link
Author

I really don't know where it went wrong @@
image
image

@YuJianrong
Copy link
Owner

Why wrapped by Vite this time? can you just run a small example by tsx or pure node+javascript?

Like test.jsm

import RAR from `your-module`;
const rar = new RAR('largeFile.rar');
await rar.extractAllTo('targetPath');

and use node to run it directly

@pquoctuanno1
Copy link
Author

Why wrapped by Vite this time? can you just run a small example by tsx or pure node+javascript?

Like test.jsm

import RAR from `your-module`;
const rar = new RAR('largeFile.rar');
await rar.extractAllTo('targetPath');

and use node to run it directly

I tried what you said, and it works, I think it's a problem with my code

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants