You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When I tried to run the pipeline, paper.csv was generated from Miner-Papertxt (about 2.2G). And the paper.csv file was too large (exceeded 1.7T) but my computer has only about 2T storage space. So it failed each time I run the project. Do you know how to fix this?
The text was updated successfully, but these errors were encountered:
I'm surprised it's so big. I didn't catalog file sizes, but I don't remember anything being even close to 1T in size. IIRC, I was able to store everything on a machine with only 500G. It's been a while though, so I may be misremembering.
You could try modifying the code that writes the file to compress it first. I think pandas supports writing in compressed formats via extra kwargs.
When I tried to run the pipeline, paper.csv was generated from Miner-Papertxt (about 2.2G). And the paper.csv file was too large (exceeded 1.7T) but my computer has only about 2T storage space. So it failed each time I run the project. Do you know how to fix this?
The text was updated successfully, but these errors were encountered: