You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is this planned or even already possible somehow? Since paratext will most likely be used for reading large CSV files (pandas is usually already fast enough for small ones) which might not fit in memory, this would be very useful in my opinion.
The text was updated successfully, but these errors were encountered:
Thank you for your feature request. I completely agree that this will be very useful and doable. It would require changes at the C++-level.
It would involve first running the ParaText::Generic::Chunker as before, but instead finding num_chunks*num_threads chunks. Then, spawing num_threads threads over the first set of chunks, returning the results back to C++, and then proceeding on the next batch of num_threads chunks, until this process is repeated num_chunks times. We will add this feature to the roadmap.
tdenniston
added a commit
to tdenniston/paratext
that referenced
this issue
Dec 7, 2017
Similar to the chunksize parameter of pandas.read_csv()...
Is this planned or even already possible somehow? Since paratext will most likely be used for reading large CSV files (pandas is usually already fast enough for small ones) which might not fit in memory, this would be very useful in my opinion.
The text was updated successfully, but these errors were encountered: