You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently we can manipulate batch size of dataframe in 2 ways:
DataFrame::parallelize
DataFrame::collect
First one will split Rows into smaller batches, second one will merge small Rows into bigger batch.
Even that those two are working fine, it might not be always easy to determine which one to use, plus
their names might be confusing to users without understanding how DataFrame works.
We can solve both of those problems by adding:
DataFrame::batchSize(int $size) : self
That will split or merge processed rows based on the size.
It should also be more intuitive to the users.
The text was updated successfully, but these errors were encountered:
Currently we can manipulate batch size of dataframe in 2 ways:
First one will split Rows into smaller batches, second one will merge small Rows into bigger batch.
Even that those two are working fine, it might not be always easy to determine which one to use, plus
their names might be confusing to users without understanding how DataFrame works.
We can solve both of those problems by adding:
DataFrame::batchSize(int $size) : self
That will split or merge processed rows based on the size.
It should also be more intuitive to the users.
The text was updated successfully, but these errors were encountered: