New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Characterize impact of partition size on pileup creation #163

Closed
fnothaft opened this Issue Mar 5, 2014 · 2 comments

Comments

Projects
None yet
2 participants
@fnothaft
Member

fnothaft commented Mar 5, 2014

The performance of pileup creation seems to be strongly impacted by partition size/count. This isn't entirely surprising, as we need to do a large groupBy that has fairly good data locality. This performance needs to be characterized, possibly so that we can come up with guidelines for storage parameters.

@fnothaft fnothaft added the question label Mar 5, 2014

@fnothaft fnothaft self-assigned this Mar 5, 2014

@fnothaft

This comment has been minimized.

Show comment
Hide comment
@fnothaft

fnothaft Mar 10, 2014

Member

After some more debugging and reasoning about the pileup conversion algorithm, here are some thoughts:

  • Per partition, shuffle traffic is E[O(1)]; after read->pileup conversion, we expect to need to shuffle read_length * coverage pileups between partitions
  • Therefore, aggregate shuffle traffic is O(p) where p is the number of partitions and our constant factor is the read_length * coverage factor above

This should be pretty performant. However, we don't see this performance, because the Spark shuffle engine apparently shuffles the entire dataset when doing a repartition. Depending on the dataset size, we should be shuffling approximately 0.01% of pileups; so we are incurring 10,000x more shuffle traffic than is necessary...

To fix this, I plan to revise the pileup conversion engine to aggregate the pileups that have been identified as needing to be shuffled (via a mapPartitionsWithIndex), followed by a mapPartitionsWithIndex that performs the "move".

Pileup creation also appears to experience memory back-pressure; I am looking into ways to fix this. A simple approach may allow users to specify "projections" for conversion.

Member

fnothaft commented Mar 10, 2014

After some more debugging and reasoning about the pileup conversion algorithm, here are some thoughts:

  • Per partition, shuffle traffic is E[O(1)]; after read->pileup conversion, we expect to need to shuffle read_length * coverage pileups between partitions
  • Therefore, aggregate shuffle traffic is O(p) where p is the number of partitions and our constant factor is the read_length * coverage factor above

This should be pretty performant. However, we don't see this performance, because the Spark shuffle engine apparently shuffles the entire dataset when doing a repartition. Depending on the dataset size, we should be shuffling approximately 0.01% of pileups; so we are incurring 10,000x more shuffle traffic than is necessary...

To fix this, I plan to revise the pileup conversion engine to aggregate the pileups that have been identified as needing to be shuffled (via a mapPartitionsWithIndex), followed by a mapPartitionsWithIndex that performs the "move".

Pileup creation also appears to experience memory back-pressure; I am looking into ways to fix this. A simple approach may allow users to specify "projections" for conversion.

@heuermh

This comment has been minimized.

Show comment
Hide comment
@heuermh

heuermh Mar 24, 2016

Member

The pileup command has been removed from ADAM, ok to close?

Member

heuermh commented Mar 24, 2016

The pileup command has been removed from ADAM, ok to close?

@fnothaft fnothaft closed this Mar 24, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment