Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve TypedSimilarity algorithm and update test. #983

Merged
merged 1 commit into from
Jul 30, 2014

Conversation

reconditesea
Copy link
Contributor

Add one optimization to disco/dimsum algorithms: doing the filtering before the large self-join.
Update test to use TypedTsv.

val groupedOnSrc = graph
.filter { e => smallpred(e.to) || bigpred(e.to) }
val bigGroupedOnSrc = graph
.filter { e => bigpred(e.to) }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I'm wondering if this might be one of the cases where a .forceToDisk or a .fork might be prudent. Thoughts, @johnynek ?

Also, you could do the map before applying the predicates, since we map them to the exact same thing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can do the map first before creating two pipes using the predicates and add a .forceToDisk there. Is that more recommend?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need to add another. These two paths are separate since they are grouped differently.

jcoveney added a commit that referenced this pull request Jul 30, 2014
Improve TypedSimilarity algorithm and update test.
@jcoveney jcoveney merged commit 9b0f3dc into develop Jul 30, 2014
@jcoveney jcoveney deleted the klin_typed_sim_imp branch July 30, 2014 16:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants