Skip to content

Optimized matching#144

Merged
gdementen merged 24 commits into
liam2:masterfrom
TaxIPP-Life:optimized_matching
Nov 7, 2014
Merged

Optimized matching#144
gdementen merged 24 commits into
liam2:masterfrom
TaxIPP-Life:optimized_matching

Conversation

@AlexisEidelman

Copy link
Copy Markdown
Contributor

No description provided.

Comment thread src/matching.py

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because it doesn't make sense, we take the best score, independently to the order of individuals in set 2

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True. :)

@gdementen gdementen added this to the 0.9 milestone Sep 28, 2014
@gdementen gdementen self-assigned this Sep 28, 2014
@gdementen

Copy link
Copy Markdown
Member

This looks like a great contribution (again). I wonder if it is not possible to make the optimized version return exactly the same results as the old method (possibly as an option if it has a non-negligible cost). If my understanding is correct, I guess it is a matter of getting "df_by_cell" to return a list of ids in the "original" order within the cell, no?

If that is indeed possible, we could simply remove the old method. There are probably cases where there are enough different combination of variables that the additional groupby makes the new method slower rather than faster. I believe those cases should be relatively rare, but it would be nice to know what is the threshold/at what point it is not worth it. Have you done some tests in that area?

@gdementen

Copy link
Copy Markdown
Member

merged in optimized_matching branch. I will clean it up before merging to master

@gdementen gdementen merged commit 67fc8e6 into liam2:master Nov 7, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants