Optimized matching#144
Conversation
No test because, there is nothing to do except adding other matching options in simulation.yml
Oops, there is something wrong with change/version_0_8_2.rst, I'll fix it in an incoming commit.
TODO: - deal with orederby - keep only optimized matching - test performance
…d_matching Conflicts: doc/usersguide/source/changes.rst
There was a problem hiding this comment.
because it doesn't make sense, we take the best score, independently to the order of individuals in set 2
|
This looks like a great contribution (again). I wonder if it is not possible to make the optimized version return exactly the same results as the old method (possibly as an option if it has a non-negligible cost). If my understanding is correct, I guess it is a matter of getting "df_by_cell" to return a list of ids in the "original" order within the cell, no? If that is indeed possible, we could simply remove the old method. There are probably cases where there are enough different combination of variables that the additional groupby makes the new method slower rather than faster. I believe those cases should be relatively rare, but it would be nice to know what is the threshold/at what point it is not worth it. Have you done some tests in that area? |
|
merged in optimized_matching branch. I will clean it up before merging to master |
No description provided.