-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature delete mapping #458
Conversation
This is a breaking change for the database initialisatiom.
Yes, I was going through most files and cleaned slightly.
…he creator of the project. This commit correct at the same time a bug: everyone seems to have been able to get a group result, not only the project creator.
Realtes to issue #457
Also removed an old section.
def test_run_status_with_clks_2p(requests, mapping_project): | ||
size = mapping_project['size'] | ||
ensure_run_progressing(requests, mapping_project, size) | ||
def test_permutations_run_status_with_clks_2p(requests, permutations_project): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I may not have understood the reasons of these tests: it was done only on mapping but not on permutations and similarity scores... so I removed the mapping but add the two others.
backend/entityservice/tests/util.py
Outdated
@@ -263,7 +263,7 @@ def wait_approx_run_time(size, assumed_rate=1_000_000): | |||
""" | |||
number_comparisons = sum( | |||
size0 * size1 for size0, size1 in itertools.combinations(size, 2)) | |||
time.sleep(5) | |||
time.sleep(1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Relates to issue #457
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why 1 sec? Given that there is another sleep
right after it, we might just remove this one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I wasn't sure why it has been introduced, I reduced it to 1 second and opened an issue. I'll check what happens without this wait (f7601d1), if the tests fail, I'll revert it, otherwise keep it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well, that must have been fun...
some minor comments inline.
@@ -113,8 +94,10 @@ def permute_mapping_data(project_id, run_id, len_filters1, len_filters2, parent_ | |||
random.shuffle(remaining_new_indexes) | |||
log.info("Assigning random indexes for {} matched entities".format(number_in_common)) | |||
|
|||
for mapping_number, a_index in enumerate(mapping): | |||
b_index = mapping[a_index] | |||
for mapping_number, group in enumerate(groups): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rename mapping_number
. Maybe group_index
or group_number
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nicely spotted. 👍
* `"permutations"` - Creates random permutations and a mask. | ||
* `"similarity_scores"` - Outputs a list of similarity scores of `[indexA, indexB, score]`, where `score` | ||
represents the likelihood that `indexA = indexB`. | ||
* `"groups"` - Outputs a list of groups of records, where each group represents one entity. | ||
|
||
Only `"groups"` supports multi-party linkage. `"mapping"`, `"permutations"`, and `"similarity_scores"` only support linkage |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
line 43 and 71 mentioning four different output types. -> change to three.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
backend/entityservice/tests/util.py
Outdated
@@ -263,7 +263,7 @@ def wait_approx_run_time(size, assumed_rate=1_000_000): | |||
""" | |||
number_comparisons = sum( | |||
size0 * size1 for size0, size1 in itertools.combinations(size, 2)) | |||
time.sleep(5) | |||
time.sleep(1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why 1 sec? Given that there is another sleep
right after it, we might just remove this one?
Nicely spotted during the PR review.
relates to #457
The goal of this PR is to remove the
mapping
output type from the entity-service. Themapping
output type is indeed redundant with thegroups
output type for 2 parties, however, the output representation is slightly different.Few notes:
testing.es.data61.xyz
) which makes them impossible to keep up to date in the same PR as the one modifying the service