-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
merge
command returned nothing when input msdfs
had no confidence
column
#350
Conversation
I made this decision. Does this seem logical @matentzn ? |
merge
returned nothing when msdfs had no confidence
columnmerge
command returned nothing when msdfs had no confidence
column
merge
command returned nothing when msdfs had no confidence
columnmerge
command returned nothing when input msdfs
had no confidence
column
sssom/util.py
Outdated
@@ -273,6 +273,8 @@ def filter_redundant_rows( | |||
:param ignore_predicate: If true, the predicate_id column is ignored, defaults to False | |||
:return: Filtered pandas DataFrame | |||
""" | |||
if CONFIDENCE not in df.columns: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would happen if we assumed that missing confidence means 100% confidence? I think this assumption is more often true. Also, if you do ad the IF clause here, the dataframe will not be sorted?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just realized, line 280 has assign_default_confidence
which adds a new column 'confidence' if absent BUT it initiates to np.NaN
. Is this what we still intend to do?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think so, but it is important that the absence of a confidence value does not lead to a removal of the entire mapping! While it is not the same as confidence 1.0, it should probably be interpreted as such during reconciliation.. Unless you see a strong reason against it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The presence of np.NaN
is causing the df returned to be empty in the code between lines 297 to 311
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to itiniate it to "0.0" instead of np.Nan
Added a flag: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a xomment
Co-authored-by: Nico Matentzoglu <nicolas.matentzoglu@gmail.com>
Co-authored-by: Nico Matentzoglu <nicolas.matentzoglu@gmail.com>
Good catch! sorry about that! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a test case for reconcile false?
Done |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a comment
Nice! Thanks! |
Fixes #348