Modified search to take in multiple strings#4650
Conversation
|
I have modified the |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## develop #4650 +/- ##
===========================================
- Coverage 99.22% 99.22% -0.01%
===========================================
Files 303 303
Lines 23070 23102 +32
===========================================
+ Hits 22891 22922 +31
- Misses 179 180 +1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
kratman
left a comment
There was a problem hiding this comment.
I did not test the code changes locally, but they were just renamings.
Can you add a test with multiple keys in the search? It looks like you only fixed the test for the output formatting
|
I have implemented the suggested changes and added tests for searching multiple strings as well. |
brosaplanella
left a comment
There was a problem hiding this comment.
Looks good to me, just needs an entry to the CHANGELOG before merging
agriyakhetarpal
left a comment
There was a problem hiding this comment.
Thanks, @medha-14! Happy to approve/merge after these suggestions.
|
Actually, in the case of a partial match, would it be better to indicate that it is such? For example, model.variables.search(["NotAVariable", "concentration"])now returns but we could say something like because we do have a match here for |
|
In cases where some keys have an exact match while others only have a best match, what should the expected result be? Should we only prioritize the exact matches in such cases or should we also have the best matches printed separately? |
Do you mean the case where we have an exact match for If I understood your question correctly, then an input as follows: model.variables.search(["Electrolyte concentration", "Electrolite concentration"])should return something, in my opinion, like: We can figure out the best way to display the output later. There is also a case to be made to say that this improvement to the search functionality to accept multiple strings means that the result is returned for only the string that does return a match, but we are not really implementing a search engine, so I feel it is acceptable to have all results for all input strings (as if we are looping over them in the search). @brosaplanella, what do you think? |
|
For now i have modified the method to search for exact matches having all the model.variables.search(["electrolyte", "concentration"])Since both terms are present together in a single key, the result will be: Results for 'Electrolyte concentration': ['Electrolyte concentration']For the cases where there are no such matches it will iterate over each string individually and give results as such: model.variables.search(["RandomKey", "elecrtolyte concentration","electrolite"])will give results as: |
agriyakhetarpal
left a comment
There was a problem hiding this comment.
Almost there. It's looking good, but I tested with a few adversarial inputs – we should raise a helpful error where applicable for them, and an associated test case that catches it as well:
For example, for:
model.variables = {
"Concentration [mol.m-3]": 0,
"Surface concentration [mol.m-3]": 1,
"Flux [mol.m-2.s-1]": 2,
}model.variables.search([""]) returns Results for '': ['Concentration [mol.m-3]', 'Flux [mol.m-2.s-1]', 'Surface concentration [mol.m-3]'], but it should ask the user to input a non-empty string instead.
Another case, here I tried the following (with the same model.variables):
In [4]: model.variables.search(["abcd", "concentr"])
No matches found for 'abcd'.
Exact matches for 'concentr': ['Concentration [mol.m-3]', 'Surface concentration [mol.m-3]']
In [5]: model.variables.search(["abcd", "concent"])
No matches found for 'abcd'.
Exact matches for 'concent': ['Concentration [mol.m-3]', 'Surface concentration [mol.m-3]']
In [6]: model.variables.search(["abcd", "concen"])
No matches found for 'abcd'.
Exact matches for 'concen': ['Concentration [mol.m-3]', 'Surface concentration [mol.m-3]']
In [7]: model.variables.search(["abcd", "conce"])
No matches found for 'abcd'.
Exact matches for 'conce': ['Concentration [mol.m-3]', 'Surface concentration [mol.m-3]']
In [8]: model.variables.search(["abcd", "conc"])
No matches found for 'abcd'.
Exact matches for 'conc': ['Concentration [mol.m-3]', 'Surface concentration [mol.m-3]']
In [9]: model.variables.search(["abcd", "con"])
No matches found for 'abcd'.
Exact matches for 'con': ['Concentration [mol.m-3]', 'Surface concentration [mol.m-3]']
In [10]: model.variables.search(["abcd", "co"])
No matches found for 'abcd'.
Exact matches for 'co': ['Concentration [mol.m-3]', 'Surface concentration [mol.m-3]']
In [11]: model.variables.search(["abcd", "c"])
No matches found for 'abcd'.
Exact matches for 'c': ['Concentration [mol.m-3]', 'Surface concentration [mol.m-3]']
and the last few of these don't really make a lot of sense. However, this case is wrong on the main branch as well, so it probably doesn't need to be addressed in this PR itself. The best approach would be to use difflib to determine what part of the search string corresponds to at least a significant value (maybe 50%?) of the search results. Please feel free to take this up in a follow-up PR if you'd like to.
So, I'm happy to approve once we manage the empty string case (even that is currently failing on the main branch, but it's quite easy to handle). Thanks for your work!
|
Thank you for the detailed review! I’ve modified the method to handle the empty string case . I’ve also added a tests to cover the same.Regarding the issue with partial matches and using |
3a1b130
|
Thanks everyone for the detailed reviews and suggestions, I have made the necessary changes accordingly. Please take a look and let me know if anything else needs attention. |
Saransh-cpp
left a comment
There was a problem hiding this comment.
Thanks, @medha-14! This looks amazing!
Most of the suggestions below are "good practices" and they should be applied to the PR. Approving this as I am on holiday from tomorrow, and I don't want my review to block the merge :)
|
@medha-14 Thanks for working on this, I will re-review after you finish @Saransh-cpp's suggestions |
6f36d36
|
I will merge this after the tests pass |
Description
Fixes #4629
Type of change
Please add a line in the relevant section of CHANGELOG.md to document the change (include PR #) - note reverse order of PR #s. If necessary, also add to the list of breaking changes.
Key checklist:
$ pre-commit run(or$ nox -s pre-commit) (see CONTRIBUTING.md for how to set this up to run automatically when committing locally, in just two lines of code)$ python -m pytest(or$ nox -s tests)$ python -m pytest --doctest-plus src(or$ nox -s doctests)You can run integration tests, unit tests, and doctests together at once, using
$ nox -s quick.Further checks: