Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Client-side specimen ID matching #585

Merged
merged 46 commits into from Sep 12, 2019
Merged

Client-side specimen ID matching #585

merged 46 commits into from Sep 12, 2019

Conversation

fedarko
Copy link
Collaborator

@fedarko fedarko commented Aug 30, 2019

Progress on #520 / #553. (Both of those issues have a couple moving parts, but this should address what I think is the "main" thing with both.)

Summary of changes

  • Added getSubstringMatches(), a simple function that (given a query string and an array of strings) returns all strings in the array that include the query string

    • Added corresponding unit tests
  • Added get_active_samples() (named analogously to the existing get_active_studies()). This function just gets all sample IDs associated with all active studies. (It technically returns a Promise that resolves to an array of sample IDs, since the requests are done asynchronously.)

    • Made some modifications in the back-end code to enable this -- in particular, exposing a limit parameter that allows the front-end to issue a request for all samples associated with a study instead of just 20.
    • Added various tests to verify that limits are being handled properly.
    • Thanks to @charles-cowart for help with this!
    • Also created merge_sample_responses(), which is just a few lines of code I abstracted out from autocomplete_search_samples() so that I could use it in both that function and in get_active_samples().
  • Added an "error" function in autocomplete_search_samples() (which didn't previously have one). This function just creates a bootstrapAlert() explaining the situation.

    • Also added two other error functions to the two other instances in plateViewer.js that use .then() (i.e. Promises).
    • See the commit message of ed55343 for more details about this than are probably necessary.
  • PlateViewer.modifyWell() now uses get_active_samples() and getSubstringMatches() to perform client-side matching: given an input specimen ID, this tries to match it up with an active sample ID.

    • There are three possible results of this:
      • exactly 1 sample matches --> great, change the well contents accordingly
      • 0 samples match --> ok, this well will just be labelled "unknown"
      • 2 or more samples match --> assign the new class "indeterminate" (it's the yellow one) to this well, indicating that manual resolution is needed. On reloading a plate, these wells will show up as red (i.e. "unknown") -- see caveats below.
    • The main AJAX call that PlateViewer.modifyWell() used to do is now done in PlateViewer.patchWell(). I initially did this so I could call patchWell() from multiple points, but this ended up not being needed. (I still like having modifyWell() and patchWell() be separate functions, though. Feel free to ask for them to be renamed :)
    • Changed the "intermediate" well legend to reflect the behavior on reloading.
  • Various small fixes around the codebase (adding missing docs in process.py, removed an unused variable declaration in plate.js, added Travis/Coveralls badges to the README, etc.)

Caveats

There are some things I still think could be improved in the proposed changes -- noting them here because most of these are sorta subjective, so if you have any feedback I'd appreciate it.

  • Client-side matching is fairly quick (at least on my computer), but it could be faster if we did stuff like caching the list of active sample IDs. The golden standard here would be applying all of these changes in bulk (instead of doing everything through modifyWell(), which is why you see samples' statuses change in individual "blips")... but this would probably necessitate some significant restructuring of how the plating interface works.

  • Some of the uses of promises in the code could be handled better (e.g. throwing a bootstrapAlert if a promise fails to resolve). Update: resolved as of ed55343.

  • This is the main caveat: after spending some time going through the code that handles well classes, it seems like adding an indeterminate class to wells on the front-end is pretty simple (as is done here). However, making that class persist on the back-end (so that reloading a plate reloads the class for each well) is going to be a pretty heavy amount of work on top of what's been done here: it seems like we'd need to modify the database schema to store this information, or at least figure out a way to use a SQL query to detect in python which wells are indeterminate. Took a few stabs at this last night; didn't get very far.

    • To complicate things, indeterminate wells are technically "unknown", since they don't have a valid value. From what I can tell, making these wells not "unknown" seems like it'd involve a lot of going through the code to modify things.
    • For now, the behavior is that (as mentioned above) the indeterminate class isn't preserved when reloading a plate—indeterminate cells will just show up on a reload as "unknown", which is technically kinda accurate since they don't contain exact sample IDs. Although technically accurate, this is fairly undesirable. (As mentioned, I updated the indeterminate legend to match, so that users understand why these cells change on reloading.)
    • I'm sure making the indeterminate class persistent is doable—and I'm happy to work when I get back next week on this—but I'm not sure that the effort is worth it, especially since the yellow cells are an indicator of "hey you need to resolve this manually". Of course if you disagree that is fine, I'm not the expert here :)

Anyway—sorry if this PR isn't comprehensive enough; I mainly wanted to make sure I had this ready to show you all where this is at today. Let me know what you think!

Screen Shot 2019-08-30 at 8 34 02 AM

This is an important part of implementing client-side multi-matching
(i.e. getting all the IDs needed to perform matching).

Pair-programmed with @charles-cowart :)
get_active_samples() is based on autocomplete_search_samples(), so
I abstracted some of the code for processing responses into its own
function that can be shared.

This seems to work alright! Would really like to test this with
multiple studies, though (biocore#437).

For future reference: you can get the value of a promise returned by
get_active_samples() as follows:

get_active_samples().then(
    function(v) {
        // Now, v is the array of active sample IDs
        console.log(v);
    }
);

This is 1/3 of the work towards multi-select (just gonna tag biocore#520).

What remains are the following:
2. On a paste operation when the multi-select checkbox is checked,
   identify all "new" specimen IDs. Get an array or something of
   these, preferably. (Or we can do this on a
   specimen-ID-by-specimen-ID basis, doesn't really matter.)
3. For each new specimen ID, attempt to match it with the active
   samples. Color cells / update values accordingly based on the
   following cases:
   *   0 matches (color cell red)
   *   1 match (leave cell uncolored, and change text to matching ID)
   * >=2 matches (color cell yellow)

And ...I think that should be it, knock on wood.
TODOs:

-Better then() failure handling on get_active_samples()
-Apply yellow well color (well-indeterminate CSS class IIRC) if multiple
 matches are possible.
-Remove debug console.log() stmts
I think this was from back before I fixed biocore#567
Also prettified the code, since I forgot to do that previously.
this avoids having to call getActiveStudy() twice, + removes
avenue for potential funky error where multiple calls would trigger
different results
this includes some discussion of todos, etc
Right now it works ok if you don't reload the well. but ideally
this should save the indeterminate state on the backend? Which means
that I think I'll have to pretty much redo this js logic in python..?
tldr: it's slow, and could be faster
This change can definitely be ignored/reverted if desired, but this
makes it clear that if the user doesn't resolve this then these cells
will turn red when you reload the plate.
These will alert the user if something goes wrong. This sort of
paradigm is already in use in this codebase, so adding these is pretty
simple -- essentially just giving a short summary of the issue,
then dumping the error message.
This necessitated some messing around with each of the three
onRejected functions I added in 77cb5a6 in order to figure out which
arguments were passed where.

ERROR FUNCTIONS 1 AND 2

For the most part, any error functions that fail due to requests (i.e.
the error functions I just created in autocomplete_search_samples() and
in get_active_samples()) are passed a jqXHR, etc. as arguments -- so we
can just show the responseText.

(These error functions, if ever called,
should be called with a jqXHR since all they're really doing is waiting
on a series of requests to finish. I *guess* it's conceivably possible
they could fail not due to a request if jQuery's code somehow explodes,
but I doubt it.)

ERROR FUNCTION 3

The callback on the promise returned by get_active_samples() -- that
is, the error function declared in modifyWell() -- just receives a
normal rejectionReason as a string if there's an error in the success
function within get_active_samples(), which makes sense since if any
of the requests would fail in get_active_samples() then that would
trigger get_active_samples()' declared error function (with a jqXHR,
etc.  as arguments since a request caused the failure) instead of the
error function on the promise returned by get_active_samples().

TLDR

Phew! That was a lot of long paragraphs. The gist of things here is:
I tested each of the 3 new error callbacks locally, saw what the
responses I was getting were, and adjusted the bootstrapAlert() calls
accordingly to be more informative.
From manual testing and thinking things through these adjustments
should be correct, but even if they aren't (and, say,
error function 1 gets passed a string instead of a jqXHR and things)
then the outcome is the same: an error pops up with some sort of
message. (The difference in that case will be we'll see something like
"[Nice error message:] undefined" instead of
"[Nice error message:] [further information]", which isn't too
horrifying.)
@fedarko
Copy link
Collaborator Author

fedarko commented Sep 6, 2019

As of now, I think this is ok to be reviewed—now that error handling is added in, the two remaining "caveats" are general, subjective issues that I'd like feedback on.

@AmandaBirmingham AmandaBirmingham requested review from AmandaBirmingham and removed request for charles-cowart September 6, 2019 22:25
@AmandaBirmingham
Copy link
Collaborator

@fedarko Hiya--I'm stepping into Charlie's shoes for the moment :) I see that the Travis build for this PR did not succeed due to three failing unit tests. Please modify the PR to address these failures; I can do a review on it once the build passes.

Thanks for all the effort you are putting in on LabControl!

Adjusting tests re: the default limit no longer being 20.
Also adding some more tests now that "limit" is an exposed parameter.

A TODO is adding tests to labcontrol/db/tests/test_study.py that
check this functionality more closely (e.g. trying negative limits).
Previously in this PR, this wasn't directly tested, and things were
kind of funky -- Study.samples() said that it expected limit to be an
int, but the limit obtained from get_argument() in
StudySamplesHandler.get() was actually a str.

Major changes:

- Study.samples() now explicitly requires that limit is already an
  int.

- Added test_samples_with_limit() to TestStudy accordingly.

- StudySamplesHandler.get() now converts limit to an int (if it's been
  specified), before calling Study.samples(). If int() fails, then
  it'll raise a ValueError. (Similarly, Study.samples() raises a
  ValueError for invalid limits.) If StudySamplesHandler.get() runs
  into a ValueError, it sets a status code of 400 ("Bad Request").

- Added various invalid limit tests to
  test_get_study_samples_handler() accordingly.

Running tests on my system seems to explode the database on my system
(I'll look into getting this fixed soon -- since I've been working on
the front-end mostly this hasn't been a problem), so I'm going to just
kick this up to Travis for the time being.
For some reason, I forgot that there were 9 "SKM" samples in the
test dataset. The thing I was trying to test with the

response = self.get('/study/1/samples?term=SKM&limit=30')

test was that the property of limits not doing anything if greater
than the normal amount of output still worked, even if a term was
specified. This test should be fixed now.
@fedarko
Copy link
Collaborator Author

fedarko commented Sep 10, 2019

@AmandaBirmingham Thank you so much for the many helpful comments! I'm not going to be able to respond to them all today, but I'll get to work on hopefully addressing all of these tomorrow.

That is, this just tries to see if int(limit) works successfully.

I also overhauled the tests accordingly, and accounted for a few
silly corner cases (e.g. someone passes in float('inf') as a limit).

This should address a few of @AmandaBirmingham's comments in biocore#585.
-Specify originating functions in the "new" error messages added
 to the plateViewer in biocore#585
-Reword misleading sentence re: "assuming" in the docstring of
 merge_sample_responses()

Towards addressing @AmandaBirmingham's comments
@fedarko
Copy link
Collaborator Author

fedarko commented Sep 11, 2019

@AmandaBirmingham I think I've addressed all of your comments here (I left a few comments "unresolved" where I wasn't sure if you'd like me to do more).

Please let me know if you'd like me to do anything else before we merge this in—thanks again!

Copy link
Collaborator

@AmandaBirmingham AmandaBirmingham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one minor comment that I don't think requires action right now. Ready to merge after a second reviewer approves?

labcontrol/db/study.py Outdated Show resolved Hide resolved
@fedarko
Copy link
Collaborator Author

fedarko commented Sep 11, 2019

Since @gwarmstrong is out right now, I can request a review from someone else?

@fedarko
Copy link
Collaborator Author

fedarko commented Sep 11, 2019

@qiyunzhu and I just sat down and walked through the code changes, and he has given his verbal approval of this PR.

While going over the code I noticed one small case that isn't explicitly tested in the Study.samples() unit test (strings of negative ints), so I added a check for that in the latest commit.

Anyway, once the latest commit passes on Travis (or at least fails solely due to the intermittent and unrelated download error mentioned in #588), I can merge this PR in.

@fedarko
Copy link
Collaborator Author

fedarko commented Sep 12, 2019

Build passed—merging in now. Thanks @AmandaBirmingham and @qiyunzhu for looking over this!

@fedarko fedarko merged commit 9c1867d into biocore:master Sep 12, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants