Client-side specimen ID matching #585

fedarko · 2019-08-30T12:52:22Z

Progress on #520 / #553. (Both of those issues have a couple moving parts, but this should address what I think is the "main" thing with both.)

Summary of changes

Added getSubstringMatches(), a simple function that (given a query string and an array of strings) returns all strings in the array that include the query string
- Added corresponding unit tests
Added get_active_samples() (named analogously to the existing get_active_studies()). This function just gets all sample IDs associated with all active studies. (It technically returns a Promise that resolves to an array of sample IDs, since the requests are done asynchronously.)
- Made some modifications in the back-end code to enable this -- in particular, exposing a limit parameter that allows the front-end to issue a request for all samples associated with a study instead of just 20.
- Added various tests to verify that limits are being handled properly.
- Thanks to @charles-cowart for help with this!
- Also created merge_sample_responses(), which is just a few lines of code I abstracted out from autocomplete_search_samples() so that I could use it in both that function and in get_active_samples().
Added an "error" function in autocomplete_search_samples() (which didn't previously have one). This function just creates a bootstrapAlert() explaining the situation.
- Also added two other error functions to the two other instances in plateViewer.js that use .then() (i.e. Promises).
- See the commit message of ed55343 for more details about this than are probably necessary.
PlateViewer.modifyWell() now uses get_active_samples() and getSubstringMatches() to perform client-side matching: given an input specimen ID, this tries to match it up with an active sample ID.
- There are three possible results of this:
  - exactly 1 sample matches --> great, change the well contents accordingly
  - 0 samples match --> ok, this well will just be labelled "unknown"
  - 2 or more samples match --> assign the new class "indeterminate" (it's the yellow one) to this well, indicating that manual resolution is needed. On reloading a plate, these wells will show up as red (i.e. "unknown") -- see caveats below.
- The main AJAX call that PlateViewer.modifyWell() used to do is now done in PlateViewer.patchWell(). I initially did this so I could call patchWell() from multiple points, but this ended up not being needed. (I still like having modifyWell() and patchWell() be separate functions, though. Feel free to ask for them to be renamed :)
- Changed the "intermediate" well legend to reflect the behavior on reloading.
Various small fixes around the codebase (adding missing docs in process.py, removed an unused variable declaration in plate.js, added Travis/Coveralls badges to the README, etc.)

Caveats

There are some things I still think could be improved in the proposed changes -- noting them here because most of these are sorta subjective, so if you have any feedback I'd appreciate it.

Client-side matching is fairly quick (at least on my computer), but it could be faster if we did stuff like caching the list of active sample IDs. The golden standard here would be applying all of these changes in bulk (instead of doing everything through modifyWell(), which is why you see samples' statuses change in individual "blips")... but this would probably necessitate some significant restructuring of how the plating interface works.
~~Some of the uses of promises in the code could be handled better (e.g. throwing a bootstrapAlert if a promise fails to resolve).~~ Update: resolved as of ed55343.
This is the main caveat: after spending some time going through the code that handles well classes, it seems like adding an indeterminate class to wells on the front-end is pretty simple (as is done here). However, making that class persist on the back-end (so that reloading a plate reloads the class for each well) is going to be a pretty heavy amount of work on top of what's been done here: it seems like we'd need to modify the database schema to store this information, or at least figure out a way to use a SQL query to detect in python which wells are indeterminate. Took a few stabs at this last night; didn't get very far.
- To complicate things, indeterminate wells are technically "unknown", since they don't have a valid value. From what I can tell, making these wells not "unknown" seems like it'd involve a lot of going through the code to modify things.
- For now, the behavior is that (as mentioned above) the indeterminate class isn't preserved when reloading a plate—indeterminate cells will just show up on a reload as "unknown", which is technically kinda accurate since they don't contain exact sample IDs. Although technically accurate, this is fairly undesirable. (As mentioned, I updated the indeterminate legend to match, so that users understand why these cells change on reloading.)
- I'm sure making the indeterminate class persistent is doable—and I'm happy to work when I get back next week on this—but I'm not sure that the effort is worth it, especially since the yellow cells are an indicator of "hey you need to resolve this manually". Of course if you disagree that is fine, I'm not the expert here :)

Anyway—sorry if this PR isn't comprehensive enough; I mainly wanted to make sure I had this ready to show you all where this is at today. Let me know what you think!

@charles-cowart

This is an important part of implementing client-side multi-matching (i.e. getting all the IDs needed to perform matching). Pair-programmed with @charles-cowart :)

get_active_samples() is based on autocomplete_search_samples(), so I abstracted some of the code for processing responses into its own function that can be shared. This seems to work alright! Would really like to test this with multiple studies, though (biocore#437). For future reference: you can get the value of a promise returned by get_active_samples() as follows: get_active_samples().then( function(v) { // Now, v is the array of active sample IDs console.log(v); } ); This is 1/3 of the work towards multi-select (just gonna tag biocore#520). What remains are the following: 2. On a paste operation when the multi-select checkbox is checked, identify all "new" specimen IDs. Get an array or something of these, preferably. (Or we can do this on a specimen-ID-by-specimen-ID basis, doesn't really matter.) 3. For each new specimen ID, attempt to match it with the active samples. Color cells / update values accordingly based on the following cases: * 0 matches (color cell red) * 1 match (leave cell uncolored, and change text to matching ID) * >=2 matches (color cell yellow) And ...I think that should be it, knock on wood.

just syncing with upstream

TODOs: -Better then() failure handling on get_active_samples() -Apply yellow well color (well-indeterminate CSS class IIRC) if multiple matches are possible. -Remove debug console.log() stmts

I think this was from back before I fixed biocore#567

Also prettified the code, since I forgot to do that previously.

this avoids having to call getActiveStudy() twice, + removes avenue for potential funky error where multiple calls would trigger different results

this includes some discussion of todos, etc

Right now it works ok if you don't reload the well. but ideally this should save the indeterminate state on the backend? Which means that I think I'll have to pretty much redo this js logic in python..?

tldr: it's slow, and could be faster

This change can definitely be ignored/reverted if desired, but this makes it clear that if the user doesn't resolve this then these cells will turn red when you reload the plate.

These will alert the user if something goes wrong. This sort of paradigm is already in use in this codebase, so adding these is pretty simple -- essentially just giving a short summary of the issue, then dumping the error message.

This necessitated some messing around with each of the three onRejected functions I added in 77cb5a6 in order to figure out which arguments were passed where. ERROR FUNCTIONS 1 AND 2 For the most part, any error functions that fail due to requests (i.e. the error functions I just created in autocomplete_search_samples() and in get_active_samples()) are passed a jqXHR, etc. as arguments -- so we can just show the responseText. (These error functions, if ever called, should be called with a jqXHR since all they're really doing is waiting on a series of requests to finish. I *guess* it's conceivably possible they could fail not due to a request if jQuery's code somehow explodes, but I doubt it.) ERROR FUNCTION 3 The callback on the promise returned by get_active_samples() -- that is, the error function declared in modifyWell() -- just receives a normal rejectionReason as a string if there's an error in the success function within get_active_samples(), which makes sense since if any of the requests would fail in get_active_samples() then that would trigger get_active_samples()' declared error function (with a jqXHR, etc. as arguments since a request caused the failure) instead of the error function on the promise returned by get_active_samples(). TLDR Phew! That was a lot of long paragraphs. The gist of things here is: I tested each of the 3 new error callbacks locally, saw what the responses I was getting were, and adjusted the bootstrapAlert() calls accordingly to be more informative. From manual testing and thinking things through these adjustments should be correct, but even if they aren't (and, say, error function 1 gets passed a string instead of a jqXHR and things) then the outcome is the same: an error pops up with some sort of message. (The difference in that case will be we'll see something like "[Nice error message:] undefined" instead of "[Nice error message:] [further information]", which isn't too horrifying.)

fedarko · 2019-09-06T06:08:51Z

As of now, I think this is ok to be reviewed—now that error handling is added in, the two remaining "caveats" are general, subjective issues that I'd like feedback on.

AmandaBirmingham · 2019-09-06T22:29:25Z

@fedarko Hiya--I'm stepping into Charlie's shoes for the moment :) I see that the Travis build for this PR did not succeed due to three failing unit tests. Please modify the PR to address these failures; I can do a review on it once the build passes.

Thanks for all the effort you are putting in on LabControl!

Adjusting tests re: the default limit no longer being 20. Also adding some more tests now that "limit" is an exposed parameter. A TODO is adding tests to labcontrol/db/tests/test_study.py that check this functionality more closely (e.g. trying negative limits).

Previously in this PR, this wasn't directly tested, and things were kind of funky -- Study.samples() said that it expected limit to be an int, but the limit obtained from get_argument() in StudySamplesHandler.get() was actually a str. Major changes: - Study.samples() now explicitly requires that limit is already an int. - Added test_samples_with_limit() to TestStudy accordingly. - StudySamplesHandler.get() now converts limit to an int (if it's been specified), before calling Study.samples(). If int() fails, then it'll raise a ValueError. (Similarly, Study.samples() raises a ValueError for invalid limits.) If StudySamplesHandler.get() runs into a ValueError, it sets a status code of 400 ("Bad Request"). - Added various invalid limit tests to test_get_study_samples_handler() accordingly. Running tests on my system seems to explode the database on my system (I'll look into getting this fixed soon -- since I've been working on the front-end mostly this hasn't been a problem), so I'm going to just kick this up to Travis for the time being.

For some reason, I forgot that there were 9 "SKM" samples in the test dataset. The thing I was trying to test with the response = self.get('/study/1/samples?term=SKM&limit=30') test was that the property of limits not doing anything if greater than the normal amount of output still worked, even if a term was specified. This test should be fixed now.

@AmandaBirmingham

Addresses @AmandaBirmingham's comment at biocore#585 (comment)

@AmandaBirmingham

Instead, rely on Study.samples() to validate it. Addresses @AmandaBirmingham's comment at biocore#585 (comment).

This was due to my misunderstanding of how multiple studies are handled in LC, as is documented in biocore#437 :)

fedarko · 2019-09-10T02:37:47Z

@AmandaBirmingham Thank you so much for the many helpful comments! I'm not going to be able to respond to them all today, but I'll get to work on hopefully addressing all of these tomorrow.

@AmandaBirmingham

That is, this just tries to see if int(limit) works successfully. I also overhauled the tests accordingly, and accounted for a few silly corner cases (e.g. someone passes in float('inf') as a limit). This should address a few of @AmandaBirmingham's comments in biocore#585.

Addresses a comment in biocore#585

@AmandaBirmingham

-Specify originating functions in the "new" error messages added to the plateViewer in biocore#585 -Reword misleading sentence re: "assuming" in the docstring of merge_sample_responses() Towards addressing @AmandaBirmingham's comments

fedarko · 2019-09-11T01:36:37Z

@AmandaBirmingham I think I've addressed all of your comments here (I left a few comments "unresolved" where I wasn't sure if you'd like me to do more).

Please let me know if you'd like me to do anything else before we merge this in—thanks again!

AmandaBirmingham

Just one minor comment that I don't think requires action right now. Ready to merge after a second reviewer approves?

labcontrol/db/study.py

Addresses comment in biocore#585

fedarko · 2019-09-11T19:46:20Z

Since @gwarmstrong is out right now, I can request a review from someone else?

fedarko · 2019-09-11T23:26:38Z

@qiyunzhu and I just sat down and walked through the code changes, and he has given his verbal approval of this PR.

While going over the code I noticed one small case that isn't explicitly tested in the Study.samples() unit test (strings of negative ints), so I added a check for that in the latest commit.

Anyway, once the latest commit passes on Travis (or at least fails solely due to the intermittent and unrelated download error mentioned in #588), I can merge this PR in.

fedarko · 2019-09-12T00:01:20Z

Build passed—merging in now. Thanks @AmandaBirmingham and @qiyunzhu for looking over this!

fedarko added 18 commits August 21, 2019 14:32

ENH: Expose limit in StudySamplesHandler biocore#520 biocore#553

be1233f

This is an important part of implementing client-side multi-matching (i.e. getting all the IDs needed to perform matching). Pair-programmed with @charles-cowart :)

Merge branch 'master' of https://github.com/jdereus/LabControl

b69f649

just syncing with upstream

Merge branch 'master' of https://github.com/jdereus/LabControl

f7fce69

Merge branch 'master' of https://github.com/jdereus/LabControl

129e30b

ENH: Prototype impl. of multi-select biocore#520 biocore#553

0421acb

TODOs: -Better then() failure handling on get_active_samples() -Apply yellow well color (well-indeterminate CSS class IIRC) if multiple matches are possible. -Remove debug console.log() stmts

MAINT: remove now-unneeded variable declaration

1920405

I think this was from back before I fixed biocore#567

MAINT: Apply auto-matching regardless of ms c.box

1f38eb4

Also prettified the code, since I forgot to do that previously.

Merge branch 'master' of https://github.com/jdereus/LabControl

00b4abd

MAINT: pass active study ID to patchWell

28e979c

this avoids having to call getActiveStudy() twice, + removes avenue for potential funky error where multiple calls would trigger different results

Merge branch 'master' of https://github.com/jdereus/LabControl

08226b5

DOC: Add comments re: multi-matching details

d81253d

this includes some discussion of todos, etc

ENH: show yellow box on indet. wells *initially*

967297e

Right now it works ok if you don't reload the well. but ideally this should save the indeterminate state on the backend? Which means that I think I'll have to pretty much redo this js logic in python..?

DOC: Add TODO comment re: patchWell efficiency

cf36c66

tldr: it's slow, and could be faster

DOC: add comment re: a formerly-missing return val

948dd19

HACK: Document how the indet. color is temporary

0f3d2da

This change can definitely be ignored/reverted if desired, but this makes it clear that if the user doesn't resolve this then these cells will turn red when you reload the plate.

MAINT: Split getSubstringMatches into lc.js & test

9a5eefc

MAINT: remove old console.log statements

c5de9aa

fedarko requested review from gwarmstrong and charles-cowart August 30, 2019 12:52

fedarko added 2 commits September 5, 2019 21:49

MAINT: Add onRejected funcs throughout plateViewer

77cb5a6

These will alert the user if something goes wrong. This sort of paradigm is already in use in this codebase, so adding these is pretty simple -- essentially just giving a short summary of the issue, then dumping the error message.

AmandaBirmingham requested review from AmandaBirmingham and removed request for charles-cowart September 6, 2019 22:25

fedarko added 4 commits September 6, 2019 16:40

TST: Add missing response code check in a test

26f82db

fedarko added 5 commits September 9, 2019 17:59

STY: Change return signature to "explicit tuple"

4014bad

Addresses @AmandaBirmingham's comment at biocore#585 (comment)

MAINT: Don't explicitly check limit in a handler

1b558b4

Instead, rely on Study.samples() to validate it. Addresses @AmandaBirmingham's comment at biocore#585 (comment).

DOC: updates -> update

289fcd4

DOC: Fix error in patchWell studyID description

1037e17

This was due to my misunderstanding of how multiple studies are handled in LC, as is documented in biocore#437 :)

BUG: remove duplicate hidden IDs in plate template

4b9752a

fedarko added 4 commits September 10, 2019 16:37

TST: test negative float limit case from handler

f7b6d06

BUG/TST: catch TypeError on int(limit)

661c8cc

DOC: Fix description of getSubstringMatches return

f313223

Addresses a comment in biocore#585

fedarko mentioned this pull request Sep 11, 2019

Refactor updateUnknownsAndDuplicates to only check certain wells, making it more efficient #591

Open

fedarko added 4 commits September 10, 2019 17:13

DOC: Remove comment that is superseded by biocore#591

ad932ee

DOC: Improve plateViewer docs/error messages

fb58786

-Specify originating functions in the "new" error messages added to the plateViewer in biocore#585 -Reword misleading sentence re: "assuming" in the docstring of merge_sample_responses() Towards addressing @AmandaBirmingham's comments

DOC: Copy a comment to where code has been copied

5e68a0d

DOC: possiblyNewContent -> possiblyMatchedContent

7e4db61

fedarko mentioned this pull request Sep 11, 2019

Cache list of active samples when the active study is changed in the plating interface #592

Open

fedarko added 3 commits September 10, 2019 17:36

DOC: add reference from comment to biocore#592

3fab6ed

DOC: Update modifyWell() docs re reviewer comments

827ac67

TST: For consistency, propEqual -> deepEqual

97f020d

This was referenced Sep 11, 2019

Adding a "being processed" color/border/... to newly plated wells? Or vice versa. #593

Open

In plating interface, valid sample IDs occasionally "flash" red before turning another color #594

Open

AmandaBirmingham approved these changes Sep 11, 2019

View reviewed changes

labcontrol/db/study.py Outdated Show resolved Hide resolved

MAINT: More general error-handling of int(limit)

9403705

Addresses comment in biocore#585

fedarko requested a review from qiyunzhu September 11, 2019 19:49

TST: for completeness' sake, test str(<= 0) limits

c0db25b

fedarko merged commit 9c1867d into biocore:master Sep 12, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Client-side specimen ID matching #585

Client-side specimen ID matching #585

fedarko commented Aug 30, 2019 •

edited

fedarko commented Sep 6, 2019

AmandaBirmingham commented Sep 6, 2019

fedarko commented Sep 10, 2019

fedarko commented Sep 11, 2019

AmandaBirmingham left a comment •

edited

fedarko commented Sep 11, 2019 •

edited

fedarko commented Sep 11, 2019 •

edited

fedarko commented Sep 12, 2019

Client-side specimen ID matching #585

Client-side specimen ID matching #585

Conversation

fedarko commented Aug 30, 2019 • edited

Summary of changes

Caveats

fedarko commented Sep 6, 2019

AmandaBirmingham commented Sep 6, 2019

fedarko commented Sep 10, 2019

fedarko commented Sep 11, 2019

AmandaBirmingham left a comment • edited

Choose a reason for hiding this comment

fedarko commented Sep 11, 2019 • edited

fedarko commented Sep 11, 2019 • edited

fedarko commented Sep 12, 2019

fedarko commented Aug 30, 2019 •

edited

AmandaBirmingham left a comment •

edited

fedarko commented Sep 11, 2019 •

edited

fedarko commented Sep 11, 2019 •

edited