New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Revamp dataset selection UI #62
Comments
Looks very nice! I think this is a good direction. Instead of using this table on the splash page, I do think it'd be nicer to keep a small number of selected tiles followed by a link to "see more…" that goes to another page which uses this table view to show all builds/datasets (nextstrain.org/core or something). |
I'd be happy with something like this. I had thought about something like nextstrain.org/pathogens or nextstrain.org/datasets to give full listing of core and non-core, but that nextstrain.org/core could just show full listing of core datasets, the same way that nextstrain.org/groups/grubaughlab would show listing of grubaughlab datasets. I imagine also allowing fields to be sorted and having an elastic search box to filter rows. |
Seems good to be thinking about this. |
Notes from talking to @eharkins this morning while reviewing #271. (We felt that big UI changes to the components that multiple pages will use, e.g. /builds, /community, /flu, /sars-cov-2, was beyond the scope of that PR). General concept for the React component(s), which can be utilised across many pages: |
Thanks for the detailed sketch @jameshadfield. I think these are really interesting thoughts. I have few small comments, but I'll try to spend more time with throwing together an alternative design. I do think spitballing a few options here is the best starting point. Specific comments:
In the flu example our paths are:
I'd argue that the dataset UI should mirror URL paths in order to work correctly for arbitrary datasets uploaded to a groups's S3 bucket or posted to GitHub. In this case we actually have: "animal flu type" / "subtype" / "gene" / "resolution" and "animal flu type" / "subtype" / "gene" But for the flu example: you've split the top-level into separate entries. I don't see how we can generically know that this is the right cut for an arbitrary set of hierarchical paths.
I'm assuming this is then effectively the current interface of a series of datasets in alphabetical order in columns, ie I don't like how attempting for a hierarchical interface breaks down when encountering a diverse array of datasets. I want a single UI that could smoothly accommodate all datasets on nextstrain.org into a single UI. This current sketch would have one UI for neherlab group (with single datasets in columns) and a separate boxy layout for seasonal flu.
Thanks again for sketching things out. I'll try to put something together for criticism. |
Thanks for the comments @trvrb! The hierarchical UI implementation very much depends on the hierarchy being encoded in the yaml rather than being inferred from paths. So in that sense it is as flexible as the encoding in the yaml and could allow a yaml made up of a mix of hierarchical and non-hierarchical groups in the same listing, and would only display hierarchies where they are encoded. I will say while implementing the more structured layout I had some thoughts about how much more flexible a less structured UI could potentially be. The question is do we lose anything in the name of flexibility that we care about? As an extreme example, we could list all datasets at the same level (in something collapsible so that it didn't take up too much space on the page) with the main way of finding things being via search / filter. The only major flaw I can see with this is exploring when you don't know what you're looking for. But we can suggest filters that help with this. I would imagine this taking the best of both the auspice filter ui: You could imagine with a really good dataset search, pages like /influenza are not necessary at all until we have more bespoke resources and ways of exploring that pathogen or group of datasets - for example, the map for sars-cov-2, or something analogous for flu which illustrates the hierarchy visually. Instead, the splash page could just feature the dataset search / filter ui, replacing cards in the "Explore pathogens" section with suggested filters/searches (could still have the nice art from the cards, but when you click them it scrolls to the dataset-search section and applies the appropriate search/filter. I can try to implement something more on this end of the spectrum to contrast the more structured approach if you think it would be worth seeing. Although it's more difficult to implement so perhaps I could start with a sketch for feedback. |
This is definitely the direction I was thinking towards. "flu" pages could just be a filter for "flu" in the dataset name that belong to "core" Nextstrain datasets. I'd want to expose some metadata (maintained by, date updated, tip count) as well as title in this flat list. Just like the Auspice intersecting filter UI you could filter to "flu" in dataset name and also belonging to "blab". Or have union filters like "blab" and "neherlab" when they apply to the same metadata element also like in Auspice filter. And I really like the idea of keeping a few small tiles (or new UI elements) that would serve to apply suggested filters. |
Okay. I've made a simple sketch with Here, I've made a simple line list that has three elements: dataset name (synonymous with splitting JSON by This lends itself to a simple filtering by name in dataset (matching for Hovering over a dataset would reveal a bit more metadata using the same style of hover panel we do in Auspice: Here, in addition to a unified dataset select UI that would include core, community and groups, each Groups page would contain the same dataset select UI but only for datasets belonging to this group. Generally, I think this would be a nicely flexible and simple approach. |
A couple further considerations on this flat approach:
or https://nextstrain.org/groups/blab/ncov would bring up a dataset select UI with only
This would replace the current need for a
|
As Eli alludes to above, I would update the splash page to keep a few highlight tiles under "Explore pathogens" and "From the community" but mostly push prominent links to new URLs |
Thanks for taking the time to sketch this out @trvrb! Judging by the proposal of a |
I guess it can be a combination of approaches for sourcing a dataset listing, e.g. must be in the yaml for community datasets, vs s3 presence for nextstrain datasets (modified by |
Thanks Eli. I am indeed suggesting for these generic dataset selection UI that we'd generate the data on the fly by scraping relevant S3 buckets. This could be generated into a JSON file or a YAML file that the nextstrain.org front-end would reference when displaying datasets, but the "source of truth" would be the S3 buckets. For Community datasets, the script to troll GitHub would be a bit more complicated, but the endpoint would be generation of a similar JSON or YAML file as to the S3 trawling. The |
I can further clarify how I imagine this view component would work (in terms of things like "infinite-scroll", etc...). Edit: In the below screenshots I've clarified that there's an internal vertical scrollbar with "infinite scroll". |
The general search bar is useful if you know what you are looking for but may not be great for general browsing. Can we have some dynamic dropdown boxes that would "guide" the dataset filtering? (This is similar to how auspice dataset selection currently works). The "hierarchy" is hidden within these dynamic dropdown boxes, but the display of the datasets would be the infinite scroll list proposed above. |
Agreed re: Jover's comments about discovery. The dynamic dropdowns remind me of faceted search interfaces, which are commonly used for filtering but also play a nice role in enabling discovery by surfacing properties and values of potential interest. I wonder about using facet lists for each hierarchy level, so you can a) pick more than one value and b) see more values to choose from at once without having to disclose a dropdown. Facet lists could also be used for other dataset properties, like contributor, last updated date range (within last week, within last month, within last 6 months, etc.). As a convenient, rough example, here's one example of a faceted search I've previously built: You can click on any value, which immediately updates the results list (not shown) as well as the counts for the values in other facets. Values within a single facet are ORed, values across facets are ANDed. There's also a freeform text filter (not shown) which is ANDed with any facet selections. |
this offers an alternative way to list builds that is more flexible - listing them in a flat (non-hierarchical) interface that can be filtered. The listing can easily be converted into a table or other. see #62 (comment) for more details
I see what both getting at Jover and Tom and I appreciate the feedback, but I think there's a couple things going on. The primary one is that we don't actually have labels for any of the levels of hierarchy, so we can't have a normal faceted interface, ie we don't know about "cohort", "tissue", etc... In the case of flu, Eli has gone in and made a manual curation of these levels of "subtype", "segment", etc... but generally we won't have this. We just have the word in the dataset file name and we have words separated by "/". My intent here is to treat this filtering as a bag-of-words model, so that the dataset I think that just having dataset filtering provides the same sort of discoverability as the series of dropdowns that Jover proposes above. Here's how I imagine that typing "h5n" into the filter box would proceed. You'd get the same sort of Auspice dropdown with autocomplete of all the words that match to "h5n...", in this case, "h5n1" and "h5nx". Selecting "h5nx" in the downdown creates a blue pillbox with "h5nx" and the same eyeball / trash can as used in the Auspice interface. This filters to just datasets possessing the word "h5nx" in their bag-of-words. Notice that it doesn't matter where in the list "h5nx" appears: it will filter to both If one were to subsequently filter to "pb1", you'd get two pillboxes under "Filtered to" and the dataset list would contain just their intersection, in this case just However, I take the general point about discoverability and I'd imagine you could do something like the following: This is just picking the most common words across all the bag-of-words datasets and giving a list of these words as suggestions. Clicking on "flu" would add a filter to "flu" and update the word counts (ala Tom's faceted UI above, just that things are necessarily flat). Sketch file of the above is available as |
this offers an alternative way to list builds that is more flexible - listing them in a flat (non-hierarchical) interface that can be filtered. The listing can easily be converted into a table or other. see #62 (comment) for more details
The current "tile" UI to navigate to datasets on the splash page works for well for a few items; it's attractive and approachable. However, I can imagine aiming for ~20 core datasets alongside as about as many non-core datasets. I'd propose a general UI for dataset selection. This would get used on the splash page, but could also be used on say nextstrain.org/groups/blab/ to display
blab
specific datasets.I imagine this working by crawling the S3 bucket at some interval and collecting datasets and the meta portion of the JSON (primarily
updated
).I wanted to combine datasets that begin with the same filestem the way we do now, otherwise the 40 seasonal flu datasets will overwhelm the single Zika dataset. After playing around for a little while, here's what I came up with:
The filestem is used to collect datasets under a single umbrella. The dataset owner is displayed alongside logo as well as date updated for the most recent dataset. The toggle on the left can be used to display a list of all datasets collected under the same umbrella:
This reuses the "list" styling in Auspice built for filters.
This is related to issue #48 (manifest). Sketch mock can be found on Google Drive here.
The text was updated successfully, but these errors were encountered: