-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improved colour themes from new data sources #419
Comments
After some discussion with @hyanwong and with possible users of this feature I wanted to open a discussion about how it could work and specify a small project around it that can be costed. It is clear that the most sensible approach for highlighting regions of the tree depends on how simple it is to define those regions, and also on how much of the tree is involved in the highlights. For simple cases where only one species or clade (possibly including all direct ancestor nodes) is to be highlighted it's relative easy to pass around the information defining that region and the tree can be expanded to include the highlighted area. What we're interested in scoping out here is a solution for highlighting any generic region of the tree e.g. a species list consisting of tens of thousands of species. One approach is to expand the database with another field for leaves and nodes to tell if they are in this region or not, but that's really very ugly, obliges OneZoom to play a big role in defining the regions (which may be managed by and for third parties) and doesn't scale well to cover large numbers of regions with different use cases. That's out but I'm recording the possibility as part of the though process. The solution suggested is to have a string of digits (rather like the tree topology and cut map) that lines up with ordered leaves and ordered nodes and tells us immediately, for every leaf and node, if it is within the highlighted region or not. These stings can be stored in compressed files (I'll call it a tree region file) rather like the tree topology file we already have - this makes them static and easy to load and express on a tree view. The downside is that tree region files will need to be remade whenever the tree itself is remade as the leaf and node positions will change. We'd need unit tests to make sure it all works so that we never mess up because a small mistake in the data could easily lead to non sensical highlights that bear no resemblance to the intended region of the tree. Finally, if we later had more than one tree (e.g. a bigger tree including extinct species and maybe even a third tree including selected extinct species but not all) then we'd need three versions of these highlighted area static files and would need to serve the correct one to the visualisation. In terms of the graphic approach to highlighting, I think what we have now for advanced search works OK but there should be an option to add flags to highlighted species as well and there should be an option to fully colour the tree by these highlights (rather than allow any colour scheme and attempt the highlights over the top as it is now for advanced search and common ancestor marking). I guess that graphical part is probably reasonably clear and well defined. We'd also need some kind of JSON format for the source information defining the tree region (I'll call it a tree region source) from which the tree region files can be built - I think it's fine just to make the tree region source an ordered list of OTT IDs with a pair of flags on each to say that you include (or not) all descendants / ancestors of that OTT in the region as well. The most thorny issue, and the part that Yan and I had most difficulty specifying is the process of creating and updating the tree region files which need to get rebuilt whenever the tree region source from the third party changes and also whenever the tree is rebuilt. We'd need to create an API to which you pass the URL of a JSON tree region source and get back a region map file. One key issue there is how tree region files would get cached and the extent to which we need to manage that process to make it performant vs. let the server do it all itself. Should it be that we store all the tree region source files and rebuild all the tree region files as static files whenever the tree is built? Should it be that we don't store the tree region sources but do store the tree region files on the OneZoom server? Or should it be that we provide a service to create the tree region files and let third parties store them / rebuild them as needed. |
Scoping out this further for a specific project.
|
Highlighting a bunch of individual things (vs. things and their descendents) has come up already, but a binary map would only make sense if you're highlighting a significant proportion of the tree, which seems unlikely (but I could be wrong). Having an external URL to define the list of OTTs (vs. stuffing in an incredibly long URL) sounds sensible, even if it creates some weirdness when that list is refreshed. Presumably it would need to be Pinpoints rather than OTTs to cope with highlighting nodes without an OTT? Caching the results of the Pinpoint -> OZid API lookup (possibly manually rather than via. NGINX so we have more control of expiry) would be a lot simpler than building this into the tree-building pipeline, and potentially benefit other parts of the site too where we do bulk Pinpoint lookups. |
We think this is actually a colour scheme, not a highlight scheme (rule of thumb: A colour scheme is a data visualisation tool, a highlight is a mechanism to find things). We want a colour scheme that, based on a list of OTTs (non-OTT nodes won't be supported), we can assign a bunch of "traits" to a node/leaf, and then map these to a colour (or CBF colour), for both a branch and the point itself. In our case the traits would inherit, as highlights do, and apply to the branches, but that may not be the case. Here we're looking at 50k OTTs, but that could be higher (imagine the IUCN colour scheme being implemented through the same mechanism). |
There's 2 places the inheritance logic could sit:
|
@jrosindell On a leaf we put the IUCN status in text once you zoom in enough. Of course this makes sense for the default extinction risk colour scheme, but if you select "popularity", should we actually be displaying the relative popularity in that place, and similar for other colour schemes? |
These could be from the advanced search or from other features.
For example, we could add a pin on the tree itself and/or colour all branches descending from a higher taxa if that higher taxa was selected for highlight.
The text was updated successfully, but these errors were encountered: