Join GitHub today
GitHub is home to over 36 million developers working together to host and review code, manage projects, and build software together.Sign up
Enhancement: add private data sources to the associations page heatmap #566
The current associations page has a heatmap-like visualization that only displays scores at data type level. When loading custom data, the data is many times loaded as an extra data source of an existing data type. To draw the user's attention to the fact that there is custom association data in the associations page, it would be nice to have a feature that:
@pieterlukasse, just wanted to post a few quick UX observations about this change - and apologies for the length of this comment, but I figured it was best to put all my thoughts here. :)
The screenshots provided in the pull request are of the disease association page and on that page, it is easier to accommodate extra columns because the first column has the HGNC symbol. Below you will see screenshots from the target associations page, which has disease names in the first column
Adding columns significantly reduces the size of the column for the disease name along with each individual column width. While the width of the first column could be fixed, it would likely impact the overall responsiveness and usability of the heat map because the width would need to be optimised to work on both the target and disease associations pages - the same table component is shared on both views.
And on the subject of disease names, currently, if a disease name is truncated, users can hover over it to see the full name. But if users want to compare multiple diseases with truncated names - as most will be if extra columns are added - it will be difficult because only one full name can be shown at a time with the on-hover action.
And although the width of each individual column can be reduced from its current width of 49px, the labels and their position require a width of at least 25 px to be readable. Anything less than that, and the column names move too close together and it can be difficult to distinguish which column name belongs to which column. Also, as the column labels move closer together due to reduced column width (e.g. at 30px wide), it becomes more important to have a visual hierarchy so that users can distinguish between the different levels of association evidence - the overall association score, the overall data type score, and the individual data source score. A lack of visual hierarchy will likely cause users to misidentify the relationship between the different cells and could cause them to inadvertently believe that some associations are stronger or weaker than others.
White cells due to no data
Another observation is that the amount of white space that results from showing all specific data sources will present a usability issue - it can be difficult for users to follow the row if there are large blocks of white space. In the screenshot below, you'll see a large amount of white space because the given target does not have that data source and/or type.
In its current form, the heat map functions as a qualitative assessment of evidence based on the shade of blue. While users can make a quantitative assessment by hovering over each cell, it's not immediately obvious and so many users rely on a judgement about the shade of the blue to judge the strength of the association. I raise this because in conversations with users, I heard that seeing white cells in a row (even if the overall association score is 1.00) causes some users to believe this is a weaker association if a row above or below has more or all cells with some shade of blue (and the same overall association score of 1.00). For users familiar with our scoring method, it's less of a concern but to new users, it's not immediately apparent that the Platform uses a harmonic sum which might be why the data source cells are a lighter blue or white while the data type cell is a darker blue.
And depending on the nature of the data being added, it may not be relevant to users in different therapeutic areas. For example, if the data is oncology related and would fit under the somatic mutations datatype, a user conducting a search for a non cancer gene (e.g. IL13) or disease (e.g. asthma) would see columns of white cells because that data source may not be relevant. I would caution against using
For the rewrite project, I've been exploring the idea of freezing the first columns and rows for all data tables (including heatmaps). This would be similar to the freeze frames functionality in Excel and will help users keep the context of the row/column they are exploring.
I've also been exploring the idea of having a show-hide button for each data type so that users could see the underlying data sources and scores if they wanted.
And just wondering, will all data types have their individual sources shown - for example, showing all the genetic associations, somatic mutations, and pathways sources? Or will only the types with private data sources be expanded? I haven't investigated this directly, but based on research I've done for other features, my hypothesis is that users would expect to see all of the individual sources shown - or none - but not a mix of both.
@andrewhercules thanks for the detailed review! Good to see that you are already considering this feature for a future implementation. Freezing the first columns similar to Excel is a good idea. Having an option to let users expand the data type columns into its data source "sub-columns" is also a nice idea I think. For my current implementation however, I wanted to keep it simple and use it to only expand the data type for which we have added a custom data source. The goal is to highlight this internal data source to users and allow them to sort the heatmap based on it. There is no requirement yet to expand other data types, hence this minimal implementation to start with. I think that once you have your own implementation ready, we will be happy to present it to our users. Not sure if auto expansion will still be needed by then. Perhaps a new way of highlighting custom data sources can be found in the new rewrite implementation.
For the current PR I made (opentargets/webapp#306): are there any parts that you think should be changed in this PR? Or are you OK with merging it as it is now?