RxRx19b is the second component of the RxRx19 dataset series released by Recursion sharing data from a high-dimensional human cellular assay for COVID-19 associated disease. RxRx19b models the COVID-19-associated cytokine storm. For more information about RxRx19b please visit RxRx.ai and the associated preprint, Functional immune mapping with deep-learning enabled phenomics applied to immunomodulatory and COVID-19 drug discovery.
RxRx19b is part of a larger set of Recursion datasets that can be found at RxRx.ai and on GitHub. For questions about this dataset and others please email info@rxrx.ai.
The metadata can be found in metadata.csv and downloaded from here. The schema of the metadata is as follows:
| Attribute | Description |
|---|---|
| site_id | Unique identifier of a given site |
| well_id | Unique identifier of a given well |
| cell_type | Cell type tested |
| experiment | Experiment identifier |
| plate | Plate number within the experiment |
| well | Location on the plate |
| site | Indication of the location in the well where image was taken (always 1 in RxRx19b) |
| disease_condition | The disease condition tested in the well (healthy, healthy cytokine cocktail; storm-severe, severe cytokine storm cocktail; or blank, no cytokines) |
| treatment | Compound tested in the well (if any) |
| treatment_conc | Compound concentration tested (in uM) |
| SMILES | Formula of tested compound (as CXSMILES/ChemAxon Extended SMILES) |
The images are found in images/* and can be downloaded from here (n.b. this is 409GB).
The image data are 2048x2048 8-bit png files. The image paths, such as HUVEC-1/Plate1/AA02_s1_w3.png, can be read as:
Experiment Name: Cell type and experiment number (HUVEC experiment 1) Plate Number (1) Well location on plate (column AA, row 2) Site (1) Channel (3)
All six channels (w1 - w6) make up an single image of a given site.
Physical resolution: 0.65 micron/pixel.
The deep learning embeddings can be found in embeddings.csv and downloaded from here (n.b. this is 41MB).
Each row in the csv has a site_id as described in the metadata schema. The remaining 128 columns are the embedding for that respective site.
- August 2020: initial release
This work is licensed under the Creative Commons Attribution 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ or send a letter to Creative Commons, PO Box 1866, Mountain View, CA 94042, USA.
