The data and scripts of this repository generate the figures of the main text and the supplementary text of the following manuscript: Moderated designs can balance between batch-effect mitigation and cell loss due to hashtag-assisted pooling in single-cell experiments. Budha Chatterjee, Katrina Gorga, Carly Blair, Yuko Ohta, Elizabeth M Hill, Christopher T Boughter, Martin Meier-Schellersheim, Nevil J Singh. bioRxiv. 2025.10. 16.682943.
The goal here is to read, transform and integrate single cell count matrices from different pools in an experiment (kept in the directory ./data), to extract different experimental designs. These designs are then evaluated for the mitigation of batch effects and for cell loss due to demultiplexing.
This R markdown file reads the h5 files from the directory ./data, carries out demultiplexing, extracts different designs, carries out different transformation, integration steps for each design. Then it saves the normalized, integrated objects in the directory ./SObjects. Create this directory locally. Once the script is run the rds objects saved in this directory will take close to 10GB space. These objects are used in subsequent scripts but the rds objects are not provided here (due to GitHub space constraints) - those need to be locally generated by running this script.
This R markdown file calculates the batch effects across all the processed objects: designs I-VI. Entropy measurements of each cell in each design for the different analytical pipelines are calculated and saved as rds objects (lists) in the directory ./BEffects. These objects are provided in the repository since they are not too large. These latter objects are used by other scripts (see below).
This markdown uses the outputs saved from Hash_well_01 and Hash_well_02. It generates a bunch of figures in figure 3 of the manuscript and the series of supplementary figures (the UMAP visualizations).
This markdown carries out the differential expression analysis between different samples within same pool or between same samples coming from different pools.
This markdown generates the plots of GDE and IHE in figure 4. Also supplementary figure S5.
This markdown estimates the x and y parameters from the hashtag data.