Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run doublet detection on ground truth data #454

Merged
merged 43 commits into from
May 24, 2024
Merged
Show file tree
Hide file tree
Changes from 39 commits
Commits
Show all changes
43 commits
Select commit Hold shift + click to select a range
47e58ee
conda environment with scrublet
sjspielman May 21, 2024
eb7d62d
Merge upstream main
sjspielman May 22, 2024
3306608
anndata to conda
sjspielman May 22, 2024
46df73b
remove outdated notebook
sjspielman May 22, 2024
26d449f
Scripts to download and format the data, and associated documentation
sjspielman May 22, 2024
b28074e
lock file update
sjspielman May 22, 2024
32b0ec9
remove scripts/.gitkeep
sjspielman May 22, 2024
40ef01e
Add scripts to run doublet detection and associated documentation
sjspielman May 22, 2024
1c2fafa
cores option
sjspielman May 22, 2024
555a649
readme updated
sjspielman May 22, 2024
a88de1b
add module bash script
sjspielman May 22, 2024
18102e8
Merge branch 'AlexsLemonade:main' into sjspielman/446-run-methods
sjspielman May 22, 2024
8f2df99
conda update
sjspielman May 22, 2024
e74f95a
newline for github
sjspielman May 22, 2024
371794d
Merge branch 'main' into sjspielman/446-run-methods
sjspielman May 22, 2024
54dafd2
Update analyses/doublet-detection/environment.yml
sjspielman May 22, 2024
178dec5
Apply suggestions from code review
sjspielman May 23, 2024
3fff9dc
Merge branch 'main' into sjspielman/446-run-methods
sjspielman May 23, 2024
0c81d92
make sure pandas is locked down
sjspielman May 23, 2024
519f061
update python script based on review comments
sjspielman May 23, 2024
30c9b2f
add zenodo to top comments and use outdir variable
sjspielman May 23, 2024
de07fe8
better path handling
sjspielman May 23, 2024
0dd2bdc
Actually, only do 1 file at a time
sjspielman May 23, 2024
6550615
too many inputs
sjspielman May 23, 2024
e5b8866
totally forgot a seed here
sjspielman May 23, 2024
53f7c9e
reproducibilify scdblfinder script
sjspielman May 23, 2024
d85a174
consolidate some code
sjspielman May 23, 2024
7dd5008
single script
sjspielman May 23, 2024
43e7f49
module run script massively updated for modularity
sjspielman May 23, 2024
0d0a64b
final.final environment
sjspielman May 23, 2024
85d0d71
documentation update
sjspielman May 23, 2024
8714800
results needs a readme
sjspielman May 23, 2024
e03fc28
formatted directory for each dataset, and put the original ones in raw
sjspielman May 23, 2024
c988dcb
Update analyses/doublet-detection/scripts/01a_detect-doublets.R
sjspielman May 23, 2024
9987a29
new line bork
sjspielman May 23, 2024
fbcf313
Apply suggestions from code review
sjspielman May 24, 2024
f997544
Apply suggestions from code review
sjspielman May 24, 2024
4db360e
change script names to reflect specific method
sjspielman May 24, 2024
9500945
Better argument handling and checking, and update Path specification
sjspielman May 24, 2024
1d03455
Update analyses/doublet-detection/scripts/01b_run-scrublet.py
sjspielman May 24, 2024
a9f81ff
spacing and better arg checking
sjspielman May 24, 2024
2b1410c
make the script a tad more flexible for future us by accommodating mu…
sjspielman May 24, 2024
d812619
Merge branch 'sjspielman/446-run-methods' of github.com:sjspielman/Op…
sjspielman May 24, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
61 changes: 0 additions & 61 deletions analyses/doublet-detection/01_compare-doublet-methods.Rmd

This file was deleted.

30 changes: 23 additions & 7 deletions analyses/doublet-detection/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,28 +7,44 @@ This module explores doublet detection across ScPCA datasets.
Methods used in this module include the following:

- [`scDBlFinder`](https://bioconductor.org/packages/release/bioc/html/scDblFinder.html)
- [`scds`](https://bioconductor.org/packages/release/bioc/html/scds.html)
Please provide a description of your module, including:
- [`scrublet`](https://github.com/swolock/scrublet)


## Usage

_Forthcoming._
To run this module, first create the `openscpca-doublet-detection` conda environment, and then activate it:

```sh
# create the environment
conda-lock install --name openscpca-doublet-detection conda-lock.yml

# activate the environment
conda activate openscpca-doublet-detection
```

Then, run the following bash script:

```sh
./run_doublet-detection.sh
```

## Input files

_Forthcoming._
This module currently uses input data from [a Zenodo repository](https://doi.org/10.5281/zenodo.4562782) to explore doublet detection methods.
Specifically, these datasets are used: `hm-6k`, `pbmc-1B-dm`, `pdx-MULTI`, and `HMEC-orig-MULTI`.

Eventually, we'd like to run all ScPCA datasets through doublet detection, but this is still TBD for this specific module.

## Output files

_Forthcoming._
- `results/benchmark_results`
- `{dataset_name}_sce.tsv`: TSV files with `scDblFinder` inferences
- `{dataset_name}_scrublet.tsv`: TSV files with `scrublet` inferences

## Software requirements

This module uses `renv` to manage software dependencies.
A Dockerfile created using [these guidelines](https://openscpca.readthedocs.io/en/latest/software-platforms/docker/docker-images/#r-based-images) is also provided.
This module uses both `renv` and `conda` to manage software dependencies.
TODO: NEEDS UPDATING! A Dockerfile created using [these guidelines](https://openscpca.readthedocs.io/en/latest/software-platforms/docker/docker-images/#r-based-images) is also provided.

## Computational resources

Expand Down
Loading