GHA for doublet-detection module #462

sjspielman · 2024-05-24T16:15:27Z

Purpose/implementation Section

Please link to the GitHub issue that this pull request addresses.

Note that should be considered "stacked on" #454

What is the goal of this pull request?

This PR adds a GHA for the module, including a comment placeholder for we'll probably want to download data eventually, later in this module's development.

Is there anything that you want to discuss further?

It may be too early in this module's development to be filing this PR, so it's fine if it hangs out for a little bit since there shouldn't be any conflicts!.

Author checklists

Analysis module and review

This analysis module uses the analysis template and has the expected directory structure.
The analysis module README.md has been updated to reflect code changes in this pull request.
The analytical code is documented and contains comments.
Any results and/or plots this code produces have been added to your S3 bucket for review.

Reproducibility checklist

Code in this pull request has been added to the GitHub Action workflow that runs this module.
The dependencies required to run the code in this pull request have been added to the analysis module Dockerfile.
If applicable, the dependencies required to run the code in this pull request have been added to the analysis module conda environment.yml file.
If applicable, R package dependencies required to run the code in this pull request have been added to the analysis module renv.lock file.

.github/workflows/run_doublet-detection.yml

Co-authored-by: Joshua Shapiro <josh.shapiro@ccdatalab.org>

allyhawkins · 2024-05-24T16:53:35Z

Just noting that I don't think this check actually ran because you didn't change any of the files in the module directory. To test, you probably just want to make a minor change and then revert before merging.

sjspielman · 2024-05-24T16:58:04Z

To test, you probably just want to make a minor change and then revert before merging.

Yes, definitely! I probably should have marked this as a Draft PR, since it can't be tested until #454 goes in. I'll go ahead and draft this now.

jashapiro · 2024-05-24T19:46:36Z

While I support adding a Rproj file in 72c0048, this is also an argument for using rprojroot::find_root(rprojroot::is_renv_project) or similar for more explicit root finding than here::here()

sjspielman · 2024-05-24T19:50:44Z

this is also an argument for using rprojroot::find_root(rprojroot::is_renv_project) or similar for more explicit root

Indeed!!! Will wait for after this build. If renv doesn't get cached...oooof.

sjspielman · 2024-05-28T14:48:58Z

@jashapiro do you have any insight into why the conda-lock environment doesn't seem to be fully activated/available here? https://github.com/AlexsLemonade/OpenScPCA-analysis/actions/runs/9270012159/job/25502032871#step:9:15143

.github/workflows/run_doublet-detection.yml

Co-authored-by: Joshua Shapiro <josh.shapiro@ccdatalab.org>

sjspielman · 2024-05-28T15:42:28Z

GHA passed! ✅

allyhawkins

LGTM!

sjspielman · 2024-05-28T17:13:43Z

Wanted to get an opinion here before merging @allyhawkins @jashapiro - running this workflow takes quite a bit of time! I'm wondering if, for now, we want to remove the trigger to run on every change to this module, and instead just add to the workflow that runs all modules. Notably, I did not actually add to the run-all workflow in this PR, so I suppose I'm also seeking opinions on should I go ahead and do that!

jashapiro · 2024-05-28T17:37:45Z

Wanted to get an opinion here before merging @allyhawkins @jashapiro - running this workflow takes quite a bit of time! I'm wondering if, for now, we want to remove the trigger to run on every change to this module, and instead just add to the workflow that runs all modules. Notably, I did not actually add to the run-all workflow in this PR, so I suppose I'm also seeking opinions on should I go ahead and do that!

You are running the workflow on all your benchmarking datasets, in full. This is not something we want to have happen in general. The goal of these actions is to see that the code works with example data, so they should not be running with real data in GHA.

If you have a small dataset that you can download and run instead, that would be the preferred solution: you can set a ENV variable to select that option when running your wrapper script. At a minimum, you should run the script on only one dataset, but it would be better to find something completely different, if you can.

For now, I would probably completely disable this action. In general, I don't think benchmarking is going to be something we want running repeatedly in GHA, so it will not be a big problem. When you get to running on the ScPCA samples, you will use simulated data anyway.

sjspielman · 2024-05-28T17:44:34Z

If you have a small dataset that you can download and run instead, that would be the preferred solution: you can set a ENV variable to select that option when running your wrapper script.

I had been wondering about this too. I think this is a separate PR, to not automatically download the full benchmark zip, but include an option to use smaller zip file with dummy data. I'd also have to change the dataset array here, too.

For now, I'll just turn it off. Thanks!

doublet-detection gha

538e1aa

sjspielman requested a review from allyhawkins as a code owner May 24, 2024 16:15

jashapiro reviewed May 24, 2024

View reviewed changes

.github/workflows/run_doublet-detection.yml Show resolved Hide resolved

Update .github/workflows/run_doublet-detection.yml

5eb3a98

Co-authored-by: Joshua Shapiro <josh.shapiro@ccdatalab.org>

sjspielman marked this pull request as draft May 24, 2024 16:58

sjspielman added 2 commits May 24, 2024 17:30

Merge branch 'main' into sjspielman/461-doublet-gha

9e15d61

lil readme modification to trigger gha

bf4c374

sjspielman removed the request for review from allyhawkins May 24, 2024 17:33

sjspielman added 5 commits May 24, 2024 14:03

install a couple dependencies

4d088d1

names

4297800

update dependencies

d618710

igraph dep

38e669d

Rproj file needs to be added

72c0048

sjspielman added 3 commits May 28, 2024 09:08

try actually activating

09a27fb

find renv root

9a0adfc

Merge branch 'main' into sjspielman/461-doublet-gha

7541780

jashapiro reviewed May 28, 2024

View reviewed changes

.github/workflows/run_doublet-detection.yml Outdated Show resolved Hide resolved

Update .github/workflows/run_doublet-detection.yml

1b606b6

Co-authored-by: Joshua Shapiro <josh.shapiro@ccdatalab.org>

sjspielman marked this pull request as ready for review May 28, 2024 15:41

sjspielman requested a review from allyhawkins May 28, 2024 15:42

allyhawkins approved these changes May 28, 2024

View reviewed changes

Merge branch 'main' into sjspielman/461-doublet-gha

b76b13e

sjspielman mentioned this pull request May 28, 2024

Test benchmark data for the doublet GHA #467

Open

turn off PR trigger

63ec223

sjspielman merged commit ce1b734 into AlexsLemonade:main May 28, 2024
2 checks passed

sjspielman deleted the sjspielman/461-doublet-gha branch May 28, 2024 18:50

sjspielman mentioned this pull request May 30, 2024

Add GHA for doublet-detection module #461

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GHA for doublet-detection module #462

GHA for doublet-detection module #462

sjspielman commented May 24, 2024

allyhawkins commented May 24, 2024

sjspielman commented May 24, 2024

jashapiro commented May 24, 2024 •

edited

sjspielman commented May 24, 2024 •

edited

sjspielman commented May 28, 2024

sjspielman commented May 28, 2024

allyhawkins left a comment

sjspielman commented May 28, 2024

jashapiro commented May 28, 2024 •

edited

sjspielman commented May 28, 2024

GHA for doublet-detection module #462

GHA for doublet-detection module #462

Conversation

sjspielman commented May 24, 2024

Purpose/implementation Section

Please link to the GitHub issue that this pull request addresses.

What is the goal of this pull request?

Is there anything that you want to discuss further?

Author checklists

Analysis module and review

Reproducibility checklist

allyhawkins commented May 24, 2024

sjspielman commented May 24, 2024

jashapiro commented May 24, 2024 • edited

sjspielman commented May 24, 2024 • edited

sjspielman commented May 28, 2024

sjspielman commented May 28, 2024

allyhawkins left a comment

Choose a reason for hiding this comment

sjspielman commented May 28, 2024

jashapiro commented May 28, 2024 • edited

sjspielman commented May 28, 2024

jashapiro commented May 24, 2024 •

edited

sjspielman commented May 24, 2024 •

edited

jashapiro commented May 28, 2024 •

edited