Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

build: upgrade tool versions #155

Merged
merged 12 commits into from
Feb 3, 2024
Merged

build: upgrade tool versions #155

merged 12 commits into from
Feb 3, 2024

Conversation

uniqueg
Copy link
Member

@uniqueg uniqueg commented Feb 1, 2024

Description

  • Upgraded most tools to latest versions in Conda environments in workflow/envs
  • Upgraded Docker containers accordingly
  • Upgrade Ubuntu Docker images from Focal Fossa to Jammy Jellyfish
  • Use sra-tools and pigz images from BioContainers (now all images except Ubuntu-based ones are from BioContainers)

Fixes #154

Type of change

  • Maintenance

Conventional Commits guidelines

Checklist:

  • My code changes follow the style of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing tests pass locally with my changes
  • I have updated the project's documentation

@uniqueg uniqueg requested review from balajtimate, mkatsanto, dominikburri and ninsch3000 and removed request for dominikburri February 1, 2024 13:57
@uniqueg
Copy link
Member Author

uniqueg commented Feb 1, 2024

Expecting this to fail, of course - at the very least the expected outputs have to be updated. The main questions for now are (a) whether the tests run through at all and (b) whether the results are compatible.

@uniqueg
Copy link
Member Author

uniqueg commented Feb 2, 2024

In all tests, we currently get an error in rule create_kallisto_index. Will restore the old version of kallisto and see if the error goes away

@uniqueg uniqueg marked this pull request as ready for review February 2, 2024 19:40
@uniqueg
Copy link
Member Author

uniqueg commented Feb 2, 2024

The problem with kallisto index likely stems from this release that introduced a new and improved kallisto index:
https://github.com/pachterlab/kallisto/releases/tag/v0.50.0

I have now set the version to the last version before the offending version (0.50.0), so now it's 0.48.0, up from 0.46.2 that we previously used. With that all test workflows run through. As expected, the tests still fail, as many of the outputs changed (different MD5 sums).

Now, the question is whether they change in a way that we believe is worse than before. Do we have any way of assessing the testing quantitatively? How have we previously done this when upgrading tools (I'm sure this is not the first time)? Just ran a pipeline on one or more known samples and checked if the results look roughly the same and, if so, just upgraded? Because I think I would like to do that here as well. What do you think @mkatsanto?

@balajtimate As discussed, I rolled back the changes to the HTSinfer Docker image from this PR, so that it'll still pull the latest image from the Zavolab registry after merging. We can put in a separate PR for that when the image is fixed on BioContainers.

@uniqueg
Copy link
Member Author

uniqueg commented Feb 2, 2024

Actually, looking at the 57 files whose MD5 sum changed, it's not too bad at all:

  • 56 are related to FastQC, so they have no downstream dependencies, e.g., on alignments, quantification etc.
  • 1 is a file with version info (expected to change)

So I think the question about still getting compatible outputs doesn't arise.

Here's a list of the files whose MD5 check failed (renamed and grouped to make them easier to read/process):

fastqc/fq1/single/fastqc_data.txt
fastqc/fq1/single/fastqc.fo
fastqc/fq1/single/Images/adapter_content.png
fastqc/fq1/single/Images/duplication_levels.png
fastqc/fq1/single/Images/per_base_n_content.png
fastqc/fq1/single/Images/per_base_sequence_content.png
fastqc/fq1/single/Images/per_sequence_gc_content.png
fastqc/fq1/single/Images/per_sequence_quality.png
fastqc/fq1/single/Images/sequence_length_distribution.png
fastqc/fq1/paired/fastqc_data.txt
fastqc/fq1/paired/fastqc.fo
fastqc/fq1/paired/Images/adapter_content.png
fastqc/fq1/paired/Images/duplication_levels.png
fastqc/fq1/paired/Images/per_base_n_content.png
fastqc/fq1/paired/Images/per_base_sequence_content.png
fastqc/fq1/paired/Images/per_sequence_gc_content.png
fastqc/fq1/paired/Images/per_sequence_quality.png
fastqc/fq1/paired/Images/sequence_length_distribution.png
fastqc/fq2/paired/fastqc_data.txt
fastqc/fq2/paired/fastqc.fo
fastqc/fq2/paired/Images/adapter_content.png
fastqc/fq2/paired/Images/duplication_levels.png
fastqc/fq2/paired/Images/per_base_n_content.png
fastqc/fq2/paired/Images/per_base_sequence_content.png
fastqc/fq2/paired/Images/per_sequence_gc_content.png
fastqc/fq2/paired/Images/per_sequence_quality.png
fastqc/fq2/paired/Images/sequence_length_distribution.png
fastqc_trimmed/fq1/single/fastqc_data.txt
fastqc_trimmed/fq1/single/fastqc.fo
fastqc_trimmed/fq1/single/Images/adapter_content.png
fastqc_trimmed/fq1/single/Images/duplication_levels.png
fastqc_trimmed/fq1/single/Images/per_base_n_content.png
fastqc_trimmed/fq1/single/Images/per_base_sequence_content.png
fastqc_trimmed/fq1/single/Images/per_sequence_gc_content.png
fastqc_trimmed/fq1/single/Images/per_sequence_quality.png
fastqc_trimmed/fq1/single/Images/sequence_length_distribution.png
fastqc_trimmed/fq1/paired/fastqc_data.txt
fastqc_trimmed/fq1/paired/fastqc.fo
fastqc_trimmed/fq1/paired/Images/adapter_content.png
fastqc_trimmed/fq1/paired/Images/duplication_levels.png
fastqc_trimmed/fq1/paired/Images/per_base_n_content.png
fastqc_trimmed/fq1/paired/Images/per_base_sequence_content.png
fastqc_trimmed/fq1/paired/Images/per_sequence_gc_content.png
fastqc_trimmed/fq1/paired/Images/per_sequence_quality.png
fastqc_trimmed/fq1/paired/Images/sequence_length_distribution.png
fastqc_trimmed/fq2/paired/fastqc_data.txt
fastqc_trimmed/fq2/paired/fastqc.fo
fastqc_trimmed/fq2/paired/Images/adapter_content.png
fastqc_trimmed/fq2/paired/Images/duplication_levels.png
fastqc_trimmed/fq2/paired/Images/per_base_n_content.png
fastqc_trimmed/fq2/paired/Images/per_base_sequence_content.png
fastqc_trimmed/fq2/paired/Images/per_sequence_gc_content.png
fastqc_trimmed/fq2/paired/Images/per_sequence_quality.png
fastqc_trimmed/fq2/paired/Images/sequence_length_distribution.png
multiqc_summary/multiqc_data/multiqc_fastqc_fastqc_raw.txt
multiqc_summary/multiqc_data/multiqc_fastqc_fastqc_trimmed.txt
salmon_indexes/homo_sapiens/31/salmon.idx/versionInfo.json

@uniqueg uniqueg enabled auto-merge (squash) February 2, 2024 21:08
@uniqueg uniqueg merged commit a82ac5d into dev Feb 3, 2024
11 checks passed
@uniqueg uniqueg deleted the upgrade_tool_versions branch February 3, 2024 17:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

build: upgrade tool versions
2 participants