Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Amalgkit 0.6.0.0 changelog #49

Closed
Hego-CCTB opened this issue Jun 29, 2021 · 10 comments
Closed

Amalgkit 0.6.0.0 changelog #49

Hego-CCTB opened this issue Jun 29, 2021 · 10 comments
Assignees

Comments

@Hego-CCTB
Copy link
Collaborator

This is going to be a bigger update, affecting multiple currently open issues. So I'll post the changelog in here and refer to this from the other issues.

@Hego-CCTB
Copy link
Collaborator Author

Changelog Amalgkit ver. 0.6.0.0

amalgkit csca

  • new sub function for cross species correlation analysis
  • USAGE:
amalgkit csca \
--out_dir \
PATH_TO_WORKING_DIRECTORY \
--file_species_tree \
PATH_TO_NWK_FILE \
--file_singlecopy \
PATH_TO_ORTHOFINDER_FILE \
--file_orthogroup \
PATH_TO_ORTHOFINDER_FILE \
--dir_uncorrected_curate_group_mean \ 
PATH_TO_CURATE_TABLES\
--dir_curate_group_mean\
PATH_TO_CURATE_TABLES\
--dir_sra \
PATH_TO_CURATE_TABLES\
--dir_tc \
PATH_TO_CURATE_TABLES\

--curate_group \
'root,flower,leaf' \
  • Note: This was tested on a 9 species plant dataset retrieved, quantified and curated by amalgkit. That said, further testing is needed. Especially gene name format can cause issues.
  • Note: dir_uncorrected_curate_group_mean, dir_curate_group_mean, dir_sra, dir_tc all point to the same directory, if the input is unchanged curate output. As such, these arguments are inferred by default. If there is a curate/tables folder in the --out_dir path, amalgkit will find those files on its own.

amalgkit curate

  • Now throws a warning when transforming with TPM
  • Now throws an error when cstmm output files are detected (parsed from path) in combination with TPM transformation
  • Now includes option --one_outlier_per_iter yes|no, which allows only 1 sample per same bioproject or same tissue to be removed per iteration of the outlier removal
  • check_within_tissue_correlation() now removes samples below a pearson r of 0.2 (currently hard coded, but can be made an optional input in the future)
  • --cleanup 0|1 is now plot_intermediate yes|no. "yes" calculates and prints SVA correction after every single iteration of outlier removal. This can drastically increase runtimes.

amalgkit getfastq

  • truncated updated_metadata output files to only essential columns for curate. This comes with two benefits: lower filesize (which very slightly increases curate performance) and more importantly, same column number across all individual files
  • obsoleted --ascp and all related options

amalgkit

  • added amalgkit csca subparsers

This should go up later today. I'm still debugging and I have to merge with the other updates today.

@kfuku52
Copy link
Owner

kfuku52 commented Jun 29, 2021

Is there any option like --curate_group all to include all curate_group in the metadata table?

@Hego-CCTB
Copy link
Collaborator Author

If --curate_group is left none , it should parse out all unique values from the curate_group column and use that as input.

@kfuku52
Copy link
Owner

kfuku52 commented Jun 29, 2021

Sounds good!

Hego-CCTB added a commit that referenced this issue Jun 29, 2021
- see #49 for the full change log

Signed-off-by: Hego_CCTB <matthias_freund@outlook.com>
@Hego-CCTB
Copy link
Collaborator Author

Update is now live.
cbd6852

@kfuku52
Copy link
Owner

kfuku52 commented Jun 30, 2021

The curate_group column is missing in the metadata table. Could you update amalgkit metadata?

@Hego-CCTB
Copy link
Collaborator Author

Ah, it seems the column doesn't survive the last metadata step. There are 3 metadata sheets as output. curate_group is in the second output, but not in the third.

I'll investigate that.

@kfuku52
Copy link
Owner

kfuku52 commented Jul 5, 2021

It seems that curate_group isn't used at all in transcriptome_curation.r. Am I missing something?

@Hego-CCTB
Copy link
Collaborator Author

Yeah, you are right. I'm gonna need to replace any reference to tissue with curate_group.

@Hego-CCTB
Copy link
Collaborator Author

Hego-CCTB commented Jul 9, 2021

Yeah, you are right. I'm gonna need to replace any reference to tissue with curate_group.

Amalgkit ver. 0.6.2.3

  • Replaced every 'tissue' or 'tissues' with 'curate_group' or 'curate_groups', including variables
    2174567

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants