Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
add HGTrobustness_parsing
This script, HGTrobustness_parsing.py, is designed to Facilitate a phylogenetic validation of putative Horizontal Gene Transfer (HGT) events based on a comprehensive analysis involving protein counts and taxonomic ratios within sister branches of phylogenetic trees.
Usage
It is run from the command line and requires several arguments:
--fasttree_tree_results or -a: This is the file path to fasttree_tree_results from AvP results. This argument is required.
--fasttree or -t: This is the directory path containing trees in Newick format. This argument is required.
--nb_prot or -n: This is the total number of proteins in the sister branch and ancestral sister branch. This argument is optional and defaults to 3 if not provided.
--clade_ratio or -n: This is the clade ratio, which is the total number of proteins in the sister branch and ancestral sister branch. This argument is optional and defaults to 0.8 if not provided.
--output_names or -o: This is the name of the output files. This argument is required.
The script also includes a function import_tree that imports a tree structure from a Newick file and roots the tree at the midpoint. This function takes as an argument the name of the file containing the tree structure in Newick format.
To run the script, navigate to the directory containing the script and use the following command, replacing the argument values with those appropriate for your use case:
$ python Parse_tree_2024.py --fasttree_tree_results [your_results_file/file path] --fasttree [your_tree_directory/dir path] --nb_prot [your_protein_number/int] --clade_ratio [your_clade_ratio/float] --output_names [your_output_file_name/path]