https://apptainer.org/docs/user/latest/
https://github.com/tseemann/abricate
https://hpcdocs.hpc.arizona.edu/software/containers/using_containers/

Run Time = 24 ; Core count on a single node = 35 ; Memory per core = 5

In [2]:
# Refresh until Apptainer is loaded
!apptainer --version

apptainer version 1.3.0-1.el7


In [3]:
# Chunk 1: Import necessary libraries
import os
from concurrent.futures import ThreadPoolExecutor, as_completed

In [4]:
# Chunk 2: Define a function to list all .fasta files
def list_fasta_files(directory):
    return [os.path.join(directory, f) for f in os.listdir(directory) if f.endswith('.fasta')]

In [5]:
# Chunk 3: Define a function to run abricate on a single file for all databases
def run_abricate(file_path):
    databases = ['ncbi', 'card', 'resfinder', 'argannot', 'megares', 'vfdb']
    for db in databases:
        output_file = f"{file_path}_{db}.tab"
        command = f"apptainer exec abricate.sif abricate --db {db} {file_path} > {output_file}"
        os.system(command)

In [6]:
# Chunk 4: Main processing logic using ThreadPoolExecutor
def process_files_in_parallel(file_paths):
    with ThreadPoolExecutor(max_workers=4) as executor:
        future_to_file = {executor.submit(run_abricate, file_path): file_path for file_path in file_paths}
        for future in as_completed(future_to_file):
            file_path = future_to_file[future]
            try:
                future.result()
                print(f"Processing completed for {file_path}")
            except Exception as exc:
                print(f"Error for {file_path}: {exc}")

In [7]:
# Chunk 5: Main function to orchestrate the processing
def main(directory):
    fasta_files = list_fasta_files(directory)
    process_files_in_parallel(fasta_files)

In [8]:
# Chunk 6: Run the main function with the directory containing your .fasta files
# (If job does not finish within time limits, move finished files to another directory before restarting)
directory_path = '/xdisk/kcooper/caparicio/tree-fruit/08abricate'
main(directory_path)

Using nucl database ncbi:  5386 sequences -  2023-Nov-4
Processing: /xdisk/kcooper/caparicio/tree-fruit/08abricate/peach192_contig.fasta
Using nucl database ncbi:  5386 sequences -  2023-Nov-4
Processing: /xdisk/kcooper/caparicio/tree-fruit/08abricate/apple338_contig.fasta
Using nucl database ncbi:  5386 sequences -  2023-Nov-4
Processing: /xdisk/kcooper/caparicio/tree-fruit/08abricate/peach304_contig.fasta
Using nucl database ncbi:  5386 sequences -  2023-Nov-4
Processing: /xdisk/kcooper/caparicio/tree-fruit/08abricate/apple341_contig.fasta
Found 9 genes in /xdisk/kcooper/caparicio/tree-fruit/08abricate/peach304_contig.fasta
Tip: found a bug in abricate? Post it at https://github.com/tseemann/abricate/issues.
Done.
Using nucl database card:  2631 sequences -  2023-Nov-4
Processing: /xdisk/kcooper/caparicio/tree-fruit/08abricate/peach304_contig.fasta
Found 9 genes in /xdisk/kcooper/caparicio/tree-fruit/08abricate/peach304_contig.fasta
Tip: it's important to realise abricate uses DNA ma

Processing completed for /xdisk/kcooper/caparicio/tree-fruit/08abricate/peach304_contig.fasta


Using nucl database ncbi:  5386 sequences -  2023-Nov-4
Processing: /xdisk/kcooper/caparicio/tree-fruit/08abricate/peach284_contig.fasta
Found 1 genes in /xdisk/kcooper/caparicio/tree-fruit/08abricate/peach284_contig.fasta
Tip: it's important to realise abricate uses DNA matching, not AA.
Done.
Using nucl database card:  2631 sequences -  2023-Nov-4
Processing: /xdisk/kcooper/caparicio/tree-fruit/08abricate/peach284_contig.fasta
Found 1 genes in /xdisk/kcooper/caparicio/tree-fruit/08abricate/peach284_contig.fasta
Tip: have a suggestion for abricate? Tell me at https://github.com/tseemann/abricate/issues
Done.
Using nucl database resfinder:  3077 sequences -  2023-Nov-4
Processing: /xdisk/kcooper/caparicio/tree-fruit/08abricate/peach284_contig.fasta
Found 1 genes in /xdisk/kcooper/caparicio/tree-fruit/08abricate/peach284_contig.fasta
Tip: it's important to realise abricate uses DNA matching, not AA.
Done.
Using nucl database argannot:  2223 sequences -  2023-Nov-4
Processing: /xdisk/kco

Processing completed for /xdisk/kcooper/caparicio/tree-fruit/08abricate/peach284_contig.fasta


Using nucl database ncbi:  5386 sequences -  2023-Nov-4
Processing: /xdisk/kcooper/caparicio/tree-fruit/08abricate/orange415_contig.fasta
Found 0 genes in /xdisk/kcooper/caparicio/tree-fruit/08abricate/peach192_contig.fasta
Tip: it's important to realise abricate uses DNA matching, not AA.
Done.
Using nucl database vfdb:  2597 sequences -  2023-Nov-4
Processing: /xdisk/kcooper/caparicio/tree-fruit/08abricate/peach192_contig.fasta
Found 0 genes in /xdisk/kcooper/caparicio/tree-fruit/08abricate/orange415_contig.fasta
Tip: did you know? abricate was named after 'A'nti 'B'acterial 'R'esistiance
Done.
Using nucl database card:  2631 sequences -  2023-Nov-4
Processing: /xdisk/kcooper/caparicio/tree-fruit/08abricate/orange415_contig.fasta
Found 3 genes in /xdisk/kcooper/caparicio/tree-fruit/08abricate/peach192_contig.fasta
Tip: found a bug in abricate? Post it at https://github.com/tseemann/abricate/issues.
Done.


Processing completed for /xdisk/kcooper/caparicio/tree-fruit/08abricate/peach192_contig.fasta


Using nucl database ncbi:  5386 sequences -  2023-Nov-4
Processing: /xdisk/kcooper/caparicio/tree-fruit/08abricate/orange412_contig.fasta
Found 0 genes in /xdisk/kcooper/caparicio/tree-fruit/08abricate/orange415_contig.fasta
Tip: the abricate manual is at https://github.com/tseemann/abricate/blob/master/README.md
Done.
Using nucl database resfinder:  3077 sequences -  2023-Nov-4
Processing: /xdisk/kcooper/caparicio/tree-fruit/08abricate/orange415_contig.fasta
Found 4 genes in /xdisk/kcooper/caparicio/tree-fruit/08abricate/apple338_contig.fasta
Tip: the --fofn option allows you to feed in a big list of files to run on.
Done.
Using nucl database card:  2631 sequences -  2023-Nov-4
Processing: /xdisk/kcooper/caparicio/tree-fruit/08abricate/apple338_contig.fasta
Found 0 genes in /xdisk/kcooper/caparicio/tree-fruit/08abricate/orange415_contig.fasta
Tip: found a bug in abricate? Post it at https://github.com/tseemann/abricate/issues.
Done.
Using nucl database argannot:  2223 sequences -  202

Processing completed for /xdisk/kcooper/caparicio/tree-fruit/08abricate/orange415_contig.fasta


Using nucl database ncbi:  5386 sequences -  2023-Nov-4
Processing: /xdisk/kcooper/caparicio/tree-fruit/08abricate/orange363_contig.fasta
Found 0 genes in /xdisk/kcooper/caparicio/tree-fruit/08abricate/orange412_contig.fasta
Tip: the abricate manual is at https://github.com/tseemann/abricate/blob/master/README.md
Done.
Using nucl database megares:  6635 sequences -  2023-Nov-4
Processing: /xdisk/kcooper/caparicio/tree-fruit/08abricate/orange412_contig.fasta
Found 10 genes in /xdisk/kcooper/caparicio/tree-fruit/08abricate/apple338_contig.fasta
Tip: did you know? abricate was named after 'A'nti 'B'acterial 'R'esistiance
Done.
Using nucl database resfinder:  3077 sequences -  2023-Nov-4
Processing: /xdisk/kcooper/caparicio/tree-fruit/08abricate/apple338_contig.fasta
Found 0 genes in /xdisk/kcooper/caparicio/tree-fruit/08abricate/orange363_contig.fasta
Tip: you can use the --summary option to combine reports in a presence/absence matrix.
Done.
Using nucl database card:  2631 sequences -  2

Processing completed for /xdisk/kcooper/caparicio/tree-fruit/08abricate/orange363_contig.fasta


Using nucl database ncbi:  5386 sequences -  2023-Nov-4
Processing: /xdisk/kcooper/caparicio/tree-fruit/08abricate/peach244_contis.fasta
Found 89 genes in /xdisk/kcooper/caparicio/tree-fruit/08abricate/apple341_contig.fasta
Tip: found a bug in abricate? Post it at https://github.com/tseemann/abricate/issues.
Done.
Using nucl database resfinder:  3077 sequences -  2023-Nov-4
Processing: /xdisk/kcooper/caparicio/tree-fruit/08abricate/apple341_contig.fasta
Found 3 genes in /xdisk/kcooper/caparicio/tree-fruit/08abricate/apple338_contig.fasta
Tip: the abricate manual is at https://github.com/tseemann/abricate/blob/master/README.md
Done.
Using nucl database argannot:  2223 sequences -  2023-Nov-4
Processing: /xdisk/kcooper/caparicio/tree-fruit/08abricate/apple338_contig.fasta
Found 0 genes in /xdisk/kcooper/caparicio/tree-fruit/08abricate/orange412_contig.fasta
Tip: you can use the --summary option to combine reports in a presence/absence matrix.
Done.


Processing completed for /xdisk/kcooper/caparicio/tree-fruit/08abricate/orange412_contig.fasta


Using nucl database ncbi:  5386 sequences -  2023-Nov-4
Processing: /xdisk/kcooper/caparicio/tree-fruit/08abricate/orange370_contigs.fasta
Found 0 genes in /xdisk/kcooper/caparicio/tree-fruit/08abricate/orange370_contigs.fasta
Tip: did you know? abricate was named after 'A'nti 'B'acterial 'R'esistiance
Done.
Using nucl database card:  2631 sequences -  2023-Nov-4
Processing: /xdisk/kcooper/caparicio/tree-fruit/08abricate/orange370_contigs.fasta
Found 0 genes in /xdisk/kcooper/caparicio/tree-fruit/08abricate/orange370_contigs.fasta
Tip: found a bug in abricate? Post it at https://github.com/tseemann/abricate/issues.
Done.
Using nucl database resfinder:  3077 sequences -  2023-Nov-4
Processing: /xdisk/kcooper/caparicio/tree-fruit/08abricate/orange370_contigs.fasta
Found 0 genes in /xdisk/kcooper/caparicio/tree-fruit/08abricate/peach244_contis.fasta
Tip: did you know? abricate was named after 'A'nti 'B'acterial 'R'esistiance
Done.
Using nucl database card:  2631 sequences -  2023-Nov-4
Pr

Processing completed for /xdisk/kcooper/caparicio/tree-fruit/08abricate/orange370_contigs.fasta


Using nucl database ncbi:  5386 sequences -  2023-Nov-4
Processing: /xdisk/kcooper/caparicio/tree-fruit/08abricate/peach254_contig.fasta
Found 0 genes in /xdisk/kcooper/caparicio/tree-fruit/08abricate/peach244_contis.fasta
Tip: did you know? abricate was named after 'A'nti 'B'acterial 'R'esistiance
Done.
Using nucl database argannot:  2223 sequences -  2023-Nov-4
Processing: /xdisk/kcooper/caparicio/tree-fruit/08abricate/peach244_contis.fasta
Found 0 genes in /xdisk/kcooper/caparicio/tree-fruit/08abricate/peach244_contis.fasta
Tip: abricate can also find virulence factors; use --list to see all supported databases.
Done.
Using nucl database megares:  6635 sequences -  2023-Nov-4
Processing: /xdisk/kcooper/caparicio/tree-fruit/08abricate/peach244_contis.fasta
Found 0 genes in /xdisk/kcooper/caparicio/tree-fruit/08abricate/peach254_contig.fasta
Tip: did you know? abricate was named after 'A'nti 'B'acterial 'R'esistiance
Done.
Using nucl database card:  2631 sequences -  2023-Nov-4
Proces

Processing completed for /xdisk/kcooper/caparicio/tree-fruit/08abricate/peach244_contis.fasta


Using nucl database ncbi:  5386 sequences -  2023-Nov-4
Processing: /xdisk/kcooper/caparicio/tree-fruit/08abricate/orange400_contig.fasta
Found 3 genes in /xdisk/kcooper/caparicio/tree-fruit/08abricate/peach254_contig.fasta
Tip: found a bug in abricate? Post it at https://github.com/tseemann/abricate/issues.
Done.
Using nucl database vfdb:  2597 sequences -  2023-Nov-4
Processing: /xdisk/kcooper/caparicio/tree-fruit/08abricate/peach254_contig.fasta
Found 0 genes in /xdisk/kcooper/caparicio/tree-fruit/08abricate/orange400_contig.fasta
Tip: it's important to realise abricate uses DNA matching, not AA.
Done.
Using nucl database card:  2631 sequences -  2023-Nov-4
Processing: /xdisk/kcooper/caparicio/tree-fruit/08abricate/orange400_contig.fasta
Found 19 genes in /xdisk/kcooper/caparicio/tree-fruit/08abricate/peach254_contig.fasta
Tip: remember that abricate is unable to find AMR-mediated SNPs.
Done.


Processing completed for /xdisk/kcooper/caparicio/tree-fruit/08abricate/peach254_contig.fasta


Using nucl database ncbi:  5386 sequences -  2023-Nov-4
Processing: /xdisk/kcooper/caparicio/tree-fruit/08abricate/orange403_contig.fasta
Found 0 genes in /xdisk/kcooper/caparicio/tree-fruit/08abricate/orange400_contig.fasta
Tip: have a suggestion for abricate? Tell me at https://github.com/tseemann/abricate/issues
Done.
Using nucl database resfinder:  3077 sequences -  2023-Nov-4
Processing: /xdisk/kcooper/caparicio/tree-fruit/08abricate/orange400_contig.fasta
Found 20 genes in /xdisk/kcooper/caparicio/tree-fruit/08abricate/apple341_contig.fasta
Tip: you can use the --summary option to combine reports in a presence/absence matrix.
Done.
Using nucl database megares:  6635 sequences -  2023-Nov-4
Processing: /xdisk/kcooper/caparicio/tree-fruit/08abricate/apple341_contig.fasta
Found 0 genes in /xdisk/kcooper/caparicio/tree-fruit/08abricate/orange400_contig.fasta
Tip: remember that abricate is unable to find AMR-mediated SNPs.
Done.
Using nucl database argannot:  2223 sequences -  2023-No

Processing completed for /xdisk/kcooper/caparicio/tree-fruit/08abricate/orange400_contig.fasta


Using nucl database ncbi:  5386 sequences -  2023-Nov-4
Processing: /xdisk/kcooper/caparicio/tree-fruit/08abricate/orange418_contig.fasta
Found 0 genes in /xdisk/kcooper/caparicio/tree-fruit/08abricate/orange403_contig.fasta
Tip: the --fofn option allows you to feed in a big list of files to run on.
Done.
Using nucl database argannot:  2223 sequences -  2023-Nov-4
Processing: /xdisk/kcooper/caparicio/tree-fruit/08abricate/orange403_contig.fasta
Found 0 genes in /xdisk/kcooper/caparicio/tree-fruit/08abricate/orange418_contig.fasta
Tip: the --fofn option allows you to feed in a big list of files to run on.
Done.
Using nucl database card:  2631 sequences -  2023-Nov-4
Processing: /xdisk/kcooper/caparicio/tree-fruit/08abricate/orange418_contig.fasta
Found 0 genes in /xdisk/kcooper/caparicio/tree-fruit/08abricate/orange403_contig.fasta
Tip: you can use the --summary option to combine reports in a presence/absence matrix.
Done.
Using nucl database megares:  6635 sequences -  2023-Nov-4
Proce

Processing completed for /xdisk/kcooper/caparicio/tree-fruit/08abricate/apple338_contig.fasta


Using nucl database ncbi:  5386 sequences -  2023-Nov-4
Processing: /xdisk/kcooper/caparicio/tree-fruit/08abricate/peach205_contig.fasta
Found 0 genes in /xdisk/kcooper/caparicio/tree-fruit/08abricate/peach205_contig.fasta
Tip: remember that abricate is unable to find AMR-mediated SNPs.
Done.
Using nucl database card:  2631 sequences -  2023-Nov-4
Processing: /xdisk/kcooper/caparicio/tree-fruit/08abricate/peach205_contig.fasta
Found 0 genes in /xdisk/kcooper/caparicio/tree-fruit/08abricate/peach205_contig.fasta
Tip: remember that abricate is unable to find AMR-mediated SNPs.
Done.
Using nucl database resfinder:  3077 sequences -  2023-Nov-4
Processing: /xdisk/kcooper/caparicio/tree-fruit/08abricate/peach205_contig.fasta
Found 0 genes in /xdisk/kcooper/caparicio/tree-fruit/08abricate/orange418_contig.fasta
Tip: you can use the --summary option to combine reports in a presence/absence matrix.
Done.
Using nucl database megares:  6635 sequences -  2023-Nov-4
Processing: /xdisk/kcooper/capa

Processing completed for /xdisk/kcooper/caparicio/tree-fruit/08abricate/peach205_contig.fasta


Using nucl database ncbi:  5386 sequences -  2023-Nov-4
Processing: /xdisk/kcooper/caparicio/tree-fruit/08abricate/orange347_contig.fasta
Found 2 genes in /xdisk/kcooper/caparicio/tree-fruit/08abricate/orange403_contig.fasta
Tip: abricate can also find virulence factors; use --list to see all supported databases.
Done.


Processing completed for /xdisk/kcooper/caparicio/tree-fruit/08abricate/orange403_contig.fasta


Using nucl database ncbi:  5386 sequences -  2023-Nov-4
Processing: /xdisk/kcooper/caparicio/tree-fruit/08abricate/orange382_contig.fasta
Found 0 genes in /xdisk/kcooper/caparicio/tree-fruit/08abricate/orange418_contig.fasta
Tip: the abricate manual is at https://github.com/tseemann/abricate/blob/master/README.md
Done.
Using nucl database vfdb:  2597 sequences -  2023-Nov-4
Processing: /xdisk/kcooper/caparicio/tree-fruit/08abricate/orange418_contig.fasta
Found 103 genes in /xdisk/kcooper/caparicio/tree-fruit/08abricate/apple341_contig.fasta
Tip: remember that abricate is unable to find AMR-mediated SNPs.
Done.
Using nucl database vfdb:  2597 sequences -  2023-Nov-4
Processing: /xdisk/kcooper/caparicio/tree-fruit/08abricate/apple341_contig.fasta
Found 0 genes in /xdisk/kcooper/caparicio/tree-fruit/08abricate/orange347_contig.fasta
Tip: it's important to realise abricate uses DNA matching, not AA.
Done.
Using nucl database card:  2631 sequences -  2023-Nov-4
Processing: /xdisk/kcooper/ca

Processing completed for /xdisk/kcooper/caparicio/tree-fruit/08abricate/orange418_contig.fasta


Using nucl database ncbi:  5386 sequences -  2023-Nov-4
Processing: /xdisk/kcooper/caparicio/tree-fruit/08abricate/peach252_contig.fasta
Found 0 genes in /xdisk/kcooper/caparicio/tree-fruit/08abricate/orange382_contig.fasta
Tip: remember that abricate is unable to find AMR-mediated SNPs.
Done.
Using nucl database resfinder:  3077 sequences -  2023-Nov-4
Processing: /xdisk/kcooper/caparicio/tree-fruit/08abricate/orange382_contig.fasta
Found 0 genes in /xdisk/kcooper/caparicio/tree-fruit/08abricate/peach252_contig.fasta
Tip: abricate can also find virulence factors; use --list to see all supported databases.
Done.
Using nucl database card:  2631 sequences -  2023-Nov-4
Processing: /xdisk/kcooper/caparicio/tree-fruit/08abricate/peach252_contig.fasta
Found 0 genes in /xdisk/kcooper/caparicio/tree-fruit/08abricate/orange347_contig.fasta
Tip: have a suggestion for abricate? Tell me at https://github.com/tseemann/abricate/issues
Done.
Using nucl database resfinder:  3077 sequences -  2023-Nov

Processing completed for /xdisk/kcooper/caparicio/tree-fruit/08abricate/apple341_contig.fasta


Using nucl database ncbi:  5386 sequences -  2023-Nov-4
Processing: /xdisk/kcooper/caparicio/tree-fruit/08abricate/orange371_contig.fasta
Found 10 genes in /xdisk/kcooper/caparicio/tree-fruit/08abricate/peach252_contig.fasta
Tip: you can use the --summary option to combine reports in a presence/absence matrix.
Done.


Processing completed for /xdisk/kcooper/caparicio/tree-fruit/08abricate/peach252_contig.fasta


Using nucl database ncbi:  5386 sequences -  2023-Nov-4
Processing: /xdisk/kcooper/caparicio/tree-fruit/08abricate/orange402_contig.fasta
Found 0 genes in /xdisk/kcooper/caparicio/tree-fruit/08abricate/orange382_contig.fasta
Tip: the --fofn option allows you to feed in a big list of files to run on.
Done.


Processing completed for /xdisk/kcooper/caparicio/tree-fruit/08abricate/orange382_contig.fasta


Using nucl database ncbi:  5386 sequences -  2023-Nov-4
Processing: /xdisk/kcooper/caparicio/tree-fruit/08abricate/orange365_contig.fasta
Found 0 genes in /xdisk/kcooper/caparicio/tree-fruit/08abricate/orange371_contig.fasta
Tip: the abricate manual is at https://github.com/tseemann/abricate/blob/master/README.md
Done.
Using nucl database card:  2631 sequences -  2023-Nov-4
Processing: /xdisk/kcooper/caparicio/tree-fruit/08abricate/orange371_contig.fasta
Found 0 genes in /xdisk/kcooper/caparicio/tree-fruit/08abricate/orange402_contig.fasta
Tip: the abricate manual is at https://github.com/tseemann/abricate/blob/master/README.md
Done.
Using nucl database card:  2631 sequences -  2023-Nov-4
Processing: /xdisk/kcooper/caparicio/tree-fruit/08abricate/orange402_contig.fasta
Found 0 genes in /xdisk/kcooper/caparicio/tree-fruit/08abricate/orange365_contig.fasta
Tip: remember that abricate is unable to find AMR-mediated SNPs.
Done.
Using nucl database card:  2631 sequences -  2023-Nov-4
Proces

Processing completed for /xdisk/kcooper/caparicio/tree-fruit/08abricate/orange347_contig.fasta


Found 0 genes in /xdisk/kcooper/caparicio/tree-fruit/08abricate/orange402_contig.fasta
Tip: it's important to realise abricate uses DNA matching, not AA.
Done.
Using nucl database vfdb:  2597 sequences -  2023-Nov-4
Processing: /xdisk/kcooper/caparicio/tree-fruit/08abricate/orange402_contig.fasta
Found 0 genes in /xdisk/kcooper/caparicio/tree-fruit/08abricate/orange365_contig.fasta
Tip: you can use the --summary option to combine reports in a presence/absence matrix.
Done.
Using nucl database vfdb:  2597 sequences -  2023-Nov-4
Processing: /xdisk/kcooper/caparicio/tree-fruit/08abricate/orange365_contig.fasta
Found 0 genes in /xdisk/kcooper/caparicio/tree-fruit/08abricate/orange371_contig.fasta
Tip: you can use the --summary option to combine reports in a presence/absence matrix.
Done.
Using nucl database vfdb:  2597 sequences -  2023-Nov-4
Processing: /xdisk/kcooper/caparicio/tree-fruit/08abricate/orange371_contig.fasta
Found 0 genes in /xdisk/kcooper/caparicio/tree-fruit/08abricate/or

Processing completed for /xdisk/kcooper/caparicio/tree-fruit/08abricate/orange365_contig.fasta


Found 0 genes in /xdisk/kcooper/caparicio/tree-fruit/08abricate/orange402_contig.fasta
Tip: have a suggestion for abricate? Tell me at https://github.com/tseemann/abricate/issues
Done.


Processing completed for /xdisk/kcooper/caparicio/tree-fruit/08abricate/orange402_contig.fasta
Processing completed for /xdisk/kcooper/caparicio/tree-fruit/08abricate/orange371_contig.fasta


Found 0 genes in /xdisk/kcooper/caparicio/tree-fruit/08abricate/orange371_contig.fasta
Tip: found a bug in abricate? Post it at https://github.com/tseemann/abricate/issues.
Done.


In [13]:
# Chunk 7
import glob

In [14]:
# Chunk 8: Define a Function to Summarize Results
def summarize_abricate_results(directory):
    # Assuming the naming convention is {sample_name}_{database}.tab
    # and you want to summarize across all such files
    files = glob.glob(f"{directory}/*_*.tab")  # This matches all files ending with _database.tab
    summary_file = f"{directory}/summary.tab"
    summary_csv = f"{directory}/summary.csv"
    
    # Joining files with spaces for the command
    files_str = " ".join(files)
    
    # Summarize results into a tab file
    summarize_command = f"apptainer exec abricate.sif abricate --summary {files_str} > {summary_file}"
    os.system(summarize_command)
    
    # Convert summary to CSV for Phyloseq
    summarize_csv_command = f"apptainer exec abricate.sif abricate --summary --csv {files_str} > {summary_csv}"
    os.system(summarize_csv_command)
    
    print("Summarization and CSV conversion completed.")

In [16]:
# Chunk 9: Call the Summarization Function
summarize_abricate_results(directory_path)

Summarization and CSV conversion completed.
