### Notebook to analyzse the efficiency of minimap mapping against a mock community

Starting points from Tavish  
reference_dataframe at '/media/MassStorage/tmp/TE/honours/analysis/Stats/reference_dataframe.csv'  
custom_database at '/media/MassStorage/tmp/TE/honours/database/custom_database_labelled.fasta'  
taxonomy_file at '/media/MassStorage/tmp/TE/honours/analysis/Stats/taxonomy_file.csv'

#### workflow

* Two databases
* subsample 15000 reads per each mock community species. Save those out.
* map reads against both databases with minimap safe out data in paf format.
* get best hit per species (see what this means while looking at the data).
* add the full taxonomy to each best match using the taxonomy file.
* summarize data at different taxonomic ranks for each species.
* pull this all together somehow (summary across all the samples? focus on species of interest e.g. deleted from analyis?)

In [1]:
from Bio import SeqIO
import os
import random
import subprocess
import pandas as pd

#### Initial data

In [2]:
reference_dataframe_fn = os.path.abspath('/media/MassStorage/tmp/TE/honours/analysis/Stats/reference_dataframe.csv')
max_custom_database_fn = os.path.abspath('/media/MassStorage/tmp/TE/honours/database/custom_database_labelled.fasta')
taxonomy_file_fn = os.path.abspath('/media/MassStorage/tmp/TE/honours/analysis/Stats/taxonomy_file.csv')

In [3]:
INPUT_BASEDIR = os.path.abspath('/media/MassStorage/tmp/TE/honours')

In [4]:
OUT_DIR = os.path.abspath('../../analysis/Mapping_mock_gsref')
if not os.path.exists(OUT_DIR):
    os.mkdir(OUT_DIR)

In [5]:
### list of species in the max database
max_species = ['Puccinia_striiformis_tritici',
 'Zymoseptoria_tritici',
 'Pyrenophora_tritici-repentis',
 'Fusarium_oxysporum',
 'Tuber_brumale',
 'Cortinarius_globuliformis',
 'Aspergillus_niger',
 'Clavispora_lusitaniae',
 'Cryptococcus_neoformans',
 'Penicillium_chrysogenum',
 'Rhodotorula_mucilaginosa',
 'Scedosporium_boydii',
 'Blastobotrys_proliferans',
 'Candida_zeylanoides',
 'Galactomyces_geotrichum',
 'Kodamaea_ohmeri',
 'Meyerozyma_guillermondii',
 'Wickerhamomyces_anomalus',
 'Yamadazyma_mexicana',
 'Yamadazyma_scolyti',
 'Yarrowia_lipolytica',
 'Zygoascus_hellenicus',
 'Aspergillus_flavus',
 'Cryptococcus_zero',
 'Aspergillus_sp.',
 'CCL067',
 'Diaporthe_sp.',
 'Tapesia_yallundae_CCL031',
 'Tapesia_yallundae_CCL029',
 'Dothiorella_vidmadera',
 'Quambalaria_cyanescens',
 'Entoleuca_sp.',
 'CCL060',
 'CCL068',
 'Saccharomyces_cerevisiae',
 'Cladophialophora_sp.',
 'Candida_albicans',
 'Candida_metapsilosis',
 'Candida_orthopsilosis',
 'Candida_parapsilosis',
 'Geotrichum_candidum',
 'Kluyveromyces_lactis',
 'Kluyveromyces_marxianus',
 'Pichia_kudriavzevii',
 'Pichia_membranifaciens']

In [6]:
###Removed from second test databes
species_delete = ['Candida_orthopsilosis',
                 'Candida_metapsilosis',
                 'Aspergillus_niger']

In [7]:
###species to be searched against both databases
mock_community = ['Penicillium_chrysogenum',
 'Aspergillus_flavus',
 'Aspergillus_niger',
 'Pichia_kudriavzevii',
 'Pichia_membranifaciens',
 'Candida_albicans',
 'Candida_parapsilosis',
 'Candida_orthopsilosis',
 'Candida_metapsilosis']

In [8]:
fixed_old_names = ['Kluyveromyces_lactis',
                   'Candida_zeylanoides',
                   'Cladophialophora_sp.',
                   'Diaporthe_sp.',
                   'CCL060',
                   'CCL068',
                   'CCL067',
                   'Aspergillus_sp.',
                   'Entoleuca_sp.',
                   'Tapesia_yallundae_CCL029',
                   'Tapesia_yallundae_CCL031',
                   'Cryptococcus_neoformans']

In [9]:
fixed_new_names = ['candida_unidentified',
                   'debaryomyces_unidentified',
                   'cladophialophora_unidentified',
                   'diaporthe_unidentified',
                   'asteroma_ccl060',
                   'asteroma_ccl068',
                   'diaporthe_ccl067',
                   'aspergillus_unidentified',
                   'entoleuca_unidentified',
                   'oculimacula_yallundae-ccl029',
                   'oculimacula_yallundae-ccl031',
                   'kluyveromyces_unidentified']

In [10]:
old_to_new_names = dict(zip(fixed_old_names, fixed_new_names))

In [11]:
old_to_new_names

{'Kluyveromyces_lactis': 'candida_unidentified',
 'Candida_zeylanoides': 'debaryomyces_unidentified',
 'Cladophialophora_sp.': 'cladophialophora_unidentified',
 'Diaporthe_sp.': 'diaporthe_unidentified',
 'CCL060': 'asteroma_ccl060',
 'CCL068': 'asteroma_ccl068',
 'CCL067': 'diaporthe_ccl067',
 'Aspergillus_sp.': 'aspergillus_unidentified',
 'Entoleuca_sp.': 'entoleuca_unidentified',
 'Tapesia_yallundae_CCL029': 'oculimacula_yallundae-ccl029',
 'Tapesia_yallundae_CCL031': 'oculimacula_yallundae-ccl031',
 'Cryptococcus_neoformans': 'kluyveromyces_unidentified'}

### Fix databases and names

In [12]:
ref_df = pd.read_csv(reference_dataframe_fn)
ref_df['name_species'] = ref_df['genus'] +"_"+ ref_df['species']

In [13]:
ref_df.name_species.tolist()

['puccinia_striiformis-tritici',
 'zymoseptoria_tritici',
 'pyrenophora_tritici-repentis',
 'fusarium_oxysporum',
 'tuber_brumale',
 'cortinarius_globuliformis',
 'aspergillus_niger',
 'clavispora_lusitaniae',
 'kluyveromyces_unidentified',
 'penicillium_chrysogenum',
 'rhodotorula_mucilaginosa',
 'scedosporium_boydii',
 'blastobotrys_proliferans',
 'debaryomyces_unidentified',
 'galactomyces_geotrichum',
 'kodamaea_ohmeri',
 'meyerozyma_guillermondii',
 'wickerhamomyces_anomalus',
 'yamadazyma_mexicana',
 'yamadazyma_scolyti',
 'yarrowia_lipolytica',
 'zygoascus_hellenicus',
 'aspergillus_flavus',
 'cryptococcus_zero',
 'aspergillus_unidentified',
 'diaporthe_ccl067',
 'diaporthe_unidentified',
 'oculimacula_yallundae-ccl031',
 'oculimacula_yallundae-ccl029',
 'dothiorella_vidmadera',
 'quambalaria_cyanescens',
 'entoleuca_unidentified',
 'asteroma_ccl060',
 'asteroma_ccl068',
 'saccharomyces_cerevisiae',
 'cladophialophora_unidentified',
 'candida_albicans',
 'candida_metapsilosis',


In [14]:
new_db_fn = os.path.join(OUT_DIR, 'gsref.db.fasta')

In [15]:
new_db_list = []
old_db_list = []
for seq in SeqIO.parse(max_custom_database_fn, 'fasta'):
    old_db_list.append(seq.id)
    if seq.id in old_to_new_names.keys():
        #print(seq.id)
        seq.id = seq.name = seq.description = old_to_new_names[seq.id]
        new_db_list.append(seq)
    elif seq.id.lower() in ref_df.name_species.tolist():
        #print(seq.id)
        seq.id = seq.name = seq.description = seq.id.lower()
        new_db_list.append(seq)
    else:
        print(seq.id)

Cryptococcus_gattii
Geotrichum_candidum


In [16]:
if len(new_db_list) == len(old_db_list) -2:
    SeqIO.write(new_db_list, new_db_fn, 'fasta')
else:
    print("please check!")

In [17]:
sub_db_fn = os.path.join(OUT_DIR, 'gsref.subdb.fasta')
sub_db_list = []
for seq in new_db_list:
    if seq.id not in [x.lower() for x in species_delete]:
        sub_db_list.append(seq)

In [18]:
if len(sub_db_list) + len(species_delete) == len(new_db_list):
    SeqIO.write(sub_db_list, sub_db_fn, 'fasta' )
else:
    print("please check!")

In [19]:
[x.id for x in sub_db_list]

['puccinia_striiformis-tritici',
 'zymoseptoria_tritici',
 'pyrenophora_tritici-repentis',
 'fusarium_oxysporum',
 'tuber_brumale',
 'cortinarius_globuliformis',
 'clavispora_lusitaniae',
 'kluyveromyces_unidentified',
 'penicillium_chrysogenum',
 'rhodotorula_mucilaginosa',
 'scedosporium_boydii',
 'blastobotrys_proliferans',
 'debaryomyces_unidentified',
 'galactomyces_geotrichum',
 'kodamaea_ohmeri',
 'meyerozyma_guillermondii',
 'wickerhamomyces_anomalus',
 'yamadazyma_mexicana',
 'yamadazyma_scolyti',
 'yarrowia_lipolytica',
 'zygoascus_hellenicus',
 'aspergillus_flavus',
 'cryptococcus_zero',
 'aspergillus_unidentified',
 'diaporthe_ccl067',
 'diaporthe_unidentified',
 'oculimacula_yallundae-ccl031',
 'oculimacula_yallundae-ccl029',
 'dothiorella_vidmadera',
 'quambalaria_cyanescens',
 'entoleuca_unidentified',
 'asteroma_ccl060',
 'asteroma_ccl068',
 'saccharomyces_cerevisiae',
 'cladophialophora_unidentified',
 'candida_albicans',
 'candida_parapsilosis',
 'candida_unidentified

In [20]:
mock_community = [x.lower() for x in mock_community]

In [21]:
mock_community

['penicillium_chrysogenum',
 'aspergillus_flavus',
 'aspergillus_niger',
 'pichia_kudriavzevii',
 'pichia_membranifaciens',
 'candida_albicans',
 'candida_parapsilosis',
 'candida_orthopsilosis',
 'candida_metapsilosis']

### Subsample reads

In [22]:
def subsamplereads(in_fn, out_fn, n_reads):
    command = F'reformat.sh samplereadstarget={n_reads} in={in_fn} out={out_fn}'
    out = subprocess.getstatusoutput(command)
    if out[0] == 1:
        print(F":)Completed {command}\n")
    else:
        print(F":(check one {command}!!\n")

In [23]:
n_reads = 15000

In [24]:
MC_READ_DIR = os.path.join(OUT_DIR, 'MC_READS')
if not os.path.exists(MC_READ_DIR):
    os.mkdir(MC_READ_DIR)

In [25]:
ref_df.columns

Index(['Unnamed: 0', 'species', 'genus', 'family', 'order', 'class', 'phylum',
       'kingdom', '# raw reads', '# reads after homology filtering',
       '# reads after length filtering', '# for use', 'path to raw reads',
       'path to homology filtering', 'path to length filtering',
       'path for use', 'name_species'],
      dtype='object')

In [26]:
fn_subsampling = {}
for x in mock_community:
    fn_subsampling[x] = (ref_df[(ref_df['species'] == x.split('_')[1]) & (ref_df['genus'] == x.split('_')[0])]['path for use'].tolist()[0])
    fn_subsampling[x] = os.path.join(INPUT_BASEDIR, fn_subsampling[x])
fn_subsampling

{'penicillium_chrysogenum': '/media/MassStorage/tmp/TE/honours/analysis/Length_Filtered/20171103_FAH15473/barcode10/length_restricted_for_use.fasta',
 'aspergillus_flavus': '/media/MassStorage/tmp/TE/honours/analysis/Length_Filtered/20171207_FAH18654/barcode12/length_restricted_for_use.fasta',
 'aspergillus_niger': '/media/MassStorage/tmp/TE/honours/analysis/Length_Filtered/20171103_FAH15473/barcode07/length_restricted_for_use.fasta',
 'pichia_kudriavzevii': '/media/MassStorage/tmp/TE/honours/analysis/Length_Filtered/20180108_FAH18647/barcode11/length_restricted_for_use.fasta',
 'pichia_membranifaciens': '/media/MassStorage/tmp/TE/honours/analysis/Length_Filtered/20180108_FAH18647/barcode12/length_restricted_for_use.fasta',
 'candida_albicans': '/media/MassStorage/tmp/TE/honours/analysis/Length_Filtered/20180108_FAH18647/barcode03/length_restricted_for_use.fasta',
 'candida_parapsilosis': '/media/MassStorage/tmp/TE/honours/analysis/Length_Filtered/20180108_FAH18647/barcode06/length_res

In [27]:
sub_reads_fn = {}
for key, value in fn_subsampling.items():
    species = key
    in_fn = value
    out_fn = os.path.join(MC_READ_DIR, F'{species}.{n_reads}.fasta')
    subsamplereads(in_fn, out_fn, n_reads)
    sub_reads_fn[species] = out_fn

:)Completed reformat.sh samplereadstarget=15000 in=/media/MassStorage/tmp/TE/honours/analysis/Length_Filtered/20171103_FAH15473/barcode10/length_restricted_for_use.fasta out=/media/WorkingStorage/ben.working/students/tavish/analysis/Mapping_mock_gsref/MC_READS/penicillium_chrysogenum.15000.fasta

:)Completed reformat.sh samplereadstarget=15000 in=/media/MassStorage/tmp/TE/honours/analysis/Length_Filtered/20171207_FAH18654/barcode12/length_restricted_for_use.fasta out=/media/WorkingStorage/ben.working/students/tavish/analysis/Mapping_mock_gsref/MC_READS/aspergillus_flavus.15000.fasta

:)Completed reformat.sh samplereadstarget=15000 in=/media/MassStorage/tmp/TE/honours/analysis/Length_Filtered/20171103_FAH15473/barcode07/length_restricted_for_use.fasta out=/media/WorkingStorage/ben.working/students/tavish/analysis/Mapping_mock_gsref/MC_READS/aspergillus_niger.15000.fasta

:)Completed reformat.sh samplereadstarget=15000 in=/media/MassStorage/tmp/TE/honours/analysis/Length_Filtered/2018010

### Map with minimap against both databases

In [28]:
def minimapmapping(fasta_fn, ref_fn, out_fn):
    command = F"minimap2 -x map-ont -t 6 {ref_fn} {fasta_fn} -o {out_fn}"
    out = subprocess.getstatusoutput(command)
    print(out)

In [29]:
dbases_fn = {}
for x in [sub_db_fn, new_db_fn]:
    dbases_fn[x] = os.path.join(OUT_DIR, os.path.basename(x).replace('.fasta', '').replace('.','_'))
    if not os.path.exists(dbases_fn[x]):
        os.mkdir(dbases_fn[x])
dbases_fn

{'/media/WorkingStorage/ben.working/students/tavish/analysis/Mapping_mock_gsref/gsref.subdb.fasta': '/media/WorkingStorage/ben.working/students/tavish/analysis/Mapping_mock_gsref/gsref_subdb',
 '/media/WorkingStorage/ben.working/students/tavish/analysis/Mapping_mock_gsref/gsref.db.fasta': '/media/WorkingStorage/ben.working/students/tavish/analysis/Mapping_mock_gsref/gsref_db'}

In [30]:
db_fn = sub_db_fn
sub_db_mapping_fn = {}
for species, fasta_fn in sub_reads_fn.items():
    tmp_out = dbases_fn[db_fn]
    db_name = os.path.basename(db_fn).replace('.fasta', '')
    out_fn = os.path.join(tmp_out, F"{db_name}.{species}.minimap2.paf")
    sub_db_mapping_fn[species] = out_fn
    minimapmapping(fasta_fn, db_fn, out_fn)

(0, '[M::mm_idx_gen::0.009*1.19] collected minimizers\n[M::mm_idx_gen::0.012*2.11] sorted minimizers\n[M::main::0.012*2.10] loaded/built the index for 41 target sequence(s)\n[M::mm_mapopt_update::0.013*2.01] mid_occ = 42\n[M::mm_idx_stat] kmer size: 15; skip: 10; is_hpc: 0; #seq: 41\n[M::mm_idx_stat::0.014*1.94] distinct minimizers: 9557 (72.65% are singletons); average occurrences: 2.332; average spacing: 5.427\n[M::worker_pipeline::0.982*5.32] mapped 15000 sequences\n[M::main] Version: 2.17-r941\n[M::main] CMD: minimap2 -x map-ont -t 6 -o /media/WorkingStorage/ben.working/students/tavish/analysis/Mapping_mock_gsref/gsref_subdb/gsref.subdb.penicillium_chrysogenum.minimap2.paf /media/WorkingStorage/ben.working/students/tavish/analysis/Mapping_mock_gsref/gsref.subdb.fasta /media/WorkingStorage/ben.working/students/tavish/analysis/Mapping_mock_gsref/MC_READS/penicillium_chrysogenum.15000.fasta\n[M::main] Real time: 0.983 sec; CPU: 5.226 sec; Peak RSS: 0.094 GB')
(0, '[M::mm_idx_gen::0.00

In [31]:
sub_db_mapping_fn

{'penicillium_chrysogenum': '/media/WorkingStorage/ben.working/students/tavish/analysis/Mapping_mock_gsref/gsref_subdb/gsref.subdb.penicillium_chrysogenum.minimap2.paf',
 'aspergillus_flavus': '/media/WorkingStorage/ben.working/students/tavish/analysis/Mapping_mock_gsref/gsref_subdb/gsref.subdb.aspergillus_flavus.minimap2.paf',
 'aspergillus_niger': '/media/WorkingStorage/ben.working/students/tavish/analysis/Mapping_mock_gsref/gsref_subdb/gsref.subdb.aspergillus_niger.minimap2.paf',
 'pichia_kudriavzevii': '/media/WorkingStorage/ben.working/students/tavish/analysis/Mapping_mock_gsref/gsref_subdb/gsref.subdb.pichia_kudriavzevii.minimap2.paf',
 'pichia_membranifaciens': '/media/WorkingStorage/ben.working/students/tavish/analysis/Mapping_mock_gsref/gsref_subdb/gsref.subdb.pichia_membranifaciens.minimap2.paf',
 'candida_albicans': '/media/WorkingStorage/ben.working/students/tavish/analysis/Mapping_mock_gsref/gsref_subdb/gsref.subdb.candida_albicans.minimap2.paf',
 'candida_parapsilosis': '

In [32]:
db_fn = new_db_fn
new_db_mapping_fn = {}
for species, fasta_fn in sub_reads_fn.items():
    tmp_out = dbases_fn[db_fn]
    db_name = os.path.basename(db_fn).replace('.fasta', '')
    out_fn = os.path.join(tmp_out, F"{db_name}.{species}.minimap2.paf")
    new_db_mapping_fn[species] = out_fn
    minimapmapping(fasta_fn, db_fn, out_fn)

(0, '[M::mm_idx_gen::0.005*1.20] collected minimizers\n[M::mm_idx_gen::0.007*2.30] sorted minimizers\n[M::main::0.007*2.30] loaded/built the index for 44 target sequence(s)\n[M::mm_mapopt_update::0.008*2.18] mid_occ = 45\n[M::mm_idx_stat] kmer size: 15; skip: 10; is_hpc: 0; #seq: 44\n[M::mm_idx_stat::0.009*2.10] distinct minimizers: 9696 (71.01% are singletons); average occurrences: 2.462; average spacing: 5.428\n[M::worker_pipeline::1.029*5.38] mapped 15000 sequences\n[M::main] Version: 2.17-r941\n[M::main] CMD: minimap2 -x map-ont -t 6 -o /media/WorkingStorage/ben.working/students/tavish/analysis/Mapping_mock_gsref/gsref_db/gsref.db.penicillium_chrysogenum.minimap2.paf /media/WorkingStorage/ben.working/students/tavish/analysis/Mapping_mock_gsref/gsref.db.fasta /media/WorkingStorage/ben.working/students/tavish/analysis/Mapping_mock_gsref/MC_READS/penicillium_chrysogenum.15000.fasta\n[M::main] Real time: 1.031 sec; CPU: 5.542 sec; Peak RSS: 0.098 GB')
(0, '[M::mm_idx_gen::0.008*1.15] c

### Look at mapping results

In [33]:
def mapping_results(fn, species):
    min_header = ['qseqid', 'qlen', 'qstart', 'qstop', 'strand', 'tname', 'tlen', 'tstart', 'tend', 'nmatch', 'alen', 'mquality']
    tmp_df = pd.read_csv(fn, sep='\t', header = None, usecols=[x for x in range(0,12)], names=min_header)
    sub_df = tmp_df[tmp_df['mquality'] == tmp_df.groupby('qseqid')['mquality'].transform(max)].reset_index(drop=True)
    sub_df = sub_df[tmp_df['nmatch'] == tmp_df.groupby('qseqid')['nmatch'].transform(max)].reset_index(drop=True)
    hit_series = pd.Series(sub_df.groupby('tname')['mquality'].count().tolist()/sub_df.groupby('tname')['mquality'].count().sum(),
                      sub_df.groupby('tname')['mquality'].count().index)
    hit_series.sort_values(ascending=False, inplace=True)
    print('##########\n')
    print(F"This was the query species: {species}\n")
    print(F"These are the results:")
    print(hit_series,'\n')

In [34]:
###this is running the reads against the full database
for species, hit_fn in new_db_mapping_fn.items():
    mapping_results(hit_fn, species)

##########

This was the query species: penicillium_chrysogenum

These are the results:
tname
penicillium_chrysogenum          0.987957
aspergillus_niger                0.002348
aspergillus_flavus               0.001894
aspergillus_unidentified         0.001515
tuber_brumale                    0.000682
kluyveromyces_unidentified       0.000606
kluyveromyces_marxianus          0.000606
zymoseptoria_tritici             0.000454
cortinarius_globuliformis        0.000454
pyrenophora_tritici-repentis     0.000303
puccinia_striiformis-tritici     0.000303
clavispora_lusitaniae            0.000303
meyerozyma_guillermondii         0.000227
kodamaea_ohmeri                  0.000227
dothiorella_vidmadera            0.000151
asteroma_ccl060                  0.000151
cladophialophora_unidentified    0.000151
debaryomyces_unidentified        0.000151
rhodotorula_mucilaginosa         0.000151
oculimacula_yallundae-ccl031     0.000151
zygoascus_hellenicus             0.000076
candida_metapsilosis    

  """


##########

This was the query species: aspergillus_niger

These are the results:
tname
aspergillus_niger                0.770105
aspergillus_unidentified         0.167564
aspergillus_flavus               0.049178
penicillium_chrysogenum          0.007175
entoleuca_unidentified           0.000747
yamadazyma_mexicana              0.000598
oculimacula_yallundae-ccl031     0.000598
dothiorella_vidmadera            0.000448
meyerozyma_guillermondii         0.000448
pyrenophora_tritici-repentis     0.000448
diaporthe_ccl067                 0.000299
asteroma_ccl060                  0.000299
pichia_membranifaciens           0.000149
yarrowia_lipolytica              0.000149
yamadazyma_scolyti               0.000149
candida_metapsilosis             0.000149
cladophialophora_unidentified    0.000149
debaryomyces_unidentified        0.000149
tuber_brumale                    0.000149
saccharomyces_cerevisiae         0.000149
galactomyces_geotrichum          0.000149
kodamaea_ohmeri               

In [35]:
###this is running against a database that have ['Candida_orthopsilosis', 'Candida_metapsilosis', 'Aspergillus_niger'] deleted
for species, hit_fn in sub_db_mapping_fn.items():
    mapping_results(hit_fn, species)

##########

This was the query species: penicillium_chrysogenum

These are the results:
tname
penicillium_chrysogenum          0.989973
aspergillus_unidentified         0.002302
aspergillus_flavus               0.001783
kluyveromyces_marxianus          0.000743
tuber_brumale                    0.000594
zymoseptoria_tritici             0.000520
kluyveromyces_unidentified       0.000446
cortinarius_globuliformis        0.000446
puccinia_striiformis-tritici     0.000371
pyrenophora_tritici-repentis     0.000371
rhodotorula_mucilaginosa         0.000297
meyerozyma_guillermondii         0.000223
kodamaea_ohmeri                  0.000223
clavispora_lusitaniae            0.000223
cladophialophora_unidentified    0.000223
asteroma_ccl068                  0.000149
debaryomyces_unidentified        0.000149
oculimacula_yallundae-ccl031     0.000149
quambalaria_cyanescens           0.000074
candida_parapsilosis             0.000074
scedosporium_boydii              0.000074
asteroma_ccl060         

  """


##########

This was the query species: aspergillus_niger

These are the results:
tname
aspergillus_unidentified         0.940982
aspergillus_flavus               0.040535
penicillium_chrysogenum          0.011217
dothiorella_vidmadera            0.001530
zymoseptoria_tritici             0.001147
meyerozyma_guillermondii         0.000510
pyrenophora_tritici-repentis     0.000510
entoleuca_unidentified           0.000510
cladophialophora_unidentified    0.000382
tuber_brumale                    0.000255
diaporthe_ccl067                 0.000255
yamadazyma_mexicana              0.000255
zygoascus_hellenicus             0.000255
oculimacula_yallundae-ccl031     0.000255
puccinia_striiformis-tritici     0.000255
oculimacula_yallundae-ccl029     0.000255
saccharomyces_cerevisiae         0.000127
diaporthe_unidentified           0.000127
wickerhamomyces_anomalus         0.000127
debaryomyces_unidentified        0.000127
candida_albicans                 0.000127
asteroma_ccl068               