# Interactias

The impact that invasive species have on other biodiversity is largely to do with the interactions the invasive species has on the local species. Invasive species can predate or parasitize, they can disrupte pollination networks, they can compete with local species for different resources and many other types of interaction.

Nevertheless, while much of the research on IAS is related to modelling spread and distribution, interactions are not as easy to study. Species interactions can happen very quickly, but have large consequences for the actors e.g. predation. They can happen only rarely, or a certain times of the day or season. Interactions are also difficult to quantify with relation to the impact on the interacting organisms.

The aim of this notebook is to try to quantify the actual or potential impact of an invasive species through our knowledge of its interactions and the abundance of its actors.

I will harvest species interactions data from GLOBI (https://www.globalbioticinteractions.org/) to discover the species that interact with an invasive species. I will then compare these interactions with the occupancy of the interacting species in a particular area.

This notebook takes considerable insperation and code from Yikang Li's project on GLoBI (https://curiositydata.org/part1_globi_access/).

In [2]:
import pandas as pd
#import pytaxize
import re
import matplotlib.pyplot as plt

## Load the GLoBI data

The current snapshot of GLoBI was taken on 2019-11-05 from https://depot.globalbioticinteractions.org/snapshot/target/data/tsv/interactions.tsv.gz


In [4]:
# This takes a few minutes to load in.
# the low_memory=False property will get rid of a warning, but will not help if there is really no memory left
data = pd.read_csv('../data/interactions.tsv', delimiter='\t', encoding='utf-8', low_memory=False)
len(data)

FileNotFoundError: [Errno 2] File b'../data/interactions.tsv' does not exist: b'../data/interactions.tsv'

In [3]:
# Take a little look at the data to make sure it makes sense
data.head()

Unnamed: 0,sourceTaxonId,sourceTaxonIds,sourceTaxonName,sourceTaxonRank,sourceTaxonPathNames,sourceTaxonPathIds,sourceTaxonPathRankNames,sourceTaxonSpeciesName,sourceTaxonSpeciesId,sourceTaxonGenusName,...,eventDateUnixEpoch,argumentTypeId,referenceCitation,referenceDoi,referenceUrl,sourceCitation,sourceNamespace,sourceArchiveURI,sourceDOI,sourceLastSeenAtUnixEpoch
0,EOL:12001247,EOL:12001247 | OTT:133330 | IRMNG:11733708 | N...,Leptoconchus massini,species,Animalia | Mollusca | Gastropoda | Neogastropo...,EOL:1 | EOL:2195 | EOL:2366 | EOL:2447 | EOL:4...,kingdom | phylum | class | order | superfamily...,Leptoconchus massini,EOL:12001247,Leptoconchus,...,,https://en.wiktionary.org/wiki/support,"Gittenberger, A., Gittenberger, E. (2011). Cry...",10.1007/s13127-011-0039-1,,Jorrit H. Poelen. 2014. Species associations m...,FloraVincent/template-dataset,https://github.com/FloraVincent/template-datas...,,2019-03-30T23:08:44.205Z
1,EOL:12001247,EOL:12001247 | OTT:133330 | IRMNG:11733708 | N...,Leptoconchus massini,species,Animalia | Mollusca | Gastropoda | Neogastropo...,EOL:1 | EOL:2195 | EOL:2366 | EOL:2447 | EOL:4...,kingdom | phylum | class | order | superfamily...,Leptoconchus massini,EOL:12001247,Leptoconchus,...,,https://en.wiktionary.org/wiki/support,"Gittenberger, A., Gittenberger, E. (2011). Cry...",10.1007/s13127-011-0039-1,,Jorrit H. Poelen. 2014. Species associations m...,FloraVincent/template-dataset,https://github.com/FloraVincent/template-datas...,,2019-03-30T23:08:44.205Z
2,EOL:12001243,EOL:12001243 | WD:Q13393577 | OTT:550603 | WOR...,Leptoconchus inpleuractis,species,Animalia | Mollusca | Gastropoda | Neogastropo...,EOL:1 | EOL:2195 | EOL:2366 | EOL:2447 | EOL:4...,kingdom | phylum | class | order | superfamily...,Leptoconchus inpleuractis,EOL:12001243,Leptoconchus,...,,https://en.wiktionary.org/wiki/support,"Gittenberger, A., Gittenberger, E. (2011). Cry...",10.1007/s13127-011-0039-1,,Jorrit H. Poelen. 2014. Species associations m...,FloraVincent/template-dataset,https://github.com/FloraVincent/template-datas...,,2019-03-30T23:08:44.205Z
3,EOL:12001243,EOL:12001243 | WD:Q13393577 | OTT:550603 | WOR...,Leptoconchus inpleuractis,species,Animalia | Mollusca | Gastropoda | Neogastropo...,EOL:1 | EOL:2195 | EOL:2366 | EOL:2447 | EOL:4...,kingdom | phylum | class | order | superfamily...,Leptoconchus inpleuractis,EOL:12001243,Leptoconchus,...,,https://en.wiktionary.org/wiki/support,"Gittenberger, A., Gittenberger, E. (2011). Cry...",10.1007/s13127-011-0039-1,,Jorrit H. Poelen. 2014. Species associations m...,FloraVincent/template-dataset,https://github.com/FloraVincent/template-datas...,,2019-03-30T23:08:44.205Z
4,EOL:12001243,EOL:12001243 | WD:Q13393577 | OTT:550603 | WOR...,Leptoconchus inpleuractis,species,Animalia | Mollusca | Gastropoda | Neogastropo...,EOL:1 | EOL:2195 | EOL:2366 | EOL:2447 | EOL:4...,kingdom | phylum | class | order | superfamily...,Leptoconchus inpleuractis,EOL:12001243,Leptoconchus,...,,https://en.wiktionary.org/wiki/support,"Gittenberger, A., Gittenberger, E. (2011). Cry...",10.1007/s13127-011-0039-1,,Jorrit H. Poelen. 2014. Species associations m...,FloraVincent/template-dataset,https://github.com/FloraVincent/template-datas...,,2019-03-30T23:08:44.205Z


In [4]:
# List all the interaction types
data['interactionTypeName'].unique()

array(['parasiteOf', 'symbiontOf', 'commensalistOf', 'mutualistOf',
       'eats', 'interactsWith', 'hasHost', 'pathogenOf', 'preysOn',
       'pollinates', 'coOccursWith', 'visitsFlowersOf', 'adjacentTo',
       'dispersalVectorOf', 'endoparasitoidOf', 'endoparasiteOf',
       'hasVector', 'ectoParasiteOf', 'livesOn', 'livesNear',
       'parasitoidOf', 'guestOf', 'livesInsideOf', 'farms',
       'ectoParasitoid', 'inhabits', 'kills', 'hasDispersalVector',
       'livesUnder', 'kleptoparasiteOf', 'laysEggsOn', 'visits',
       'ecologicallyRelatedTo'], dtype=object)

## Drop duplicates

This line gets rid of duplicate interations. I currently can't see a reason to keep them, but this perhaps should be checked. 
Some more common interactions might have more support in the literature and therefore more records. Deduplicating them tends to equal out rare interactions with common ones.

In [5]:
data.drop_duplicates(['sourceTaxonId', 'interactionTypeName', 'targetTaxonId'], inplace = True)

In [6]:
## Chech how mant rows are left
len(data)

1101736

Define the key taxon for the notebook for which to find all interactions


In [20]:
#taxon = "Oxalis pes-caprae"
#taxon = "Lantana camara"
taxon = "Cirsium vulgare"

In [21]:
# What are all the types of interactions involving Oxalis pes-caprae as source taxon?
data[data['sourceTaxonName'] == taxon]['interactionTypeName'].unique()

array(['interactsWith', 'visitsFlowersOf'], dtype=object)

In [22]:
# What are all the types of interactions involving Oxalis pes-caprae as target taxon?
data[data['targetTaxonName'] == taxon]['interactionTypeName'].unique()

array(['interactsWith', 'eats', 'visitsFlowersOf', 'hasHost',
       'parasiteOf', 'livesInsideOf', 'preysOn', 'pollinates'],
      dtype=object)

How many taxon sources do I have?

In [23]:
len(data[data['sourceTaxonName'] == taxon])

130

How many taxon targets do I have?

In [24]:
len(data[data['targetTaxonName'] == taxon])

338

Gather together all the data where the target is the taxon in question.

In [25]:
interacts_data = data[(data['targetTaxonName'] == taxon)]

In [26]:
interacts_data.head()

Unnamed: 0,sourceTaxonId,sourceTaxonIds,sourceTaxonName,sourceTaxonRank,sourceTaxonPathNames,sourceTaxonPathIds,sourceTaxonPathRankNames,sourceTaxonSpeciesName,sourceTaxonSpeciesId,sourceTaxonGenusName,...,eventDateUnixEpoch,argumentTypeId,referenceCitation,referenceDoi,referenceUrl,sourceCitation,sourceNamespace,sourceArchiveURI,sourceDOI,sourceLastSeenAtUnixEpoch
117947,EOL:6447088,EOL:6447088 | OTT:138038 | GBIF:3483411 | GBIF...,Golovinomyces cichoracearum,species,Fungi | Ascomycota | Leotiomycetes | Erysiphal...,EOL_V2:5559 | EOL:5577 | EOL:2861056 | EOL:562...,kingdom | phylum | class | order | family | ge...,Golovinomyces cichoracearum,EOL:6447088,Golovinomyces,...,,https://en.wiktionary.org/wiki/support,A. Thessen. 2014. Species associations extract...,,,A. Thessen. 2014. Species associations extract...,Bhanditz/pseudonitzchia,https://github.com/Bhanditz/pseudonitzchia/arc...,,2019-11-02T23:25:50.260Z
117948,EOL:4095553,EOL:4095553 | OTT:144679 | GBIF:5137800 | EOL:...,Pontia protodice,species,Animalia | Arthropoda | Insecta | Lepidoptera ...,EOL:1 | EOL:164 | EOL:344 | EOL:747 | EOL:854 ...,kingdom | phylum | class | order | superfamily...,Pontia protodice,EOL:4095553,Pontia,...,,https://en.wiktionary.org/wiki/support,A. Thessen. 2014. Species associations extract...,,,A. Thessen. 2014. Species associations extract...,Bhanditz/pseudonitzchia,https://github.com/Bhanditz/pseudonitzchia/arc...,,2019-11-02T23:25:50.260Z
117949,EOL_V2:3639930,EOL_V2:3639930 | NBN:NBNSYS0000006982 | WD:Q27...,Cheilosia grossa,species,Cellular organisms | Eukaryota | Opisthokonta ...,EOL:6061725 | EOL:2908256 | EOL:2910700 | EOL:...,| | | kingdom | | | | | | phylum | | ...,Cheilosia grossa,EOL_V2:3639930,Cheilosia,...,,https://en.wiktionary.org/wiki/support,A. Thessen. 2014. Species associations extract...,,,A. Thessen. 2014. Species associations extract...,Bhanditz/pseudonitzchia,https://github.com/Bhanditz/pseudonitzchia/arc...,,2019-11-02T23:25:50.260Z
117950,EOL:2742935,EOL:2742935 | OTT:487756 | ITIS:757769 | NCBI:...,Halictus ligatus,species,Animalia | Arthropoda | Insecta | Hymenoptera ...,EOL:1 | EOL:164 | EOL:344 | EOL:648 | EOL:676 ...,kingdom | phylum | class | order | superfamily...,Halictus ligatus,EOL:2742935,Halictus,...,,https://en.wiktionary.org/wiki/support,A. Thessen. 2014. Species associations extract...,,,A. Thessen. 2014. Species associations extract...,Bhanditz/pseudonitzchia,https://github.com/Bhanditz/pseudonitzchia/arc...,,2019-11-02T23:25:50.260Z
117951,EOL_V2:2668031,EOL_V2:2668031 | EOL_V2:2668031 | IRMNG:110385...,Rhinocyllus conicus,species,Animalia | Bilateria | Protostomia | Ecdysozoa...,EOL:1 | EOL:3014411 | EOL:10459935 | EOL:88807...,kingdom | subkingdom | infrakingdom | superphy...,Rhinocyllus conicus,EOL_V2:2668031,Rhinocyllus,...,,https://en.wiktionary.org/wiki/support,A. Thessen. 2014. Species associations extract...,,,A. Thessen. 2014. Species associations extract...,Bhanditz/pseudonitzchia,https://github.com/Bhanditz/pseudonitzchia/arc...,,2019-11-02T23:25:50.260Z


In [27]:
# What are the columns of this dataset?
data.columns

Index(['sourceTaxonId', 'sourceTaxonIds', 'sourceTaxonName', 'sourceTaxonRank',
       'sourceTaxonPathNames', 'sourceTaxonPathIds',
       'sourceTaxonPathRankNames', 'sourceTaxonSpeciesName',
       'sourceTaxonSpeciesId', 'sourceTaxonGenusName', 'sourceTaxonGenusId',
       'sourceTaxonFamilyName', 'sourceTaxonFamilyId', 'sourceTaxonOrderName',
       'sourceTaxonOrderId', 'sourceTaxonClassName', 'sourceTaxonClassId',
       'sourceTaxonPhylumName', 'sourceTaxonPhylumId',
       'sourceTaxonKingdomName', 'sourceTaxonKingdomId', 'sourceId',
       'sourceOccurrenceId', 'sourceCatalogNumber', 'sourceBasisOfRecordId',
       'sourceBasisOfRecordName', 'sourceLifeStageId', 'sourceLifeStageName',
       'sourceBodyPartId', 'sourceBodyPartName', 'sourcePhysiologicalStateId',
       'sourcePhysiologicalStateName', 'interactionTypeName',
       'interactionTypeId', 'targetTaxonId', 'targetTaxonIds',
       'targetTaxonName', 'targetTaxonRank', 'targetTaxonPathNames',
       'targetTaxonPath

Simplify the table to make it readable

In [28]:
interacts_data = interacts_data[['sourceTaxonId', 'sourceTaxonName', 'sourceTaxonRank',
       'sourceTaxonPathNames', 'sourceTaxonFamilyName', 'interactionTypeName',
       'interactionTypeId', 'targetTaxonId',
       'targetTaxonName','targetTaxonPathNames',
       'targetTaxonPathIds', 'targetTaxonPathRankNames',
       'targetTaxonSpeciesName', 'targetTaxonSpeciesId',
       'targetTaxonGenusName', 'targetTaxonGenusId', 'targetTaxonFamilyName',
       'targetTaxonFamilyId', 'targetTaxonOrderName', 'targetTaxonOrderId',
       'targetTaxonClassName', 'targetTaxonClassId', 'targetTaxonPhylumName',
       'targetTaxonPhylumId', 'targetTaxonKingdomName', 'targetTaxonKingdomId', 'referenceDoi', 'decimalLatitude', 'decimalLongitude'
        ]].dropna(subset=['targetTaxonId', 'targetTaxonName','targetTaxonPathNames','targetTaxonPathIds'])
interacts_data.head()

Unnamed: 0,sourceTaxonId,sourceTaxonName,sourceTaxonRank,sourceTaxonPathNames,sourceTaxonFamilyName,interactionTypeName,interactionTypeId,targetTaxonId,targetTaxonName,targetTaxonPathNames,...,targetTaxonOrderId,targetTaxonClassName,targetTaxonClassId,targetTaxonPhylumName,targetTaxonPhylumId,targetTaxonKingdomName,targetTaxonKingdomId,referenceDoi,decimalLatitude,decimalLongitude
117947,EOL:6447088,Golovinomyces cichoracearum,species,Fungi | Ascomycota | Leotiomycetes | Erysiphal...,Erysiphaceae,interactsWith,http://purl.obolibrary.org/obo/RO_0002437,EOL:468830,Cirsium vulgare,Plantae | Tracheophyta | Magnoliopsida | Aster...,...,EOL:4205,Magnoliopsida,EOL:283,Tracheophyta,EOL:4077,Plantae,EOL_V2:281,,,
117948,EOL:4095553,Pontia protodice,species,Animalia | Arthropoda | Insecta | Lepidoptera ...,Pieridae,interactsWith,http://purl.obolibrary.org/obo/RO_0002437,EOL:468830,Cirsium vulgare,Plantae | Tracheophyta | Magnoliopsida | Aster...,...,EOL:4205,Magnoliopsida,EOL:283,Tracheophyta,EOL:4077,Plantae,EOL_V2:281,,,
117949,EOL_V2:3639930,Cheilosia grossa,species,Cellular organisms | Eukaryota | Opisthokonta ...,Syrphidae,interactsWith,http://purl.obolibrary.org/obo/RO_0002437,EOL:468830,Cirsium vulgare,Plantae | Tracheophyta | Magnoliopsida | Aster...,...,EOL:4205,Magnoliopsida,EOL:283,Tracheophyta,EOL:4077,Plantae,EOL_V2:281,,,
117950,EOL:2742935,Halictus ligatus,species,Animalia | Arthropoda | Insecta | Hymenoptera ...,Halictidae,interactsWith,http://purl.obolibrary.org/obo/RO_0002437,EOL:468830,Cirsium vulgare,Plantae | Tracheophyta | Magnoliopsida | Aster...,...,EOL:4205,Magnoliopsida,EOL:283,Tracheophyta,EOL:4077,Plantae,EOL_V2:281,,,
117951,EOL_V2:2668031,Rhinocyllus conicus,species,Animalia | Bilateria | Protostomia | Ecdysozoa...,Curculionidae,interactsWith,http://purl.obolibrary.org/obo/RO_0002437,EOL:468830,Cirsium vulgare,Plantae | Tracheophyta | Magnoliopsida | Aster...,...,EOL:4205,Magnoliopsida,EOL:283,Tracheophyta,EOL:4077,Plantae,EOL_V2:281,,,


In [29]:
interacts_data.groupby(interacts_data['sourceTaxonFamilyName']).size().sort_values(ascending = False)

sourceTaxonFamilyName
Apidae                43
Megachilidae          23
Tephritidae           18
Halictidae            14
Nymphalidae           10
Syrphidae             10
Eulophidae             7
Curculionidae          5
Pucciniaceae           5
Aphididae              5
Pieridae               4
Erysiphaceae           4
Eurytomidae            4
Hesperiidae            4
Leptosphaeriaceae      4
Colletidae             3
Cicadellidae           3
Agromyzidae            3
Papilionidae           3
Pteromalidae           3
Phaeosphaeriaceae      3
Miridae                2
Tingidae               2
Apionidae              2
Bombyliidae            2
Calliphoridae          2
Cerambycidae           2
Chrysomelidae          2
Peronosporaceae        2
Trichogrammatidae      2
Nitidulidae            2
Torymidae              2
Mycosphaerellaceae     2
Lygaeidae              2
Fagaceae               1
Andrenidae             1
Anthomyiidae           1
Melittidae             1
Aphrophoridae          1
Tac

## Linking to Wikipedia
A nice trick by Yikang Li is to make the results clickable, so that you can figure out what sort of oraganisms are interacting

In [30]:
def make_clickable_both(val): 
    name, url = val.split('#')
    return f'<a href="{url}">{name}</a>'

In [31]:
def sourcesWithWiki(taxonIn):
    """ Function that takes taxon of interest and finds their interactions and linked to their Wikipedia pages.
    Args:
        taxonIn: the source taxon that we are interested in, can be in any level.

    Returns:
        The source taxa with clickable Wikipedia links for certain source taxon and certain interaction type, 
        in descending order of number of records.
    """
    
    d = data[data['targetTaxonName'] == taxonIn]
    d_cleaned = d[['sourceTaxonId', 'sourceTaxonName', 'sourceTaxonRank',
       'sourceTaxonPathNames', 'sourceTaxonFamilyName', 'interactionTypeName',
       'interactionTypeId', 'targetTaxonId',
       'targetTaxonName','targetTaxonPathNames',
       'targetTaxonPathIds', 'targetTaxonPathRankNames',
       'targetTaxonSpeciesName', 'targetTaxonSpeciesId',
       'targetTaxonGenusName', 'targetTaxonGenusId', 'targetTaxonFamilyName',
       'targetTaxonFamilyId', 'targetTaxonOrderName', 'targetTaxonOrderId',
       'targetTaxonClassName', 'targetTaxonClassId', 'targetTaxonPhylumName',
       'targetTaxonPhylumId', 'targetTaxonKingdomName', 'targetTaxonKingdomId', 'referenceDoi', 'decimalLatitude', 'decimalLongitude'
        ]].dropna(subset=['targetTaxonId', 'targetTaxonName','targetTaxonPathNames','targetTaxonPathIds'])

    result = d_cleaned.groupby(d_cleaned['sourceTaxonFamilyName']).size().sort_values(ascending = False)
    target_df = pd.DataFrame(result)
    target_df.columns = ['count']

    urls = dict(name= list(target_df.index), 
    url= ['https://en.wikipedia.org/wiki/' + str(i) for i in list(target_df.index)])
    target_df.index = [i + '#' + j for i,j in zip(urls['name'], urls['url'])]
    index_list = list(target_df.index)
    target_df.index =[make_clickable_both(i) for i in index_list]
    df = target_df.style.format({'wiki': make_clickable_both})
    
    return df

In [32]:
sourcesWithWiki(taxon)

Unnamed: 0,count
Apidae,43
Megachilidae,23
Tephritidae,18
Halictidae,14
Nymphalidae,10
Syrphidae,10
Eulophidae,7
Curculionidae,5
Pucciniaceae,5
Aphididae,5
