# Interactias Geo Selected Network

One way of examining the impact of invasive species is to look at all their interactions and the interations those organisms have with each other. This full interaction network gives you an idication of whether a species might be a "keystone" species and therefore have a disproportionatly large impact.

In this step of the evolution of this script it will be adapted to quantify species by their occupancy.

I will harvest species interactions data from GLOBI (https://www.globalbioticinteractions.org/) to discover the species that interact with an invasive species.
I will then harvest all the interactions for those species to create two tiers of interactions.
I will then count all the occurences of these in species in the Belgian datacube.
I will then create a network diagram to visualize this.

This notebook takes considerable insperation and code from Yikang Li's project on GLoBI (https://curiositydata.org/part1_globi_access/).

In [1]:
import sys
print(sys.version)

#Python 3.7.4 (default, Aug 9 2019, 18:34:13) [MSC v.1915 64 bit (AMD64)]
#pygbif 0.3.0

3.7.3 (default, Mar 27 2019, 17:13:21) [MSC v.1915 64 bit (AMD64)]


In [2]:
import pandas as pd
import re
import matplotlib.pyplot as plt
from pygbif import species
from pygbif import occurrences as occ
import sqlite3
from sqlite3 import Error
import pydot

### Setting up some parameters

In [3]:
# There is no point accepting all the species that have more than one record.
# There are too many casual records of plants and birds
thresholdForOccNum = 5
# If you want to exclude the interactions list them comma seperated here
interactionsToExclude = "interactsWith"
#interactionsToExclude = ""

#ranksToExclude = '"Kingdom","subclass","Phyllum","class","order","superclass,"infraclass,"infraorder","parvclass"'
ranksToExclude = "subclass,kingdom"

In [4]:

## Define the country of interest
country  = 'BE'

## Define the year from where to consider records for the occupancy
year = 2000

## Define the place to find the data cube for occupancy data
## Currently the cube contains only Belgian data so that is all that can be used
database = r"..\data\cube.db" 

## Define the place to find the interaction data
globiDB = r"..\data\globi.db"

### Define the key taxon for the notebook for which to find all interactions


In [5]:
#taxon = "Pipistrellus pipistrellus" # Common Pipistrelle
#taxon = "Rhinolophus sinicus" # Chinese rufous horseshoe bat
#taxon = "Rousettus leschenaulti" #Leschenault's rousette (a fruit bat)
taxon = "Rousettus aegyptiacus" #Egyptian fruit bat


## Check to see if the taxon exits in GBIF

In [6]:
try:
    #NOTE: The function method name_suggest has a different behavior with regards to the gender of Latin names to name_backbone
    # If name_backbone is given a name in one gender it can still return the best match from the GBIF backbone if that
    # name is spelled as if it was another gender.
    #key = species.name_suggest(q=taxon, limit = 1)
    match = species.name_backbone(name=taxon, limit = 1)
    #print(key)
    
    # if there is no match this is returned form .name_backbone {'confidence': 100, 'matchType': 'NONE', 'synonym': False}
    if match['matchType'] == 'NONE':
        raise ValueError("TAXON NOT FOUND ON GBIF!")
    else:
        key = match['usageKey']
except ValueError as ve:
    print(ve)
    exit(1)




In [7]:

print('The taxon to be studied is ' + match['scientificName'])

The taxon to be studied is Rousettus aegyptiacus (E.Geoffroy, 1810)


### Take a look at the interactions that the taxon has

In [8]:
# What are all the types of interactions involving taxon as source taxon?
#data[data['sourceTaxonName'] == taxon]['interactionTypeName'].unique()
try:
    connGlobi = sqlite3.connect(globiDB)
except Error as e:
    print(e)

In [9]:
curGlobi = connGlobi.cursor()
curGlobi.execute("SELECT interactionTypeName from globi  WHERE sourceTaxonName = ? GROUP BY interactionTypeName;", (taxon,))
interactDataTaxon = curGlobi.fetchall()

In [10]:
interactDataTaxon

[('coOccursWith',), ('eats',), ('visitsFlowersOf',)]

In [11]:
curGlobi = connGlobi.cursor()
curGlobi.execute("SELECT interactionTypeName from globi  WHERE targetTaxonName = ? GROUP BY interactionTypeName;", (taxon,))
interactDataTaxon = curGlobi.fetchall()
interactDataTaxon

[('ectoparasiteOf',), ('hasHost',), ('kills',), ('pathogenOf',), ('preysOn',)]

## Get the primary interation data for the species in question

In [12]:
curGlobi = connGlobi.cursor()
# curGlobi.execute("SELECT * from globi  WHERE targetTaxonName = ? and interactionTypeName NOT IN (?) and targetTaxonRank NOT IN('kingdom','phyllum','class','subclass','infraclass') and sourceTaxonRank NOT IN('kingdom','phyllum','class','subclass','infraclass');", \
#                 (taxon,interactionsToExclude,))
curGlobi.execute("SELECT * from globi  WHERE targetTaxonName = ? and interactionTypeName NOT IN (?) and targetTaxonRank ='species' and sourceTaxonRank = 'species';", \
                 (taxon,interactionsToExclude,))
interactDataTaxon = curGlobi.fetchall()

In [13]:
curGlobi = connGlobi.cursor()
#curGlobi.execute("SELECT * from globi  WHERE sourceTaxonName = ? and interactionTypeName NOT IN (?) and targetTaxonRank NOT IN('kingdom','phyllum','class','subclass','infraclass') and sourceTaxonRank NOT IN('kingdom','phyllum','class','subclass','infraclass');", \
#                 (taxon,interactionsToExclude,))
curGlobi.execute("SELECT * from globi  WHERE sourceTaxonName = ? and interactionTypeName NOT IN (?) and targetTaxonRank ='species' and sourceTaxonRank = 'species';", \
                 (taxon,interactionsToExclude,))
sources = curGlobi.fetchall()

In [14]:
interactDataTaxon.extend(sources)
len(interactDataTaxon)

276

In [15]:
# Convert to a Pandas dataframe
interactDataTaxon = pd.DataFrame(interactDataTaxon)

In [16]:
# Add column names
interactDataTaxon
interactDataTaxon.columns = ['sourceTaxonId', \
                                'sourceTaxonIds','sourceTaxonName','sourceTaxonRank','sourceTaxonPathNames', \
                                'sourceTaxonPathIds','sourceTaxonPathRankNames','sourceTaxonSpeciesName','sourceTaxonSpeciesId',\
                                'sourceTaxonGenusName','sourceTaxonGenusId','sourceTaxonFamilyName','sourceTaxonFamilyId',\
                                'sourceTaxonOrderName','sourceTaxonOrderId','sourceTaxonClassName','sourceTaxonClassId',\
                                'sourceTaxonPhylumName','sourceTaxonPhylumId','sourceTaxonKingdomName','sourceTaxonKingdomId',\
                                'sourceId','sourceOccurrenceId','sourceCatalogNumber','sourceBasisOfRecordId',\
                                'sourceBasisOfRecordName','sourceLifeStageId','sourceLifeStageName','sourceBodyPartId',\
                                'sourceBodyPartName','sourcePhysiologicalStateId','sourcePhysiologicalStateName', \
                                'sourceSexId', 'sourceSexName','interactionTypeName',\
                                'interactionTypeId','targetTaxonId','targetTaxonIds','targetTaxonName',\
                                'targetTaxonRank','targetTaxonPathNames','targetTaxonPathIds','targetTaxonPathRankNames',\
                                'targetTaxonSpeciesName','targetTaxonSpeciesId','targetTaxonGenusName','targetTaxonGenusId',\
                                'targetTaxonFamilyName','targetTaxonFamilyId','targetTaxonOrderName','targetTaxonOrderId',\
                                'targetTaxonClassName','targetTaxonClassId','targetTaxonPhylumName','targetTaxonPhylumId',\
                                'targetTaxonKingdomName','targetTaxonKingdomId','targetId','targetOccurrenceId',\
                                'targetCatalogNumber','targetBasisOfRecordId','targetBasisOfRecordName','targetLifeStageId',\
                                'targetLifeStageName','targetBodyPartId','targetBodyPartName','targetPhysiologicalStateId',\
                                'targetPhysiologicalStateName', 'targetSexId', 'targetSexName',\
                                'decimalLatitude','decimalLongitude','localityId',\
                                'localityName','eventDateUnixEpoch','argumentTypeId','referenceCitation',\
                                'referenceDoi','referenceUrl','sourceCitation','sourceNamespace',\
                                'sourceArchiveURI','sourceDOI','sourceLastSeenAtUnixEpoch']

## Get a list of all the primary interacting species

In [17]:
interactingTaxaData = interactDataTaxon.drop_duplicates()

In [18]:
interactingTaxaData

Unnamed: 0,sourceTaxonId,sourceTaxonIds,sourceTaxonName,sourceTaxonRank,sourceTaxonPathNames,sourceTaxonPathIds,sourceTaxonPathRankNames,sourceTaxonSpeciesName,sourceTaxonSpeciesId,sourceTaxonGenusName,...,eventDateUnixEpoch,argumentTypeId,referenceCitation,referenceDoi,referenceUrl,sourceCitation,sourceNamespace,sourceArchiveURI,sourceDOI,sourceLastSeenAtUnixEpoch
0,NCBI:28875,NCBI:28875,Rotavirus A,species,Viruses | dsRNA viruses | Reoviridae | Sedoreo...,NCBI:10239 | NCBI:35325 | NCBI:10880 | NCBI:68...,| | family | subfamily | genus | species,Rotavirus A,NCBI:28875,Rotavirus,...,1417392000000,https://en.wiktionary.org/wiki/support,"Sasaki M, Kajihara M, Changula K, Mori-Kajihar...",,,"Chen L, Liu B, Yang J, Jin Q, 2014. DBatVir: t...",globalbioticinteractions/dbatvir,https://github.com/globalbioticinteractions/db...,,2020-05-08T00:37:21.750Z\n
1,NCBI:1904411,NCBI:1904411,Rousettus aegyptiacus polyomavirus 1,species,"Viruses | dsDNA viruses, no RNA stage | Polyom...",NCBI:10239 | NCBI:35237 | NCBI:151341 | NCBI:1...,| | family | | species,Rousettus aegyptiacus polyomavirus 1,NCBI:1904411,,...,1325376000000,https://en.wiktionary.org/wiki/support,"Carr M, Gonzalez G, Sasaki M, Ito K, Ishii A, ...",,,"Chen L, Liu B, Yang J, Jin Q, 2014. DBatVir: t...",globalbioticinteractions/dbatvir,https://github.com/globalbioticinteractions/db...,,2020-05-08T00:37:21.750Z\n
2,GBIF:8879703,GBIF:8879703 | NCBI:11269 | OTT:4888314 | WD:Q...,Marburg marburgvirus,species,| Mononegavirales | Filoviridae | Marburgvirus...,GBIF:8 | GBIF:842 | GBIF:7759 | GBIF:9173584 |...,kingdom | order | family | genus | species,Marburg marburgvirus,GBIF:8879703,Marburgvirus,...,1199145600000,https://en.wiktionary.org/wiki/support,"Towner JS, Amman BR, Sealy TK, Carroll SA, Com...",,,"Chen L, Liu B, Yang J, Jin Q, 2014. DBatVir: t...",globalbioticinteractions/dbatvir,https://github.com/globalbioticinteractions/db...,,2020-05-08T00:37:21.750Z\n
5,GBIF:8879703,GBIF:8879703 | NCBI:11269 | OTT:4888314 | WD:Q...,Marburg marburgvirus,species,| Mononegavirales | Filoviridae | Marburgvirus...,GBIF:8 | GBIF:842 | GBIF:7759 | GBIF:9173584 |...,kingdom | order | family | genus | species,Marburg marburgvirus,GBIF:8879703,Marburgvirus,...,1167609600000,https://en.wiktionary.org/wiki/support,"Towner JS, Amman BR, Sealy TK, Carroll SA, Com...",,,"Chen L, Liu B, Yang J, Jin Q, 2014. DBatVir: t...",globalbioticinteractions/dbatvir,https://github.com/globalbioticinteractions/db...,,2020-05-08T00:37:21.750Z\n
15,GBIF:8879703,GBIF:8879703 | NCBI:11269 | OTT:4888314 | WD:Q...,Marburg marburgvirus,species,| Mononegavirales | Filoviridae | Marburgvirus...,GBIF:8 | GBIF:842 | GBIF:7759 | GBIF:9173584 |...,kingdom | order | family | genus | species,Marburg marburgvirus,GBIF:8879703,Marburgvirus,...,1372636800000,https://en.wiktionary.org/wiki/support,"Kajihara M, Hang'ombe BM, Changula K, Harima H...",,,"Chen L, Liu B, Yang J, Jin Q, 2014. DBatVir: t...",globalbioticinteractions/dbatvir,https://github.com/globalbioticinteractions/db...,,2020-05-08T00:37:21.750Z\n
16,GBIF:8879703,GBIF:8879703 | NCBI:11269 | OTT:4888314 | WD:Q...,Marburg marburgvirus,species,| Mononegavirales | Filoviridae | Marburgvirus...,GBIF:8 | GBIF:842 | GBIF:7759 | GBIF:9173584 |...,kingdom | order | family | genus | species,Marburg marburgvirus,GBIF:8879703,Marburgvirus,...,1535932800000,https://en.wiktionary.org/wiki/support,"Kajihara M, Hang'ombe BM, Changula K, Harima H...",,,"Chen L, Liu B, Yang J, Jin Q, 2014. DBatVir: t...",globalbioticinteractions/dbatvir,https://github.com/globalbioticinteractions/db...,,2020-05-08T00:37:21.750Z\n
18,GBIF:8879703,GBIF:8879703 | NCBI:11269 | OTT:4888314 | WD:Q...,Marburg marburgvirus,species,| Mononegavirales | Filoviridae | Marburgvirus...,GBIF:8 | GBIF:842 | GBIF:7759 | GBIF:9173584 |...,kingdom | order | family | genus | species,Marburg marburgvirus,GBIF:8879703,Marburgvirus,...,1183507200000,https://en.wiktionary.org/wiki/support,"Kuzmin IV, Niezgoda M, Franka R, Agwanda B, Ma...",,,"Chen L, Liu B, Yang J, Jin Q, 2014. DBatVir: t...",globalbioticinteractions/dbatvir,https://github.com/globalbioticinteractions/db...,,2020-05-08T00:37:21.750Z\n
19,GBIF:8879703,GBIF:8879703 | NCBI:11269 | OTT:4888314 | WD:Q...,Marburg marburgvirus,species,| Mononegavirales | Filoviridae | Marburgvirus...,GBIF:8 | GBIF:842 | GBIF:7759 | GBIF:9173584 |...,kingdom | order | family | genus | species,Marburg marburgvirus,GBIF:8879703,Marburgvirus,...,1104537600000,https://en.wiktionary.org/wiki/support,"Towner JS, Pourrut X, Albariño CG, Nkogue CN, ...",,,"Chen L, Liu B, Yang J, Jin Q, 2014. DBatVir: t...",globalbioticinteractions/dbatvir,https://github.com/globalbioticinteractions/db...,,2020-05-08T00:37:21.750Z\n
22,GBIF:8879703,GBIF:8879703 | NCBI:11269 | OTT:4888314 | WD:Q...,Marburg marburgvirus,species,| Mononegavirales | Filoviridae | Marburgvirus...,GBIF:8 | GBIF:842 | GBIF:7759 | GBIF:9173584 |...,kingdom | order | family | genus | species,Marburg marburgvirus,GBIF:8879703,Marburgvirus,...,1246492800000,https://en.wiktionary.org/wiki/support,"Maganga GD, Bourgarel M, Ella GE, Drexler JF, ...",,,"Chen L, Liu B, Yang J, Jin Q, 2014. DBatVir: t...",globalbioticinteractions/dbatvir,https://github.com/globalbioticinteractions/db...,,2020-05-08T00:37:21.750Z\n
23,GBIF:8879703,GBIF:8879703 | NCBI:11269 | OTT:4888314 | WD:Q...,Marburg marburgvirus,species,| Mononegavirales | Filoviridae | Marburgvirus...,GBIF:8 | GBIF:842 | GBIF:7759 | GBIF:9173584 |...,kingdom | order | family | genus | species,Marburg marburgvirus,GBIF:8879703,Marburgvirus,...,1325376000000,https://en.wiktionary.org/wiki/support,"Amman BR, Nyakarahuka L, McElroy AK, Dodd KA, ...",,,"Chen L, Liu B, Yang J, Jin Q, 2014. DBatVir: t...",globalbioticinteractions/dbatvir,https://github.com/globalbioticinteractions/db...,,2020-05-08T00:37:21.750Z\n


In [19]:
primaryInteractingTaxa = interactingTaxaData['sourceTaxonName'].drop_duplicates()
len(primaryInteractingTaxa)

19

In [20]:
primaryInteractingTaxa = (primaryInteractingTaxa.append(interactingTaxaData['targetTaxonName']).drop_duplicates())
len(primaryInteractingTaxa)

58

In [21]:
primaryInteractingTaxa

0                               Rotavirus A
1      Rousettus aegyptiacus polyomavirus 1
2                      Marburg marburgvirus
82                     Lagos bat lyssavirus
85                              Yogue virus
87                            Bat rotavirus
89                          Bat coronavirus
99           Rousettus bat coronavirus HKU9
134                      Bat mastadenovirus
148                            Homo sapiens
198                              Flavivirus
199                         Lagos bat virus
201                          Shamonda virus
202                        Zaire ebolavirus
204                 European bat lyssavirus
205                       Chikungunya virus
212                             Felis catus
213                   Eucampsipoda africana
214                   Rousettus aegyptiacus
214                      Ekebergia capensis
215                    Elaeodendron croceum
216               Acokanthera oppositifolia
217                Miniopterus s

## A function to query the globi database

In [22]:
def secondaryDataInGlobi(conn, intaxon):
    cur = conn.cursor()
    cur.execute("SELECT * from globi where (targetTaxonName = ? OR sourceTaxonName = ?) and interactionTypeName NOT IN (?) and targetTaxonRank ='species' and sourceTaxonRank = 'species';", \
                (intaxon,intaxon,interactionsToExclude,))
    return(cur.fetchall())

## Get all the secondary interactions

In [23]:
try:
    conn = sqlite3.connect(globiDB)
except Error as e:
    print(e)

In [None]:
secondarylist = []

for name in primaryInteractingTaxa:
    temp = secondaryDataInGlobi(conn, name)
    secondarylist.extend(temp)

In [None]:
# Convert to a Pandas dataframe
secondarylist = pd.DataFrame(secondarylist)

In [None]:
secondarylist.columns = ['sourceTaxonId', \
                                'sourceTaxonIds','sourceTaxonName','sourceTaxonRank','sourceTaxonPathNames', \
                                'sourceTaxonPathIds','sourceTaxonPathRankNames','sourceTaxonSpeciesName','sourceTaxonSpeciesId',\
                                'sourceTaxonGenusName','sourceTaxonGenusId','sourceTaxonFamilyName','sourceTaxonFamilyId',\
                                'sourceTaxonOrderName','sourceTaxonOrderId','sourceTaxonClassName','sourceTaxonClassId',\
                                'sourceTaxonPhylumName','sourceTaxonPhylumId','sourceTaxonKingdomName','sourceTaxonKingdomId',\
                                'sourceId','sourceOccurrenceId','sourceCatalogNumber','sourceBasisOfRecordId',\
                                'sourceBasisOfRecordName','sourceLifeStageId','sourceLifeStageName','sourceBodyPartId',\
                                'sourceBodyPartName','sourcePhysiologicalStateId','sourcePhysiologicalStateName',\
                                'sourceSexId', 'sourceSexName', 'interactionTypeName',\
                                'interactionTypeId','targetTaxonId','targetTaxonIds','targetTaxonName',\
                                'targetTaxonRank','targetTaxonPathNames','targetTaxonPathIds','targetTaxonPathRankNames',\
                                'targetTaxonSpeciesName','targetTaxonSpeciesId','targetTaxonGenusName','targetTaxonGenusId',\
                                'targetTaxonFamilyName','targetTaxonFamilyId','targetTaxonOrderName','targetTaxonOrderId',\
                                'targetTaxonClassName','targetTaxonClassId','targetTaxonPhylumName','targetTaxonPhylumId',\
                                'targetTaxonKingdomName','targetTaxonKingdomId','targetId','targetOccurrenceId',\
                                'targetCatalogNumber','targetBasisOfRecordId','targetBasisOfRecordName','targetLifeStageId',\
                                'targetLifeStageName','targetBodyPartId','targetBodyPartName','targetPhysiologicalStateId',\
                                'targetPhysiologicalStateName', 'targetSexId', 'targetSexName', \
                                'decimalLatitude','decimalLongitude','localityId',\
                                'localityName','eventDateUnixEpoch','argumentTypeId','referenceCitation',\
                                'referenceDoi','referenceUrl','sourceCitation','sourceNamespace',\
                                'sourceArchiveURI','sourceDOI','sourceLastSeenAtUnixEpoch']

In [None]:
allInteractionsData = interactingTaxaData.append(secondarylist)

In [None]:
allInteractionsData = allInteractionsData.drop_duplicates()


In [None]:
    cleanInteractDataTaxon = allInteractionsData[['sourceTaxonId', 'sourceTaxonName', 'sourceTaxonRank',
       'sourceTaxonFamilyName', 'interactionTypeName',
       'targetTaxonName','targetTaxonRank', 'sourceCitation', 'sourceDOI'
        ]].dropna(subset=['targetTaxonName','sourceTaxonName'])

In [None]:
allSourceInteractingTaxa = cleanInteractDataTaxon['sourceTaxonName'].drop_duplicates()
allTargetInteractingTaxa = cleanInteractDataTaxon['targetTaxonName'].drop_duplicates()

In [None]:
allInteractingTaxa = allSourceInteractingTaxa.append(allTargetInteractingTaxa).drop_duplicates()
allInteractingTaxa.count()

In [None]:
allInteractingTaxa.head()

In [None]:
# How many different sort of interaction do I have left?
# Checking out all the interaction types
cleanInteractDataTaxon.groupby(cleanInteractDataTaxon['interactionTypeName']).size().sort_values(ascending = False)

### This function takes a name string and checks on GBIF to see if the name exists there.

In [None]:
def speciesExistsInGBIF(name, rank):
    try:
        match = species.name_backbone(name=name, rank=rank, limit = 1)

        # if there is no match this is returned from .name_backbone {'confidence': 100, 'matchType': 'NONE', 'synonym': False}
        if match['matchType'] == 'NONE':
            return False
        else:
            return match
    except ValueError as ve:
        print(ve)
        exit(1)

### Check that the species in question is actually found on GBIF

In [None]:
if(speciesExistsInGBIF(taxon, "species") == False):
    print("##### {0} has not been found on GBIF #####".format(taxon))

### Check to see which taxa in the interaction network are found in GBIF and list those ones that are not

In [None]:
taxaFound = {}

print('Taxa from GLoBI, but not found in GBIF')
for name in allInteractingTaxa.items():
    GBIFName = speciesExistsInGBIF(name[1], "species")
    if GBIFName == False:
        taxaFound[name[1]] = False
        print(name[1])
    else:
        taxaFound[name[1]] = GBIFName['usageKey']
    

In [None]:
# Convert to a Pandas dataframe
taxaFound = pd.DataFrame.from_dict(taxaFound, orient='index')

In [None]:
len(taxaFound)

In [None]:
taxaFound

## Drawing a network of the interactions

Now that I have a list of all the species in the country I can use this as my nodes list for the network diagram.

In [None]:
#networkx seems to be a leading network tool in Python
import networkx as nx
import matplotlib.pyplot as plt

try:
    import pygraphviz
    from networkx.drawing.nx_agraph import write_dot
    print("using package pygraphviz")
except ImportError:
    try:
        import pydot
        from networkx.drawing.nx_pydot import write_dot
        print("using package pydot")
    except ImportError:
        print()
        print("Both pygraphviz and pydot were not found ")
        print("see  https://networkx.github.io/documentation/latest/reference/drawing.html")
        print()

In [None]:
# Create graphic object
G = nx.DiGraph()

In [None]:
# Match colours to interactions to distinguish them on the graph
colorInteractions = {'interaction':['pollinates', 'mutualistOf', 'eats', 'visitsFlowersOf', 'hasHost', 'parasiteOf', 'pathogenOf'],
        'colour':['r', 'g', 'b', 'y', 'm', 'w', 'c']}  

colorInteractionsDf = pd.DataFrame(colorInteractions)

#len(list(G.nodes))

## A quick look at the interaction data to see if it is what is expected

In [None]:
#cleanInteractDataTaxon.loc[(cleanInteractDataTaxon["sourceTaxonName"] == 'Apis mellifera') & (cleanInteractDataTaxon["targetTaxonName"] == 'Procyon lotor')]
cleanInteractDataTaxon.loc[(cleanInteractDataTaxon["targetTaxonName"] == 'Pipistrellus pipistrellus')]

### Add the nodes to the graph

In [None]:
for index, row in taxaFound.iterrows():
    # access data using column names
    #print('A: ', row.name, row[0])
    G.add_node(row.name, gbifkey=row[0])
    #create a list of node sizes scaled for the network visulization

### Add edges to the graph

In [None]:
# iterate over the interacting species that are in GBIF and in the country with iterrows()
# Find the taxa found in the country that are in the source taxon name of the interation data,
# then add the edge if the target species is in the country too.

taxaFound_copy = taxaFound.copy()

for index, row in taxaFound.iterrows():
    # loop over all the taxa finding if any of them are mentioned in the sourceTaxonName field
    for edge in cleanInteractDataTaxon.iterrows():
        if row.name == edge[1]['sourceTaxonName']:
            #print('B: ', edge[1]['sourceTaxonName'], edge[1]['targetTaxonName'])
            # Some of the target species will not be in GBIF of in the country, so only add an edge where they are.
            for index2, row2 in taxaFound_copy.iterrows():
                #print('E: ', row2.name, edge[1]['targetTaxonName'])
                if row2.name == edge[1]['targetTaxonName']:
                    print('C: ', edge[1]['targetTaxonName'], row.name, edge[1]['interactionTypeName'])
                    G.add_edge(edge[1]['targetTaxonName'], row.name, label = edge[1]['interactionTypeName'])
                    
#len(list(G.nodes))


In [None]:
# iterate over rows with iterrows()

# Find the taxa found in the country that are in the target taxon name of the interation data,
# then add the edge if the source species is in the country too.

for index, row in taxaFound.iterrows():
    for edge in cleanInteractDataTaxon.iterrows():
        if row.name == edge[1]['targetTaxonName']:
            #print('D: ', edge[1]['sourceTaxonName'], edge[1]['targetTaxonName'])
            #G.add_node(edge[1]['sourceTaxonName'], gbifkey=row['key'])
            #dictOfNodeSizes[edge[1]['sourceTaxonName']] = int(row['count']/maxRecords*100)
            for index2, row2 in taxaFound.iterrows():
                #print('E: ', row2['species'])
                if row2.name == edge[1]['sourceTaxonName']: 
                    print('F: ', edge[1]['sourceTaxonName'],edge[1]['targetTaxonName'])
                    G.add_edge(row2.name, edge[1]['sourceTaxonName'], label = edge[1]['interactionTypeName'])

In [None]:
print("Number of nodes = {0}".format(G.number_of_nodes()))
print("Number of edges = {0}".format(G.number_of_edges()))

## Remove any nodes that have no edges.
This happens because some of the linking nodes have few supporting observations and so have been weeded out


In [None]:
#for n in G.nodes:
#    if G.degree(n) == 0:
#        G.remove_node(n)
        
G.remove_nodes_from(list(nx.isolates(G)))

In [None]:
len(G.nodes)

## Remove any selfloop edges

In [None]:
for e in G.selfloop_edges(data=False):
    G.remove_edge(e[0],e[1])

## A network of all the interacting taxa

In [None]:
plt.figure(figsize=(15,15))
edge_labels = nx.get_edge_attributes(G,'label')

pos = nx.spring_layout(G, iterations=50, k=50) 
#pos = nx.spring_layout(G)
#pos = nx.random_layout(G)
#pos = nx.circular_layout(G)
#pos = nx.spectral_layout(G)
#pos = nx.shell_layout(G, scale=1)

nodeColors = nx.get_node_attributes(G,'color')

nx.draw_networkx_edge_labels(G,pos, edge_labels = edge_labels, font_size=10, font_color='blue')

#nx.draw_networkx_nodes(G, pos, node_color=nodeColors.values())

nx.draw_networkx(G, pos, with_labels=True, node_size = 12, node_color='b', alpha= 1, arrows=True, 
                    linewidths=1, font_color="black", font_size=10, style = 'dashed')

plt.axis('off')
plt.tight_layout()
plt.show()

In [None]:
with open(taxon+country+".html", "w") as file:
    file.write(" \
<!DOCTYPE html> \
<html> \
<head> \
<script src='../../../GitHub\cytoscape.js\dist\cytoscape.min.js'></script> \
<script src='https://unpkg.com/layout-base/layout-base.js'></script> \
<script src='https://unpkg.com/cose-base/cose-base.js'></script> \
<script src='../../../GitHub\cytoscape.js-cose-bilkent\cytoscape-cose-bilkent.js'></script> \
</head> \
<style>#cy {width: 90%; height: 90%; position: absolute; top: 50px; left: 150px;}\
body {font-family: 'times; font-size: 6px;}\
</style> \
<body> \
<h1><em font-style: italic;>"+taxon+"</em></h1>")

### Write out the details of the species

In [None]:
with open(taxon+country+".html", "a") as file:
    file.write("<table align='left' style='margin-left: 0px'><tbody><tr><td width='20%'>")
    file.write("<table><th>Species</th>")
    

In [None]:
species = G.nodes
with open(taxon+country+".html", "a") as file:
    for n in species:
        file.write("<tr><td><a target='_blank' href=https://www.gbif.org/species/"+str(nx.get_node_attributes(G, 'gbifkey')[n])+">"+n+"</a></td>\n")
        #file.write("<a href=https://www.gbif.org/species/"+str(nx.get_node_attributes(G, 'gbifkey')[n])+">"+n+"</a>, "+str(dictOfNodeSizes[n])+"\n")

In [None]:
with open(taxon+country+".html", "a") as file:
    file.write("</table>")
    file.write("</td><td='80%'>")
file.close()

In [None]:
with open(taxon+country+".html", "a") as file:
    file.write(" \
<div id='cy'></div> \
<script> \
var cy = cytoscape({ \
  container: document.getElementById('cy'), \n \
  elements: [ \
")

### Write nodes to file

In [None]:
file = open(taxon+country+".html", "a")
for n in species:
    file.write("{ data: { id: '"+n+"', href: 'https://www.gbif.org/species/"+str(nx.get_node_attributes(G, 'gbifkey')[n])+"', occnum: 2 }, },\n")
file.close()

### Write edges to file

In [None]:
file = open(taxon+country+".html", "a")
for edge in G.edges:
    file.write("{data: {id: '"+edge[0]+edge[1]+"', source: '"+edge[0]+"', target: '"+edge[1]+"', label: '"+nx.get_edge_attributes(G, 'label')[edge]+"'}},\n")
file.close()
    

In [None]:
with open(taxon+country+".html", "a") as file:
    file.write("], \
style: [ \n\
        { \n\
            selector: 'node', \n\
            style: { \n\
                shape: 'circle', \n\
                'background-color': 'darkgreen', \n\
                label: 'data(id)', \n\
                'font-family': 'helvetica', \n\
                'font-style': 'italic', \n\
                'font-size': '8px', \n\
                'width': 'mapData(occnum, 0, 400, 3, 150)', \n\
                'height': 'mapData(occnum, 0, 400, 3, 150)' \n\
            } \n\
        },  \n\
        {  \n\
            selector: 'edge',  \n\
            style: {  \n\
                label: 'data(label)', \n\
                'font-family': 'helvetica', \n\
                'font-size': '8px', \n\
                'color': 'blue', \n\
                'curve-style': 'bezier', \n\
                'target-arrow-shape': 'triangle',  \n\
                'width': '1' \n\
                } \n\
            },  \n\
            {  \n\
              selector: ':selected',   \n\
              css: {  \n\
                'line-color': 'red',  \n\
                'background-color': 'red'  \n\
            }  \n\
        }], \n\
layout:  { \n\
            name: 'circle', padding: 10, animate: true, gravity: 30, animationDuration: 1000 \n\
     } \n\
} \n\
); \n\
cy.userZoomingEnabled( true ); \n\
</script> \n\
")

In [None]:
with open(taxon+country+".html", "a") as file:
    file.write("</td></tr></tbody></table>\n")
    file.write("<h2>References</h2><ul>\n")

In [None]:
citations = cleanInteractDataTaxon['sourceCitation'].unique()
file = open(taxon+country+".html", "a")
for ref in citations:
    file.write("<li>"+str(ref)+"</li>\n")
file.close()

In [None]:
with open(taxon+country+".html", "a") as file:
    file.write("</ul> \
        </body> \
        </html>")

## Output a CSV file for import into Gephi

In [None]:
for node in G.edges:
    print(node[1])

In [None]:
with open(taxon+"_nodes.csv", "w") as file:
    file.write("Id,Label,Category")
    for node in G.nodes:
        file.write("'"+edge[0]+edge[1]+"','"+edge[0]+"','"+edge[1]+"','"+nx.get_edge_attributes(G, 'label')[edge]+"'\n")
file.close()

In [None]:
with open(taxon+"edges.csv", "w") as file:
    file.write("Id,Label,Category")
    for node in G.nodes:
        file.write("'"+edge[0]+edge[1]+"','"+edge[0]+"','"+edge[1]+"','"+nx.get_edge_attributes(G, 'label')[edge]+"'\n")
file.close()

In [None]:
#with open(taxon+".csv", "w") as file:
write_dot(G, taxon+".dot")