# Interactias Geo Selected Network

One way of examining the impact of invasive species is to look at all their interactions and the interations those organisms have with each other. This full interaction network gives you an indication of whether a species might be a "keystone" species and therefore have a disproportionatly large impact.

In this step of the evolution of this script it will be adapted to quantify species by their occupancy.

I will harvest species interactions data from GloBI (https://www.globalbioticinteractions.org/) to discover the species that interact with an invasive species.
I will then harvest all the interactions for those species to create two tiers of interactions.
I will then count all the occurences of these in species in the Belgian datacube.
I will then visualize this.

This notebook takes considerable insperation and code from Yikang Li's project on GloBI (https://curiositydata.org/part1_globi_access/).


### For information: This is the list of invasive alien species of Union concern
|A|H|O|
|--|--|--|
|Acacia saligna|Heracleum sosnowskyi|Orconectes virilis|
|Acridotheres tristis|Herpestes javanicus|Oxyura jamaicensis|
|Ailanthus altissima|Humulus scandens|Pacifastacus leniusculus|
|Alopochen aegyptiaca|Hydrocotyle ranunculoides|Parthenium hysterophorus|
|Alternanthera philoxeroides|Impatiens glandulifera|Pennisetum setaceum|
|Andropogon virginicus|Lagarosiphon major|Perccottus glenii|
|Arthurdendyus triangulatus|Lepomis gibbosus|Persicaria perfoliata|
|Asclepias syriaca|Lespedeza cuneata|Plotosus lineatus|
|Baccharis halimifolia|Lithobates catesbeianus|Procambarus clarkii|
|Cabomba caroliniana|Ludwigia grandiflora|Procambarus fallax|
|Callosciurus erythraeus|Ludwigia peploides|Procyon lotor|
|Cardiospermum grandiflorum|Lygodium japonicum|Prosopis juliflora|
|Cortaderia jubata|Lysichiton americanus|Pseudorasbora parva|
|Corvus splendens|Microstegium vimineum|Pueraria montana|
|Ehrharta calycina|Muntiacus reevesi|Salvinia molesta|
|Eichhornia crassipes|Myocastor coypus|Sciurus carolinensis|
|Elodea nuttallii|Myriophyllum aquaticum|Sciurus niger|
|Eriocheir sinensis|Myriophyllum heterophyllum|Tamias sibiricus|
|Gunnera tinctoria|Nasua nasua|Threskiornis aethiopicus|
|Gymnocoronis spilanthoides|Nyctereutes procyonoides|Trachemys scripta|
|Heracleum mantegazzianum|Ondatra zibethicus|Triadica sebifera|
|Heracleum persicum|Orconectes limosus|Vespa velutina|

In [16]:
import sys
print(sys.version)

#Python 3.7.4 (default, Aug 9 2019, 18:34:13) [MSC v.1915 64 bit (AMD64)]
#pygbif 0.3.0

3.7.3 (default, Mar 27 2019, 17:13:21) [MSC v.1915 64 bit (AMD64)]


In [17]:
import pandas as pd
import re
import matplotlib.pyplot as plt
from pygbif import species
from pygbif import occurrences as occ
import sqlite3
from sqlite3 import Error
import pydot

### Setting up some parameters

In [18]:
# There is no point accepting all the species that have more than one record.
# There are too many casual records of plants and birds
thresholdForOccNum = 5
# If you want to exclude an interaction add it here
#interactionsToExclude = "interactsWith"
interactionsToExclude = ""

In [68]:

## Define the country of interest
country  = 'BE'

## Define the year from where to consider records for the occupancy
year = 2000

## Define the place to find the data cube for occupancy data
## Currently the cube contains only Belgian data so that is all that can be used
database = r"..\..\data\cube.db" 

## Define the place to find the interaction data
globiDB = r"..\..\globi\globi.db"

### Define the key taxon for the notebook for which to find all interactions


In [20]:
#taxon = "Oxalis corniculata" #creeping woodsorrel
#taxon = "Oxalis pes-caprae" #Bermuda buttercup
#taxon = "Abramis brama" #common bream
#taxon = "Dikerogammarus villosus" #killer shrimp
#taxon = "Lantanophaga pusillidactyla" #lantana plume moth
#taxon = "Lantana camara" #common lantana
#taxon = "Cirsium vulgare" #spear thistle
#taxon = "Solenopsis invicta" #red imported fire ant
#taxon = "Linepithema humile" #Argentine ant
#taxon = "Procyon lotor" # raccoon
#taxon = "Carpobrotus edulis" #Hottentot-fig
taxon = "Sciurus carolinensis" # Eastern grey squirrel
#taxon = "Pipistrellus pipistrellus" # Common Pipistrelle
#taxon = "Rousettus aegyptiacus" #Egyptian fruit bat
#taxon = "Ailanthus altissima" #tree of heaven
#taxon = "Triturus carnifex" #Italian crested newt
#taxon = "Xenopus laevis" #African clawed frog
#taxon = "Rhinella marina" #cane toad
#taxon = "Cynops pyrrhogaster" #Japanese Newt
#taxon = "Pachytriton labiatus" #paddle-tail newt
#taxon = "Pleurodeles waltl" #Iberian ribbed newt
#taxon = "Podarcis sicula" #Italian wall lizard

#taxon=input()


In [21]:
taxon = taxon.strip()

## Check to see if the taxon exits in GBIF

In [22]:
try:
    #NOTE: The function method name_suggest has a different behavior with regards to the gender of Latin names to name_backbone
    # If name_backbone is given a name in one gender it can still return the best match from the GBIF backbone if that
    # name is spelled as if it was another gender.
    #key = species.name_suggest(q=taxon, limit = 1)
    match = species.name_backbone(name=taxon, limit = 1)
    #print(key)
    
    # if there is no match this is returned form .name_backbone {'confidence': 100, 'matchType': 'NONE', 'synonym': False}
    if match['matchType'] == 'NONE':
        raise ValueError("TAXON NOT FOUND ON GBIF!")
    else:
        key = match['usageKey']
except ValueError as ve:
    print(ve)
    exit(1)




In [23]:

print('The taxon to be studied is ' + match['scientificName'])

The taxon to be studied is Sciurus carolinensis Gmelin, 1788


### Just for information take a look at the interactions that the taxon has

In [24]:
# What are all the types of interactions involving taxon as source taxon?
#data[data['sourceTaxonName'] == taxon]['interactionTypeName'].unique()
try:
    connGlobi = sqlite3.connect(globiDB)
except Error as e:
    print(e)

In [25]:
curGlobi = connGlobi.cursor()
curGlobi.execute("SELECT interactionTypeName from globi  WHERE sourceTaxonSpeciesName = ? GROUP BY interactionTypeName;", (taxon,))
interactDataTaxon = curGlobi.fetchall()

In [26]:
interactDataTaxon

[('eats',), ('interactsWith',), ('preysOn',)]

In [27]:
curGlobi = connGlobi.cursor()
curGlobi.execute("SELECT interactionTypeName from globi  WHERE targetTaxonSpeciesName = ? GROUP BY interactionTypeName;", (taxon,))
interactDataTaxon = curGlobi.fetchall()
interactDataTaxon

[('eats',),
 ('ectoparasiteOf',),
 ('endoparasiteOf',),
 ('hasHost',),
 ('interactsWith',),
 ('parasiteOf',),
 ('pathogenOf',),
 ('preysOn',)]

## Get the primary interation data for the species in question

This is limited to the rank of species because otherwise the network can get very large with rather irrelivent higher taxa.
However, if no interactions are found at the species level it might be worth removing this restriction.

In [28]:
curGlobi = connGlobi.cursor()
#curGlobi.execute("SELECT * from globi  WHERE targetTaxonName = ? and interactionTypeName NOT IN (?);", \
if interactionsToExclude == "":
    curGlobi.execute("SELECT * from globi  WHERE targetTaxonSpeciesName = ?;", \
                 (taxon,))
else:
    curGlobi.execute("SELECT * from globi  WHERE targetTaxonSpeciesName = ? and interactionTypeName NOT IN (?);", \
                 (taxon,interactionsToExclude,))
interactDataTaxon = curGlobi.fetchall()

In [29]:
curGlobi = connGlobi.cursor()
#curGlobi.execute("SELECT * from globi  WHERE sourceTaxonName = ? and interactionTypeName NOT IN (?);", \
if interactionsToExclude == "":
    curGlobi.execute("SELECT * from globi  WHERE sourceTaxonSpeciesName = ?;", \
                 (taxon,))
else:
    curGlobi.execute("SELECT * from globi  WHERE sourceTaxonSpeciesName = ? and interactionTypeName NOT IN (?);", \
                 (taxon,interactionsToExclude,))
sources = curGlobi.fetchall()

In [30]:
interactDataTaxon.extend(sources)
len(interactDataTaxon)

937

In [31]:
# Convert to a Pandas dataframe
interactDataTaxon = pd.DataFrame(interactDataTaxon)

In [32]:
# Add column names
interactDataTaxon
interactDataTaxon.columns = ['sourceTaxonId', \
                                'sourceTaxonIds','sourceTaxonName','sourceTaxonRank','sourceTaxonPathNames', \
                                'sourceTaxonPathIds','sourceTaxonPathRankNames','sourceTaxonSpeciesName','sourceTaxonSpeciesId',\
                                'sourceTaxonGenusName','sourceTaxonGenusId','sourceTaxonFamilyName','sourceTaxonFamilyId',\
                                'sourceTaxonOrderName','sourceTaxonOrderId','sourceTaxonClassName','sourceTaxonClassId',\
                                'sourceTaxonPhylumName','sourceTaxonPhylumId','sourceTaxonKingdomName','sourceTaxonKingdomId',\
                                'sourceId','sourceOccurrenceId','sourceCatalogNumber','sourceBasisOfRecordId',\
                                'sourceBasisOfRecordName','sourceLifeStageId','sourceLifeStageName','sourceBodyPartId',\
                                'sourceBodyPartName','sourcePhysiologicalStateId','sourcePhysiologicalStateName', \
                                'sourceSexId', 'sourceSexName','interactionTypeName',\
                                'interactionTypeId','targetTaxonId','targetTaxonIds','targetTaxonName',\
                                'targetTaxonRank','targetTaxonPathNames','targetTaxonPathIds','targetTaxonPathRankNames',\
                                'targetTaxonSpeciesName','targetTaxonSpeciesId','targetTaxonGenusName','targetTaxonGenusId',\
                                'targetTaxonFamilyName','targetTaxonFamilyId','targetTaxonOrderName','targetTaxonOrderId',\
                                'targetTaxonClassName','targetTaxonClassId','targetTaxonPhylumName','targetTaxonPhylumId',\
                                'targetTaxonKingdomName','targetTaxonKingdomId','targetId','targetOccurrenceId',\
                                'targetCatalogNumber','targetBasisOfRecordId','targetBasisOfRecordName','targetLifeStageId',\
                                'targetLifeStageName','targetBodyPartId','targetBodyPartName','targetPhysiologicalStateId',\
                                'targetPhysiologicalStateName', 'targetSexId', 'targetSexName',\
                                'decimalLatitude','decimalLongitude','localityId',\
                                'localityName','eventDateUnixEpoch','argumentTypeId','referenceCitation',\
                                'referenceDoi','referenceUrl','sourceCitation','sourceNamespace',\
                                'sourceArchiveURI','sourceDOI','sourceLastSeenAtUnixEpoch']

## Get a list of all the primary interacting species

In [33]:
interactingTaxaData = interactDataTaxon.drop_duplicates()

In [34]:
primaryInteractingTaxa = interactingTaxaData['sourceTaxonName'].drop_duplicates()
len(primaryInteractingTaxa)

125

In [35]:
primaryInteractingTaxa = (primaryInteractingTaxa.append(interactingTaxaData['targetTaxonName']).drop_duplicates())
len(primaryInteractingTaxa)

172

In [36]:
primaryInteractingTaxa

0                           Strix varia
2                      Bubo virginianus
3                        Cathartes aura
4                    Accipiter gentilis
6              Haliaeetus leucocephalus
10                       Megascops asio
12                    Buteo jamaicensis
13                       Buteo lineatus
15                   Accipiter cooperii
16                    Buteo platypterus
86                 Sciurus carolinensis
88                        Canis latrans
98                           Lynx rufus
104              Buteogallus urubitinga
112                    Coragyps atratus
123                     Lynx canadensis
124                      Neovison vison
125                         Canis lupus
126                             Mustela
127                       Vulpes vulpes
194                   Orchopeas howardi
197                       Ceratophyllus
200               Spilopsyllus cuniculi
201    Amalaraeus penicilliger mustelae
206                      Ctenophthalmus


## A function to query the globi database

In [37]:
def secondaryDataInGlobi(conn, intaxon):
    cur = conn.cursor()
    if interactionsToExclude == "":
        cur.execute("SELECT * from globi  WHERE (targetTaxonSpeciesName = ? OR sourceTaxonSpeciesName = ?);", \
                 (intaxon,intaxon,))
    else:
        cur.execute("SELECT * from globi where (targetTaxonSpeciesName = ? OR sourceTaxonSpeciesName = ?) and interactionTypeName NOT IN (?);", \
                (intaxon,intaxon,interactionsToExclude,))
    return(cur.fetchall())

## Get all the secondary interactions

In [38]:
try:
    conn = sqlite3.connect(globiDB)
except Error as e:
    print(e)

In [39]:
secondarylist = []

for name in primaryInteractingTaxa:
    temp = secondaryDataInGlobi(conn, name)
    secondarylist.extend(temp)

In [40]:
# Convert to a Pandas dataframe
secondarylist = pd.DataFrame(secondarylist)

In [41]:
secondarylist.columns = ['sourceTaxonId', \
                                'sourceTaxonIds','sourceTaxonName','sourceTaxonRank','sourceTaxonPathNames', \
                                'sourceTaxonPathIds','sourceTaxonPathRankNames','sourceTaxonSpeciesName','sourceTaxonSpeciesId',\
                                'sourceTaxonGenusName','sourceTaxonGenusId','sourceTaxonFamilyName','sourceTaxonFamilyId',\
                                'sourceTaxonOrderName','sourceTaxonOrderId','sourceTaxonClassName','sourceTaxonClassId',\
                                'sourceTaxonPhylumName','sourceTaxonPhylumId','sourceTaxonKingdomName','sourceTaxonKingdomId',\
                                'sourceId','sourceOccurrenceId','sourceCatalogNumber','sourceBasisOfRecordId',\
                                'sourceBasisOfRecordName','sourceLifeStageId','sourceLifeStageName','sourceBodyPartId',\
                                'sourceBodyPartName','sourcePhysiologicalStateId','sourcePhysiologicalStateName',\
                                'sourceSexId', 'sourceSexName', 'interactionTypeName',\
                                'interactionTypeId','targetTaxonId','targetTaxonIds','targetTaxonName',\
                                'targetTaxonRank','targetTaxonPathNames','targetTaxonPathIds','targetTaxonPathRankNames',\
                                'targetTaxonSpeciesName','targetTaxonSpeciesId','targetTaxonGenusName','targetTaxonGenusId',\
                                'targetTaxonFamilyName','targetTaxonFamilyId','targetTaxonOrderName','targetTaxonOrderId',\
                                'targetTaxonClassName','targetTaxonClassId','targetTaxonPhylumName','targetTaxonPhylumId',\
                                'targetTaxonKingdomName','targetTaxonKingdomId','targetId','targetOccurrenceId',\
                                'targetCatalogNumber','targetBasisOfRecordId','targetBasisOfRecordName','targetLifeStageId',\
                                'targetLifeStageName','targetBodyPartId','targetBodyPartName','targetPhysiologicalStateId',\
                                'targetPhysiologicalStateName', 'targetSexId', 'targetSexName', \
                                'decimalLatitude','decimalLongitude','localityId',\
                                'localityName','eventDateUnixEpoch','argumentTypeId','referenceCitation',\
                                'referenceDoi','referenceUrl','sourceCitation','sourceNamespace',\
                                'sourceArchiveURI','sourceDOI','sourceLastSeenAtUnixEpoch']

In [42]:
allInteractionsData = interactingTaxaData.append(secondarylist)

In [43]:
allInteractionsData = allInteractionsData.drop_duplicates()


In [44]:
    cleanInteractDataTaxon = allInteractionsData[['sourceTaxonId', 'sourceTaxonName', 'sourceTaxonSpeciesName', 'sourceTaxonRank',
       'sourceTaxonFamilyName', 'interactionTypeName',
       'targetTaxonName','targetTaxonSpeciesName','targetTaxonRank', 'sourceCitation', 'sourceDOI'
        ]].dropna(subset=['targetTaxonSpeciesName','sourceTaxonSpeciesName'])

In [45]:
allSourceInteractingTaxa = cleanInteractDataTaxon['sourceTaxonSpeciesName'].drop_duplicates()
allTargetInteractingTaxa = cleanInteractDataTaxon['targetTaxonSpeciesName'].drop_duplicates()

In [46]:
allInteractingTaxa = allSourceInteractingTaxa.append(allTargetInteractingTaxa).drop_duplicates()
allInteractingTaxa.count()

6045

In [47]:
allInteractingTaxa.head()

0                 Strix varia
2            Bubo virginianus
3              Cathartes aura
4          Accipiter gentilis
6    Haliaeetus leucocephalus
dtype: object

In [48]:
# How many different sort of interaction do I have left?
# Checking out all the interaction types
cleanInteractDataTaxon.groupby(cleanInteractDataTaxon['interactionTypeName']).size().sort_values(ascending = False)

interactionTypeName
interactsWith         44838
hasHost               36908
eats                  35294
parasiteOf            11451
pathogenOf             1257
ectoparasiteOf          985
preysOn                 621
adjacentTo              443
endoparasiteOf          348
mutualistOf             239
pollinates              203
visitsFlowersOf         161
visits                   44
kills                    38
symbiontOf               33
hasDispersalVector       16
hasVector                 7
livesOn                   3
livesInsideOf             3
parasitoidOf              1
commensalistOf            1
dtype: int64

### This function takes a name string and checks on GBIF to see if the name exists there.

In [49]:
def speciesExistsInGBIF(name, rank):
    try:
        match = species.name_backbone(name=name, rank=rank, limit = 1)

        # if there is no match this is returned from .name_backbone {'confidence': 100, 'matchType': 'NONE', 'synonym': False}
        if match['matchType'] == 'NONE':
            return False
        else:
            return match
    except ValueError as ve:
        print(ve)
        exit(1)

### Check that the species in question is actually found on GBIF

In [50]:
if(speciesExistsInGBIF(taxon, "species") == False):
    print("##### {0} has not been found on GBIF #####".format(taxon))

### Check to see which taxa in the interaction network are found in GBIF and list those ones that are not

In [51]:
taxaFound = {}

print('Taxa from GLoBI, but not found in GBIF')
for name in allInteractingTaxa.items():
    GBIFName = speciesExistsInGBIF(name[1], "species")
    if GBIFName == False:
        taxaFound[name[1]] = False
        print(name[1])
    else:
        taxaFound[name[1]] = GBIFName['usageKey']
    

Taxa from GLoBI, but not found in GBIF

Vesicular stomatitis New Jersey virus
Theilovirus
Sendai virus
Murine pneumonia virus
Murid herpesvirus 1
Encephalomyocarditis virus
California encephalitis virus
Rabies virus
Minute virus of mice
Paruterininae
Newcastle disease virus
Habronematinae
Leucocytozoon toddi
Ascaroidea
Columbid herpesvirus 1
Canine distemper virus
Canine parvovirus
Capillaria putorii
Eimeria canis
Sarcocystis fusiformis
ALARIA (A.) MARCIANAE
Alphacoronavirus
Feline panleukopenia virus
ASCAROIDEA
Gammacoronavirus Avian coronavirus
Lynx canadensis faeces associated genomovirus CL1 48
Lynx canadensis faeces associated genomovirus CL1 148
Lynx canadensis associated microvirus CLP 9413
Alphacoronavirus Alphacoronavirus 2
Alphacoronavirus Mink coronavirus 1
Aleutian mink disease virus
Mink astrovirus
Capillaria mucronata
Canine parainfluenza virus
Mink calicivirus
Trypanosoma cruzi
Sarcocystis cruzi
Trypanosoma congolense
Mokola virus
Human herpesvirus 4
Trypanosoma evansi
C

In [52]:
# Convert to a Pandas dataframe
taxaFound = pd.DataFrame.from_dict(taxaFound, orient='index')

In [53]:
len(taxaFound)

6045

In [54]:
taxaFound

Unnamed: 0,0
Strix varia,2497541
Bubo virginianus,5959118
Cathartes aura,2481930
Accipiter gentilis,2480589
Haliaeetus leucocephalus,2480446
Megascops asio,2497415
Buteo jamaicensis,2480542
Buteo lineatus,2480529
Accipiter cooperii,2480621
Buteo platypterus,2480538


### This function takes a GBIF species key and counts how many occurences exist in the data cube

In [69]:
def speciesCountInCube(conn, key):
    cur = conn.cursor()
    cur.execute("SELECT COUNT(taxonKey) from (SELECT taxonKey FROM cube WHERE year > ? and taxonKey = ? GROUP BY eea_cell_code)", (year, key,))
    return(cur.fetchall())

### Loop over all the taxa that are in the interaction network and are in GBIF to find the ones that have been found in the country

In [70]:
taxaFoundInCountry = []

try:
    conn = sqlite3.connect(database)
except Error as e:
    print(e)

In [71]:
year

1970

In [None]:
for GBIFtaxon in taxaFound.iterrows():
    GBIFOccCount = speciesCountInCube(conn, GBIFtaxon[1][0])
    print(GBIFtaxon[0])
    if GBIFOccCount[0][0] > thresholdForOccNum and GBIFtaxon[0] != taxon: # The threshold for observations is not greater than one, due to all the casual records
        taxaFoundInCountry.append({'key': GBIFtaxon[1][0], 'species': GBIFtaxon[0], 'count': GBIFOccCount[0][0]})
        print('{0} with key {1} occurs in {2} km squares.'.format(GBIFtaxon[0],GBIFtaxon[1][0],GBIFOccCount[0][0]))
    elif GBIFtaxon[0] == taxon:
        taxaFoundInCountry.append({'key': GBIFtaxon[1][0], 'species': GBIFtaxon[0], 'count': GBIFOccCount[0][0]})
        print('{0} with key {1} occurs in {2} km squares.'.format(GBIFtaxon[0],GBIFtaxon[1][0],GBIFOccCount[0][0]))

Strix varia
Bubo virginianus
Cathartes aura
Accipiter gentilis
Accipiter gentilis with key 2480589 occurs in 8740 km squares.
Haliaeetus leucocephalus
Haliaeetus leucocephalus with key 2480446 occurs in 13 km squares.
Megascops asio
Buteo jamaicensis
Buteo jamaicensis with key 2480542 occurs in 9 km squares.
Buteo lineatus
Accipiter cooperii
Buteo platypterus
Sciurus carolinensis
Sciurus carolinensis with key 5219681 occurs in 19 km squares.
Canis latrans
Lynx rufus
Buteogallus urubitinga
Coragyps atratus
Lynx canadensis
Neovison vison
Neovison vison with key 2433652 occurs in 41 km squares.
Canis lupus
Canis lupus with key 5219173 occurs in 14 km squares.

Vulpes vulpes
Vulpes vulpes with key 5219243 occurs in 50 km squares.
Orchopeas howardi
Spilopsyllus cuniculi
Amalaraeus penicilliger
Peromyscopsylla spectabilis
Nosopsyllus Gerbillophilus
Neohaematopinus sciuri
Hoplopleura sciuricola
Enderleinellus longiceps
Baylisascaris procyonis
Moniliformis clarki
Hymenolepis nana
Trichostrongy

Ficedula hypoleuca
Ficedula hypoleuca with key 2492606 occurs in 4502 km squares.
Setophaga ruticilla
Tringa flavipes
Tringa flavipes with key 2481721 occurs in 163 km squares.
Cyrtonyx montezumae
Lagopus lagopus
Falco tinnunculus
Falco tinnunculus with key 9616058 occurs in 19447 km squares.
Accipiter striatus
Accipiter nisus
Accipiter nisus with key 2480637 occurs in 18644 km squares.
Garrulus glandarius
Garrulus glandarius with key 5229493 occurs in 13701 km squares.
Milvus milvus
Milvus milvus with key 5229168 occurs in 7387 km squares.
Tamiasciurus douglasii
Sciurus aberti
Spermophilus lateralis
Spermophilus brunneus
Cynomys mexicanus
Peromyscus boylii
Ochotona princeps
Microtetrameres accipiter
Colpocephalum flavescens
Colpocephalum turbinatum
Craspedorrhynchus halieti
Degeeriella discocephalus
Influenza A virus
Clostridium botulinum
Sarcocystis falcatula
Eastern equine encephalitis virus
Cyprinodon variegatus
Chen canagica
Chen canagica with key 2498162 occurs in 121 km squares.

Anas platyrhynchos
Anas platyrhynchos with key 9761484 occurs in 14470 km squares.
Oncicola canis
Taenia multiceps
Athesmia heterolecithodes
Trichuris vulpis
Capillaria hepatica
Heterobilharzia americana
Eurytrema pancreaticum
Echinococcus granulosus
Pachysentis canicola
Taenia serialis
Mesocestoides corti
Rictularia cahirensis
Dermatoxys veligera
Alaria arisaemoides
Trichinella spiralis
Sarcoptes scabiei
Sarcocystis neurona
Mycobacterium tuberculosis
Mycobacterium intracellulare
Mycobacterium avium
Lawsonia intracellularis
Francisella tularensis
Rickettsia rickettsii
Taenia krabbei
Passalurus nonannulatus
Multiceps packii
Mesocestoides kirbyi
Dipylidium caninum
Capillaria aerophila
Toxocara canis
Hammondia heydorni
Capillaria plica
Capillaria putorii
Eimeria canis
Isospora ohioensis
Bartonella vinsonii
Bartonella rochalimae
Taenia ovis
Crenosoma vulpis
Sarcocystis fusiformis
Cediopsylla simplex
Neorickettsia risticii
Borrelia burgdorferi
Alaria mustelae
Mesocestoides lineatus
Molineus

Diphyllobothrium alascense
Echinococcus sibiricensis
Diphyllobothrium osmeri
Diphyllobothrium dalliae
Diphyllobothrium lanceolatum
Taenia polyacantha
Baylisiella tecta
Rauschoides arctica
Malassezia pachydermatis
Pasteurella multocida
Ehrlichia canis
Leishmania infantum
Alphacoronavirus 1
Leptospira noguchii
Leptospira santarosai
Babesia gibsoni
Bergeyella zoohelcum
Leishmania braziliensis
Campylobacter jejuni
Trypanosoma cruzi
Leishmania donovani
Blastomyces dermatitidis
Helicobacter pylori
Clostridium perfringens
Brucella canis
Staphylococcus intermedius
Babesia canis
Streptococcus dysgalactiae
Trypanosoma brucei
Staphylococcus pseudintermedius
Enterococcus faecalis
Histoplasma capsulatum
Rotavirus A
Trichophyton mentagrophytes
Cryptococcus neoformans
Anaplasma platys
Dirofilaria repens
Bordetella bronchiseptica
Ehrlichia chaffeensis
Streptococcus canis
Clonorchis sinensis
Strongyloides stercoralis
Capnocytophaga cynodegmi
Enterococcus faecium
Necator americanus
Campylobacter upsalie

Calidris alba
Calidris alba with key 2481748 occurs in 1570 km squares.
Naemorhedus caudatus
Hemitragus hylocrius
Tetracerus quadricornis
Procapra przewalskii
Pseudois nayaur
Bubo scandiacus
Bubo scandiacus with key 5959143 occurs in 162 km squares.
Castor fiber
Castor fiber with key 4409131 occurs in 49 km squares.
Lepus nigricollis
Ochotona ladacensis
Redunca fulvorufula
Rupicapra rupicapra
Bos grunniens
Bos taurus
Capra sibirica
Capra nubiana
Capra ibex
Capra cylindricornis
Ovis ammon
Panthera pardus
Alces alces
Cervus nippon
Cervus nippon with key 2440954 occurs in 22 km squares.
Phoca largha
Lynx lynx
Martes americana
Gulo gulo
Ursus americanus
Marmota broweri
Atherurus macrourus
Macrotis lagotis
Trichosurus arnhemensis
Spermophilus madrensis
Marmota himalayana
Puma concolor
Lepus oiostolus
Equus hemionus
Equus kiang
Semnopithecus entellus
Cervus albirostris
Axis porcinus
Lepus castroviejoi
Gazella dorcas
Merops ornatus
Stilbella erythrocephala
Mortierella reticulata
Papilio rutul

Orchestes signifer
Lymantor coryli
Luperus flavipes
Gonioctena pallida
Dryocoetes alni
Dryocoetes alni with key 1242976 occurs in 6 km squares.
Curculio nucum
Cryptocephalus sexpunctatus
Cryptocephalus nitidulus
Cryptocephalus labiatus
Altica brevicollis
Agrilus laticornis
Agrilus laticornis with key 4432706 occurs in 27 km squares.
Agrilus angustulus
Agrilus angustulus with key 4432673 occurs in 65 km squares.
Phellinus lundellii
Pentatoma rufipes
Pentatoma rufipes with key 5758737 occurs in 2333 km squares.
Palomena prasina
Palomena prasina with key 4485776 occurs in 3082 km squares.
Orthotylus tenellus
Orthotylus tenellus with key 4488049 occurs in 8 km squares.
Malacocoris chlorizans
Malacocoris chlorizans with key 2009226 occurs in 69 km squares.
Elasmostethus interstinctus
Elasmostethus interstinctus with key 2079270 occurs in 1026 km squares.
Veronaea botryosa
Diplococcium lawrencei
Cryptocephalus primarius
Cryptocephalus coryli
Polygonia c-album
Polygonia c-album with key 18985

Fringilla coelebs
Fringilla coelebs with key 2494422 occurs in 15620 km squares.
Psathyrella narcotica
Phleogena faginea
Melanospora longisetosa
Eichleriella deglubens
Daedalea quercina
Ceraceomyces borealis
Bolbitius
Auricularia auricula-judae
Auricularia auricula-judae with key 5249271 occurs in 17 km squares.
Rutstroemia sydowiana
Nematus fagi
Rhabdospora
Fusicoccum macrosporum
Cryptodiaporthe galericulata
Stictoleptura scutellata
Stictoleptura scutellata with key 4458838 occurs in 38 km squares.
Prionus coriarius
Prionus coriarius with key 7887483 occurs in 88 km squares.
Anoplodera sexguttata
Anoplodera sexguttata with key 8216511 occurs in 43 km squares.
Xyleborus dryographus
Xyleborus dispar
Xyleborinus saxesenii
Trypodendron signatum
Trypodendron signatum with key 1244655 occurs in 83 km squares.
Trypodendron domesticum
Trypodendron domesticum with key 1244634 occurs in 54 km squares.
Taphrorychus bicolor
Taphrorychus bicolor with key 1241015 occurs in 46 km squares.
Scolytus i

Placusa incompleta
Tachyta nana
Cryptarcha strigata
Cryptarcha strigata with key 4453153 occurs in 23 km squares.
Quedius brevicornis
Henoticus serratus
Silusa rubiginosa
Dendrophagus crenatus
Soronia grisea
Ampedus triangulum
Ampedus vandalitae
Brachygonus dubius
Allecula rhenana
Clypastraea pusilla
Vincenzellus ruficollis
Vincenzellus ruficollis with key 4456161 occurs in 16 km squares.
Aderus populneus
Mycetochara flavipes
Phymatura brevicollis
Laemophloeus monilis
Colydium filiforme
Corticeus unicolor
Corticeus unicolor with key 4455173 occurs in 26 km squares.
Cerylon histeroides
Mycetochara axillaris
Epuraea distincta
Rhizophagus nitidulus
Rhizophagus nitidulus with key 5748847 occurs in 19 km squares.
Placusa tachyporoides
Placusa tachyporoides with key 1042420 occurs in 10 km squares.
Corticeus bicolor
Cyanostolus aeneus
Medon rufiventris
Dasytes nigrocyaneus
Bolitochara obliqua
Bolitochara obliqua with key 6097673 occurs in 16 km squares.
Euryusa sinuata
Soronia punctatissima


Salix repens
Salix repens with key 9148812 occurs in 1863 km squares.
Russula sardonia
Pinus sylvestris
Pinus sylvestris with key 5285637 occurs in 3153 km squares.
Russula melitodes
Russula maculata
Russula violacea
Russula albonigra
Pseudotsuga menziesii
Pseudotsuga menziesii with key 2685796 occurs in 596 km squares.
Russula nitida
Russula grisea
Calcarisporium arbuscula
Russula emetica
Russula badia
Quercus ilex
Quercus ilex with key 2879098 occurs in 54 km squares.
Falco columbarius
Falco columbarius with key 9813242 occurs in 5495 km squares.
Glaucidium gnoma
Hohorstiella paladinella
Columbicola baculoides
Physconelloides zenaidurae
Columbicola macrourae
Bonomiella columbae
Baruscapillaria obsignata
Ornithostrongylus quadriradiatus
Gigantobilharzia huronensis
Plasmodium relictum
Haemorhous mexicanus
Haemorhous mexicanus with key 8323485 occurs in 24 km squares.
Tyrannus vociferans
Icterus cucullatus
Callipepla californica
Callipepla californica with key 5228080 occurs in 13 km sq

Eupeodes americanus
Dinotiscus colon
Metablastothrix claripennis
Chrysocharis occidentalis
Brachys aerosus
Ophiostoma ulmi
Ceratomia amyntor
Ectoedemia ulmella
Agraulis vanillae
Papilio troilus
Octotoma plicatula
Sabulodes caberata
Ceratomia undulosa
Syritta pipiens
Syritta pipiens with key 1544431 occurs in 10 km squares.
Delphinia picta
Helicobia rapax
Ravinia stimulans
Manduca sexta
Archilochus colubris
Lucilia sericata
Milichiella lucidula
Stomoxys calcitrans
Formica fusca
Formica fusca with key 1314881 occurs in 2139 km squares.
Crematogaster lineolata
Formica schaufussi
Tytthus vagus
Micrutalis calva
Hemiberlesia lataniae
Cavariella pastinacae
Oceanaspidiotus spinosus
Abgrallaspis cyanophylli
Myzus persicae
Myzus persicae with key 2076179 occurs in 27 km squares.
Phoebis sennae
Bombus pensylvanicus
Vespula maculifrons
Ancistrocerus gazella
Ancistrocerus gazella with key 5037434 occurs in 94 km squares.
Dolichovespula maculata
Chalybion californicum
Chrysis angolensis
Polistes exc

Diadasia enavata
Diadasia diminuta
Bombus morrisoni
Bombus huntii
Pseudopanugus sp. E4
Pseudopanugus sp. E3
Pseudopanugus n.sp. (aff. irregularis)
Perdita phymatae
Perdita calloleuca
Perdita verbesinae
Perdita albipennis
Perdita moabensis
Calliopsis philiphunteri
Andrena ramaleyi
Andrena pecosana
Andrena helianthi
Andrena accepta
Illinoia masoni
Pieris rapae
Pieris rapae with key 1920496 occurs in 15284 km squares.
Chlosyne nycteis
Tobacco streak virus
Bacillus cereus
Lygus elisus
Bidens mottle virus
Gibellulopsis nigrescens
Lygus borealis
Lygus lineolaris
Bemisia tabaci
Pustula tragopogonis
Pseudopanurgus rugosus
Halictus ligatus
Halictus confusus
Halictus confusus with key 1353363 occurs in 365 km squares.
Diabrotica undecimpunctata
Bombus bimaculatus
Bombus griseocollis
Bombus vagans
Bombus variabilis
Lewia infectoria
Cochliobolus hawaiiensis
Speyeria cybele
Chauliognathus pennsylvanicus
Megachile mendica
Dieunomia heteropoda
Svastra atripes
Melissodes bimaculata
Acalymma vittatum
A

Stegophylla quercifoliae
Stegophylla davisi
Allokermes essigi
Eriococcus quercus
Chnaurococcus villosa
Parthenolecanium quercifex
Quernaspis quercus
Protodiaspis agrifoliae
Aspidaspis densiflorae
Stegophylla quercicola
Paraproba hamata
Tupiocoris notatus
Closterotomus norwegicus
Leiothlypis celata
Pipilo maculatus
Lonchura punctulata
Melozone crissalis
Otala lactea
Omphalotus olivascens
Pseudohemihyalea edwardsii
Melanerpes formicivorus
Oecanthus californicus
Adelpha californica
Callirhytis apicalis
Bloomeria crocea
Gymnopilus ventricosus
Henricus umbrabasana
Xanthomendoza hasseana
Dendromecon rigida
Prenolepis imparis
Amanita ocreata
Hesperoyucca whipplei
Agaricus hondensis
Russula cremoricolor
Tricholoma dryophilum
Fraxinus dipetala
Artemisia californica
Baccharis pilularis
Achnatherum miliaceum
Achnatherum miliaceum with key 4141737 occurs in 16 km squares.
Orgyia vetusta
Celastrina echo
Nemapogon granella
Monomorium ergatogyna
Leptothorax andrei
Camponotus clarithorax
Abutilon indi

Pica pica
Pica pica with key 5229490 occurs in 13979 km squares.
Cistothorus palustris
Python molurus
Machaerilaemus laticorpus
Philopterus agelaii
Ischnura gemina
Dolichonyx oryzivorus
Botaurus lentiginosus
Boydaia agelaii
Xanthocephalus xanthocephalus
PLAGIORCHIS (P.) NOBLEI
Parechinostoma
Oxyspirura mansoni
Microtetrameres centuri
Harpirhynchus nidulans
Gigantobilharzia gyravli
Ecnhinostomum
Cnemidocoptes mutans
Anonchotaenia globata
Anaxyrus americanus
Aceria fraxiniflora
Sicya macularia
Sympistis chionanthi
Phigalia titea
Harrisimemna trisignata
Agrilus planipennis
Olceclostera seraphica
Anacamptodes ephyraria
Ennomos magnaria
Nematocampa filamentaria
Caloptilia arsenievi
Marmara fraxinicola
Marmara corticola
Marmara basidendroca
Orthosia hibisci
Callosamia promethea
Cydia ingrata
Pseudosciaphila duplex
Caloptilia fraxinella
Eupareophora parca
Tethida barda
Quercus laurifolia
Platanus occidentalis
Nyssa aquatica
Carya tomentosa
Bisulcopsallus huachucae
Salicopsallus lucidus
Tropid

Celastrina argiolus
Celastrina argiolus with key 1925918 occurs in 12273 km squares.
Callophrys rubi
Callophrys rubi with key 1932411 occurs in 1588 km squares.
Brenthis ino
Brenthis ino with key 1902749 occurs in 78 km squares.
Brenthis daphne
Brenthis daphne with key 1902791 occurs in 27 km squares.
Phragmidium rubi-idaei
Cladius brullei
Elasmucha ferrugata
Callimorpha dominula
Phenacephorus auriculatus
Lonchodes everetti
Onchestus rentzi
Rhamphosipyloidea palumensis
Acrophylla wuelfingi
Apatelodes torrefacta
Spilosoma lubricipeda
Spilosoma lubricipeda with key 1811896 occurs in 10 km squares.
Blastobasis lacticolella
Carposina adreptella
Pseudothyatira cymatophoroides
Thyatira batis
Cleora cinctaria
Crocallis elinguaria
Plagodis pulveraria
Synchlora aerata
Anticlea vasiliata
Eupithecia miserulata
Eupithecia pusillata
Eupithecia subfuscata
Xanthorhoe lacustrata
Incurvaria rubiella
Lasiocampa quercus
Lasiocampa quercus with key 1734826 occurs in 35 km squares.
Orgyia recens
Protodelto

Lamium amplexicaule
Lamium amplexicaule with key 2926679 occurs in 3333 km squares.
Vespula alascensis
Nymphalis californica
Paradejeania rutilioides
Ochlodes agricola
Leucospermum cordifolium
Metopium toxiferum
Arachis glabrata
Amelanchier canadensis
Juglans hindsii
Catharus ustulatus
Ixoreus naevius
Perisoreus canadensis
Sorex palustris
Passerella iliaca
Sylvilagus transitionalis
Scalopus aquaticus
Sorex fumeus
Dytiscus alaskanus
Rabbit coronavirus HKU14
Quercus sinuata
Gallus gallus
Gallus gallus with key 9326020 occurs in 347 km squares.
Plectrophenax nivalis
Plectrophenax nivalis with key 2491719 occurs in 963 km squares.
Aythya marila
Aythya marila with key 2498265 occurs in 2175 km squares.


In [None]:
# Convert to a Pandas dataframe
taxaFoundInCountry = pd.DataFrame(taxaFoundInCountry)    

In [None]:
print("The number of species left in the network is {0}".format(len(taxaFoundInCountry)))

In [None]:
taxaFoundInCountry

## Drawing a network of the interactions

Now that I have a list of all the species in the country I can use this as my nodes list for the network diagram.

In [None]:
#networkx seems to be a leading network tool in Python
import networkx as nx
import matplotlib.pyplot as plt

try:
    import pygraphviz
    from networkx.drawing.nx_agraph import write_dot
    print("using package pygraphviz")
except ImportError:
    try:
        import pydot
        from networkx.drawing.nx_pydot import write_dot
        print("using package pydot")
    except ImportError:
        print()
        print("Both pygraphviz and pydot were not found ")
        print("see  https://networkx.github.io/documentation/latest/reference/drawing.html")
        print()

In [None]:
# Create graphic object
G = nx.DiGraph()

In [None]:
# Match colours to interactions to distinguish them on the graph
colorInteractions = {'interaction':['pollinates', 'mutualistOf', 'eats', 'visitsFlowersOf', 'hasHost', 'parasiteOf', 'pathogenOf'],
        'colour':['r', 'g', 'b', 'y', 'm', 'w', 'c']}  

colorInteractionsDf = pd.DataFrame(colorInteractions)

#len(list(G.nodes))

## A quick look at the interaction data to see if it is what is expected

In [None]:
#cleanInteractDataTaxon.loc[(cleanInteractDataTaxon["sourceTaxonName"] == 'Apis mellifera') & (cleanInteractDataTaxon["targetTaxonName"] == 'Procyon lotor')]
cleanInteractDataTaxon.loc[(cleanInteractDataTaxon["sourceTaxonSpeciesName"] == 'Rousettus aegyptiacus')]

## Figure out which node has the most records. This is so that the graphic can be scaled.

In [None]:
if len(taxaFoundInCountry) > 0:
    maxRecords = taxaFoundInCountry.max()["count"]
    print(maxRecords)

### Add the nodes to the graph

In [None]:
dictOfNodeSizes = {}

for index, row in taxaFoundInCountry.iterrows():
     # access data using column names
    #print('A: ', row['species'], row['count'], row['key'])
    G.add_node(row['species'], gbifkey=row['key'], occupancy = row['count'])
    #create a list of node sizes scaled for the network visulization
    dictOfNodeSizes[row['species']] = int(row['count']/maxRecords*100)

### Add edges to the graph

In [None]:
# iterate over the interacting species that are in GBIF and in the country with iterrows()
# Find the taxa found in the country that are in the source taxon name of the interation data,
# then add the edge if the target species is in the country too.

taxaFoundInCountry_copy = taxaFoundInCountry.copy()

for index, row in taxaFoundInCountry.iterrows():
    # loop over all the taxa finding if any of them are mentioned in the sourceTaxonName field
    for edge in cleanInteractDataTaxon.iterrows():
        if row['species'] == edge[1]['sourceTaxonSpeciesName']:
            #print('B: ', edge[1]['sourceTaxonName'], edge[1]['targetTaxonName'])
            # Some of the target species will not be in GBIF of in the country, so only add an edge where they are.
            for index2, row2 in taxaFoundInCountry_copy.iterrows():
                #print('E: ', row2['species'], edge[1]['targetTaxonName'])
                if row2['species'] == edge[1]['targetTaxonSpeciesName']: 
                    print('C: ', edge[1]['targetTaxonSpeciesName'], row['species'], edge[1]['interactionTypeName'])
                    G.add_edge(row['species'], edge[1]['targetTaxonSpeciesName'], label = edge[1]['interactionTypeName'])
                    
#len(list(G.nodes))


In [None]:
# iterate over rows with iterrows()

# Find the taxa found in the country that are in the target taxon name of the interation data,
# then add the edge if the source species is in the country too.

for index, row in taxaFoundInCountry.iterrows():
    for edge in cleanInteractDataTaxon.iterrows():
        if row['species'] == edge[1]['targetTaxonName']:
            #print('D: ', edge[1]['sourceTaxonName'], edge[1]['targetTaxonName'])
            #G.add_node(edge[1]['sourceTaxonName'], gbifkey=row['key'])
            #dictOfNodeSizes[edge[1]['sourceTaxonName']] = int(row['count']/maxRecords*100)
            for index2, row2 in taxaFoundInCountry.iterrows():
                #print('E: ', row2['species'])
                if row2['species'] == edge[1]['sourceTaxonSpeciesName']: 
                    print('F: ', edge[1]['sourceTaxonSpeciesName'],edge[1]['targetTaxonSpeciesName'],edge[1]['interactionTypeName'])
                    G.add_edge(edge[1]['sourceTaxonSpeciesName'], row2['species'], label = edge[1]['interactionTypeName'])

In [None]:
print("Number of nodes = {0}".format(G.number_of_nodes()))
print("Number of edges = {0}".format(G.number_of_edges()))

In [None]:
len(dictOfNodeSizes)

## Remove any nodes that have no edges.
This happens because some of the linking nodes have few supporting observations and so have been weeded out


In [None]:
#for n in G.nodes:
#    if G.degree(n) == 0:
#        G.remove_node(n)
        
G.remove_nodes_from(list(nx.isolates(G)))

In [None]:
len(G.nodes)

## Remove any selfloop edges

In [None]:
for e in G.selfloop_edges(data=False):
    G.remove_edge(e[0],e[1])

## Run through the list of nodes again and construct a list of the node sizes in the correct order

In [None]:
listOfNodeSizes = []

for node in G.nodes:
    listOfNodeSizes.append(dictOfNodeSizes[node])

## A network of all the interacting taxa

In [None]:
plt.figure(figsize=(15,15))
edge_labels = nx.get_edge_attributes(G,'label')

pos = nx.spring_layout(G, iterations=50, k=50) 
#pos = nx.spring_layout(G)
#pos = nx.random_layout(G)
#pos = nx.circular_layout(G)
#pos = nx.spectral_layout(G)
#pos = nx.shell_layout(G, scale=1)

nodeColors = nx.get_node_attributes(G,'color')

nx.draw_networkx_edge_labels(G,pos, edge_labels = edge_labels, font_size=10, font_color='blue')

#nx.draw_networkx_nodes(G, pos, node_color=nodeColors.values())

nx.draw_networkx(G, pos, with_labels=True, node_size = listOfNodeSizes, node_color='c', alpha= 1, arrows=True, 
                    linewidths=1, font_color="black", font_size=10, style = 'dashed')

plt.axis('off')
plt.tight_layout()
plt.show()

In [None]:
filename = "..\\docs\\"+taxon+country+".html"
with open(filename, "w") as file:
    file.write(" \
<!DOCTYPE html> \
<html> \
<head> \
<script src='../../../GitHub\cytoscape.js\dist\cytoscape.min.js'></script> \
<script src='https://unpkg.com/layout-base/layout-base.js'></script> \
<script src='https://unpkg.com/cose-base/cose-base.js'></script> \
<script src='../../../GitHub\cytoscape.js-cose-bilkent\cytoscape-cose-bilkent.js'></script> \
</head> \
<style>#cy {width: 90%; height: 90%; position: absolute; top: 50px; left: 150px;}\
body {font-family: 'times; font-size: 6px;}\
</style> \
<body> \
<h1><em font-style: italic;>"+taxon+"</em> in "+country+"</h1>")

### Write out the details of the species

In [None]:
with open(filename, "a") as file:
    file.write("<table><th>Species</th><th>Occupancy</th>")
    

In [None]:
species = G.nodes
with open(filename, "a") as file:
    for n in species:
        file.write("<tr><td><a target='_blank' href=https://www.gbif.org/species/"+str(nx.get_node_attributes(G, 'gbifkey')[n])+">"+n+"</a></td><td>"+str(dictOfNodeSizes[n])+"</td>\n")
        #file.write("<a href=https://www.gbif.org/species/"+str(nx.get_node_attributes(G, 'gbifkey')[n])+">"+n+"</a>, "+str(dictOfNodeSizes[n])+"\n")

In [None]:
with open(filename, "a") as file:
    file.write("</table>")    
file.close()

In [None]:
with open(filename, "a") as file:
    file.write(" \
<div id='cy'></div> \
<script> \
var cy = cytoscape({ \
  container: document.getElementById('cy'), \n \
  elements: [ \
")

### Write nodes to file (for import into Gephi, but has been replace by the .dot file below)

In [None]:
#file = open(taxon+country+".html", "a")
#for n in species:
#    file.write("{ data: { id: '"+n+"', href: 'https://www.gbif.org/species/"+str(nx.get_node_attributes(G, 'gbifkey')[n])+"', occnum: "+str(dictOfNodeSizes[n])+" }, },\n")
#file.close()

### Write edges to file

In [None]:
#file = open(taxon+country+".html", "a")
#for edge in G.edges:
#    file.write("{data: {id: '"+edge[0]+edge[1]+"', source: '"+edge[0]+"', target: '"+edge[1]+"', label: '"+nx.get_edge_attributes(G, 'label')[edge]+"'}},\n")
#file.close()
    

In [None]:
with open(filename, "a") as file:
    file.write("], \
style: [ \n\
        { \n\
            selector: 'node', \n\
            style: { \n\
                shape: 'circle', \n\
                'background-color': 'darkgreen', \n\
                label: 'data(id)', \n\
                'font-family': 'helvetica', \n\
                'font-style': 'italic', \n\
                'font-size': '12px', \n\
                'width': 'mapData(occnum, 0, 400, 3, 150)', \n\
                'height': 'mapData(occnum, 0, 400, 3, 150)' \n\
            } \n\
        },  \n\
        {  \n\
            selector: 'edge',  \n\
            style: {  \n\
                label: 'data(label)', \n\
                'font-family': 'helvetica', \n\
                'font-size': '12px', \n\
                'color': 'blue', \n\
                'curve-style': 'bezier', \n\
                'target-arrow-shape': 'triangle',  \n\
                'width': '1' \n\
                } \n\
            },  \n\
            {  \n\
              selector: ':selected',   \n\
              css: {  \n\
                'line-color': 'red',  \n\
                'background-color': 'red'  \n\
            }  \n\
        }], \n\
layout:  { \n\
            name: 'circle', padding: 10, animate: true, gravity: 30, animationDuration: 1000 \n\
     } \n\
} \n\
); \n\
cy.userZoomingEnabled( true ); \n\
</script> \n\
")

In [None]:
with open(filename, "a") as file:
    file.write("<h2>References</h2><ul>\n")

In [None]:
citations = cleanInteractDataTaxon['sourceCitation'].unique()
file = open(filename, "a")
for ref in citations:
    file.write("<li>"+str(ref)+"</li>\n")
file.close()

In [None]:
with open(filename, "a") as file:
    file.write("</ul> \
        </body> \
        </html>")

## Output a CSV file for import into Gephi

In [None]:
for node in G.edges:
    print(node[1])

In [None]:
with open(taxon+"_nodes.csv", "w") as file:
    file.write("Id,Label,Category")
    for node in G.nodes:
        file.write("'"+edge[0]+edge[1]+"','"+edge[0]+"','"+edge[1]+"','"+nx.get_edge_attributes(G, 'label')[edge]+"'\n")
file.close()

In [None]:
with open(taxon+"edges.csv", "w") as file:
    file.write("Id,Label,Category")
    for node in G.nodes:
        file.write("'"+edge[0]+edge[1]+"','"+edge[0]+"','"+edge[1]+"','"+nx.get_edge_attributes(G, 'label')[edge]+"'\n")
file.close()

In [None]:
#with open(taxon+".csv", "w") as file:
write_dot(G, taxon+".dot")