<div class="notebook-buttons" style="display:flex; padding-top: 5rem;padding-bottom: 2.5rem;line-height: 2.15;">
    <a href="https://colab.research.google.com/github/zazuko/notebooks/blob/master/notebooks/animal-disease/epidemics.ipynb">
        <div id="colab-link" style="display: flex;padding-right: 3.5rem;padding-bottom: 0.625rem;border-bottom: 1px solid #ececed; align-items: center;">
            <img class="call-to-action-img" src="img//colab.svg" width="30" height="30" style="margin-right: 10px;margin-top: auto;margin-bottom: auto;">
            <div class="call-to-action-txt">Run in Google Colab</div>
        </div>
    </a>
    <a href="https://raw.githubusercontent.com/zazuko/notebooks/master/notebooks/animal-disease/epidemics.ipynb" download>
        <div id="download-link" style="display: flex;padding-right: 3.5rem;padding-bottom: 0.625rem;border-bottom: 1px solid #ececed; height: auto;align-items: center;">
            <img class="call-to-action-img" src="img//download.svg" width="22" height="30" style="margin-right: 10px;margin-top: auto;margin-bottom: auto;">
            <div class="call-to-action-txt">Download Notebook</div>
        </div>
    </a>
    <a href="https://github.com/zazuko/notebooks/blob/master/notebooks/animal-disease/epidemics.ipynb">
        <div id="github-link" style="display: flex;padding-right: 3.5rem;padding-bottom: 0.625rem;border-bottom: 1px solid #ececed; height: auto;align-items: center;">
            <img class="call-to-action-img" src="img//github.svg" width="25" height="30" style="margin-right: 10px;margin-top: auto;margin-bottom: auto;">
            <div class="call-to-action-txt">View on GitHub</div>
        </div>
    </a>
</div>

# Animal epidemics in Switzerland

FOAG, Federal Office for Agriculture, collects data on the animal diseases in Switzerland. This data is published as [Linked Data](https://en.wikipedia.org/wiki/Linked_data). 

In this tutorial, we will show **how to work with Linked Data.** Mainly, we will see how to work with data on animal epidemics.   
We will look into how to query, process, and visualize it.   

## Ideas

* Animal hierarchy
* Disease hierarchy
* Number of reports per gde, over time (https://python-visualization.github.io/folium/plugins.html#folium.plugins.TimeSliderChoropleth)

* Serious outbreaks: find reports where stock == killed

In [12]:
import json

import folium
import matplotlib as mpl
import matplotlib.pyplot as plt
import pandas as pd
import plotly.express as px
import plotly.graph_objects as go
from plotly.subplots import make_subplots

from graphly.api_client import SparqlClient


import networkx as nx

from bokeh.io import output_file, show, output_notebook
from bokeh.plotting import figure, from_networkx, ColumnDataSource

from bokeh.models import (Circle, EdgesAndLinkedNodes, HoverTool,
                          MultiLine, NodesAndLinkedEdges, Plot, Range1d, TapTool,CustomJS)
from bokeh.palettes import Spectral4
from bokeh.models import Legend, LegendItem

import pandas as pd

In [13]:
sparql = SparqlClient("https://int.lindas.admin.ch/query")
geosparql = SparqlClient("https://ld.geo.admin.ch/query")

sparql.add_prefixes({
    "schema": "<http://schema.org/>",
    "cube": "<https://cube.link/>",
    "admin": "<https://schema.ld.admin.ch/>",
    "skos": "<http://www.w3.org/2004/02/skos/core#>",
    "disease": "<https://environment.ld.admin.ch/foen/animal-pest/>"   
})

geosparql.add_prefixes({
    "dct": "<http://purl.org/dc/terms/>",
    "geonames": "<http://www.geonames.org/ontology#>",
    "schema": "<http://schema.org/>",
    "geosparql": "<http://www.opengis.net/ont/geosparql#>",
})

### Animals hierarchy

In [14]:
query = """
SELECT DISTINCT ?specie ?group
WHERE {
  ?s disease:animal-specie ?specieIRI.
  ?specieIRI schema:name ?specie.
  
  ?specieIRI skos:broader/schema:name ?group.
  
  FILTER (LANG(?specie) = "de")
  FILTER (LANG(?group) = "de")
} 
ORDER BY ?group
"""

df = sparql.send_query(query)
df.head()

Unnamed: 0,specie,group
0,Anderes Haustier,Andere Haustiere
1,Bienen,Bienen
2,Pferd,Equiden
3,Krebs,Fische
4,Fisch,Fische


## Disease hierarchy

In [15]:
query = """
SELECT DISTINCT ?epidemics ?group
WHERE {
  ?s disease:epidemics ?epidemicsIRI.
  ?epidemicsIRI schema:name ?epidemics.
  
  ?epidemicsIRI skos:broader/schema:name ?group.

  FILTER (LANG(?epidemics) = "de")
  FILTER (LANG(?group) = "de")
}
ORDER BY ?group
"""

df = sparql.send_query(query)
df.head()

Unnamed: 0,epidemics,group
0,Bovine Virusdiarrhoe / Mucosal Disease,Auszurottende Seuchen
1,Brucellose der Rinder,Auszurottende Seuchen
2,Tollwut,Auszurottende Seuchen
3,Infektiöse Agalaktie,Auszurottende Seuchen
4,Bovine spongiforme Enzephalopathie,Auszurottende Seuchen


## Can we link disease to animal type?

In [16]:
query = """
SELECT DISTINCT ?epidemics ?specie ?group
WHERE {
  <https://environment.ld.admin.ch/foen/animal-pest/observation/> cube:observation ?obs .
  ?obs disease:epidemics/schema:name ?epidemics;
       disease:animal-specie ?specieIRI.
       
  ?specieIRI schema:name ?specie.
  ?specieIRI skos:broader/schema:name ?group.
  
  FILTER (LANG(?epidemics) = "de")
  FILTER (LANG(?specie) = "de")
  FILTER (LANG(?group) = "de")
} 

ORDER BY ?specie
"""

df = sparql.send_query(query)
df.head()

Unnamed: 0,epidemics,specie,group
0,Listeriose,Affe,Wildtier
1,Coxiellose,Affe,Wildtier
2,Toxoplasmose,Affe,Wildtier
3,Tularämie,Affe,Wildtier
4,Echinokokkose,Affe,Wildtier


In [17]:
g = nx.from_pandas_edgelist(df, source='epidemics', target='specie')
groups = {key: "epidemics" for key in df.epidemics} | {key: "specie" for key in df.specie}

nx.set_node_attributes(g, groups, name="group")

colormap = {"epidemics": "yellow", "specie": "green"}
colors = {k: colormap[v] for k, v in groups.items()}
nx.set_node_attributes(g, colors, name="color")

In [19]:
# Loayouts https://networkx.org/documentation/stable/reference/generated/networkx.drawing.layout.spring_layout.html
plot = Plot()
plot.title.text = 'Bokeh Plot'

graph_renderer = from_networkx(g, nx.circular_layout, scale = 1, center = (0,0))

# manipulating nodes
graph_renderer.node_renderer.glyph = Circle(size = 15, fill_color = "color")
graph_renderer.node_renderer.selection_glyph = Circle(size = 15, fill_color = 
"color")
graph_renderer.node_renderer.hover_glyph = Circle(size = 15, fill_color = 
 Spectral4[1])
graph_renderer.node_renderer.data_source.data['group'] = [groups[g] for g in graph_renderer.node_renderer.data_source.data["index"]]

# manipulating edges
graph_renderer.edge_renderer.glyph = MultiLine(line_color = '#CCCCCC', line_alpha = 
 .5, line_width = 5)
graph_renderer.edge_renderer.selection_glyph = MultiLine(line_color = Spectral4[2], 
 line_width = 5)
graph_renderer.edge_renderer.hover_glyph = MultiLine(line_color = Spectral4[1], 
 line_width = 5)


graph_renderer.selection_policy = NodesAndLinkedEdges()
graph_renderer.inspection_policy = NodesAndLinkedEdges()

plot.renderers.append(graph_renderer)

code = '''
    if (cb_data.index.indices.length > 0) {
        const index = cb_data.index.indices[0];

        if (source.data.group[index] === "specie") {
            hover.tooltips = [["Animal", "@index"]];  
        }
        else {
            hover.tooltips = [["Disease", "@index"]];
        }                                     
    }
'''

hover = HoverTool(
    renderers=[graph_renderer.node_renderer]
)
hover.callback = CustomJS(
    args = dict(source = graph_renderer.node_renderer.data_source, hover = hover),
    code = code
)

plot.add_tools(hover, TapTool())

output_notebook()
show(plot)

### Diseases
All reports. Run in browser: https://s.zazuko.com/24fSKE

In [63]:
query = """
SELECT ?diagnosis ?gde ?specie ?stock ?sick ?infected ?killed ?deceased ?epidemics ?death_cause
WHERE {
  <https://environment.ld.admin.ch/foen/animal-pest/observation/> cube:observation ?obs .
  ?obs disease:epidemics/schema:name ?epidemics;
       disease:diagnosis-date ?diagnosis;
       disease:animals-stock ?stock;
       disease:animals-sick ?sick;
       disease:animals-infected ?infected;
       disease:animals-killed ?killed;
       disease:animals-deceased ?deceased;
       disease:internet-publication ?date;
       disease:death-cause/schema:name ?death_cause;
       disease:animal-specie/schema:name ?specie;
       schema:containedInPlace/schema:name ?gde .
  
  FILTER (LANG(?epidemics) = "de")
  FILTER (LANG(?death_cause) = "de")
  FILTER (LANG(?specie) = "de")
} 
ORDER BY DESC(?diagnosis) ?gde
"""
df = sparql.send_query(query)
df.head()

Unnamed: 0,diagnosis,gde,specie,stock,sick,infected,killed,deceased,epidemics,death_cause
0,2021-09-23,Tafers,Huhn,1,1,0,0,0,Listeriose,Klinischer Fall
1,2021-09-21,Bern,Pferd,1,1,0,0,0,Salmonellose,Klinischer Fall
2,2021-09-21,Kerzers,Huhn,1,1,0,0,0,Salmonella-Infektion des Geflügels,Untersuchungsprogramm
3,2021-09-21,Sorens,Rind,3,3,0,0,0,Salmonellose,Klinischer Fall
4,2021-09-20,Grüningen,Rind,38,1,0,0,0,Salmonellose,Klinischer Fall


## Reports

Number of reports per day, specie, commune, and epidemics

In [116]:
query = """
SELECT ?diagnosis ?gde ?specie ?epidemics
WHERE {
  <https://environment.ld.admin.ch/foen/animal-pest/observation/> cube:observation ?obs .
  ?obs disease:epidemics/schema:name ?epidemics;
       disease:diagnosis-date ?diagnosis;
       disease:internet-publication ?date;
       disease:animal-specie/schema:name ?specie;
       schema:containedInPlace/schema:name ?gde .
  
  FILTER (LANG(?epidemics) = "de")
  FILTER (LANG(?specie) = "de")
} 
ORDER BY DESC(?diagnosis) ?gde
"""
df = sparql.send_query(query)
df.head()

Unnamed: 0,diagnosis,gde,specie,epidemics
0,2021-09-23,Tafers,Huhn,Listeriose
1,2021-09-21,Bern,Pferd,Salmonellose
2,2021-09-21,Kerzers,Huhn,Salmonella-Infektion des Geflügels
3,2021-09-21,Sorens,Rind,Salmonellose
4,2021-09-20,Grüningen,Rind,Salmonellose


In [186]:
dff = df.copy()
dff.loc[:,"year"] = dff.diagnosis.apply(lambda x: x.year)
dff.loc[:,"month"] = dff.diagnosis.apply(lambda x: x.month)
dff['day'] = pd.to_datetime(dff[['year', 'month']].assign(DAY=1))
dff = dff[["day", "specie", "epidemics"]].groupby(by=["day", "specie", "epidemics"]).size().reset_index(name="reports")
dff = dff.sort_values(by=["day", "epidemics", "specie"], ascending=False)
dff.sort_values(by=["reports"], ascending=False).reset_index(drop=True)[0:15]

Unnamed: 0,day,specie,epidemics,reports
0,2008-11-01,Rind,Bovine Virusdiarrhoe / Mucosal Disease,1929
1,2008-12-01,Rind,Bovine Virusdiarrhoe / Mucosal Disease,1246
2,2008-10-01,Rind,Bovine Virusdiarrhoe / Mucosal Disease,1106
3,2009-01-01,Rind,Bovine Virusdiarrhoe / Mucosal Disease,419
4,2009-05-01,Bienen,Sauerbrut der Bienen,284
5,2011-04-01,Bienen,Sauerbrut der Bienen,265
6,2010-06-01,Bienen,Sauerbrut der Bienen,263
7,2010-05-01,Bienen,Sauerbrut der Bienen,258
8,2009-02-01,Rind,Bovine Virusdiarrhoe / Mucosal Disease,216
9,2011-05-01,Bienen,Sauerbrut der Bienen,200


In [264]:
diseases = dff[["reports", "epidemics"]].groupby("epidemics").sum().reset_index().sort_values(by="reports", ascending=False).reset_index(drop=True)
ROWS=8
COLS=2

diseases_subset = diseases[0:ROWS*COLS]
fig = make_subplots(rows=ROWS, cols=COLS, subplot_titles=diseases_subset["epidemics"])
for obs in diseases_subset.itertuples():
    
    row = obs.Index//COLS + 1
    col = obs.Index%COLS + 1
    d = dff[dff.epidemics == obs.epidemics]
    fig.append_trace(go.Line(x=d["day"], y=d["reports"], name=obs.epidemics, marker_color=px.colors.qualitative.Dark24[0]), row=row, col=col)
    
fig.update_layout(height=1200, width=1000, title={"text": "Biggest animal epidemics in Switzerland", "x": 0.5}, showlegend=False)
fig.show()

In [267]:
diseases = dff[["reports", "epidemics", "specie"]].groupby(["specie", "epidemics"]).sum().reset_index().sort_values(by="reports", ascending=False).reset_index(drop=True)
diseases.head()

Unnamed: 0,specie,epidemics,reports
0,Rind,Bovine Virusdiarrhoe / Mucosal Disease,10143
1,Bienen,Sauerbrut der Bienen,8633
2,Bienen,Faulbrut der Bienen,2498
3,Rind,Coxiellose,1899
4,Ziege,Caprine Arthritis-Encephalitis,1791


In [242]:
d = dff[dff.epidemics == "Bovine Virusdiarrhoe / Mucosal Disease"]
fig = px.line(d, x="day", y="reports")
fig.show()

In [206]:
d = dff[dff.epidemics == "Sauerbrut der Bienen"]
fig = px.line(d, x="day", y="reports")
fig.update_xaxes(rangeslider_visible=True)
fig.show()

Interesting case: rabies does not appear anymore!

Also, check rabies by specie: all cases were coming from one animal type: Fuchs

In [201]:
d = dff[dff.epidemics == "Tollwut"]
fig = px.bar(d, x="day", y="reports")
fig.show()

In [357]:
diseases = dff[["reports", "epidemics", "specie"]].groupby(["specie", "epidemics"]).sum().reset_index()
diseases["reports_group"] = diseases[["epidemics", "reports"]].groupby("epidemics").transform(sum)
diseases["affected_species"] = diseases[["epidemics", "reports"]].groupby("epidemics").transform(len)
diseases = diseases[diseases.affected_species > 1]
diseases = diseases.sort_values(by=["reports_group", "reports"], ascending=False).reset_index(drop=True)
diseases[diseases.epidemics != "Salmonellose"][0:30]

Unnamed: 0,specie,epidemics,reports,reports_group,affected_species
0,Rind,Bovine Virusdiarrhoe / Mucosal Disease,10143,10144,2
1,Anderes Haustier,Bovine Virusdiarrhoe / Mucosal Disease,1,10144,2
35,Rind,Coxiellose,1899,2235,7
36,Ziege,Coxiellose,221,2235,7
37,Schaf,Coxiellose,109,2235,7
38,Schwein,Coxiellose,3,2235,7
39,Affe,Coxiellose,1,2235,7
40,Anderes Haustier,Coxiellose,1,2235,7
41,Anderes Wildtier,Coxiellose,1,2235,7
42,Schaf,Chlamydienabort der Schafe und Ziegen,827,1357,3


## Disease co-ocurrence: Chlamydienabort
https://www.blv.admin.ch/blv/fr/home/tiere/tierseuchen/uebersicht-seuchen/alle-tierseuchen/chlamydienabort-der-schafe-und-ziegen.html

In [359]:
# Candidates: Coxiellose, Campylobacteriose, Listeriose, Pseudotuberkulose der Schafe und Ziegen
# Interesting :Tollwut
# SUPER_ CURIOUS: Maedi-Visna

#epidemic = "Chlamydienabort der Schafe und Ziegen"
epidemic = "Maedi-Visna"
relevant_animals = diseases[(diseases.epidemics == epidemic) & (diseases.reports > 20)].specie
subset = dff[(dff.epidemics == epidemic) & (dff.specie.isin(relevant_animals))]

fig = px.line(subset, x="day", y="reports", color="specie", range_y=[0, int(max(subset.reports)*1.3)+1], range_x=[min(subset.day), max(subset.day)])
        
fig.update_xaxes(rangeslider_visible=True)
fig.show()

### Deadly epidemics

Which epidemics lead to death of all animals?

In [55]:
query = """
SELECT ?diagnosis ?gde ?specie ?stock_deceased ?epidemics ?death_cause
WHERE {
  <https://environment.ld.admin.ch/foen/animal-pest/observation/> cube:observation ?obs .
  ?obs disease:epidemics/schema:name ?epidemics;
       disease:diagnosis-date ?diagnosis;
       disease:animals-stock ?stock_deceased;
       disease:animals-deceased ?stock_deceased;
       disease:internet-publication ?date;
       disease:death-cause/schema:name ?death_cause;
       disease:animal-specie/schema:name ?specie;
       schema:containedInPlace/schema:name ?gde .
  
  #FILTER (?stock_deceased > 1)
  FILTER (LANG(?epidemics) = "de")
  FILTER (LANG(?death_cause) = "de")
  FILTER (LANG(?specie) = "de")
} 
ORDER BY DESC(?stock_deceased)
"""

df = sparql.send_query(query)
df.head()

Unnamed: 0,diagnosis,gde,specie,stock_deceased,epidemics,death_cause
0,2020-02-27,Wileroltigen,Huhn,24000,Salmonella-Infektion des Geflügels,Bestandesüberwachung
1,2020-12-01,Montanaire,Huhn,4000,Salmonella-Infektion des Geflügels,Untersuchungsprogramm
2,2002-09-10,La Grande Béroche,Kaninchen,300,Virale hämorrhagische Krankheit der Kaninchen,Klinischer Fall
3,1996-02-09,Prangins,Kaninchen,70,Virale hämorrhagische Krankheit der Kaninchen,Klinischer Fall
4,2002-04-03,Bellinzona,Kaninchen,67,Virale hämorrhagische Krankheit der Kaninchen,Klinischer Fall


A disease can be higly contagious. It may be hence sefer to kill all animal stock at certain farm, and prevent potential disease spread.

Which epidemics forced farmers to kill all their stock?

In [10]:
query = """
SELECT ?diagnosis ?gde ?specie ?sick ?infected ?deceased ?stock_killed ?epidemics ?death_cause
WHERE {
  <https://environment.ld.admin.ch/foen/animal-pest/observation/> cube:observation ?obs .
  ?obs disease:epidemics/schema:name ?epidemics;
       disease:diagnosis-date ?diagnosis;
       disease:animals-stock ?stock_killed;
       disease:animals-sick ?sick;
       disease:animals-infected ?infected;
       disease:animals-killed ?stock_killed;
       disease:animals-deceased ?deceased;
       disease:internet-publication ?date;
       disease:death-cause/schema:name ?death_cause;
       disease:animal-specie/schema:name ?specie;
       schema:containedInPlace/schema:name ?gde .
  
  #FILTER (?stock_killed > 1)
  FILTER (LANG(?epidemics) = "de")
  FILTER (LANG(?death_cause) = "de")
  FILTER (LANG(?specie) = "de")
} 
ORDER BY DESC(?stock_killed)
"""

df = sparql.send_query(query)
df.head()

Unnamed: 0,diagnosis,gde,specie,sick,infected,deceased,stock_killed,epidemics,death_cause
0,2014-07-23,Val-de-Travers,Fisch,900,50000,900,90000,Infektiöse Hämatopoietische Nekrose,Ansteckungsverdacht
1,2013-07-01,Cham,Fisch,15000,25000,0,40000,Infektiöse Hämatopoietische Nekrose,Klinischer Fall
2,2004-07-12,Erstfeld,Fisch,0,0,0,30000,Infektiöse Hämatopoietische Nekrose,Klinischer Fall
3,2021-08-13,Cheseaux-Noréaz,Huhn,0,20,0,18000,Salmonella-Infektion des Geflügels,Untersuchungsprogramm
4,2013-12-31,Agiez,Huhn,0,23,0,16000,Salmonella-Infektion des Geflügels,Bestandesüberwachung
