### Conservation status of macrophytes in Austria and Tirol
---
# 1. Build a collection of Macrophytes of Austria


In [None]:
import yaml
import os
from Bio import Entrez
import polars as pl

from utils import get_taxon_id
from utils import get_taxonlist_species

with open('../config.yml', 'r') as file:
    configs = yaml.safe_load(file)

As I received a list of Macrophytes (some named by families, some as genera or subgenera), I faced the problem to find all f the corresponding members in the red list of Austria and Tirol.

The red lists can be downloaded here:
- Northern and eastern Tirol: https://www.uibk.ac.at/de/botany/aktuelles/rote-liste-und-checkliste-der-farn-und-blutenpflanzen-nord-und-osttirols/
- Austria as a whole: https://www.zobodat.at/publikation_volumes.php?id=69999

### Step 1
Here, we define our files and load the lists into the memory.

In [19]:
RL_TIR_FILE = "../static/red_list_plants_tirol.csv"  # red list tirol
RL_AUT_FILE = "../static/red_list_plants_austria.csv"  # red list austria
MP_FILE = "../static/macrophytes.txt"  # list of macrophytes

ALLTAXA_OUTFILE = "../static/allmacrophytes_ncbi.txt"  # output file that should contain all macrophyte species that correspond to our list on NCBI
MPSPECIES_TIR_OUTFILE = "../static/macrophytes_species_tirol.txt"  # output file that lists all macrophytes in the red list in tirol
MPSPECIES_AUT_OUTFILE = "../static/macrophytes_species_austria.txt"  # output file that lists all macrophytes in the red list in austria

In [20]:
red_list_tir_df = pl.read_csv(RL_TIR_FILE)
red_list_aut_df = pl.read_csv(RL_AUT_FILE)

macrophytes = [w.strip() for w in open(MP_FILE, "r").readlines()]

Hence, we can inspect the list of macrophytes that we should use for the analysis:

In [4]:
macrophytes

['Alismataceae',
 'Hydrocharitaceae',
 'Nymphaeaceae',
 'Potamogetonaceae',
 'Lemnaceae',
 'Menyanthaceae',
 'Myriophyllum',
 'Ceratophyllum',
 'Callitriche',
 'Hippuris',
 'Ranunculus subgen. Batrachium',
 'Najas',
 'Utricularia',
 'Zannichellia',
 'Typha',
 'Sparganium']

### Step 2
And so the NCBI taxonomy database can be screened for all species and genera that are included in our selection of macrophyte taxa.

In [6]:
Entrez.api_key = configs["NCBI_API_KEY"]
Entrez.email = configs["NCBI_EMAIL"]

This can be done using 2 small functions fetching the NCBI database for all species and genera that belong to a given taxon.

In [14]:
all_taxa = []
for taxon in macrophytes:
   taxa = get_taxonlist_species(get_taxon_id(taxon))
   for t in taxa : all_taxa.append(t)


And then we save the data in a predefined file to avoid the need of rerunning.

In [17]:
sorted(all_taxa)

['Adelonema',
 'Adelonema allenii',
 'Adelonema crinipes',
 'Adelonema erythropus',
 'Adelonema hammelii',
 'Adelonema panamense',
 'Adelonema peltatum',
 'Adelonema picturatum',
 'Adelonema sp. 1 SV-2018',
 'Adelonema sp. SEL 1992-0029',
 'Adelonema speariae',
 'Adelonema wallisii',
 'Adelonema wendlandii',
 'Aglaodorum',
 'Aglaodorum griffithii',
 'Aglaonema',
 'Aglaonema cochinchinense',
 'Aglaonema commutatum',
 'Aglaonema costatum',
 'Aglaonema crispum',
 'Aglaonema hybrid cultivar',
 'Aglaonema marantifolium',
 'Aglaonema modestum',
 'Aglaonema nitidum',
 'Aglaonema pictum',
 'Aglaonema rotundum',
 'Aglaonema simplex',
 'Aglaonema sp. Carlsen 3112',
 'Aglaonema sp. DZ-2018',
 'Aglaonema sp. Red Vein',
 'Aglaonema tenuipes',
 'Albidella',
 'Albidella acanthocarpa',
 'Albidella nymphaeifolia',
 'Albidella oligococca',
 'Alisma',
 'Alisma canaliculatum',
 'Alisma gramineum',
 'Alisma lanceolatum',
 'Alisma nanum',
 'Alisma plantago-aquatica',
 'Alisma rariflorum',
 'Alisma sp. C-002

One can already see, there are somehow a lot of members of Araceae included. Somehow this must be a problem with the structure of the taxonomic tree on NCBI - Maybe with Alismataceae or Lemnaceae? We do not go into detail but will need some manual curation of the results. 

But first, we save the file.

In [None]:
#with open(ALLTAXA_OUTFILE, "w") as fl:
#    [fl.write(f"{t}\n") for t in all_taxa]

### Step 3
Next, we go through each of those species and genera with and select those in our red lsits, that correspond to them.

In [36]:
# for the red list of tirol:
with open(MPSPECIES_TIR_OUTFILE, "w") as fl:
    for tx in red_list_tir_df["Taxon"]:
        if tx in all_taxa or any([tx.startswith(f"{t} ") for t in all_taxa]):
            fl.write(f"{tx}\n")


In [37]:
# for the red list of Austria:
with open(MPSPECIES_AUT_OUTFILE, "w") as fl:
    for tx in red_list_aut_df["Taxon"]:
        try:
            if tx in all_taxa or any([tx.startswith(f"{t} ") for t in all_taxa]):
                if not "→" in tx: fl.write(f"{tx}\n")

        except AttributeError:
            continue

So the preselection of macrophyte species in tirol are these:

In [33]:
with open(MPSPECIES_TIR_OUTFILE, "r") as file:
    print(file.read())

Alisma lanceolatum
Alisma plantago-aquatica s. str.
Arum maculatum s. str.
Calla palustris
Callitriche cophocarpa
Callitriche hamulata
Callitriche obtusangula
Callitriche palustris s. str.
Callitriche platycarpa
Ceratophyllum demersum
Elodea canadensis
Elodea nuttallii
Groenlandia densa
Hippuris vulgaris
Hydrocharis morsus-ranae
Lemna minor
Lemna trisulca
Lysichiton americanus
Menyanthes trifoliata
Myriophyllum spicatum
Myriophyllum verticillatum
Najas marina s. str.
Najas minor
Nuphar lutea
Nuphar pumila
Nymphaea alba
Nymphaea candida
Nymphoides peltata
Potamogeton alpinus
Potamogeton × angustifolius
Potamogeton berchtoldii
Potamogeton crispus
Potamogeton gramineus
Potamogeton lucens
Potamogeton natans
Potamogeton × nitens
Potamogeton nodosus
Potamogeton perfoliatus
Potamogeton praelongus
Potamogeton pusillus s. str.
Potamogeton trichoides
Sagittaria latifolia
Sagittaria sagittifolia
Sparganium angustifolium
Sparganium emersum
Sparganium erectum subsp. erectum
Sparganium erectum subsp

For Austria it is those:

In [35]:
with open(MPSPECIES_AUT_OUTFILE, "r") as file:
    print(file.read())

Alisma gramineum
Alisma lanceolatum
Alisma plantago-aquatica s.str.
Arum cylindraceum
Arum italicum
Arum maculatum s.str.
Caldesia parnassifolia
Calla palustris
Callitriche cophocarpa
Callitriche hamulata
Callitriche obtusangula
Callitriche palustris s.str.
Callitriche platycarpa
Callitriche stagnalis
Ceratophyllum demersum
Ceratophyllum submersum
Cryptocoryne balansae
Cryptocoryne wendtii
Elodea canadensis
Elodea nuttallii
Groenlandia densa
Hippuris vulgaris
Hydrocharis morsus-ranae
Lemna gibba
Lemna minor
Lemna minuta
Lemna trisulca
Lemna turionifera
Menyanthes trifoliata
Myriophyllum alterniflorum
Myriophyllum aquaticum
Myriophyllum heterophyllum
Myriophyllum spicatum
Myriophyllum verticillatum
Najas flexilis
Najas marina agg.
Najas major
Najas marina s.str.
Najas minor
Nuphar advena
Nuphar lutea
Nuphar pumila
Nymphaea alba
Nymphaea candida
Nymphoides peltata
Potamogeton acutifolius
Potamogeton alpinus
Potamogeton berchtoldii
Potamogeton coloratus
Potamogeton compressus
Potamogeton 

Again, we observe that some Araceae are included. Also, we are missing the *Ranunculus* subg. *Batrachium* taxa. Hence, we will do some manual curation of the selection.

### Step 4

So for the final lists some manual curation will be performed. The Araceae in the lists are removed and all members of *Ranunculus* subg. *Batrachium* added.

In [38]:
MPSPECIES_FIN_TIR_OUTFILE = "../static/macrophytes_species_final_tirol.txt"  # manually curated output file that lists all macrophytes in the red list in tirol
MPSPECIES_FIN_AUT_OUTFILE = "../static/macrophytes_species_final_austria.txt"  # manually curated output file that lists all macrophytes in the red list in austria

Hence, here we present the taxa of the study:


- For Tirol, we will analyze:

In [53]:
with open(MPSPECIES_FIN_TIR_OUTFILE, "r") as file:
    print(file.read())

Alisma lanceolatum
Alisma plantago-aquatica s. str.
Callitriche cophocarpa
Callitriche hamulata
Callitriche obtusangula
Callitriche palustris s. str.
Callitriche platycarpa
Ceratophyllum demersum
Elodea canadensis
Elodea nuttallii
Groenlandia densa
Hippuris vulgaris
Hydrocharis morsus-ranae
Lemna minor
Lemna trisulca
Menyanthes trifoliata
Myriophyllum spicatum
Myriophyllum verticillatum
Najas marina s. str.
Najas minor
Nuphar lutea
Nuphar pumila
Nymphaea alba
Nymphaea candida
Nymphoides peltata
Potamogeton alpinus
Potamogeton berchtoldii
Potamogeton crispus
Potamogeton gramineus
Potamogeton lucens
Potamogeton natans
Potamogeton nodosus
Potamogeton perfoliatus
Potamogeton praelongus
Potamogeton pusillus s. str.
Potamogeton trichoides
Potamogeton × angustifolius
Potamogeton × nitens
Ranunculus aquatilis
Ranunculus circinatus
Ranunculus confervoides
Ranunculus penicillatus s. str.
Ranunculus rionii
Ranunculus sceleratus
Ranunculus trichophyllus
Sagittaria latifolia
Sagittaria sagittifolia

- For Austria it is:

In [54]:
with open(MPSPECIES_FIN_AUT_OUTFILE, "r") as file:
    print(file.read())

Alisma gramineum
Alisma lanceolatum
Alisma plantago-aquatica s.str.
Caldesia parnassifolia
Callitriche cophocarpa
Callitriche hamulata
Callitriche obtusangula
Callitriche palustris s.str.
Callitriche platycarpa
Callitriche stagnalis
Ceratophyllum demersum
Ceratophyllum submersum
Elodea canadensis
Elodea nuttallii
Groenlandia densa
Hippuris vulgaris
Hydrocharis morsus-ranae
Lemna gibba
Lemna minor
Lemna minuta
Lemna trisulca
Lemna turionifera
Menyanthes trifoliata
Myriophyllum alterniflorum
Myriophyllum aquaticum
Myriophyllum heterophyllum
Myriophyllum spicatum
Myriophyllum verticillatum
Najas flexilis
Najas major
Najas marina agg.
Najas marina s.str.
Najas minor
Nuphar advena
Nuphar lutea
Nuphar pumila
Nymphaea alba
Nymphaea candida
Nymphoides peltata
Potamogeton acutifolius
Potamogeton alpinus
Potamogeton berchtoldii
Potamogeton coloratus
Potamogeton compressus
Potamogeton crispus
Potamogeton friesii
Potamogeton gramineus
Potamogeton lucens
Potamogeton natans
Potamogeton nodosus
Potam

With this selection we can continue in the next Notebook with some statistics.