<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Libraries" data-toc-modified-id="Libraries-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Libraries</a></span></li><li><span><a href="#Download-species-coordinates" data-toc-modified-id="Download-species-coordinates-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>Download species coordinates</a></span><ul class="toc-item"><li><span><a href="#Species-list" data-toc-modified-id="Species-list-2.1"><span class="toc-item-num">2.1&nbsp;&nbsp;</span>Species list</a></span></li><li><span><a href="#Get-the-species-codes-to-the-GBIF-API" data-toc-modified-id="Get-the-species-codes-to-the-GBIF-API-2.2"><span class="toc-item-num">2.2&nbsp;&nbsp;</span>Get the species codes to the GBIF API</a></span></li><li><span><a href="#Get-coordinates" data-toc-modified-id="Get-coordinates-2.3"><span class="toc-item-num">2.3&nbsp;&nbsp;</span>Get coordinates</a></span></li><li><span><a href="#Clean-species-dataframe" data-toc-modified-id="Clean-species-dataframe-2.4"><span class="toc-item-num">2.4&nbsp;&nbsp;</span>Clean species dataframe</a></span></li></ul></li></ul></div>

# Libraries

In [None]:
import sys
sys.path.append('../')

In [None]:
import warnings
warnings.filterwarnings("ignore")

In [None]:
import pandas as pd
from pygbif import species
from pygbif import occurrences as occ
from geopy.geocoders import Nominatim
import src.get_species as gs

# Download species coordinates 

## Species list 

In [None]:
sp_list = ["Fucus serratus", "Ascophyllum nodosum", "Pelvetia canaliculata", "Bifurcaria bifurcata",
           "Ulva lactuca", "Fucus vesiculosus", "Fucus spiralis", "Codium tomentosum", "Sargassum muticum", "Laminaria hyperborea", 
          "Palmaria palmata", "Alaria esculenta", "Laminaria digitata", "Himanthalia elongata", "Halidrys siliquosa"
          "Saccharina latissima", "Undaria pinnatifida", "Codium fragile", "Gratelupia turuturu"]

## Get the species codes to the GBIF API 

In [None]:
sp, codes = gs.get_species_name_from_codes(sp_list)

In [None]:
codes

In [None]:
sp

## Get coordinates

In [None]:
%%time
results = gs.get_coordinates(sp)

In [None]:
results.to_csv("macroalgae_initial.csv", index = False)

In [None]:
results = pd.read_csv("macroalgae_initial.csv")

## Clean species dataframe 

Drop NaN and round latitude and longitude to 4 decimals

In [None]:
results = gs.clean(results)

Clean species names

In [None]:
results = gs.divide_species(results, "species")

Laminaria flexicaulis ==> eliminarla

Ulva fasciata ==> Ulva lactuca 

Laminaria stenophylla ==> Laminaria digitata

Rhodymenia palmata ==> Palmaria palmata

Fucus platycarpus ==> Fucus spiralis

Alaria musaefolia == > Alaria esculenta

Fucus canaliculatus ==> Fucus vesiculosus

Laminaria intermedia ==> Laminaria digitata

Palmaria mollis ==> eliminar

Fucus inflatus ==> Fucus vesiculosus

Alaria grandifolia ==> Alaria esculenta

Halidrys dioica ==> eliminarla

Alaria dolichorhachis ==> Alaria esculenta

Halidrys Lyngbye ==> eliminarla

Fucus rotundatus ==> eliminarla

Ascophyllum mackaii ==> eliminarla

Fucus nodosus ==> Ascophyllum nodosum

Fucus mytili == Fucus vesiculosus

BOLD:AAB0883 ==> eliminar

Some species are defined with old taxonomy, with this function what I do is to change the old names for the currently accepted ones.

In [None]:
final = gs.clean_species(results, "species")

Extract the community of given coordinates.

In [None]:
final["new"] = final["lat"].map(str) + "," + final["lon"].map(str)
final["locality"] = final.new.apply(gs.get_community)

Remove those columns that dont have locality

In [None]:
final2 = final[final["locality"] != "unknown"]

In [None]:
final2[['locality', 'state', "country"]] = pd.DataFrame(final2['locality'].tolist(), index=final2.index)

In [None]:
final2.to_csv("macroalgae_final.csv")