<a href="https://colab.research.google.com/github/MathewBiddle/ioos_by_the_numbers/blob/main/IOOS_BTN.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Creating the IOOS By The Numbers

[Website](https://ioos.noaa.gov/about/ioos-by-the-numbers/)

[Spreadsheet](https://docs.google.com/spreadsheets/d/1AUfXmc3OwxpVdeMNjZyTGWjyR4ku3kRD5eexNrMORnI/edit#gid=516871794)

In [1]:
import pandas as pd

## Collect HF Radar Installations

From http://hfrnet.ucsd.edu/sitediag/stationList.php

In [None]:
url = 'http://hfrnet.ucsd.edu/sitediag/stationList.php?output=CSV'

df_hfr = pd.read_csv(url)

hfr_installations = df_hfr['Station'].unique().size

## Collect NGDAC Glider Days

From https://gliders.ioos.us/erddap/info/index.html?page=1&itemsPerPage=1000

Cumulative from 2008 - present



In [None]:
df_glider = pd.read_csv('https://gliders.ioos.us/erddap/tabledap/allDatasets.csvp?minTime%2CmaxTime')
df_glider.dropna(
    axis=0, 
    inplace=True,
    )

df_glider[['minTime (UTC)','maxTime (UTC)']] = df_glider[
                                                         ['minTime (UTC)','maxTime (UTC)']
                                                         ].apply(pd.to_datetime)

df_glider['glider_days'] = (df_glider['maxTime (UTC)'] - df_glider['minTime (UTC)']).dt.days

glider_days = df_glider['glider_days'].sum()

## National Platforms

### CO-OPS
* https://opendap.co-ops.nos.noaa.gov/stations/index.jsp
  * as xml: https://opendap.co-ops.nos.noaa.gov/stations/stationsXML.jsp
* https://tidesandcurrents.noaa.gov/cdata/StationList?type=Current+Data&filter=active

In [5]:
#from lxml import etree
import requests

xml = requests.get('https://opendap.co-ops.nos.noaa.gov/stations/stationsXML.jsp').text
import re
count = sum(1 for _ in re.finditer(r'\b%s\b' % re.escape("station name"), xml))
print("All stations:",count)

url = 'https://tidesandcurrents.noaa.gov/cdata/StationListFormat?type=Current+Data&filter=active&format=csv'

df_coops = pd.read_csv(url)
#print(df_coops[' Project'].unique())
ports = df_coops[df_coops[' Project'] != ' Great Lakes Real-Time Currents Monitoring'].shape[0]
print("Ports: %s" % ports)

All stations: 379
Ports: 66


### NDBC
https://www.ndbc.noaa.gov/wstat.shtml	Buoys: 106 (103 base-funded); CMAN: 45

In [6]:
import requests
from bs4 import BeautifulSoup
import re
import pprint

url = 'https://www.ndbc.noaa.gov/wstat.shtml'

html = requests.get(url).text

soup = BeautifulSoup(html, 'html.parser')

string_to_find = ['Total Base Funded Buoys:','Total Other Buoys:',
                  'Total Moored Buoys:','Total Base Funded Stations:',
                  'Total Stations:']

ndbc = dict()
for string in string_to_find:
  for tag in soup.find_all("td", string=string):
    ndbc[string] = tag.next_sibling.string

pprint.pprint(ndbc)

{'Total Base Funded Buoys:': '103',
 'Total Base Funded Stations:': '45',
 'Total Moored Buoys:': '106',
 'Total Other Buoys:': '3',
 'Total Stations:': '45'}


### NERRS
https://nosc.noaa.gov/OSC/OSN/index.php	NERRS SWMP; Across 29 NERRS; Source = internal access only - NOAA Observing System Council.

http://cdmo.baruch.sc.edu/webservices.cfm <- need IP address approval

Need number of stations (120 last time)

In [8]:

import requests
from bs4 import BeautifulSoup
import re

url = 'https://coast.noaa.gov/nerrs/about/'

html = requests.get(url).text

soup = BeautifulSoup(html, 'html.parser')

string_to_find = ['The National Estuarine Research Reserve System is a network of ']

nerrs = dict()
for string in string_to_find:
  for tag in soup.find_all("meta", attrs={'content': re.compile(string)}, limit=1):
    res = [int(i) for i in tag['content'].split() if i.isdigit()] # extract number
    #print(tag['content'])
    nerrs = res[0]
    #print('%s = %s' % (string, tag.next_sibling.string))

print("NERRS reserves:",nerrs)


NERRS reserves: 29


### CBIBS
https://buoybay.noaa.gov/locations

[API docs](https://buoybay.noaa.gov/node/174)

Base URL: https://mw.buoybay.noaa.gov/api/v1

Testing Key: f159959c117f473477edbdf3245cc2a4831ac61f

Latest measurements:
https://mw.buoybay.noaa.gov/api/v1/json/station?key=f159959c117f473477edbdf3245cc2a4831ac61f

In [9]:
import json

base_url = 'https://mw.buoybay.noaa.gov/api/v1'
apikey = 'f159959c117f473477edbdf3245cc2a4831ac61f'
start = '2021-12-08T01:00:00z'
end = '2021-12-09T23:59:59z'
var = 'Position'

query_url = '{}/json/query?key={}&sd={}&ed={}&var={}'.format(base_url,apikey,start,end,var)
#query_url = '{}/json/station?key={}'.format(base_url, apikey)

json = json.loads(requests.get(query_url).text)

cbibs = len(json['stations'])

print("CBIBS Stations:",cbibs)

CBIBS Stations: 10


### OAP
https://cdip.ucsd.edu/m/stn_table/	Includes overlap with the RAs and other programs

85

In [None]:
19+67

86

### CDIP
https://cdip.ucsd.edu/m/stn_table/	Includes overlap with the RAs

## Regional Platforms

https://github.com/ioos/ioos-asset-inventory/tree/main/2020

http://erddap.ioos.us/erddap/tabledap/2020_asset_inventory.html <- raw data, need processed data

In [None]:
url = 'https://github.com/ioos/ioos-asset-inventory/raw/main/2020/processed_inventory.csv'
df_regional_platforms = pd.read_csv(url)

regional_platforms = df_regional_platforms['station_long_name'].unique().size

## ATN Deployments

See Deployments at https://portal.atn.ioos.us/#
Not sure if there is a way to scrape that page or get those values from somewhere

In [None]:
# from bs4 import BeautifulSoup
# import requests

# headers = {'Accept-Encoding': 'identity'}

# soup = BeautifulSoup(requests.get('https://portal.atn.ioos.us', headers=headers).text, 'html.parser')

# soup

## MBON Projects
https://marinebon.org/

https://github.com/marinebon/www_marinebon2/tree/master/content/project

