# Stage 2: prepare station metadata in the DocumentDB

#### 🚀 Targets
1. Check the DocumentDB implementation (with correct URI and r/w permission).
2. Populate the DocumentDB with station metadata.

#### ⚠️ Checklist
1. Make sure you have the DocumentDB cluster running and the DOCDB_ENDPOINT_URI in [parameters.py](../sb_catalog/src/parameters.py) has been filled properly.
2. This notebook has to be running on an EC2 instance, under the same VPC & security group as the DocumentDB cluster.

In [1]:
import sys
import pandas as pd

sys.path.append("../sb_catalog")

from src.parameters import DOCDB_ENDPOINT_URI
from src.utils import SeisBenchDatabase

## Populate the DocumentDB

Connect to the DocumentDB and write the station metadata.

In [2]:
db = SeisBenchDatabase(DOCDB_ENDPOINT_URI, "earthscope")

  super().__init__(db_uri, **kwargs)


In [15]:
for netfile in tqdm(sorted(glob.glob("../networks/*.zip"))):
    stations = pd.read_csv(netfile)
    for i, s in stations.iterrows():
        cha = stations.loc[i, "channels"]
        cha = list(set([i[:2] for i in cha.split(",")]))
        stations.loc[i, "channels"] = ",".join(cha)
    stations.location_code = stations.apply(lambda s: s.id.split('.')[-1], axis = 1)
    db.write_stations(stations)

Some duplicate entries have been skipped in collection stations
Some duplicate entries have been skipped in collection stations
Some duplicate entries have been skipped in collection stations
Some duplicate entries have been skipped in collection stations


## Check the DocumentDB

Just to make sure that the DB has been populated with the station metadata, and we are able to read from it.

In [3]:
network = "BG,BK,BP,NC,PG,WR"
db.get_stations(None, network)

Unnamed: 0,_id,id,network_code,station_code,location_code,channels,latitude,longitude,elevation,start_date,end_date
0,67f69a231ca772a0aa21d9a1,BK.AASB.00,BK,AASB,00,"BH,HN,HH",38.430260,-121.109750,67.1,2021.222,3000.001
1,67f69a231ca772a0aa21d9a2,BK.AASB.S0,BK,AASB,S0,HN,38.430260,-121.109750,67.1,2021.222,3000.001
2,67f69a231ca772a0aa21d9a3,BK.ADAM.00,BK,ADAM,00,"BH,HN,HH",38.751420,-122.334140,627.3,2022.168,3000.001
3,67f69a231ca772a0aa21d9a4,BK.ADAM.S0,BK,ADAM,S0,HN,38.751420,-122.334140,627.3,2022.168,3000.001
4,67f69a231ca772a0aa21d9a5,BK.ALVW.00,BK,ALVW,00,"BH,HN,HH",37.049060,-120.471430,41.8,2023.096,3000.001
...,...,...,...,...,...,...,...,...,...,...,...
2032,67fb432e8575352fefb17873,WR.STNI.10,WR,STNI,10,HN,38.119688,-121.540149,3.1,1995.047,3000.001
2033,67fb432e8575352fefb17874,WR.THER.02,WR,THER,02,HN,39.484103,-121.688221,36.7,1981.007,3000.001
2034,67fb432e8575352fefb17875,WR.THER.00,WR,THER,00,HN,39.484103,-121.688221,36.7,1981.007,3000.001
2035,67fb432e8575352fefb17876,WR.THER.01,WR,THER,01,HN,39.484103,-121.688221,36.7,1981.007,3000.001
