## DNB/Hub Ocean Hackathon

In [1]:
import odp.geospatial as odp
import warnings
import geopandas as gpd
import pandas as pd
import cmocean
pd.set_option("display.max_columns", None)
warnings.filterwarnings("ignore")

In [2]:
db = odp.Database()
db_plt = odp.PlotTools()
gd = odp.GridData()

### First we will work with tabular data

#### The following shows the available datasets that can be used as dataframes
#### Alternatively check out the data catalog in the [Ocean Data Explorer Catalog](https://app.oceandata.earth/catalog)

In [3]:
db.datasets

### Pulling data from a dataset.
#### This is general query you will use, examples to follow


```
db.query(
    ds_name,
    date_from=None,
    date_to=None,
    poly=None,
    filters=[],
    limit=1000000.0,
    data_columns=['*'],
)
```


Example:
```
df=query('Ocean Biodiversity Information System')
        date_from='2000-01-01',
        date_to='2020-02-01',
        poly='POLYGON ((51.0 3.0, 51.3 3.61, 51.3 3.0, 51.0 3.0))',
        limit=5)
```

#### Run a query for the "Emodnet HA aquaculture - marine Finfish" dataset listed above

In [4]:
df=db.query('Emodnet HA aquaculture - marine Finfish',
        limit=5)

In [5]:
df.head(3)

In [6]:
#### We can add a filter for just Norway
filter1 = db.filter_data("COUNTRY", "=", "Norway")
df=db.query('Emodnet HA aquaculture - marine Finfish',
            filters=[filter1])

In [7]:
df.head(3)

#### Let's plot (this may take a bit of time)

In [8]:
db_plt.plot_points(df)

### We can also just query for a specific region of the country by adding a polygon to our query

In [9]:
poly = "POLYGON ((5.0 59.0, 10 59, 10 64, 5 64, 5 59))"

In [10]:
filter1 = db.filter_data("COUNTRY", "=", "Norway")
df=db.query('Emodnet HA aquaculture - marine Finfish',
            filters=[filter1],
            poly=poly)

In [11]:
db_plt.plot_points(df)

### Let's see if any of them are in MPAS
#### Pull data for MPAS in Norway

In [12]:
filter1 = db.filter_data("country", "=", "Norway")
df_mpa = db.query("ProtectedSeas MPA Dataset",
                  filters=[filter1])
df_mpa.head(5)

#### Let's visualize one

In [13]:
poly = df_mpa.iloc[0]["geometry"]
print(df_mpa.iloc[0]["site_name"])
poly

In [14]:
df_mpa.shape

### Do a spatial join using geopandas functionality, documentation [here](https://geopandas.org/en/stable/docs/user_guide/mergingdata.html)

It looks like they've included Norway's entire EEZ, so let's filter that one out, along with "Norway Territorial Waters" first before joining


In [15]:
df_mpa=df_mpa[df_mpa.site_name != "Norway EEZ (0-200NM)"]
df_mpa=df_mpa[df_mpa.site_name != "Norway Territorial Waters"]

In [16]:
dff = df.sjoin(df_mpa, how="inner", predicate='intersects')

In [17]:
dff.head()

In [18]:
## Looks like 5 of the farms are in MPAs
df.shape, dff.shape

### Let's play around with some gridded data

#### The following shows the available gridded datasets.
#### Alternatively check out the data catalog in the [Ocean Data Explorer search bar](https://app.oceandata.earth/explorer)

In [19]:
gd.datasets

In [20]:
## Feel free to click though the coordinates and variables (little data icon), and information about them (little paper icon)
ds= gd.open_dataset('global-analysis-forecast-bio-001-028-monthly')
ds

#### Using built-in [xarray functionality](https://docs.xarray.dev/en/stable/user-guide/indexing.html) we can easily slice to time and place we are interested in. 

In [21]:
ds_slice = ds.sel(
            longitude=slice(0,30),
            latitude=slice(50,70),
            time=slice('2021-01-01', '2022-12-31'))

ds_slice

#### Using built-in [xarray functionality](https://docs.xarray.dev/en/stable/user-guide/plotting.html) we can easily slice visualize the data
#### I am using the [cmocean](https://matplotlib.org/cmocean/) library for the colormap, but you can use one of your choice

In [22]:
monthly_means = ds_slice.isel(depth=0).groupby("time.month").mean()
fg = monthly_means.o2.plot(
    col="month",
    col_wrap=4,
    cmap=cmocean.cm.oxy,
)

In [23]:
monthly_means = ds_slice.isel(depth=0).groupby("time.month").mean()
fg = monthly_means.no3.plot(
    col="month",
    col_wrap=4,
    cmap=cmocean.cm.dense,
)

#### You can select a specific depth and plot the data variables for that depth over time  

In [24]:
ds_time = ds_slice.isel(depth=0)
ds_time

In [25]:
d_time = ds_time.ph.sel(longitude=5, latitude=60, method="nearest")
d_time.plot()

## Challenge

#### Pick a an area with many farms from the Emodnet HA aquaculture - marine Finfish dataset or pick other industry assets from other Emodnet Human Activity datasets
You can filter for a specific Owner/Company if you like 

Using the "global-ocean-biogeochemistry-hindcast-monthly mean" plot different variables in that location<br>
What are some trends?
Can you do extra research on when the farm was established?<br>
What other organisms are in that area?<br>
Are there other industries operating in that same area?<br>
Are there seasonal differences? <br>
Create whichever visualizations you think are releveant