# Using *owimetadatabase-preprocessor* to retrieve the data from Owimetadatabase

If you haven't already, you need to install the following packages in your Python environment or if you want to update them (mostly owimetadatabse-preprcoessor as it is frequently updated) (remove *%%capture* here if you have problems when installing):

In [None]:
%%capture
%pip install python-dotenv==1.0.0
%pip install owimetadatabase_preprocessor

Load necessary modules:

In [1]:
import os

from owimetadatabase_preprocessor.locations.io import LocationsAPI 
from owimetadatabase_preprocessor.geometry.io import GeometryAPI
from owimetadatabase_preprocessor.soil.io import SoilAPI
from owimetadatabase_preprocessor.fatigue.io import FatigueAPI

Setup the necessary configurations and load environment variables. 

In [None]:
import pandas as pd

pd.set_option('display.max_columns', None)

In [None]:
from dotenv import load_dotenv

load_dotenv()

For authorization, the recommended way is to store your access token securely for authentication locally as an environment variable (can be in *.env* file next to your code, e.g. *.env* file with *OWIMETADB_TOKEN=<your-token-here>* field). Otherwise, just copying it to the **TOKEN** variable also works (but be careful with sahring/publishing, delete it from the notebook before). 

To load it into variable securely with your *.env* file:

In [3]:
# TOKEN = os.getenv('OWIMETADB_TOKEN')
TOKEN = os.getenv('OWIMETA_STAGING_TOKEN') 

Otherwise, just copy paste the provided token into **TOKEN**. Do not forget to delete it from your code if sharing!

In [None]:
TOKEN = "<your-token-string-goes-here>"

**TOKEN** can be passed down to your API to authenticate when requesting data from *owimetadatabase*. You can do this directly by passing *token* argument or passing using *header* in the format *{"Authorization": f"Token {TOKEN}"}*. You can also specify endpoint URL yourself if needed but the most up-to-date one is already provided by default.

**Please note that you will need to change the input arguments in this notebook according to what you have access to, so it will actually provide an output.**

Additionally let us define some helper functions for later:

In [None]:
def show_attrs(class_object: object):
    print(f"{class_object.__class__.__name__} class attributes: {class_object.__dict__.keys()}")

def show_props(class_object: object):
    props = [prop for prop in vars(type(class_object)).keys() if isinstance(getattr(type(class_object), prop), property)]
    print(f"{class_object.__class__.__name__} class properties: {props}")

def show_methods(class_object: object):
    methods = [method for method in dir(class_object) if callable(getattr(class_object, method)) and not method.startswith('_')]
    print(f"{class_object.__class__.__name__} class methods: {methods}")

### Locations API

For example, we can start working with locations, and for this we would need to work with *LocationsAPI*.

In [4]:
api_loc = LocationsAPI(token=TOKEN)

To view all the projects you have access to:

In [5]:
data_projectsites = api_loc.get_projectsites()
data_projectsites["data"]

Unnamed: 0,id,created,modified,description,slug,uuid,active,visibility,additional_data,title,area,created_by,modified_by,visibility_groups
0,31,2020-09-25T15:21:55.736832Z,2021-02-15T09:09:29.344475Z,Nobelwind is the fourth project of the Belgian...,nobelwind,629cc9e8-0c5c-4603-88bd-132240019df5,False,usergroup,,Nobelwind,,1,1,[1]
1,33,2020-09-25T15:22:25.279854Z,2021-06-22T08:10:00.574499Z,Norther is the sixth project of the Belgian No...,norther,98c511ff-bf2b-4a38-94e3-b5572ce69d08,False,usergroup,,Norther,,1,2,[7]
2,32,2020-09-25T15:22:24.956739Z,2021-08-09T14:07:37.379336Z,Northwester 2 is the seventh project of the Be...,northwester-2,be4e1f23-7b2a-40ff-92ce-3a486e4ef7e9,False,usergroup,,Northwester 2,,1,1,[9]
3,35,2020-09-25T15:42:21.395372Z,2021-10-19T17:00:47.909161Z,Belwind is the second project of the Belgian N...,belwind,feb485d5-171a-4efb-9de1-b22b68df149b,False,usergroup,,Belwind,,1,2,[10]
4,30,2020-09-25T15:21:32.404395Z,2022-09-16T15:10:59.462792Z,Rentel is the fifth project of the Belgian Nor...,rentel,c875d63a-10a2-44c6-b2c3-1843824ff4b1,False,usergroup,,Rentel,,1,2,[14]
5,64,2022-12-07T14:55:31.610560Z,2022-12-07T14:55:31.610601Z,Measuring stations for Meetnet Vlaamse Banken,meetnet-vlaamse-banken,b75376e5-380a-4378-852b-942bdebb567b,True,usergroup,,Meetnet Vlaamse Banken,"{'type': 'MultiPolygon', 'coordinates': [[[[2....",1,1,[2]
6,69,2024-10-09T09:38:13.771419Z,2024-10-09T09:57:39.097270Z,Saint Nazaire is a 480MW offshore wind farm be...,saint-nazaire,b0a4ce31-d4bd-47f4-aa94-06750aee0f13,True,usergroup,,Saint-Nazaire,,39,39,[19]


Here and further, the data is provided in dictionary format, with "exists" key specifying if queried data exists and "data" key with data itslef in a suitable format (dataframes).

In [None]:
list(data_projectsites.keys())

To get all the location information for the specified projectsite: 

In [None]:
locs = api_loc.get_assetlocations(projectsite="Nobelwind")

To make sure this data exists:

In [None]:
locs["exists"]

To view five first rows of the locations dataframe:

In [None]:
locs["data"].head(3)

For example, if there is no data for the specified project:

In [None]:
locs_false = api_loc.get_assetlocations(projectsite="Somename")
locs_false["exists"]

Please note that if you have access to a lot of projects/assets, it is better to narrow down your query as much as possible, e.g. by specifying a projectsite name or even turbine name(s). Otherwise, it might result in a lot of data and the database might run into timeout with no output returned. You even might need to use more specific method from the ones offered by the package. For that see more in documentation.

In [None]:
data_asset = api_loc.get_assetlocation_detail(projectsite="Nobelwind", assetlocation=["BBG01"])
data_asset["data"]

Alternatively, you can request several turbines at once. Can be from different projects. Since the requests right now are done sequentially, timeout is not as crucial in this case as it might be for geometry queries.

In [None]:
data_asset = api_loc.get_assetlocations(assetlocations=["BBG01", "NRTA1"])
data_asset["data"]

You can also plot the locations for the all turbines you have access to, e.g. for a specific project or a set of specific turbines in a list:

In [None]:
api_loc.plot_assetlocations(projectsite="Nobelwind")

Please refer to the documentation for more specific details of each method and more capabilities. The package still might expand and add more capabilities in terms of querying specific data!

### Geometry API

This more extensive part of the package allows to gather and process geometrical data for each existing turbine in the database. It works in a similar manner to locations to get "raw" database information. But it gets a little bit different in terms of having methods allowing some preprocessing to get important geometry information (height, etc.) which can be used, e.g., as input to FE models. 

In [None]:
api_geo = GeometryAPI(token=TOKEN)

To load turbine processor to calculate the information for the turbine(s) (note it might take some time for multiple turbines and it can even timeout sometimes, please rerun the cell in this case):

In [None]:
turbines = ["BBG01", "BBG10"]
owts = api_geo.get_owt_geometry_processor(turbines)

By running the next cell you can see the information it can provide/store/calculate (mostly in dataframes or dictionaries): 

In [None]:
list(owts.__dict__.keys())

You can already acces the most trivial information like water depth in dictionary format:

In [None]:
owts.water_depth

But the most important method is to calculate the information(s) about turbine(s) into dataframes. If you try to query some dataframes without running processing, they will provide no information, e.g.:

In [None]:
owts.all_turbines

Hence, you would want to run the processing explicitly:

In [None]:
owts.process_structures()

After this you can query all kinds of dataframes, e.g. all turbines general information:

In [None]:
owts.all_turbines

Only tower geometry (tubular structures) for all turbines:

In [None]:
owts.tower

For a specific turbine (you can either specify the name directly or the number in the list of turbines provided before):

In [None]:
owts.select_owt("BBG01").tower

Or even all tubular sections for all subassemblies for all turbines (convenient to filter later according to your requirement):

In [None]:
owts.all_tubular_structures

Of course, you can also query other information like RNA:

In [None]:
owts.rna

Or lumped masses, etc.

In [None]:
owts.all_lumped_mass

### Fatigue API

Fatigue part is the most recent addition to the databse and this package, and it does not contain as much methods/information as other parts. In line with the other submodules, it provides the main FatigueAPI class with (for now) a couple of *get_\** and plot methods. Most of the output will be a list of custom data objects with multiple attributes containing/operating on fatigue data.

In [None]:
api_fatigue = FatigueAPI(token=TOKEN)

For example, to get all the information on the all existing SN surves: 

In [None]:
sncurves = api_fatigue.get_sncurves()
sncurve = sncurves[0]
print(f"Total number of accessible SN curves in the database currently is {len(sncurves)}.")

For the convenience, you can check what attributes, properties and methods each object has:

In [None]:
show_attrs(sncurve)

In [None]:
show_props(sncurve)

In [None]:
show_methods(sncurve)

To check for the range of cycles for the corresponding range of stresses (in MPa), you can run the following method:

In [None]:
n = sncurve.n([10., 100.])
n

Moreover, you can plot the SN curve.

In [None]:
data, layout = sncurve.plotly()

To get fatigue details for a specific turbine or parts/subassembly, for turbine:

In [None]:
fatigue_details = api_fatigue.get_fatiguedetails(title__icontains="NW2F04")
fd = fatigue_details[0]
print(f"The amount of accessible fatigue details according to the specified query parameters is {len(fatigue_details)}.")

In [None]:
fd

In case you want to specify a subassembly type in the title, be careful since the amount of accessible data is quite limited at the moment. Hence the following command should throw an error.

In [None]:
fatigue_details = api_fatigue.get_fatiguedetails(title__icontains="NW2F04_TP")

While another turbine might have more data available and you could specify subassembly as well.

In [None]:
fatigue_details = api_fatigue.get_fatiguedetails(title__icontains="NW2A01")
fd = fatigue_details[0]
print(f"The amount of accessible fatigue details according to the specified query parameters is {len(fatigue_details)}.")

Furthermore, you can query different properties and even corresponding SN curves:

In [None]:
fd.sncurves

In [None]:
fd.height

You can get the requested data in a bit different format through the method below.

In [None]:
fatigue_sa = api_fatigue.get_fatiguesubassembly(turbine="NW2A01", subassembly="MP")
fdsa = fatigue_sa
print(f"The amount of accessible subassemblies according to the specified query parameters is {len(fatigue_sa)}.")

It can operate more on a subassembly basis while before you would get more raw list of all data according to your query.

In [None]:
fdsa['MP']

It can even provide geometrical information on the selected subassembly.

In [None]:
fdsa['MP'].subassembly

In [None]:
fdsa['MP'].height

In [None]:
fdsa['MP'].fatiguedetails

In [None]:
figg = fdsa['MP'].plotly()

Finally, currently you can plot all the existing fatigue data for a turbine in a nice overview.

In [None]:
fig = api_fatigue.fatiguedetails_serializedquickview(turbines="NW2A01")

### Final remarks

The package is currently a work in progress. In case of issues (code, docs)/bugs/suggestions contact the [authors](mailto:Arsen.Melnikov@vub.be) or [file an issue on GitHub](https://github.com/OWI-Lab/owimetadatabase-preprocessor/issues).

For more specific information about the explained functionality or more functionality, please [visit the documentation](https://owi-lab.github.io/owimetadatabase-preprocessor/index.html).