# Hydropandas Objects

In the HydroPandas Python package, the Obs and ObsCollection classes are designed to handle time series data related to hydrological observations.

The Obs class represents a single time series of measurements at a specific location, such as groundwater levels or precipitation amounts. It is a subclass of the pandas DataFrame, enriched with additional attributes and methods for the type of observation it holds. There are specialized subclasses of Obs for different measurement types, including:

- GroundwaterObs: for groundwater measurements
- WaterQualityObs: for (ground)water quality measurements
- WaterlvlObs: for surface water level measurements
- ModelObs: for observations from a MODFLOW model
- MeteoObs: for meteorological observations
- PrecipitationObs: for precipitation observations (subclass of MeteoObs)
- EvaporationObs: for evaporation observations (subclass of MeteoObs)

Each of these subclasses is essentially a pandas DataFrame with additional methods and attributes related to the type of measurement it holds. 

The ObsCollection class represents a collection of Obs objects, such as multiple groundwater level time series within a certain area. It is also a subclass of the pandas DataFrame, where each row contains metadata (e.g., coordinates of the observation point) and the corresponding Obs object that holds the measurements. Both Obs and ObsCollection classes include methods for reading data from various sources, facilitating the management and analysis of hydrological time series data.

![Artesia](../_static/Artesia_logo.jpg)

## <a id=top></a>Notebook contents

1. [Obs](#Obs)
2. [ObsCollection](#ObsCollection)

In [None]:
import numpy as np
import pandas as pd

import hydropandas as hpd

hpd.util.get_color_logger("INFO")

## Obs<a id=GroundwaterObs></a>

Creating an `Obs` object is very similar to creating a `DataFrame`. Below we create 3 differente Obs objects:
- an empty Obs
- an Obs with only metadata 
- an Obs with metadata and measurements

In [None]:
# create an empty Obs object
o1 = hpd.Obs(name="my empty obs")
display(o1)

In [None]:
# create an Obs object with only metadata
o2 = hpd.Obs(
    name="my_observation",
    x=10,
    y=20,
    location="somewhere",
    filename="unknown",
    source="imagination",
    unit="m",
)
display(o2)

In [None]:
# create an Obs object with both metadata and measurements
meas_df = pd.DataFrame(
    index=pd.date_range(start="2020-01-01", periods=10, freq="D"),
    data={"value": np.random.rand(10)},
)
o3 = hpd.Obs(
    meas_df,
    name="smw",
    x=1000,
    y=22220,
    location="somewhere else",
    source="advanced imagination",
    unit="m",
)
display(o3)

#### Obs metadata

Access observation metadata as attributes.

In [None]:
print(f"x coordinate of observation 1: {o1.x}")
print(f"x coordinate of observation 2: {o2.x}")
print(f"x coordinate of observation 3: {o3.x}")

In [None]:
print(f"source of observation 1 is : {o1.source}")
print(f"location of observation 2 is : {o2.location}")
print(f"name of observation 3 is : {o3.name}")

#### Obs measurements

Access observation measurements as if the observation is a DataFrame with the measurements.

In [None]:
display(o3["value"])  # show measurements

In [None]:
perc85 = o3["value"].quantile(0.85)  # get percentile
print(f"the 85th percentile of my measurements is {perc85:.2f} {o3.unit}")

In [None]:
o3["value"].plot(
    figsize=(14, 3),
    label=o3.name,
    ylabel=o3.unit,
    marker="o",
    legend=True,
    title="my observations",
);  # plot measurements

#### Obs types

Different Obs types have differente metadata. Groundwater observations have some extra properties `screen_top`, `screen_bottom`, `ground_level`, `tube_top` and `metadata_available`.

In [None]:
gw_obs = hpd.GroundwaterObs(
    o3,
    name="smw_pb1",
    tube_nr=1,
    screen_top=-5,
    screen_bottom=-6,
    unit="m NAP",
    ground_level=3,
    tube_top=2.95,
    metadata_available=True,
)  # create a GroundwaterObs object from the Obs object
display(gw_obs)

#### Reading/writing Obs

Observations can be saved as a pickle file for later use.

In [None]:
# save the object to a pickle file
gw_obs.to_pickle("my_gw_obs.pklz")

In [None]:
# read the object from a pickle file
gw_obs2 = hpd.read_pickle("my_gw_obs.pklz")

In [None]:
gw_obs2.equals(gw_obs)  # check if the two objects are equal

## ObsCollection<a id=ObsCollection></a>

An ObsCollection is a structured way to manage and analyze multiple time series of hydrological observations. It serves as a container for multiple Obs objects, which represent individual time series of measurements, such as groundwater levels, precipitation, or water quality.

Each row in an ObsCollection contains metadata (e.g., location, station name) and a corresponding Obs object holding the time series data. This structure allows for easy comparison, filtering, and statistical analysis across multiple observation sites.

In [None]:
# create an empty ObsCollection
oc = hpd.ObsCollection()
print(oc)

In [None]:
# create an ObsCollection with a single Obs object
oc = hpd.ObsCollection(o3)
oc

In [None]:
# create an ObsCollection with multiple Obs objects
oc = hpd.ObsCollection([o1, o2, o3])
oc

#### ObsCollection metadata

Access the metadata using the standard DataFrame methods.

In [None]:
print(f"the x coordinate of observation 2 is: {oc.loc['my_observation', 'x']}")
print(f"the location of observation 3 is: {oc.loc['smw', 'location']}")

#### ObsCollection observations

Access the Obs objects from the collection

In [None]:
o3_1 = oc.loc["smw", "obs"]  # using the loc method
o3_2 = oc.get_obs("smw")  # using the get_obs method with the name
o3_3 = oc.get_obs(
    location="somewhere else"
)  # using the get_obs method with the location (only works if the location is unique)
id(o3_1) == id(o3_2) == id(o3_3)  # check if the three objects are the same

#### Slice ObsCollection

Filter and slice ObsCollections

In [None]:
oc.loc[oc["y"] > 10]  # Selection based on the y coordinate

In [None]:
oc.loc[oc["source"].str.contains("advanced")]  # Selection based on the location

#### Read/write an ObsCollection

An ObsCollection can be written to an excel file or a pickle file. Writing to and reading from an excel file slightly alters the properties, just like writing and reading a DataFrame to/from excel would do. Reading/writing a pickle does not change anything 

In [None]:
oc.to_excel("my_obs_collection.xlsx")  # write to excel
oc.to_pickle("my_obs_collection.pklz")  # write to pickle

In [None]:
# read excel file
oc2 = hpd.read_excel("my_obs_collection.xlsx")
oc2

In [None]:
# read pickle
oc2 = hpd.read_pickle("my_obs_collection.pklz")
oc2

#### Extensions

To enhance the functionality of an ObsCollection, HydroPandas provides several extensions that add specialized methods for visualization, spatial analysis, and data processing. Some key extensions include:

- Plot Extension (ObsCollection.plot): Built-in plotting capabilities for visualizing time series data. Users can generate time series plots for individual or multiple observations, histograms, and other graphical representations to analyze trends and patterns in hydrological data.
- Geo Extension (ObsCollection.geo): Spatial analysis by integrating with geopandas. It allows users to obtain the extent of an ObsCollection, convert to another coordinate reference system and find nearby geometries.
- Groundwater Obs (ObsCollection.gwobs): Analyse and process groundwater observations. Users can find the REGIS layer of each tube and set the tube number based on the screen depth.
- Statistics (ObsCollection.stats): Statistical analysis of the observations. Users can obtain the number of consecutive years with more than 10 observations or find seasonal minimum and maximum values.




In [None]:
oc.stats.get_first_last_obs_date()  # get the first and last observation date using the stats extension

In [None]:
oc.geo.get_extent()  # get the extent of the observations using the geo extension