# Parsing Hydraulic Data to HYDWS
There are a couple of additional tools that can be used to parse hydraulic data to HYDWS. While some allow easier data manipulation, it is possible to parse very comfortably arbitrary hydraulic data with the `RawHydraulicsParser`.  

In [76]:
import pandas as pd
import io
import os
from hydws.parser.rawparser import RawHydraulicsParser
from hydws.parser.rawparser import hydws_metadata_from_configs, calculate_section_trajectories

### Reading Hydraulic Data to Pandas
Begin by reading the hydraulic data to a pandas DataFrame as you would normally do. If necessary do some data cleaning or manipulation.

In [77]:
hydraulics = pd.read_csv('data/hydraulics.csv',
    header=0,
    index_col=0,
    date_format='%d.%m.%y %H:%M:%S',
    na_values=['--', '#VALUE!'])

hydraulics = hydraulics.fillna(0)
hydraulics.index = hydraulics.index + pd.DateOffset(hours=6)


### Borehole and Section Metadata

Next, you can either copy the JSON structure of the Borehole and Section metadata and fill it in by hand. Or, alternatively you can use the following helper functions to generate the metadata structure from csv's.

In [78]:
# Usually you would proved the path to a csv inside the "borehole_csv" and the "sections_csv" variables, for demonstration purposes, we will use the io.BytesIO to simulate the csv files.

borehole_csv = io.BytesIO(b"""
name,publicid,location,institution,bedrockaltitude,measureddepth,description,x,y,z
58-32,41980d0b-efb0-4f2d-9bac-23da43a25e87,FORGE Utah,FORGE,,2290,,-112.887001,38.500535,0
""")
sections_csv = io.BytesIO(b"""
name,publicid,borehole_name,topmeasureddepth,bottommeasureddepth,topclosed,bottomclosed,holediameter,casingdiameter,description
58-32/section_01,241f1d78-140d-469c-8106-42c09876a0ae,58-32,2245.766,2290.572,TRUE,TRUE,,,
58-32/section_02,a082e586-d299-4e96-a056-8469c9dfeb28,58-32,2116.226,2119.274,TRUE,TRUE,,,
58-32/section_03,2c7e8c33-5d11-40c6-ac9e-87d10ede752a,58-32,1994.611,1997.659,TRUE,TRUE,,,
""")

In [79]:
metadata = hydws_metadata_from_configs('58-32', borehole_csv, sections_csv)

In case you don't have the section top and bottom coordinates, you can use the following function to calculate them by providing a "trajectory" of the borehole:

In [80]:
# Normally the trajectory would also be stored in a csv as above. The trajectory can also contain arbitrarily many points, but for demonstration purposes we will use a simple two point trajectory.
trajectory_csv = io.BytesIO(b"""
,depth,x,y,z
0,0,-112.887001,38.500535,0
1,-3000,-112.887001,38.500535,-3000
""")

In [81]:
metadata = \
    calculate_section_trajectories(metadata, 
                                   trajectory_csv, 
                                   [0,0,0],
                                   'EPSG:4326')
metadata

{'name': '58-32',
 'publicid': '41980d0b-efb0-4f2d-9bac-23da43a25e87',
 'location': 'FORGE Utah',
 'institution': 'FORGE',
 'measureddepth': {'value': 2290},
 'longitude': {'value': -112.887001},
 'latitude': {'value': 38.500535},
 'altitude': {'value': 0.0},
 'sections': [{'name': '58-32/section_01',
   'publicid': '241f1d78-140d-469c-8106-42c09876a0ae',
   'topmeasureddepth': {'value': 2245.766},
   'bottommeasureddepth': {'value': 2290.572},
   'topclosed': True,
   'bottomclosed': True,
   'toplongitude': {'value': -112.887001},
   'toplatitude': {'value': 38.500535},
   'topaltitude': {'value': 2245.766},
   'bottomlongitude': {'value': -112.887001},
   'bottomlatitude': {'value': 38.500535},
   'bottomaltitude': {'value': 2290.572}},
  {'name': '58-32/section_02',
   'publicid': 'a082e586-d299-4e96-a056-8469c9dfeb28',
   'topmeasureddepth': {'value': 2116.226},
   'bottommeasureddepth': {'value': 2119.274},
   'topclosed': True,
   'bottomclosed': True,
   'toplongitude': {'value

### Parse the Raw Data into HYDWS format

Using a simple config file, which is used to map the DataFrame columns to different sections and keys, you can parse the data to the HYDWS format.

In [82]:
# As before, usually you would provide a path to a csv file, for demonstration purposes we will use a temporary file to simulate the csv file.

from tempfile import NamedTemporaryFile
config_csv = NamedTemporaryFile(mode='w+', delete=False)

config_csv.write("""
[
    {
        "columnNames": [
            "flow(m3/s)"
        ],
        "fieldName": "topflow",
        "assignTo": "sectionID",
        "section": "58-32/section_02"
    },
    {
        "columnNames": [
            "SPP(MPa)"
        ],
        "fieldName": "toppressure",
        "unitConversion": [
            "mul",
            1000000
        ],
        "assignTo": "sectionID",
        "section": "58-32/section_02"
    }
]
""")

config_csv.close()

In [83]:
# read the timeseries according to configuration and metadata
parser = RawHydraulicsParser(config_path=config_csv.name,
                             boreholes_metadata=[metadata])

hydraulics = parser.parse(hydraulics, format='json')

os.unlink(config_csv.name) # removing the temporary file again, ignore

We converted the hydraulics to HYDWS JSON format using the `RawHydraulicsParser` class. You can also get a HYDWSParser file, for which you find the documentation in the other notebook in this folder. The config file accepts additional parameters for more complex data parsing, which will be documented in the future.

In [84]:
hydraulics

[{'name': '58-32',
  'publicid': '41980d0b-efb0-4f2d-9bac-23da43a25e87',
  'location': 'FORGE Utah',
  'institution': 'FORGE',
  'measureddepth': {'value': 2290},
  'longitude': {'value': -112.887001},
  'latitude': {'value': 38.500535},
  'altitude': {'value': 0.0},
  'sections': [{'name': '58-32/section_01',
    'publicid': '241f1d78-140d-469c-8106-42c09876a0ae',
    'topmeasureddepth': {'value': 2245.766},
    'bottommeasureddepth': {'value': 2290.572},
    'topclosed': True,
    'bottomclosed': True,
    'toplongitude': {'value': -112.887001},
    'toplatitude': {'value': 38.500535},
    'topaltitude': {'value': 2245.766},
    'bottomlongitude': {'value': -112.887001},
    'bottomlatitude': {'value': 38.500535},
    'bottomaltitude': {'value': 2290.572},
    'hydraulics': []},
   {'name': '58-32/section_02',
    'publicid': 'a082e586-d299-4e96-a056-8469c9dfeb28',
    'topmeasureddepth': {'value': 2116.226},
    'bottommeasureddepth': {'value': 2119.274},
    'topclosed': True,
    