# 1 - How to get water levels for a few wells

Import the main package.

In [1]:
import dew_gwdata

Get a connection to SA Geodata

In [2]:
db = dew_gwdata.sageodata()

In [3]:
db

<dew_gwdata._sageodata.SAGeodataConnection to gwquery@pirsapd07.pirsa.sa.gov.au:1521/DMEP.World>

Query for some specific wells by unit number and obs. no. Each of these are separate wells, but it doesn't matter if there duplicates.

In [4]:
wells = db.find_wells("lkw040 lkw64 5928-201,602802321")

In [5]:
wells

['5928-201', 'LKW040', '6028-2321', 'LKW064']

This is an ``sa_gwdata.Wells`` object, which closely mimics a Python list. 
It contains ``sa_gwdata.Well`` objects - see [more information here](https://python-sa-gwdata.readthedocs.io/en/latest/python.html#sa_gwdata.Well).

Now let's get a summary of details about these wells using a one ``dew_gwdata``'s predefined SA Geodata queries.

In [6]:
df = db.drillhole_details(wells)
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4 entries, 0 to 3
Data columns (total 12 columns):
well_id        4 non-null object
dh_no          4 non-null int64
unit_long      4 non-null int64
unit_hyphen    4 non-null object
obs_no         2 non-null object
dh_name        2 non-null object
easting        4 non-null float64
northing       4 non-null float64
zone           4 non-null int64
latitude       4 non-null float64
longitude      4 non-null float64
aquifer        2 non-null object
dtypes: float64(4), int64(3), object(5)
memory usage: 344.0+ bytes


Let's only look at the columns which contain well identifers.

In [7]:
df

Unnamed: 0,well_id,dh_no,unit_long,unit_hyphen,obs_no,dh_name,easting,northing,zone,latitude,longitude,aquifer
0,5928-201,7205,592800201,5928-201,,,545780.36,6168904.38,53,-34.620707,135.499398,
1,LKW040,7310,592800306,5928-306,LKW040,CB TWS INVESTIGATION 3,544296.75,6167305.54,53,-34.63519,135.483298,Tbw
2,6028-2321,198986,602802321,6028-2321,,,546676.68,6167528.47,53,-34.633074,135.509251,
3,LKW064,283736,592800459,5928-459,LKW064,COFFIN BAY 2,545648.71,6168123.45,53,-34.627755,135.498004,Qpcb


Note the column **well_id**. This will contain the obs_no if it exists, otherwise it'll contain the unit number in hyphenated form.

These columns should be present in the same order and format in all the predefined queries.

Now let's get water levels!

You can query SA Geodata directly, or use a handy list of predefined queries, similar to Lloyd's Access database. These queries are not very well documented - you can see the SQL for them [here](http://envtelem04:3000/groundwater/dew_gwdata/src/branch/master/dew_gwdata/sageodata_queries). Each is basically a method on the ``db`` object.

The query for water level data is called "water_levels". So:

In [8]:
swls = db.water_levels(wells)

And the method ``db.water_levels(...)`` returned a DataFrame, according to the [predefined query for it](http://envtelem04:3000/kinverarity/wsamdata/src/branch/master/wsamdata/sageodata/queries/water_levels.sql):

In [9]:
swls.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 322 entries, 0 to 321
Data columns (total 28 columns):
well_id            322 non-null object
dh_no              322 non-null int64
unit_long          322 non-null int64
unit_hyphen        322 non-null object
obs_no             320 non-null object
dh_name            320 non-null object
easting            322 non-null float64
northing           322 non-null float64
zone               322 non-null int64
latitude           322 non-null float64
longitude          322 non-null float64
aquifer            320 non-null object
obs_date           322 non-null datetime64[ns]
swl                321 non-null float64
dtw                322 non-null float64
rswl               321 non-null float64
pressure           0 non-null object
temperature        1 non-null float64
dry_ind            0 non-null object
anomalous_ind      322 non-null object
pumping_ind        322 non-null object
measured_during    322 non-null object
data_source        322 non-nul

In [11]:
swls[
    [
        "well_id",
        "obs_date",
        "swl",
        "dtw",
        "rswl",
        "dry_ind",
        "measured_during",
        "data_source",
    ]
].head()

Unnamed: 0,well_id,obs_date,swl,dtw,rswl,dry_ind,measured_during,data_source
0,5928-201,1950-11-01,2.44,2.44,1.87,,U,DEWNR
1,LKW040,1985-04-02,,2.64,,,D,DEWNR
2,LKW040,1986-11-07,3.06,3.37,1.28,,M,DEWNR
3,LKW040,1986-12-02,3.16,3.47,1.18,,M,DEWNR
4,LKW040,1987-01-02,3.18,3.49,1.16,,M,DEWNR


How many water level measurements are there for each well?

In [13]:
swls.groupby("well_id").obs_date.nunique()

well_id
5928-201       1
6028-2321      1
LKW040       313
LKW064         6
Name: obs_date, dtype: int64

How about the first and last?

In [14]:
swls.groupby("well_id").obs_date.agg(["min", "max"])

Unnamed: 0_level_0,min,max
well_id,Unnamed: 1_level_1,Unnamed: 2_level_1
5928-201,1950-11-01,1950-11-01 00:00:00
6028-2321,2004-02-27,2004-02-27 00:00:00
LKW040,1985-04-02,2019-07-11 00:00:00
LKW064,2014-11-30,2019-04-03 12:30:01
