# Physical Geography practical 1: Analysis of flood data (and your first look at python)

Written by Simon M Mudd, last update 19-Jan-2024

## Overview

In this practical we will work with some data that comes from the National River Flow Archive (NFRA) which can be found at: https://nrfa.ceh.ac.uk/

It allows you to perform flood frequency analysis, and then look at the results in plots.

It will also give you a basic overview of python. We will use this again later in the course.

## Your very first steps in python

You are reading this inside a python notebook.

It is made up of "cells". Some of these cells are text, which are just there for you to read. But others are code (that is, a little bit of software) that you can run.

The code is all written in python, which is a popular programming language.

If you want to run the code in the cells you can do than in a few ways.
1. You can click on the cell and then type shift+enter
2. You can click on the cell and there is a little symbol that looks like a "play" arrow, and you can click that.
3. You can go into the top menu and where it says "runtime" you can choose to run *all* the cells.

Lets practice running our first python code

In [None]:
x = 2
y = "I am a string"
z = 2.3

Wait, it didn't do anything? Well actually it did, it assigned the number `2` to the variable `x`. Then it defined the variable `y` as being a series of characters, which in programming we call a string. And z is a decimal number. If you want to see the value of your variables you nbeed to `print` them, like this:

In [None]:
print(x)
print(y)
print(z)

2
I am a string
2.3


You can combine these in print commands, but python defined the `type` of the variable, so if you give it a single number it defines that as an integer, if the number has a decimal it will become a floating point number (shortened to  `float`) and if it is a bunch of characters it will be a string (shortened to `str`).


In [None]:
print("I can't mix numbers and strings when I print, so this is how you print a number")
print("I am a number "+str(x)+" and "+y)
print(type(z))

I can't mix numbers and strings when I print, so this is how you print a number
I am a number 2 and I am a string
<class 'float'>


In [None]:
print( type(y))

<class 'str'>


### This isn't a programming course

**You have just learned how to run cells. For the rest of this practical I am just going to get you to run cells. You might have to change a number in a cell, but I will tell you when to do that.**

## Grab some data

In [12]:
#!/usr/bin/env python3

# -*- coding: utf-8 -*-

import urllib.request
import json
import pandas as pd

# The base URL to access the NFRA API
BASE_URL = "https://nrfaapps.ceh.ac.uk/nrfa/ws"

VALID_DATA_TYPES = [
    'gdf', 'ndf', 'gmf', 'nmf', 'cdr', 'cdr-d', 'cmr',
    'pot-stage', 'pot-flow', 'gauging-stage', 'gauging-flow',
    'amax-stage', 'amax-flow'
]

def catalogue():

    query = "station=*&format=json-object&fields=all"
    stations_info_url = "{BASE}/station-info?{QUERY}".format(
        BASE=BASE_URL, QUERY=query
    )

    # Send request and read response
    response = urllib.request.urlopen(stations_info_url).read()

    # Decode from JSON to Python dictionary
    response = json.loads(response)
    df = pd.DataFrame(response['data'])
    return df


def _build_ts(response):
    variable = response['data-type']['id']
    dates = response['data-stream'][0::2]
    values = response['data-stream'][1::2]
    df = pd.DataFrame.from_dict({'time': dates, variable: values})
    return df


def get_ts(id, data_type):

    query = "station=" + str(id) + "&data-type=" + data_type + "&format=json-object"
    stations_info_url = "{BASE}/time-series?{QUERY}".format(
        BASE=BASE_URL, QUERY=query
    )

    # Send request and read response
    response = urllib.request.urlopen(stations_info_url).read()

    # Decode from JSON to Python dictionary
    response = json.loads(response)

    df = _build_ts(response)
    return df

In [2]:


# Example use
md = catalogue()
#ts = get_ts(9001, "gdf")

In [4]:
print(md)

          id                                name  catchment-area  \
0       1001                     Wick at Tarroul           161.9   
1       2001              Helmsdale at Kilphedir           551.4   
2       2002                Brora at Bruachrobie           434.4   
3       3001                       Shin at Lairg           494.6   
4       3002                Carron at Sgodachail           241.1   
...      ...                                 ...             ...   
1595  206002          Jerretspass at Jerretspass           107.8   
1596  206006                Annalong at Recorder            13.8   
1597  236005  Colebrooke at Ballindarragh Bridge           313.6   
1598  236007        Sillees at Drumrainey Bridge           166.3   
1599  236051      Ballinamallard at Ballycassidy           159.4   

                                         grid-reference   easting  northing  \
0     {'ngr': 'ND2620254915', 'easting': 326202.0, '...  326202.0  954915.0   
1     {'ngr': 'NC99839181

In [13]:
# Example use
#md = catalogue()
ts = get_ts(9001, "amax-flow")
print(ts)

                   time  amax-flow
0   1960-08-26T00:00:00    165.417
1   1960-11-02T00:45:00    145.412
2   1961-10-20T12:30:00    113.073
3   1962-12-14T23:15:00    129.406
4   1964-08-19T01:45:00     79.600
..                  ...        ...
58  2017-11-21T20:45:00     89.862
59  2019-05-26T18:45:00     97.435
60  2019-11-12T18:30:00    113.312
61  2020-10-04T04:45:00    191.799
62  2021-11-01T00:45:00     68.205

[63 rows x 2 columns]


In [7]:
print(ts)

             time     gdf
0      1959-10-01   1.667
1      1959-10-02   1.667
2      1959-10-03   1.891
3      1959-10-04   1.512
4      1959-10-05   1.614
...           ...     ...
23005  2022-09-26   3.542
23006  2022-09-27   9.853
23007  2022-09-28  18.310
23008  2022-09-29  10.370
23009  2022-09-30   8.835

[23010 rows x 2 columns]
