## Met Office Interview Technical Question

Below is a script intended to calculate the minimum predicted overnight temperature $T_{min}$ using historic methods. This script uses the Craddock-Pritchard method
\begin{align} 
T_{min} = 0.316T_{12} + 0.548T_{d12} - 1.24 + K, 
\end{align}
where 
$$
T_{12} = \text{Temperature at 12:00}
$$
$$
T_{d1} = \text{Dew Point Temperature at 12:00},
$$
and $K$ is derived from a lookup table based on cloud cover (expressed in Oktas - from 0-8 representing eights of cloud cover) and the wind speed (expressed in knots). 

In [157]:
import pandas as pd
import numpy as np

Question to ask
1. What if the input data changes shape? Does it still work?
2. What if I were to add another method? Would the code still generalise? 

To use the implemented Craddock-Pritchard method, we must first provide the lookup table for $K$. We require this to be a `pd.DataFrame` where the columns and rows are indexed by tuples in a `(low, high)` format (where low is the lower bound of the interval and high is the upper bound of the interval).
 

In [158]:
#Specify the data
data = [
    [-2.2, -1.7, -0.6, 0],
    [-1.1, 0, 0.6, 1.1],
    [-0.6, 0, 0.6, 1.1],
    [1.1, 1.7, 2.8, "Unknown"]
] 

#List the numerical values of the cloud coverage and wind speeds 
index =  [(0,12),(13,25), (26,38), (39,51)]
columns = [(0,2), (2,4), (4,6), (6,8)]

dfmet = pd.DataFrame(data, index = index,
                     columns = columns) 

dfmet.index.name = "Wind Speed (knots)"
dfmet.columns.name = "Cloud Coverage (oktas)"
print(dfmet)

Cloud Coverage (oktas)  (0, 2)  (2, 4)  (4, 6)   (6, 8)
Wind Speed (knots)                                     
(0, 12)                   -2.2    -1.7    -0.6        0
(13, 25)                  -1.1     0.0     0.6      1.1
(26, 38)                  -0.6     0.0     0.6      1.1
(39, 51)                   1.1     1.7     2.8  Unknown


In [159]:

def _validate_tuples(df_index):
    """Checks the inputs are tuples, length 2 and in (low, high) ordering 

    Raises:
        ValueError: If tuples fail to meet requirements. 
    """
    for elem in df_index:
        if not isinstance(elem, tuple) or len(elem)!=2 or (elem[0]>=elem[1]):
            raise ValueError(f"{df_index.name} must be a list of tuples in (low, high) format.")
    pass 

def K_lookup(df_numeric: pd.DataFrame, cloud_coverage: float, wind_speed: float):
    """Function to find K in a table of data based on the the 
    cloud cover (oktas).

    Args:
        df : pd.DataFrame, shape (w,o)
             Input dataframe with w wind speed rows and o cloud cover columns
        cloud_coverage : float
             Cloud cover expressed in Oktas
        wind_speed : float
             Wind speed expressed in knots

    Returns:
        K : pd.DataFrame
            DataFrame containing the relevant lookup values for the specified
            oktas. May return multiple values if inputs lie on bin boundaries. 
    """
    _validate_tuples(df_numeric.index)
    _validate_tuples(df_numeric.columns)

    #calculate the minimum and maximum cloud coverage and wind speed for each 
    min_cloud, max_cloud = np.min([elem[0] for elem in df_numeric.columns]), np.max([elem[1] for elem in df_numeric.columns])
    min_wind, max_wind = np.min([elem[0] for elem in df_numeric.index]), np.max([elem[1] for elem in df_numeric.index])
    
    if cloud_coverage<min_cloud or cloud_coverage>max_cloud:
        raise ValueError(f"cloud_coverage {cloud_coverage} oktas is outside the minimum or maximum range.")
    if wind_speed<min_wind or wind_speed>max_wind:
        raise ValueError(f"wind_speed {wind_speed} knots is outside the minimum or maximum range.")

    #Find the indices in the lookup table for specified wind speeds and cloud coverage
    cloud_coverage_indices = [low<=cloud_coverage<=high for low, high in df_numeric.columns]
    wind_indices = [low<=wind_speed<=high for low, high in df_numeric.index]
    

    if not any(wind_indices):
        raise ValueError(f"Specified wind speed of {wind_speed} knots lies outside of valid bin values.")
    if not any(cloud_coverage_indices):
        raise ValueError(f"Specified cloud coverage of {cloud_coverage} oktas lies outside of valid bin values")

    
    K = df_numeric.loc[wind_indices,cloud_coverage_indices]
    
    return K 


def craddock_pritchard(dfmet: pd.DataFrame, T12: float, Td12: float, cloud_coverage: float, wind_speed: float):
    """Function to calculate overnight minimum temperature using the
    Craddock and Pritchard method

        Tmin = 0.316T12 + 0.548Td12 - 1.24 + K

    where Tmin is the predicted overnight minimum temperature, T12 is the temperature at 12:00, 
    Td1 is the dew point temperature at 12:00 and K is derived from a lookup table based on
    cloud coverage. 
    

    Args:
        dfmet : pd.DataFrame, shape (w,o)
              Input dataframe with w wind speed rows and o cloud cover columns
        T12 : float
              Temperature at 12:00
        TD12 : float
              Dew Point Temperature at 12:00 
        cloud_coverage : float
              Cloud cover expressed in Oktas
        wind_speed : float
              Wind speed expressed in knots

    Notes:
        1. No assumption that the cloud cover should be expressed as integer values
        2. No assumption made on the valid range of the bins 

    References
    ----------
    see 
        [1] The Forecasters' Reference Book, 1997, Meteorological Office College, Met Office
    """

    #remove non numeric entries and replace with NaN
    df_numeric = dfmet.apply(pd.to_numeric, errors="coerce") 

    K = K_lookup(df_numeric, cloud_coverage, wind_speed) 

    Tmin  = 0.316*T12 + 0.548*Td12 - 1.24 + K 
    
    return Tmin 



### Validation
We now perform validation of the code against provided data and results. 

In [160]:
#T12 = 18, Td12 = 10, Cloud coverage = 3, wind speed = 30
print(craddock_pritchard(dfmet, T12=18, Td12=10, cloud_coverage=3, wind_speed=30))

Cloud Coverage (oktas)  (2, 4)
Wind Speed (knots)            
(26, 38)                 9.928


In [161]:
#T12 = 22.4, Td12 = 10.9, Cloud coverage = 3.9, wind speed = 14.56
#Testing case where cloud coverage is non integer
print(craddock_pritchard(dfmet, T12=22.4, Td12=10.9, cloud_coverage=3.9, wind_speed=14.56))

Cloud Coverage (oktas)   (2, 4)
Wind Speed (knots)             
(13, 25)                11.8116


In [None]:
#T12 = 18.6, Td12 = 12.65, Cloud coverage = 6, wind speed = 3.4
#Testing case where 
print(craddock_pritchard(dfmet, T12=18.6, Td12=12.65, cloud_coverage=6, wind_speed=3.4))

Cloud Coverage (oktas)   (4, 6)   (6, 8)
Wind Speed (knots)                      
(0, 12)                 10.9698  11.5698


In [None]:
#T12 = 26, Td12 = 8.5, Cloud coverage = 0.0, wind speed = 0 
#Testing case where both the cloud coverage and wind speed are at the edge of the bins
print(craddock_pritchard(dfmet, T12=26, Td12=8.5, cloud_coverage=0, wind_speed=0.0))

Cloud Coverage (oktas)  (0, 2)
Wind Speed (knots)            
(0, 12)                  9.434


In [None]:
#T12 = 13.2, Td12 = 9.4, Cloud coverage = 4.1, wind speed = 12.5
#Testing case where wind speed falls outside of a valid bin value in the data
print(craddock_pritchard(dfmet, T12=18, Td12=9.4, cloud_coverage=4.1, wind_speed=12.5))

ValueError: Specified wind speed of 12.5 knots lies outside of valid bin values.