# Lesson 4 - Compare with P.2108 Clutter Model

In this lesson you'll compare the performance of your clutter propagation model with the P.2108 clutter model.

### About P.2108
 * __The documentation__ for the P.2108 clutter model can be found at [Recommendation ITU-R P.2108-1](https://www.itu.int/dms_pubrec/itu-r/rec/p/R-REC-P.2108-1-202109-I!!PDF-E.pdf).
 * __The code repository__ for NTIA's reference implementation of P.2108 can be found at [NTIA/p2108](https://github.com/NTIA/p2108) GitHub repository.

The P.2108 model estimates signal loss through clutter for frequencies between 300 MHz and 100 GHz. P.2108 is composed of three methods for predicting clutter loss depending on the situation:
1. Height Gain Terminal Correction Model, for 0.3 to 3 GHz
2. Terrestrial Statistical Model, for 2 to 67 GHz
3. Aeronautical Statistical Model, for 10 to 100 GHz

In this lesson we will use the __Terrestrial Statistical Model__ because the measurement data was made with two ground-based terminals and the frequency is 3.5 GHz. A full description of the Terrestrial Statistical Model can be found in section 3.2 of Recommendation ITU-R P.2108-1 linked above.

### Import the P.2108 code library
The [NTIA code repository for P.2108](https://github.com/NTIA/p2108) contains the U.S. Reference Implementation for all three P.2108 clutter loss prediction methods listed above. To use this software we have provided an installable Python package in the following directory path: **`course-materials/packages/p2108-1.0.0-py3-none-any.whl`**. 

Execute the following cell to install the P.2108 package in your JupyterLab environment.

In [None]:
! pip install packages/p2108-1.0.0-py3-none-any.whl

### Import the necessary Python libraries

In [None]:
from ITS.ITU.PSeries import P2108
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

### How to use P.2108

We will use the __Terrestrial Statistical Model__ from P.2108. This model is a statistical clutter loss model for terrestrial propagation paths. It is valid for urban and suburban clutter environments. The predicted clutter loss is a correction factor for a single terminal within the clutter. The correction can be applied to both terminals if both are within the clutter (which we will not do since the transmitter is located above the clutter). A description of the model and the inputs and outputs can be found on the [NTIA/p2108 repository README](https://github.com/NTIA/p2108?tab=readme-ov-file#terrestrial-statistical-model).

#### P2108.TerrestrialStatisticalModel( `f__ghz` , `d__km` , `p` )

#### Parameters: 
__`f__ghz` : float__\
Frequency (in GHz) of the signal. Range: 2 <= `f__ghz` <= 67.

__`d__km` : float__\
Path distance (in kilometers) between the transmitter and receiver. Range: 0.25 <= `d__km`.

__`p` : float__\
The clutter loss not exceeded for `p` percent of locations. Range: 0 < `p` < 100.

## Examples of calling P.2108

### Predict the clutter loss for: 3.5 GHz, 0.95 km, and median (50th percentile) loss

In [None]:
clutter_loss__dB = P2108.TerrestrialStatisticalModel(3.5, 0.95, 50)
print("Predicted clutter loss = {:.1f} dB".format(clutter_loss__dB))

### Predict the clutter loss for: 3.5 GHz, 0.95 km, and 10th percentile loss

In [None]:
clutter_loss__dB = P2108.TerrestrialStatisticalModel(3.5, 0.95, 10)
print("Predicted clutter loss = {:.1f} dB".format(clutter_loss__dB))

### Predict the clutter loss for: 3.5 GHz, 0.95 km, and 90th percentile loss

In [None]:
clutter_loss__dB = P2108.TerrestrialStatisticalModel(3.5, 0.95, 90)
print("Predicted clutter loss = {:.1f} dB".format(clutter_loss__dB))

The _p_ input lets us explore the statistical bounds of the model. For the propagation of a 3.5 GHz signal, across 0.95 km, with one terminal within the clutter, P.2108 predicts that the median loss (_p_ = 50) will be 30.0 dB. P.2108 also predicts that 80% (from 10th to 90th percentile) of clutter losses observed will be between 24.4 and 35.5 dB.

### Make P.2108 predictions using Martin Acres dataset path distances

1. Start by loading the Martin Acres measurement data. This is the same data that was introduced in lesson 3 except the measurements with paths shorter than 0.25 km are removed (they're not supported by P.2108).
2. Perform P.2108 predictions assuming median (50th percentile) losses.

In [None]:
## load the Martin Acres dataset
greater_0p25_df = pd.read_csv("data/MartinAcres_Lesson4.csv")
## do P.2108 predictions (ask for 50th percentile losses), create a new column with those predictions
greater_0p25_df["p2108__dB"] = greater_0p25_df.apply(lambda row: P2108.TerrestrialStatisticalModel(row.f__mhz/1000, row.d__km, 50), axis=1)

### Plot the P.2108 predictions with the Martin Acres measurement data

In [None]:
plt.rcParams["figure.figsize"] = (11,6)

## plot the measurement data
plt.scatter(greater_0p25_df["d__km"], greater_0p25_df["L_excess__db"], label='Measurement Data', s=12)
## plot the P.2108 predictions
plt.plot(np.sort(greater_0p25_df["d__km"]), np.sort(greater_0p25_df["p2108__dB"]), label='P.2108 Prediction', c='tab:orange', linewidth=3.0)

plt.xlabel('Path Distance (km)')
plt.ylabel('Clutter Loss (dB)')
plt.title('Path Distance vs Clutter Loss\nMartin Acres')

plt.legend(fontsize=14)
plt.gca().yaxis.grid(True)
plt.show()

### How did P.2108 do?
For this dataset, P.2108 appears to accurately capture clutter loss at shorter distances (less than 1 km). At 1.5 km there is a cluster of measurement data not accurately predicted by P.2108. It over predicts the clutter loss for this cluster. If you remember back to lesson 3, this cluster of measurements is from the High TX location. For the High TX, the signal suffers less because the signal passes over most buildings and trees without interference.  

### Compare your model (Lesson 3) to P.2108

In [None]:
## plot the measurement data
plt.scatter(greater_0p25_df["d__km"], greater_0p25_df["L_excess__db"], label='Measurement Data', s=12)
## plot the P.2108 predictions
plt.plot(np.sort(greater_0p25_df["d__km"]), np.sort(greater_0p25_df["p2108__dB"]), label='P.2108 Prediction', c='tab:orange', linewidth=3.0)
## plot your model
plt.scatter(greater_0p25_df["d__km"], greater_0p25_df["pred_loss"], label='Model Prediction (Lesson 3)', s=10, c='tab:green')

plt.xlabel('Path Distance (km)')
plt.ylabel('Clutter Loss (dB)')
plt.title('Comparing the Lesson 3 Model to P.2108\nMartin Acres')

plt.legend(fontsize=14)
plt.gca().yaxis.grid(True)
plt.show()

Admittedly, this comparison is a little funky. The model you made in lesson 3 doesn't follow a nice curve when examined through Path Distance (the X axis). Recall that your model uses _3D Clutter Distance_, the distance that the signal travels before it exits out of the clutter. So comparing to P.2108 is a little tricky when the X axis restricted to Path Distance. 

Regardless of the wonky comparison, you can still see that __your model outperforms P.2108 when predicting clutter loss for transmitters at differing elevations__.

### Finally, look at the Cumulative Distribution Functions (CDF) of predictions for both models

Another way to understand these models is to look at their distribution of clutter loss predictions based on the Path Distance (or 3D Clutter Distance). To do this, plot the CDF of a) the measurement data, b) P.2108, and c) your clutter model. 

Start by defining the clutter model you made in lesson 3. 

In [None]:
## Define your model from Lesson 3
slope = 13.71
y_int = -9.8
model_std = 3.8 ## standard deviation of the model (lesson 3)
## define a clutter model method, takes the 3D Clutter Distance (in meters) as input
def clutter_model(clutter_distance__m):
    return slope * np.log10(clutter_distance__m) + y_int

Next, find the distribution of clutter loss predictions from your model. Ensure that the predicted sample has the same __3D Clutter Distance__ distribution as the measurement data. 

In [None]:
## find the distribution of 3D Clutter Distances in the Martin Acres dataset
clutter_distances_array = np.sort(greater_0p25_df["clutter_d__meter"])

## Predict the clutter loss using your lesson 3 clutter model
model_cdf_distri = []
for d in clutter_distances_array:
    model_cdf_distri.append(np.random.normal(clutter_model(d), model_std))

Next, find the distribution of clutter loss predictions from P.2108. Ensure that the predicted sample has the same __Path Distance__ distribution as the measurement data. 

In [None]:
## find the distribution of Path Distances in the Martin Acres dataset
distances_array = np.sort(greater_0p25_df["d__km"])

## Predict the clutter loss using P.2108
p2108_cdf_distri = []
for d in distances_array:
    p2108_cdf_distri.append(P2108.TerrestrialStatisticalModel(greater_0p25_df["f__mhz"][0]/1000, d, np.random.randint(1,100)))

### Plot the CDFs

In [None]:
plt.rcParams["figure.figsize"] = (11,6)

meas_N = len(greater_0p25_df)
meas_x = np.sort(greater_0p25_df["L_excess__db"])
meas_y = np.arange(meas_N) / float(meas_N)
## plot the CDF
plt.plot(meas_x, meas_y, label='Measurement Data', linewidth=3.0)

p2108_x = np.sort(np.array(p2108_cdf_distri))
p2108_y = np.arange(meas_N) / float(meas_N)
## plot the CDF
plt.plot(p2108_x, p2108_y, label='P.2108', linewidth=3.0)

model_x = np.sort(np.array(model_cdf_distri))
model_y = np.arange(meas_N) / float(meas_N)
## plot the CDF
plt.plot(model_x, model_y, label='Model (Lesson 3)', linewidth=3.0)

plt.xlabel('Clutter Loss (dB)')
plt.ylabel('Probability')
plt.title('Clutter Loss CDF')

plt.gca().yaxis.grid(True)
plt.legend(fontsize=14)
plt.show()

The distribution of clutter losses predicted by your model (from lesson 3) is close to actual distribution of clutter losses observed in the Martin Acres measurement data. P.2108 appears to generally overpredict the clutter loss by 3-5 dB. This is another good indicator that your model outperforms P.2108 in situations where the transmitter is above the clutter.

### One last thing

Before ending this lesson it's important to discuss why P.2108 doesn't perform well with this dataset. P.2108's Terrestrial Statistical Model assumes that all propagation paths are completely horizontal through clutter (a 0-degree RX elevation angle). It also assumes that propagation paths that are _near_ horizontal will suffer the same loss as a _completely_ horizontal path. This turns out not to be true. If we look at the Martin Acres measurements, the Low TX data has an average RX elevation angle of 1 degree and the High TX data has an average RX elevation angle of 4 degrees. Both are near zero and could be assumed to suffer the same clutter loss as a completely horizontal path. Yet we see a big difference, especially from the High TX dataset. In short, a 4-degree RX elevation angle is enough to disrupt P.2108's predictive power.

Well done, you've compared your statistical clutter model to the P.2108 clutter model. In the next lesson you'll see how good (or bad) your clutter model performs with other datasets.

End of Lesson 4.