# Using pdbufr to inspect tropical cyclone data

In [64]:
import pdbufr
from ecmwf.opendata import Client

## Download tropical cyclone data from ECMWF open data catalogue

Setting source to azure will give you older data. 

**date** is the start date of the forecast.  
**step** in this case step 240 actually **has all the available steps inside**.

The next cell will download yesterday's forecast and save it in the file called **"tc_test.bufr"**. For older data, just put the date in format YYYYMMDD (20221124 for example)

In [65]:
client = Client(source="azure")

client.retrieve(
    date=-1,
    time=0,
    stream="oper",
    type="tf",
    step=240,
    target="tc_test.bufr",
)

                                                                                                                                                                                                                

<ecmwf.opendata.client.Result at 0x176ee9850>

# How to read bufr data
I am not an expert but this is what I've figured out today....

To check which stors are inside you use stormIdentifier key. We can see that in this data I used today (27 July 2023) there is only one tropical cyclone 07W.  

In [66]:
df = pdbufr.read_bufr("tc_test.bufr",
    columns=("stormIdentifier"))
df["stormIdentifier"].unique()

array(['07W'], dtype=object)

It looks like that if there is more cyclones you can filter out by this key, for example like in the next cell. But we now have only one storm so it wouldn't do much.

In [70]:
df = pdbufr.read_bufr("tc_test.bufr",
    columns=("stormIdentifier", "latitude", "longitude",
             "pressureReducedToMeanSeaLevel"),
    filters={"stormIdentifier": "07W"})
df.head(24)

Unnamed: 0,stormIdentifier,latitude,longitude,pressureReducedToMeanSeaLevel
0,07W,18.8,121.1,96700.0
1,07W,19.2,121.0,96700.0
2,07W,19.6,120.6,96900.0
3,07W,20.2,120.3,96700.0
4,07W,20.7,119.8,96700.0
5,07W,21.4,119.3,95800.0
6,07W,21.9,118.8,95300.0
7,07W,23.0,118.4,95200.0
8,07W,24.2,118.1,95300.0
9,07W,25.4,117.4,97200.0


It also looks like that except storIdentifier, latitude and longitude you can only read one more column. So you can't read pressure/wind etc at the same time. Why is this, is beyond me.  

In [44]:
df1 = pdbufr.read_bufr("tc_test.bufr",
    columns=("stormIdentifier", "latitude", "longitude",
             "pressureReducedToMeanSeaLevel" ))
df1.head(25)

Unnamed: 0,stormIdentifier,latitude,longitude,pressureReducedToMeanSeaLevel
0,07W,18.8,121.1,96700.0
1,07W,19.2,121.0,96700.0
2,07W,19.6,120.6,96900.0
3,07W,20.2,120.3,96700.0
4,07W,20.7,119.8,96700.0
5,07W,21.4,119.3,95800.0
6,07W,21.9,118.8,95300.0
7,07W,23.0,118.4,95200.0
8,07W,24.2,118.1,95300.0
9,07W,25.4,117.4,97200.0


To see what keys are available you can read 'flat' bufr. This means it will just give you ALL they keys and values.

The first 13 are common for all the points. Then it starts to show the data from analysis and forecast. This is no where near simple.

In [83]:
df = pdbufr.read_bufr("tc_test.bufr", columns="data", flat=True)
df.T.head(13)

Unnamed: 0,0
subsetNumber,1
#1#centre,98
#1#subCentre,
#1#generatingApplication,1
#1#stormIdentifier,07W
#1#longStormName,DOKSURI
#1#techniqueForMakingUpInitialPerturbations,2
#1#ensembleMemberNumber,52
#1#ensembleForecastType,0
#1#year,2023


## This is the analysis:

In [95]:
df.T[14:64]

Unnamed: 0,0
#1#meteorologicalAttributeSignificance,1.0
#1#latitude,18.8
#1#longitude,121.4
#2#meteorologicalAttributeSignificance,5.0
#2#latitude,18.8
#2#longitude,121.1
#1#pressureReducedToMeanSeaLevel,96700.0
#3#meteorologicalAttributeSignificance,3.0
#3#latitude,19.1
#3#longitude,121.5


This is data for step 6 (don't ask me how I deducted this...)

In [100]:
df.T[65:113]

Unnamed: 0,0
#1#timePeriod,6.0
#4#meteorologicalAttributeSignificance,1.0
#4#latitude,19.2
#4#longitude,121.0
#2#pressureReducedToMeanSeaLevel,96700.0
#5#meteorologicalAttributeSignificance,3.0
#5#latitude,19.5
#5#longitude,121.3
#2#windSpeedAt10M,29.3
#4#windSpeedThreshold,18.0


This is step 12. Basically every 48 rows is one data for one time step (timePeriod), with the timePeriod being on the beginning.

In [101]:
df.T[114:162]

Unnamed: 0,0
#2#timePeriod,12.0
#6#meteorologicalAttributeSignificance,1.0
#6#latitude,19.6
#6#longitude,120.6
#3#pressureReducedToMeanSeaLevel,96900.0
#7#meteorologicalAttributeSignificance,3.0
#7#latitude,20.0
#7#longitude,120.9
#3#windSpeedAt10M,28.8
#7#windSpeedThreshold,18.0


From above you can see that the available parameters are:
- pressureReducedToMeanSeaLevel
- windSpeedAt10M and 
- effectiveRadiusWithRespectToWindSpeedsAboveThreshold but this one is a bit complicated and it doesn't seem to work correctly with pdbufr

On top of that you can only read one of them at the same time.

In [105]:
df1 = pdbufr.read_bufr("tc_test.bufr",
    columns=("stormIdentifier", "latitude", "longitude",
             "timePeriod" ))
df1.head(25)

Unnamed: 0,stormIdentifier,latitude,longitude,timePeriod
0,07W,19.1,121.5,6
1,07W,19.5,121.3,12
2,07W,20.0,120.9,18
3,07W,20.5,120.1,24
4,07W,20.4,120.4,30
5,07W,21.0,119.7,36
6,07W,21.7,119.3,42
7,07W,22.7,118.8,48
8,07W,24.1,118.4,54
9,07W,24.6,119.2,60


In [104]:
df2 = pdbufr.read_bufr("TC/tc_test.bufr",
    columns=("stormIdentifier", "latitude", "longitude",
             "windSpeedAt10M"))
df2

Unnamed: 0,stormIdentifier,latitude,longitude,windSpeedAt10M
0,07W,19.1,121.5,28.3
1,07W,19.5,121.3,29.3
2,07W,20.0,120.9,28.8
3,07W,20.5,120.1,32.4
4,07W,20.4,120.4,33.4
5,07W,21.0,119.7,47.3
6,07W,21.7,119.3,49.4
7,07W,22.7,118.8,49.9
8,07W,24.1,118.4,46.8
9,07W,24.6,119.2,20.1


In [106]:
df3 = pdbufr.read_bufr("TC/tc_test.bufr",
    columns=("stormIdentifier", "latitude", "longitude",
             "pressureReducedToMeanSeaLevel"))
df3

Unnamed: 0,stormIdentifier,latitude,longitude,pressureReducedToMeanSeaLevel
0,07W,18.8,121.1,96700.0
1,07W,19.2,121.0,96700.0
2,07W,19.6,120.6,96900.0
3,07W,20.2,120.3,96700.0
4,07W,20.7,119.8,96700.0
5,07W,21.4,119.3,95800.0
6,07W,21.9,118.8,95300.0
7,07W,23.0,118.4,95200.0
8,07W,24.2,118.1,95300.0
9,07W,25.4,117.4,97200.0


You can now merge these three dataframes to be able to work further.