# GRIB1,	GRIB2,	NetCDF:	Which	do	I	choose?	

Your	choice	of	data	format	depends	on	the	date	range	that	you	need	and	what	you	
want	to	do	with	the	data.	

GRIB	is	a	World	Meteorological	Organization	(WMO)	international	standard	for	
exchanging	GRidded	BInary	data.	GRIB1	is	the	original	format	and	requires	a	
separate	parameter	table	to	unpack	the	data.		GRIB2	improves	upon	the	standard	
with	Hile	compression	and	the	inclusion	of	the	metadata/parameter	table	that	you	
need	to	unpack	the	data	in	each	Hile.		GRIB2	exploits	the	same	compression	software	
commonly	used	for	images	to	gain	a	roughly	50%	reduction	in	Hile	size	over	GRIB1.	

If	you	are	interested	in	2007.12.06	or	later,	then	choose	GRIB2	because	of	its	smaller	
size.		For	earlier	dates,	use	GRIB1.		If	you	are	using	software	that	reads	NetCDF	but	
not	GRIB,	then	convert	the	Hiles	to	NetCDF	*.nc	format.		Unless	you	need	to	use	
NetCDF	tools,	use	GRIB2	for	its	efHiciency.	

The	WRF	mesoscale	model	can	use	either	GRIB1	or	GRIB2.	Newer	versions	of	WRF	
Preprocessing	System	(WPS)	can	recognize	FNL	GRIB1	and	GRIB2;	WPS	will	even	
provide	the	appropriate	parameter	table	for	FNL	in	GRIB1	format	so	you	do	not	
need	to	do	so	manually.		If	you	are	using	the	older	MM5	mesoscale	model,	use	GRIB1.			

You	can	perform	many	data	processing	tasks	on	GRIB	Hiles	with	wgrib		(for	GRIB1)	
and	wgrib2	(for	GRIB2).		You	can	process	and	visualize	GRIB	Hiles	with	GRADS,	IDL,	
Panoply,	and	R	(using	the	rNOMADS	package).		You	can	use	the	extensive	NetCDF	
software	libraries	if	you	convert	the	Hiles	to	NetCDF.		However,	be	mindful	that	
NetCDF	Hiles	are	not	as	compact	as	GRIB2,	even	when	compressed.	

# Resources
- https://www.pgc.umn.edu/apps/convert/
- https://jswhit.github.io/pygrib/docs/
- http://pyaos.johnny-lin.com/
- https://nbviewer.jupyter.org/gist/jswhit/8635665

# Data
- https://drive.google.com/drive/folders/0B_wueX1dv4FsQ0t6Rm9fYTRUWGs?usp=sharing

In [70]:
import numpy as np
import pygrib

In [71]:
# open the grib file with pygrib

grbs = pygrib.open('data/_mars-atls04-95e2cf679cd58ee9b4db4dd119a05a8d-3vE789.grib')

In [72]:
# reset the iterator to the first position
# grbs.seek(0)
# => then do some looping

# extract the first message in the binary
grb = grbs.message(1)

# print the available keys in the message
print(grb.keys())
print(len(grb.keys()))

# print the values embedded in the message
print(grb.values)
print(len(grb.values))

# calculate latitude and longitude from grid
print(grb.latlons())

['parametersVersion', 'UseEcmfConventions', 'GRIBEX_boustrophedonic', 'hundred', 'globalDomain', 'GRIBEditionNumber', 'eps', 'offsetSection0', 'section0Length', 'totalLength', 'editionNumber', 'WMO', 'productionStatusOfProcessedData', 'section1Length', 'wrongPadding', 'table2Version', 'centre', 'centreDescription', 'generatingProcessIdentifier', 'gridDefinition', 'indicatorOfParameter', 'parameterName', 'parameterUnits', 'indicatorOfTypeOfLevel', 'pressureUnits', 'typeOfLevelECMF', 'typeOfLevel', 'level', 'yearOfCentury', 'month', 'day', 'hour', 'minute', 'second', 'unitOfTimeRange', 'P1', 'P2', 'timeRangeIndicator', 'numberIncludedInAverage', 'numberMissingFromAveragesOrAccumulations', 'centuryOfReferenceTimeOfData', 'subCentre', 'paramIdECMF', 'paramId', 'cfNameECMF', 'cfName', 'cfVarNameECMF', 'cfVarName', 'unitsECMF', 'units', 'nameECMF', 'name', 'decimalScaleFactor', 'setLocalDefinition', 'dataDate', 'year', 'dataTime', 'julianDay', 'stepUnits', 'stepType', 'stepRange', 'startStep

ValueError: unsupported grid sh

In [73]:
# extract data and get lat/lon values for the relevant subset
# degrees as specified in phd thesis

grb = grbs.message(2)
data, lats, lons = grb.data(lat1=5,lat2=40,lon1=62.5,lon2=97.5)
data.shape, lats.min(), lats.max(), lons.min(), lons.max()

ValueError: unsupported grid sh

In [74]:
grbs.message(1).values

array([  2.86775391e+02,   0.00000000e+00,  -3.92261982e+00, ...,
         2.30260826e-04,   1.06957364e-03,  -8.94591405e-04])

In [75]:
grbs.message(1).keys()

['parametersVersion',
 'UseEcmfConventions',
 'GRIBEX_boustrophedonic',
 'hundred',
 'globalDomain',
 'GRIBEditionNumber',
 'eps',
 'offsetSection0',
 'section0Length',
 'totalLength',
 'editionNumber',
 'WMO',
 'productionStatusOfProcessedData',
 'section1Length',
 'wrongPadding',
 'table2Version',
 'centre',
 'centreDescription',
 'generatingProcessIdentifier',
 'gridDefinition',
 'indicatorOfParameter',
 'parameterName',
 'parameterUnits',
 'indicatorOfTypeOfLevel',
 'pressureUnits',
 'typeOfLevelECMF',
 'typeOfLevel',
 'level',
 'yearOfCentury',
 'month',
 'day',
 'hour',
 'minute',
 'second',
 'unitOfTimeRange',
 'P1',
 'P2',
 'timeRangeIndicator',
 'numberIncludedInAverage',
 'numberMissingFromAveragesOrAccumulations',
 'centuryOfReferenceTimeOfData',
 'subCentre',
 'paramIdECMF',
 'paramId',
 'cfNameECMF',
 'cfName',
 'cfVarNameECMF',
 'cfVarName',
 'unitsECMF',
 'units',
 'nameECMF',
 'name',
 'decimalScaleFactor',
 'setLocalDefinition',
 'dataDate',
 'year',
 'dataTime',
 'jul

In [95]:
grbs.rewind()

temparature_values = []
for grb in grbs:
    if grb.parameterName == 'Temperature': 
        temparature_values.append(grb.values)
temparature_values = np.array(t2mens)
print(temparature_values.shape, temparature_values.min(), temparature_values.max(), temparature_values.mean())

humidity_values= []
for grb in grbs:
    if grb.parameterName == 'Relative Humidity': 
        humidity_values.append(grb.values)
humidity_values = np.array(t2mens)
print(humidity_values.shape, humidity_values.min(), humidity_values.max(), humidity_values.mean())

lats, lons = grb.latlons()  # get the lats and lons for the grid.
print('min/max lat and lon',lats.min(), lats.max(), lons.min(), lons.max())

(3896, 25760) -13.4843883514 290.987548828 0.0107722860085
(3896, 25760) -13.4843883514 290.987548828 0.0107722860085


ValueError: unsupported grid sh