# GHCND.py

GHCND.py is a set of Python tools to make it easier to work with station data from [Global Historical Climatology Network Daily (GHCND)](https://www.ncdc.noaa.gov/ghcn-daily-description).

Extract variable/element of interest from the
Global Historical Climatology Network Daily (GHCND) Version 3

To run this you will need:
* A GHCN-D '.dly' file for your chosen station
* the 'ghcnd-stations.txt' metadata file (see https://www1.ncdc.noaa.gov/pub/data/ghcn/daily/ghcnd-stations.txt)

More information on the data can be found here:
https://www1.ncdc.noaa.gov/pub/data/ghcn/daily/readme.txt


## Example

In [3]:
import numpy as np
import pandas as pd
import ghcnd

In [9]:
'''
1. Find Station Names from Here:
    https://www1.ncdc.noaa.gov/pub/data/ghcn/daily/ghcnd-stations.txt
2. Download station file (for example...)
    wget ftp://ftp.ncdc.noaa.gov/pub/data/ghcn/daily/all/CHM00057516.dly .
'''

### Extract all data into a labelled numpy array
df = ghcnd.create_DataFrame('CHM00057516.dly')

### Filter data for, e.g., desired variable
var = 'TMIN'
df = df[ df['element'] == var ]

### Tidy up columns
df = df.rename(index=str, columns={"value": var})
df = df.drop(['element'], axis=1)

### Add metadata
df = ghcnd.add_metadata(df, 'ghcnd-stations.txt')

PRCP values have been divided by ten as specified by readme.txt
TMAX values have been divided by ten as specified by readme.txt
TMIN values have been divided by ten as specified by readme.txt


In [10]:
df.head(n=10)

Unnamed: 0,station,year,month,day,TMIN,mflag,qflag,sflag,lat,lon,elev,name
620,CHM00057516,1951,1,1,6.1,,,s,29.583,106.467,416.0,CHONGQING
621,CHM00057516,1951,1,2,4.3,,,s,29.583,106.467,416.0,CHONGQING
622,CHM00057516,1951,1,3,3.0,,,s,29.583,106.467,416.0,CHONGQING
623,CHM00057516,1951,1,4,8.3,,,s,29.583,106.467,416.0,CHONGQING
624,CHM00057516,1951,1,5,8.9,,,s,29.583,106.467,416.0,CHONGQING
625,CHM00057516,1951,1,6,8.4,,,s,29.583,106.467,416.0,CHONGQING
626,CHM00057516,1951,1,7,7.7,,,s,29.583,106.467,416.0,CHONGQING
627,CHM00057516,1951,1,8,10.5,,,s,29.583,106.467,416.0,CHONGQING
628,CHM00057516,1951,1,9,4.0,,,s,29.583,106.467,416.0,CHONGQING
629,CHM00057516,1951,1,10,3.2,,,s,29.583,106.467,416.0,CHONGQING


In [11]:
### Save to file
name = '-'.join(np.unique(df['name'].values))    # in case there are more than one
stn  = '-'.join(np.unique(df['station'].values))
df.to_csv(name+'_'+stn+'_'+var+'_GHCN-D.csv', index=False)