# NEA JSON parsing

The following code will take a json file from the NEA API and generate the relevant `items` and `metadata` DataFrame. The different measures i.e. `rainfall`,`air-temperature` and `relative-humidity` have exactly the same structure. Hence, it is vital that different descriptions be used to differentiate the readings

In [1]:
import json
import pandas as pd

path = 'data/rainfall_2022-05-24T22-21-01.json'
with open(path,'r') as f:
    data=json.load(f)

In [2]:
def get_items(path:str,kind:str):
    '''Parse items key from JSON
    Args:
        path: Path to file
        kind: Display for description
    Returns:
        timestamp
        station_id
        value
        Description
    '''
    with open(path,'r') as f:
        data=json.load(f)
    
    items = data['items'][0]
    
    df = pd.DataFrame({
        'timestamp':[items['timestamp'] for i in range(len(items['readings']))],
        'station_id':[[(k,v) for k,v in obj.items()][0][1] for obj in items['readings']],
        'value':[[(k,v) for k,v in obj.items()][1][1] for obj in items['readings']],
        'Description':[kind for i in range(len(items['readings']))]})
    
    return df

def get_metadata(path:str,kind:str):
    '''Parse metadata key from JSON
    Args:
        path: Path to file
        kind: Display for description
    Returns:
        pandas DataFrame object of the following schema:
            timestamp            
            name
            latitude
            longitude
            station
            reading_type
            reading_unit
            Description                
    '''
    
    
    with open(path,'r') as f:
        data=json.load(f)
    
    metadata = data['metadata']
    
    timestamp = [data['items'][0]['timestamp'] for i in range(len(metadata['stations']))]
    location = [station['location'] for station in metadata['stations']]
    latitude = [station['latitude'] for station in location]
    longitude = [station['longitude'] for station in location]
    name = [station['name'] for station in metadata['stations']]
    station = [station['id'] for station in metadata['stations']]
    reading_type = metadata['reading_type']
    reading_unit = metadata['reading_unit']
    
    df = pd.DataFrame(
        {
            'timestamp':timestamp,            
            'name':name,
            'latitude':latitude,
            'longitude':longitude,
            'station':station,
            'reading_type':[reading_type for i in range(len(metadata['stations']))],
            'reading_unit':[reading_unit for i in range(len(metadata['stations']))],
            'Description':[kind for i in range(len(metadata['stations']))]
        }
    )
    return df    

In [3]:
rainfall_items = get_items('data/rainfall_2022-05-24T22-21-01.json','Rainfall')
rainfall_items

Unnamed: 0,timestamp,station_id,value,Description
0,2022-05-24T22:10:00+08:00,S77,0,Rainfall
1,2022-05-24T22:10:00+08:00,S109,0,Rainfall
2,2022-05-24T22:10:00+08:00,S90,0,Rainfall
3,2022-05-24T22:10:00+08:00,S114,0,Rainfall
4,2022-05-24T22:10:00+08:00,S50,0,Rainfall
...,...,...,...,...
61,2022-05-24T22:10:00+08:00,S69,0,Rainfall
62,2022-05-24T22:10:00+08:00,S08,0,Rainfall
63,2022-05-24T22:10:00+08:00,S116,0,Rainfall
64,2022-05-24T22:10:00+08:00,S104,0,Rainfall


In [4]:
rainfall_metadata = get_metadata('data/rainfall_2022-05-24T22-21-01.json','Rainfall')
rainfall_metadata

Unnamed: 0,timestamp,name,latitude,longitude,station,reading_type,reading_unit,Description
0,2022-05-24T22:10:00+08:00,Alexandra Road,1.29370,103.81250,S77,TB1 Rainfall 5 Minute Total F,mm,Rainfall
1,2022-05-24T22:10:00+08:00,Ang Mo Kio Avenue 5,1.37640,103.84920,S109,TB1 Rainfall 5 Minute Total F,mm,Rainfall
2,2022-05-24T22:10:00+08:00,Bukit Timah Road,1.31910,103.81910,S90,TB1 Rainfall 5 Minute Total F,mm,Rainfall
3,2022-05-24T22:10:00+08:00,Choa Chu Kang Avenue 4,1.38000,103.73000,S114,TB1 Rainfall 5 Minute Total F,mm,Rainfall
4,2022-05-24T22:10:00+08:00,Clementi Road,1.33370,103.77680,S50,TB1 Rainfall 5 Minute Total F,mm,Rainfall
...,...,...,...,...,...,...,...,...
61,2022-05-24T22:10:00+08:00,Upper Peirce Reservoir Park,1.37000,103.80500,S69,TB1 Rainfall 5 Minute Total F,mm,Rainfall
62,2022-05-24T22:10:00+08:00,Upper Thomson Road,1.37010,103.82710,S08,TB1 Rainfall 5 Minute Total F,mm,Rainfall
63,2022-05-24T22:10:00+08:00,West Coast Highway,1.28100,103.75400,S116,TB1 Rainfall 5 Minute Total F,mm,Rainfall
64,2022-05-24T22:10:00+08:00,Woodlands Avenue 9,1.44387,103.78538,S104,TB1 Rainfall 5 Minute Total F,mm,Rainfall


In [5]:
relative_humidity_items = get_items('data/relative-humidity_2022-05-24T22-21-01.json','Relative Humidity')
relative_humidity_items

Unnamed: 0,timestamp,station_id,value,Description
0,2022-05-24T22:15:00+08:00,S50,79.1,Relative Humidity
1,2022-05-24T22:15:00+08:00,S121,85.9,Relative Humidity
2,2022-05-24T22:15:00+08:00,S24,77.8,Relative Humidity
3,2022-05-24T22:15:00+08:00,S100,89.8,Relative Humidity


In [6]:
relative_humidity_metadata = get_metadata('data/relative-humidity_2022-05-24T22-21-01.json','Relative Humidity')
relative_humidity_metadata

Unnamed: 0,timestamp,name,latitude,longitude,station,reading_type,reading_unit,Description
0,2022-05-24T22:15:00+08:00,Clementi Road,1.3337,103.7768,S50,RH 1M F,percentage,Relative Humidity
1,2022-05-24T22:15:00+08:00,Old Choa Chu Kang Road,1.37288,103.72244,S121,RH 1M F,percentage,Relative Humidity
2,2022-05-24T22:15:00+08:00,Upper Changi Road North,1.3678,103.9826,S24,RH 1M F,percentage,Relative Humidity
3,2022-05-24T22:15:00+08:00,Woodlands Road,1.4172,103.74855,S100,RH 1M F,percentage,Relative Humidity


In [7]:
air_temperature_items = get_items('data/air-temperature_2022-05-24T22-21-01.json','Air Temperature')
air_temperature_items

Unnamed: 0,timestamp,station_id,value,Description
0,2022-05-24T22:15:00+08:00,S50,28.8,Air Temperature
1,2022-05-24T22:15:00+08:00,S121,27.8,Air Temperature
2,2022-05-24T22:15:00+08:00,S24,29.3,Air Temperature
3,2022-05-24T22:15:00+08:00,S100,27.5,Air Temperature


In [8]:
relative_humidity_metadata = get_metadata('data/air-temperature_2022-05-24T22-21-01.json','Air Temperature')
relative_humidity_metadata

Unnamed: 0,timestamp,name,latitude,longitude,station,reading_type,reading_unit,Description
0,2022-05-24T22:15:00+08:00,Clementi Road,1.3337,103.7768,S50,DBT 1M F,deg C,Air Temperature
1,2022-05-24T22:15:00+08:00,Old Choa Chu Kang Road,1.37288,103.72244,S121,DBT 1M F,deg C,Air Temperature
2,2022-05-24T22:15:00+08:00,Upper Changi Road North,1.3678,103.9826,S24,DBT 1M F,deg C,Air Temperature
3,2022-05-24T22:15:00+08:00,Woodlands Road,1.4172,103.74855,S100,DBT 1M F,deg C,Air Temperature


# LTA Parsing

The following code gets the longitude and latitude for all available taxis at each LTA api call.

The timestamp might differ from NEA data so we will need to handle this

In [9]:
path = 'data/taxis_2022-05-27T14-00-03.json'

In [10]:
def load_taxi_data(path):
    '''Parse taxi coordinate from JSON
    Args:
        path: Path to file
        
    Returns:
        pandas DataFrame object of the following schema:
            
    '''
    with open(path,'r') as f:
        lta_data=json.load(f)
    
    features = lta_data['features'][0]
    coordinates = features['geometry']['coordinates']
    longitude = [i[0] for i in coordinates]
    latitude = [i[1] for i in coordinates]
    timestamp = [features['properties']['timestamp'] for i in range(len(coordinates))]
    df = pd.DataFrame({'timestamp':timestamp,'longitude':longitude,'latitude':latitude})
    return df

In [11]:
load_taxi_data(path)

Unnamed: 0,timestamp,longitude,latitude
0,2022-05-27T13:59:30+08:00,103.624620,1.300000
1,2022-05-27T13:59:30+08:00,103.658034,1.312330
2,2022-05-27T13:59:30+08:00,103.669129,1.325566
3,2022-05-27T13:59:30+08:00,103.679667,1.326507
4,2022-05-27T13:59:30+08:00,103.679980,1.314830
...,...,...,...
1017,2022-05-27T13:59:30+08:00,103.988921,1.357636
1018,2022-05-27T13:59:30+08:00,103.989799,1.358836
1019,2022-05-27T13:59:30+08:00,103.989860,1.360000
1020,2022-05-27T13:59:30+08:00,103.989890,1.360000
