# Merge data from AURN



## Read and merge data by a bounds of longitude and latitude

Users can give a bounds of longitude and latitude to download the station data.

#### Load the OBSAQ package and Define the range

In [2]:
import obsaq

lon_min = -9
lon_max = 1.8
lat_min = 49
lat_max = 61
bounds=[lon_min, lon_max, lat_min, lat_max]

#### Get the station metadata

NOTE: This is the all station information and the next step will choose the stations by bounds.

In [3]:
meta = obsaq.meta()
site_table = meta.get_metadata('aurn')

#### Choose the stations by range

In [4]:
final_sites = meta.get_site(bounds=bounds)
final_sites.head(5)

Site is selected by bounds: [-9, 1.8, 49, 61]


Unnamed: 0,site_id,site_name,location_type,latitude,longitude,parameter,Parameter_name,start_date,end_date,ratified_to,zone,agglomeration,local_authority
0,ABD,Aberdeen,Urban Background,57.15736,-2.094278,O3,Ozone,2003-08-01,2021-09-20,2021-09-20,North East Scotland,,Aberdeen City
1,ABD,Aberdeen,Urban Background,57.15736,-2.094278,NO,Nitric oxide,1999-09-18,2021-09-20,2021-09-20,North East Scotland,,Aberdeen City
2,ABD,Aberdeen,Urban Background,57.15736,-2.094278,NO2,Nitrogen dioxide,1999-09-18,2021-09-20,2021-09-20,North East Scotland,,Aberdeen City
3,ABD,Aberdeen,Urban Background,57.15736,-2.094278,NOXasNO2,Nitrogen oxides as nitrogen dioxide,1999-09-18,2021-09-20,2021-09-20,North East Scotland,,Aberdeen City
4,ABD,Aberdeen,Urban Background,57.15736,-2.094278,SO2,Sulphur dioxide,2001-01-01,2007-09-30,2007-09-30,North East Scotland,,Aberdeen City


#### Start to download the merged targeted station data
Download the final merged file for all targeted station data using memory.

- `pollutant`: See names of pollutants. Define one pollutant, diverse pollutants or all pollutants to download the data for them.
               Pollutants can be defined as "PM2.5","PM10","O3","NO","NO2","NOXasNO2" or "SO2".
- `start`: the start date of data to be downloaded.
- `end`: the end date of data to be downloaded.
- `year`: the year of data to be downloaded. Defaults to 2010.
- `output_dir`: the directory to save the downloaded data. 
- `download_mode`: "Stream" for saving final and intermediate files while "memory" for only the final file.
- `save_per_site`: whether save files for every station individually.
- `save_merged`: whether save the merged file for all selected data.
- `add_site_id`: whether include site id in the downloaded file.

Warnings are normal for processing data with different format. Ingoring them can be fine. 

In [5]:
merged_df = meta.download_sites(
    port="aurn",
    pollutant="PM2.5",
    start="2017-12-01",
    end="2018-11-30",
    output_dir="data/test_pm25_final",
    download_mode="memory",   
    save_per_site=False,    
    save_merged=True,
    add_site_id=True
)


     site_id        site_name     location_type   latitude  longitude  \
0        ABD         Aberdeen  Urban Background  57.157360  -2.094278   
1        ABD         Aberdeen  Urban Background  57.157360  -2.094278   
2        ABD         Aberdeen  Urban Background  57.157360  -2.094278   
3        ABD         Aberdeen  Urban Background  57.157360  -2.094278   
6        ABD         Aberdeen  Urban Background  57.157360  -2.094278   
...      ...              ...               ...        ...        ...   
3057    YK11  York Fishergate     Urban Traffic  53.951889  -1.075861   
3058    YK11  York Fishergate     Urban Traffic  53.951889  -1.075861   
3059    YK11  York Fishergate     Urban Traffic  53.951889  -1.075861   
3060    YK11  York Fishergate     Urban Traffic  53.951889  -1.075861   
3061    YK11  York Fishergate     Urban Traffic  53.951889  -1.075861   

     parameter                             Parameter_name  start_date  \
0           O3                                    

## Merge data by a point of longitude and latitude

Users can give a point of longitude and latitude to download the station data.

### Load the obsaq package and Define the point

In [6]:
import obsaq

lon = 0.0
lat = 55.0
point = [lon, lat]

### Get the station metadata

NOTE: This is the all station information and the next step will choose the stations by point.

In [7]:
meta = obsaq.obsaq.meta()
site_table = meta.get_metadata('aurn')

site_table.head(5)

Unnamed: 0,site_id,site_name,location_type,latitude,longitude,parameter,Parameter_name,start_date,end_date,ratified_to,zone,agglomeration,local_authority
0,ABD,Aberdeen,Urban Background,57.15736,-2.094278,O3,Ozone,2003-08-01,2021-09-20,2021-09-20,North East Scotland,,Aberdeen City
1,ABD,Aberdeen,Urban Background,57.15736,-2.094278,NO,Nitric oxide,1999-09-18,2021-09-20,2021-09-20,North East Scotland,,Aberdeen City
2,ABD,Aberdeen,Urban Background,57.15736,-2.094278,NO2,Nitrogen dioxide,1999-09-18,2021-09-20,2021-09-20,North East Scotland,,Aberdeen City
3,ABD,Aberdeen,Urban Background,57.15736,-2.094278,NOXasNO2,Nitrogen oxides as nitrogen dioxide,1999-09-18,2021-09-20,2021-09-20,North East Scotland,,Aberdeen City
4,ABD,Aberdeen,Urban Background,57.15736,-2.094278,SO2,Sulphur dioxide,2001-01-01,2007-09-30,2007-09-30,North East Scotland,,Aberdeen City


### Choose the stations by point

In [8]:
final_sites = meta.get_site(point=point)
final_sites

Site is selected by point: [0.0, 55.0]


Unnamed: 0,site_id,site_name,location_type,latitude,longitude,parameter,Parameter_name,start_date,end_date,ratified_to,zone,agglomeration,local_authority
1267,HM,High Muffles,Rural Background,54.334497,-0.80882,O3,Ozone,1987-07-16,ongoing,2025-09-30,Yorkshire & Humberside,,Ryedale


### Start to merge the selected station data
- Use "start" and "end" to define a period of time.
- Other parameters are the same as "Download data by a bounds of longitude and latitude" part

In [9]:
meta.download_sites(
    port="aurn",
    pollutant="PM2.5",
    start="2017-12-01",
    end="2018-11-30",
    output_dir="data/test_pm25_station",
    download_mode="memory",     
    save_per_site=False,
    save_merged=True,
    add_site_id=True
)

     site_id     site_name     location_type   latitude  longitude parameter  \
1267      HM  High Muffles  Rural Background  54.334497   -0.80882        O3   

     Parameter_name  start_date end_date ratified_to                    zone  \
1267          Ozone  1987-07-16  ongoing  2025-09-30  Yorkshire & Humberside   

     agglomeration local_authority  
1267           NaN         Ryedale  


## Merge data by site_id

### Load the obsaq package and check the site information

In [10]:
import obsaq

meta = obsaq.obsaq.meta()
site_table = meta.get_metadata('aurn')

site_table.head(5)

Unnamed: 0,site_id,site_name,location_type,latitude,longitude,parameter,Parameter_name,start_date,end_date,ratified_to,zone,agglomeration,local_authority
0,ABD,Aberdeen,Urban Background,57.15736,-2.094278,O3,Ozone,2003-08-01,2021-09-20,2021-09-20,North East Scotland,,Aberdeen City
1,ABD,Aberdeen,Urban Background,57.15736,-2.094278,NO,Nitric oxide,1999-09-18,2021-09-20,2021-09-20,North East Scotland,,Aberdeen City
2,ABD,Aberdeen,Urban Background,57.15736,-2.094278,NO2,Nitrogen dioxide,1999-09-18,2021-09-20,2021-09-20,North East Scotland,,Aberdeen City
3,ABD,Aberdeen,Urban Background,57.15736,-2.094278,NOXasNO2,Nitrogen oxides as nitrogen dioxide,1999-09-18,2021-09-20,2021-09-20,North East Scotland,,Aberdeen City
4,ABD,Aberdeen,Urban Background,57.15736,-2.094278,SO2,Sulphur dioxide,2001-01-01,2007-09-30,2007-09-30,North East Scotland,,Aberdeen City


### Choose the stations by site_id

In [11]:
final_sites = meta.get_site(site_id='ABD')
final_sites.drop_duplicates(subset='site_id')

Site is selected by site_id: ABD


Unnamed: 0,site_id,site_name,location_type,latitude,longitude,parameter,Parameter_name,start_date,end_date,ratified_to,zone,agglomeration,local_authority
0,ABD,Aberdeen,Urban Background,57.15736,-2.094278,O3,Ozone,2003-08-01,2021-09-20,2021-09-20,North East Scotland,,Aberdeen City


### Start to merge the selected station data
- Use "start" and "end" to define a period of time.
- Other parameters are the same as "Download data by a bounds of longitude and latitude" part

In [12]:
meta.download_sites(
    port="aurn",
    pollutant="PM2.5",
    start="2017-12-01",
    end="2018-11-30",
    output_dir="data/test_pm25_siteid",
    download_mode="memory",     
    save_per_site=False,
    save_merged=True,
    add_site_id=True
)

   site_id site_name     location_type  latitude  longitude parameter  \
0      ABD  Aberdeen  Urban Background  57.15736  -2.094278        O3   
1      ABD  Aberdeen  Urban Background  57.15736  -2.094278        NO   
2      ABD  Aberdeen  Urban Background  57.15736  -2.094278       NO2   
3      ABD  Aberdeen  Urban Background  57.15736  -2.094278  NOXasNO2   
6      ABD  Aberdeen  Urban Background  57.15736  -2.094278      PM10   
7      ABD  Aberdeen  Urban Background  57.15736  -2.094278      NV10   
8      ABD  Aberdeen  Urban Background  57.15736  -2.094278       V10   
9      ABD  Aberdeen  Urban Background  57.15736  -2.094278     PM2.5   
10     ABD  Aberdeen  Urban Background  57.15736  -2.094278     NV2.5   
11     ABD  Aberdeen  Urban Background  57.15736  -2.094278      V2.5   
12     ABD  Aberdeen  Urban Background  57.15736  -2.094278        wd   
13     ABD  Aberdeen  Urban Background  57.15736  -2.094278        ws   
14     ABD  Aberdeen  Urban Background  57.15736  -

Unnamed: 0,site_id,Date,time,PM<sub>2.5</sub> particulate matter (Hourly measured),status.7,unit.7
0,ABD,30-11-2017,24:00,2.2,R,ugm-3 (TEOM FDMS)
1,ABD,01-12-2017,01:00,2.1,R,ugm-3 (TEOM FDMS)
2,ABD,01-12-2017,02:00,3.2,R,ugm-3 (TEOM FDMS)
3,ABD,01-12-2017,03:00,4.1,R,ugm-3 (TEOM FDMS)
4,ABD,01-12-2017,04:00,2.4,R,ugm-3 (TEOM FDMS)
...,...,...,...,...,...,...
8755,ABD,30-11-2018,19:00,3.0,R,ugm-3 (TEOM FDMS)
8756,ABD,30-11-2018,20:00,5.6,R,ugm-3 (TEOM FDMS)
8757,ABD,30-11-2018,21:00,3.0,R,ugm-3 (TEOM FDMS)
8758,ABD,30-11-2018,22:00,0.3,R,ugm-3 (TEOM FDMS)
