# 杭州萧山区项目
## 本地排放清单预处理`Local Emission Inventory`

---
*@author: Evan*\
*@date: 2023-07-19*

In [1]:
import xarray as xr
import numpy as np
import pandas as pd
import os

# silence the warning note
import warnings
warnings.filterwarnings("ignore")

import sys
sys.path.append('../../src/')
from namelist import *
import findpoint as fp

创建网格变量

In [2]:
grid = xr.open_dataset(datadir+'GRIDCRO2D_2022234.nc')
lat = grid.LAT[0,0,:,:]
lon = grid.LON[0,0,:,:]

gridfile = xr.Dataset(
    data_vars = dict(
        ShapeVar = (['y','x'],np.zeros_like(lat),{'long name':'not-used variable'})
    ),
    coords=dict(
        latitude = (['y','x'],lat.data),
        longitude = (['y','x'],lon.data)
    )
)
gridfile

读取本地清单点源数据，对空值部分用0填充

In [6]:
ef = pd.read_excel(emispoint,skiprows=1)
# 将污染物是空值的填充为0
species = ['SO2','NOx','CO','PM10','PM25','VOCs','NH3','BC','OC']
ef[species] = ef[species].fillna(np.float64(0),inplace=False)

用嵌套字典存储多个dataframe，便于循环处理

In [9]:
dfs = {}
grouped = ef.groupby(['一级源类', '二级源类'])
for group_name, group_data in grouped:
    if group_name[0] not in dfs:
        dfs[group_name[0]] = {}
    dfs[group_name[0]][group_name[1]] = pd.DataFrame(group_data)

设定本地与MEIC源分类的对应关系

In [50]:
smp = pd.read_excel(secmap).groupby('SourceType').get_group('point')
smp

Unnamed: 0,MEIC,LocalPrimarySource,LocalSecondarySource,SourceType
0,Power,电力源,不分,point
1,Industry,工业源,不分,point
2,Industry,扬尘源,堆场扬尘,point
3,Transportation,存储运输源,加油站,point
4,Transportation,存储运输源,油气运输,point
5,Transportation,非道路移动源,飞机,point
12,Agriculture,农业源,畜禽养殖,point
23,Residential,非工业溶剂使用,汽车维修,point
24,Residential,废弃物处理,废水处理,point
25,Residential,废弃物处理,固废处理,point


根据对应关系，将本地源分类映射到MEIC的五类中

In [59]:
smp_grouped = smp.groupby('MEIC')
sections = list(smp_grouped.groups.keys()) # ['Agriculture', 'Industry', 'Power', 'Residential', 'Transportation']
df_target = {}
for sec in sections:
    target_source_list = smp_grouped.get_group(sec)[['LocalPrimarySource','LocalSecondarySource']].reset_index(drop=True)
    df_temp = {}
    for n in range(len(target_source_list)):
        df_temp[n] = dfs[target_source_list.iloc[n,0]][target_source_list.iloc[n,1]]
    
    df_target[sec] = pd.concat(df_temp,axis=0).reset_index(drop=True)


将清单依照经纬度写入网格点，保存为nc文件

In [64]:
for sec in sections:
    temp = fp.assign_values_to_grid(df_target[sec],gridfile,'lon','lat',species)
    temp.to_netcdf(f'D:/Download/{sec}.nc')
    print(f'{sec} finished!')


Complete SO2
Complete NOx
Complete CO
Complete PM10
Complete PM25
Complete VOCs
Complete NH3
Complete BC
Complete OC
Agriculture finished!
Complete SO2
Complete NOx
Complete CO
Complete PM10
Complete PM25
Complete VOCs
Complete NH3
Complete BC
Complete OC
Industry finished!
Complete SO2
Complete NOx
Complete CO
Complete PM10
Complete PM25
Complete VOCs
Complete NH3
Complete BC
Complete OC
Power finished!
Complete SO2
Complete NOx
Complete CO
Complete PM10
Complete PM25
Complete VOCs
Complete NH3
Complete BC
Complete OC
Residential finished!
Complete SO2
Complete NOx
Complete CO
Complete PM10
Complete PM25
Complete VOCs
Complete NH3
Complete BC
Complete OC
Transportation finished!
