# 一些实例

本文主要记录一些平常自己写的，在实际中使用的代码。

## 计算流域平均气象时间序列数据

这里以Daymet 2天的网格数据，CAMELS 多个流域为例，计算这些流域这两天每日的forcing数据流域平均值。

因为CAMELS文件比较大，所以这里没有传到github上，需要手动从[这里](https://ral.ucar.edu/sites/default/files/public/product-tool/camels-catchment-attributes-and-meteorology-for-large-sample-studies-dataset-downloads/basin_set_full_res.zip)下载 CAMELS 的shpfile ，然后在本文件夹下创建一个large_files文件夹，并将下好的CAMELS shpfile放到其中。

也可以选择上传到GEE asset上，然后直接调用。

In [1]:
import ee
import geemap

Map = geemap.Map(center=[40, -100], zoom=4)
Map

Map(center=[40, -100], controls=(WidgetControl(options=['position', 'transparent_bg'], widget=HBox(children=(T…

In [2]:
# Add Earth Engine dataset
daymet = ee.ImageCollection("NASA/ORNL/DAYMET_V4")

In [3]:
# 本地文件
# camels_shp = 'large_files/HCDN_nhru_final_671.shp'
# camels = geemap.shp_to_ee(camels_shp)
# 远程asset上
camels = ee.FeatureCollection("users/wenyu_ouyang/Camels/HCDN_nhru_final_671")
camels

<ee.featurecollection.FeatureCollection at 0x2a3a35d3040>

In [4]:
# maybe better to use Number to replace js number
year = ee.Number(2000)
month = ee.Number(1)
day = ee.Number(1)
start_date = ee.Date.fromYMD(year, month, day)
end_date = start_date.advance(2, "day")
end_date

<ee.ee_date.Date at 0x2a3a94c3640>

In [5]:
days_num = end_date.difference(start_date, "day")
# count day from zero, and ee.List.sequence is a closed interval
days = ee.List.sequence(ee.Number(0), days_num.add(-1))
# get Imagecollection and filter, choose two days for test
daymet_days = daymet.filter(ee.Filter.date(start_date, end_date))

# show maximumTemperature, just for test
maximumTemperature = daymet_days.select("tmax")
maximumTemperatureVis = {
    "min": -40.0,
    "max": 30.0,
    "palette": ["1621A2", "white", "cyan", "green", "yellow", "orange", "red"],
}
Map.setCenter(-110.21, 35.1, 4)
Map.addLayer(maximumTemperature, maximumTemperatureVis, "Maximum Temperature")

In [6]:
def nestedMappedReducer(featCol, imgCol):
    def mapReducerOverImgCol(feat):
        def imgReducer(img):
            vals = img.reduceRegion(
                reducer=ee.Reducer.mean(), geometry=feat.geometry(), scale=1000
            )
            return ee.Feature(None, vals).set(
                {
                    "system:time_start": img.get("system:time_start"),
                    "hru_id": feat.get("hru_id"),
                }
            )

        return imgCol.map(imgReducer)

    return featCol.map(mapReducerOverImgCol).flatten()

执行函数并导出到 google drive，这样本地可以关闭，远程也在运行了，很适合较长时间的计算。

In [9]:
daymet_regions = nestedMappedReducer(camels, daymet_days)
# export to google drive
geemap.ee_export_vector_to_drive(
    daymet_regions,
    description="daymet_camels_mean_20000101-02new",
    folder="export",
    # before geemap version 0.15, the argument of fileForamt is "file_format" after 0.16 it is fileFormat
    fileFormat="csv",
    selectors=[
        "hru_id",
        "system:time_start",
        "dayl",
        "prcp",
        "srad",
        "swe",
        "tmax",
        "tmin",
        "vp",
    ],
)

Exporting daymet_camels_mean_20000101-02new... Please check the Task Manager from the JavaScript Code Editor.


来自GEE的建议，使用reduceRegions可能比reduceRegion更快一些，但是这里不方便设置每个feature的time_start和hru_id，应该还是需要对featurecollection加一个map函数，所以暂时就不尝试了。

```Javascript
results = daymet_days.map(function(img) {
    return img.reduceRegions({
        ...
    });
});
```

上面给出的是JS代码，关于map和reduce，尤其是嵌套的js代码转python代码，可以参考这里：https://gis.stackexchange.com/questions/365121/how-to-nest-mapped-functions-with-the-earth-engine-python-api

## 小时尺度气象数据平均到日尺度

这里以NLDAS数据为例，将shpfile中的小时尺度数据均化到日尺度。shpfile可以仍选择CAMELS，这里使用了自己生成的，生成过程请参考 AutoGIS/9.1-gallery-vector.ipynb 中例五，这里直接使用结果得到的shpfile了，我已经将其上传至自己的GEE上了，所以就不使用本地的了。

In [1]:
import ee
import geemap

Map = geemap.Map(center=[40, -100], zoom=4)
Map

Map(center=[40, -100], controls=(WidgetControl(options=['position', 'transparent_bg'], widget=HBox(children=(T…

In [2]:
# Add Earth Engine dataset
nldas = ee.ImageCollection("NASA/NLDAS/FORA0125_H002")

In [3]:
# 远程asset上
shpfiles = ee.FeatureCollection("users/wenyu_ouyang/site_nobs_DO")
shpfiles

<ee.featurecollection.FeatureCollection at 0x7f1a1e07fd30>

In [4]:
year = ee.Number(2000)
month = ee.Number(1)
day = ee.Number(1)
start_date = ee.Date.fromYMD(year, month, day)
end_date = start_date.advance(2, "day")
end_date

<ee.ee_date.Date at 0x7f1a1e07f7f0>

In [5]:
days_num = end_date.difference(start_date, "day")
# count day from zero, and ee.List.sequence is a closed interval
days = ee.List.sequence(ee.Number(0), days_num.add(-1))
# get Imagecollection and filter, choose two days for test
nldas_2d = nldas.filter(ee.Filter.date(start_date, end_date))

# show temperature, just for test
temperature = nldas_2d.select("temperature")
temperatureVis = {
    "min": -5.0,
    "max": 40.0,
    "palette": ["3d2bd8", "4e86da", "62c7d8", "91ed90", "e4f178", "ed6a4c"],
}
Map.addLayer(temperature, temperatureVis, "Temperature")
# 查看上面的地图

In [6]:
# forcing variables that will be calculated for its avg
avg_forcings = nldas_2d.select(
    "temperature",
    "specific_humidity",
    "pressure",
    "wind_u",
    "wind_v",
    "longwave_radiation",
    "convective_fraction",
    "shortwave_radiation",
)
avg_forcings.limit(2)

<ee.imagecollection.ImageCollection at 0x7f1a88137e20>

In [7]:
# forcing variables that will be calculated for its sum
sum_forcings = nldas_2d.select(
    "potential_energy", "potential_evaporation", "total_precipitation"
)
sum_forcings.limit(2)

<ee.imagecollection.ImageCollection at 0x7f1a8812dd60>

In [8]:
def daysMapImgsAvgReduce(dayCol, imgCol, start_day):
    def dayAvgReducerOverImgCol(oneDay):
        start = start_day.advance(ee.Number(oneDay), "day")
        end = start_day.advance(ee.Number(oneDay).add(ee.Number(1)), "day")
        return (
            imgCol.filter(ee.Filter.date(start, end))
            .reduce(ee.Reducer.mean())
            .set({"day_of_all_years": start_day.advance(oneDay, "day")})
        )

    return dayCol.map(dayAvgReducerOverImgCol)

In [9]:
avg_days = ee.ImageCollection(daysMapImgsAvgReduce(days, avg_forcings, start_date))

In [10]:
def daysMapImgsSumReduce(dayCol, imgCol, start_day):
    def daySumReducerOverImgCol(oneDay):
        start = start_day.advance(ee.Number(oneDay), "day")
        end = start_day.advance(ee.Number(oneDay).add(ee.Number(1)), "day")
        return (
            imgCol.filter(ee.Filter.date(start, end))
            .reduce(ee.Reducer.sum())
            .set({"day_of_all_years": start_day.advance(oneDay, "day")})
        )

    return dayCol.map(daySumReducerOverImgCol)

In [11]:
sum_days = ee.ImageCollection(daysMapImgsSumReduce(days, sum_forcings, start_date))

In [12]:
# show avg temperature of all days, just for test
tmpr_avg = avg_days.select("temperature_mean").reduce(ee.Reducer.mean())
temperature_avg = tmpr_avg.select("temperature_mean_mean")
Map.addLayer(temperature_avg, temperatureVis, "Temperature_2d_avg")

最后求一下流域平均：

In [13]:
def nestedMappedReducerNldas(featCol, imgCol, scaleNum):
    def mapReducerOverImgColNldas(feat):
        def imgReducerNldas(img):
            vals = img.reduceRegion(
                reducer=ee.Reducer.mean(), geometry=feat.geometry(), scale=scaleNum
            )
            return ee.Feature(None, vals).set(
                {
                    "time_start": img.get("day_of_all_years"),
                    "gage_id": feat.get("GAGE_ID"),
                }
            )

        return imgCol.map(imgReducerNldas)

    return featCol.map(mapReducerOverImgColNldas).flatten()

In [14]:
# 0.125 degree approximately equals to 13875m
avg_day_regions = nestedMappedReducerNldas(shpfiles, avg_days, 13875)
# export to google drive
geemap.ee_export_vector_to_drive(
    avg_day_regions,
    description="nldas_do_avg_mean_20000101-02",
    folder="export",
    # file_format="csv",
    fileFormat="csv",
    selectors=[
        "gage_id",
        "time_start",
        "temperature_mean",
        "specific_humidity_mean",
        "pressure_mean",
        "wind_u_mean",
        "wind_v_mean",
        "longwave_radiation_mean",
        "convective_fraction_mean",
        "shortwave_radiation_mean",
    ],
)

Exporting nldas_do_avg_mean_20000101-02...


In [15]:
avg_day_regions_4sum = nestedMappedReducerNldas(shpfiles, sum_days, 13875)
# export to google drive
geemap.ee_export_vector_to_drive(
    avg_day_regions_4sum,
    description="nldas_do_sum_mean_20000101-02",
    folder="export",
    file_format="csv",
    selectors=[
        "gage_id",
        "time_start",
        "potential_energy_sum",
        "potential_evaporation_sum",
        "total_precipitation_sum",
    ],
)

Exporting nldas_do4sum_mean_20000101-02...


下面是一个整合的一年计算的实例供实际使用：

In [16]:
import ee
import geemap

year_num = 1980
year = ee.Number(year_num)
month = ee.Number(1)
day = ee.Number(1)
start_date = ee.Date.fromYMD(year, month, day)
end_date = start_date.advance(1, "year")
days_num = end_date.difference(start_date, "day")
days = ee.List.sequence(ee.Number(0), days_num.add(-1))

nldas = ee.ImageCollection("NASA/NLDAS/FORA0125_H002")
shpfiles = ee.FeatureCollection("users/wenyu_ouyang/site_nobs_DO")

nldas_days = nldas.filter(ee.Filter.date(start_date, end_date))
avg_forcings = nldas_days.select(
    "temperature",
    "specific_humidity",
    "pressure",
    "wind_u",
    "wind_v",
    "longwave_radiation",
    "convective_fraction",
    "shortwave_radiation",
)
sum_forcings = nldas_days.select(
    "potential_energy", "potential_evaporation", "total_precipitation"
)
avg_days = ee.ImageCollection(daysMapImgsAvgReduce(days, avg_forcings, start_date))
sum_days = ee.ImageCollection(daysMapImgsSumReduce(days, sum_forcings, start_date))
# 0.125 degree approximately equals to 13875m
avg_day_regions = nestedMappedReducerNldas(shpfiles, avg_days, 13875)
# export to google drive
geemap.ee_export_vector_to_drive(
    ee_object=avg_day_regions,
    description="nldas_do_avg_mean_" + str(year_num),
    folder="NLDAS",
    file_format="csv",
    selectors=[
        "gage_id",
        "time_start",
        "temperature_mean",
        "specific_humidity_mean",
        "pressure_mean",
        "wind_u_mean",
        "wind_v_mean",
        "longwave_radiation_mean",
        "convective_fraction_mean",
        "shortwave_radiation_mean",
    ],
)
avg_day_regions_4sum = nestedMappedReducerNldas(shpfiles, sum_days, 13875)
# export to google drive
geemap.ee_export_vector_to_drive(
    ee_object=avg_day_regions_4sum,
    description="nldas_do_sum_mean_" + str(year_num),
    folder="NLDAS",
    file_format="csv",
    selectors=[
        "gage_id",
        "time_start",
        "potential_energy_sum",
        "potential_evaporation_sum",
        "total_precipitation_sum",
    ],
)

Exporting nldas_do_avg_mean_1980...
Exporting nldas_do_sum_mean_1980...


上面这个完整实例在GEE中运行时间很长，如果不想让其运行，在GEE中手动终止任务即可。

## 选择给定点坐标处的数据

主要参考资料：

- [GEE JavaScript Tutorials: Extracting Raster Values for Points](https://developers.google.com/earth-engine/tutorials/community/extract-raster-values-for-points)
- [qiswqs: earthengine=py-notebooks](https://github.com/giswqs/earthengine-py-notebooks/blob/master/Image/extract_value_to_points.ipynb)
- [What is the difference between sample, sampleRegions, and stratifiedSample in Google Earth Engine?](https://gis.stackexchange.com/questions/304929/what-is-the-difference-between-sample-sampleregions-and-stratifiedsample-in-go)

### Reduce

代码的主要目的就是在想要的点处提取某个栅格数据集数据，这是很常见的应用，比如提取某个点处NDVI的值，绘制出其时间变化过程线。提取出的数据会被放到一个feature collections里面。很多时候点坐标是有误差的，所以我们通常不会仅仅使用点所在位置处的网格值，还会通过一个buffer操作把周围数据一起取出做reduce计算。

In [10]:
import ee
import geemap

In [11]:
Map = geemap.Map(center=[40, -100], zoom=4)
Map

Map(center=[40, -100], controls=(WidgetControl(options=['position', 'transparent_bg'], widget=HBox(children=(T…

In [12]:
# Add Earth Engine dataset
# Input imagery is ERA5 daily dataset: https://developers.google.com/earth-engine/datasets/catalog/ECMWF_ERA5_DAILY?hl=en
year = 1980
# era5 = ee.ImageCollection("ECMWF/ERA5/DAILY").filterDate(str(year) + '-01-01', str(year+1)+'-01-01')
era5 = ee.ImageCollection("ECMWF/ERA5/DAILY").filterDate(
    str(year) + "-01-01", str(year) + "-01-03"
)

In [13]:
pts = ee.FeatureCollection("projects/ee-owen/assets/globalpaper_site3221")

In [14]:
def buffer_points(radius, bounds):
    def buffer_point(pt):
        pt = ee.Feature(pt)
        return pt.buffer(radius).bounds() if bounds else pt.buffer(radius)

    return buffer_point

In [15]:
# apply a 45 m radius buffer (set the second argument as false i.e., do not use bounds)
pts_era5 = pts.map(buffer_points(10000, False))
Map.addLayer(pts_era5, {}, "buffer");

In [16]:
def zonal_stats(ic, fc, params):
    # Initialize internal params dictionary.
    _params = {
        "reducer": ee.Reducer.mean(),
        "scale": None,
        "crs": None,
        "bands": None,
        "bandsRename": None,
        "imgProps": None,
        "imgPropsRename": None,
        "datetimeName": "datetime",
        "datetimeFormat": "YYYY-MM-dd HH:mm:ss",
    }

    # Replace initialized params with provided params.
    if params:
        for param in params:
            _params[param] = params[param] or _params[param]

    # Set default parameters based on an image representative.
    imgRep = ic.first()
    nonSystemImgProps = ee.Feature(None).copyProperties(imgRep).propertyNames()
    if not _params["bands"]:
        _params["bands"] = imgRep.bandNames()
    if not _params["bandsRename"]:
        _params["bandsRename"] = _params["bands"]
    if not _params["imgProps"]:
        _params["imgProps"] = nonSystemImgProps
    if not _params["imgPropsRename"]:
        _params["imgPropsRename"] = _params["imgProps"]

    def select_band(img):
        # Select bands (optionally rename), set a datetime & timestamp property.
        img = (
            ee.Image(img.select(_params["bands"], _params["bandsRename"]))
            .set(_params["datetimeName"], img.date().format(_params["datetimeFormat"]))
            .set("timestamp", img.get("system:time_start"))
        )
        # Define final image property dictionary to set in output features.
        propsFrom = ee.List(_params["imgProps"]).cat(
            ee.List([_params["datetimeName"], "timestamp"])
        )
        propsTo = ee.List(_params["imgPropsRename"]).cat(
            ee.List([_params["datetimeName"], "timestamp"])
        )
        imgProps = img.toDictionary(propsFrom).rename(propsFrom, propsTo)

        # Subset points that intersect the given image.
        fcSub = fc.filterBounds(img.geometry())
        # Reduce the image by regions.
        def reduce_img(f):
            return f.set(imgProps)

        return img.reduceRegions(
            collection=fcSub,
            reducer=_params["reducer"],
            scale=_params["scale"],
            crs=_params["crs"],
        ).map(reduce_img)

    # Map the reduceRegions function over the image collection.
    results = (
        ic.map(select_band).flatten().filter(ee.Filter.notNull(_params["bandsRename"]))
    )
    return results

In [17]:
era5_daily_bands = [
    "mean_2m_air_temperature",
    "minimum_2m_air_temperature",
    "maximum_2m_air_temperature",
    "dewpoint_2m_temperature",
    "total_precipitation",
    "surface_pressure",
    "mean_sea_level_pressure",
    "u_component_of_wind_10m",
    "v_component_of_wind_10m",
]
era5_daily_bands_rename = [
    "tmean",
    "tmin",
    "tmax",
    "tdew",
    "prcp",
    "sp",
    "mslp",
    "windu",
    "windv",
]

In [18]:
# Define parameters for the zonalStats function.
params = {
    "reducer": ee.Reducer.mean(),
    "scale": 27830,
    "crs": "EPSG:4326",
    "bands": era5_daily_bands,
    "bandsRename": era5_daily_bands_rename,
    "datetimeName": "date",
    "datetimeFormat": "YYYY-MM-dd",
}

In [19]:
# Extract zonal statistics per point per image.
pts_era5_stats = zonal_stats(era5, pts_era5, params)
# print(pts_era5_stats.limit(1));

In [20]:
# export to google drive
geemap.ee_export_vector_to_drive(
    pts_era5_stats,
    description="era5_daily_reduce_globalpaper_site3221_" + str(year),
    folder="ERA5",
    # file_format="csv",
    fileFormat="csv",
    selectors=[
        "new_site_i",
        "date",
        "tmean",
        "tmin",
        "tmax",
        "tdew",
        "prcp",
        "sp",
        "mslp",
        "windu",
        "windv",
    ],
)

Exporting era5_daily_reduce_globalpaper_site3221_1980... Please check the Task Manager from the JavaScript Code Editor.


In [22]:
def reduce_from_era5(points_shape, rasters, scale):
    the_params = {
        "reducer": ee.Reducer.mean(),
        "scale": scale,
        "crs": "EPSG:4326",
        "bands": era5_daily_bands,
        "bandsRename": era5_daily_bands_rename,
        "datetimeName": "date",
        "datetimeFormat": "YYYY-MM-dd",
    }

    pts_era5_buffers = points_shape.map(buffer_points(scale, False))
    pts_era5_buffers_stats = zonal_stats(rasters, pts_era5_buffers, the_params)
    return pts_era5_buffers_stats

In [26]:
import numpy as np

for year in np.arange(1982, 1983):
    era5_year = ee.ImageCollection("ECMWF/ERA5/DAILY").filterDate(
        str(year) + "-01-01", str(year + 1) + "-01-01"
    )
    pts_era5_reduce = reduce_from_era5(pts, era5_year, 10000)
    # export to google drive
    geemap.ee_export_vector_to_drive(
        ee_object=pts_era5_reduce,
        description="era5_daily_reduce_globalpaper_site3221_" + str(year),
        folder="ERA5",
        file_format="csv",
        selectors=["new_site_i", "date"] + era5_daily_bands_rename,
    )

Exporting era5_daily_reduce_globalpaper_site3221_1982...


### Sample

上面是reduceRegions的结果，有时候也有可能会直接用点所在的区域的数据，这时候我们可以直接使用sampleRegions把数据取出来即可。

sample类型的函数介绍可以参考这些：

- [ee.Image.sampleRegions](https://developers.google.com/earth-engine/apidocs/ee-image-sampleregions)
- [What is the difference between sample, sampleRegions, and stratifiedSample in Google Earth Engine?](https://gis.stackexchange.com/questions/304929/what-is-the-difference-between-sample-sampleregions-and-stratifiedsample-in-go)

In [8]:
def sample_from_era5(year_num, points_shape, rasters, scale):
    # Overlay the points on the imagery to get sampled data
    era5_daily_bands = [
        "mean_2m_air_temperature",
        "minimum_2m_air_temperature",
        "maximum_2m_air_temperature",
        "dewpoint_2m_temperature",
        "total_precipitation",
        "surface_pressure",
        "mean_sea_level_pressure",
        "u_component_of_wind_10m",
        "v_component_of_wind_10m",
    ]
    era5_daily_bands_rename = [
        "tmean",
        "tmin",
        "tmax",
        "tdew",
        "prcp",
        "sp",
        "mslp",
        "windu",
        "windv",
    ]

    def sample_region(img):
        img = (
            ee.Image(img.select(era5_daily_bands, era5_daily_bands_rename))
            .set("date", img.date().format("YYYY-MM-dd"))
            .set("timestamp", img.get("system:time_start"))
        )
        # Define final image property dictionary to set in output features.
        propsFrom = ee.List(["date", "timestamp"])
        imgProps = img.toDictionary(propsFrom)

        def map_set_time(f):
            return f.set(imgProps)

        return img.sampleRegions(collection=points_shape, scale=scale).map(map_set_time)

    pts_raster_sample = rasters.map(sample_region).flatten()
    return pts_raster_sample

In [11]:
import numpy as np

for year in np.arange(1983, 2021):
    era5_year = ee.ImageCollection("ECMWF/ERA5/DAILY").filterDate(
        str(year) + "-01-01", str(year + 1) + "-01-01"
    )
    pts_era5_sample = sample_from_era5(year, pts, era5_year, 27830)
    # export to google drive
    geemap.ee_export_vector_to_drive(
        ee_object=pts_era5_sample,
        description="era5_daily_sample_globalpaper_site3221_" + str(year),
        folder="ERA5",
        file_format="csv",
        selectors=["new_site_i", "date"] + era5_daily_bands_rename,
    )

Exporting era5_daily_sample_globalpaper_site3221_1983...
Exporting era5_daily_sample_globalpaper_site3221_1984...
Exporting era5_daily_sample_globalpaper_site3221_1985...
Exporting era5_daily_sample_globalpaper_site3221_1986...
Exporting era5_daily_sample_globalpaper_site3221_1987...
Exporting era5_daily_sample_globalpaper_site3221_1988...
Exporting era5_daily_sample_globalpaper_site3221_1989...
Exporting era5_daily_sample_globalpaper_site3221_1990...
Exporting era5_daily_sample_globalpaper_site3221_1991...
Exporting era5_daily_sample_globalpaper_site3221_1992...
Exporting era5_daily_sample_globalpaper_site3221_1993...
Exporting era5_daily_sample_globalpaper_site3221_1994...
Exporting era5_daily_sample_globalpaper_site3221_1995...
Exporting era5_daily_sample_globalpaper_site3221_1996...
Exporting era5_daily_sample_globalpaper_site3221_1997...
Exporting era5_daily_sample_globalpaper_site3221_1998...
Exporting era5_daily_sample_globalpaper_site3221_1999...
Exporting era5_daily_sample_glo

### Sample for hourly data

还是上面ERA5数据的例子，不过这里使用的是ERA5LAND的小时数据，取出某些点对应的小时尺度气象数据。

In [8]:
year = 2021

In [9]:
chosen_bands = ["total_precipitation_hourly", "total_evaporation_hourly"]

In [9]:
era5_land = ee.ImageCollection("ECMWF/ERA5_LAND/HOURLY").filterDate(
    str(year) + "-01-01", str(year) + "-01-03"
)

In [85]:
def era5_land_sample_region(img):
    img_ = (
        img.select(chosen_bands)
        .set("datetime", img.date().format("YYYY-MM-dd HH:mm:ss"))
        .set("timestamp", img.get("system:time_start"))
    )
    # Define final image property dictionary to set in output features.
    propsFrom = ee.List(["datetime", "timestamp"])
    imgProps = img_.toDictionary(propsFrom)

    def map_set_time(f):
        return f.set(imgProps)

    return img_.sampleRegions(collection=pts, scale=11132).map(map_set_time)

In [86]:
pts_era5_land_sample = era5_land.map(era5_land_sample_region).flatten()

In [87]:
# export to google drive
geemap.ee_export_vector_to_drive(
    ee_object=pts_era5_land_sample,
    description="era5_land_sample_globalpaper_site3221_" + str(year),
    folder="ERA5",
    file_format="csv",
    selectors=["new_site_i", "datetime"] + chosen_bands,
)

Exporting era5_land_sample_globalpaper_site3221_2000...


### Reduce for hourly data

还是上面ERA5数据的例子，不过这里使用的是ERA5LAND的小时数据，reduce计算某些点的buffer对应的小时尺度气象数据。

In [21]:
year = 2021
era5_land = ee.ImageCollection("ECMWF/ERA5_LAND/HOURLY").filterDate(
    str(year) + "-01-01", str(year) + "-01-03"
)

In [22]:
chosen_bands = [
    "dewpoint_temperature_2m",
    "temperature_2m",
    "skin_temperature",
    "u_component_of_wind_10m",
    "v_component_of_wind_10m",
    "surface_pressure",
    "surface_latent_heat_flux_hourly",
    "surface_net_solar_radiation_hourly",
    "surface_net_thermal_radiation_hourly",
    "surface_sensible_heat_flux_hourly",
    "surface_solar_radiation_downwards_hourly",
    "surface_thermal_radiation_downwards_hourly",
    "potential_evaporation_hourly",
    "total_evaporation_hourly",
    "total_precipitation_hourly",
]

In [23]:
# Define parameters for the zonalStats function.
params_era5_land = {
    "reducer": ee.Reducer.mean(),
    "scale": 11132,
    "crs": "EPSG:4326",
    "bands": chosen_bands,
}

In [24]:
# pts_era5_land are buffers of points
pts_era5_land = pts.map(buffer_points(30000, False))
pts_era5_land_stats = zonal_stats(era5_land, pts_era5_land, params_era5_land);

In [25]:
# export to google drive
geemap.ee_export_vector_to_drive(
    pts_era5_land_stats,
    description="era5_land_reduce_globalpaper_site3221_" + str(year),
    folder="export",
    fileFormat="csv",
    selectors=["new_site_i", "datetime"] + chosen_bands,
)

Exporting era5_land_reduce_globalpaper_site3221_2021... Please check the Task Manager from the JavaScript Code Editor.


## 按regions提取时间序列数据

先计算出来点的buffer对应的小时尺度气象数据，然后统计到日尺度。

本节使用了[eemont](https://github.com/davemlz/eemont)这一工具。

In [101]:
import ee, eemont, geemap
import pandas as pd
import numpy as np

In [88]:
f1 = ee.Feature(ee.Geometry.Point([-44.17708224,-22.53977252]).buffer(11132),{'ID':'A'})
f2 = ee.Feature(ee.Geometry.Point([-70.44791273,-33.37291235]).buffer(11132),{'ID':'B'})
fc = ee.FeatureCollection([f1,f2])

In [140]:
fc = ee.FeatureCollection("projects/ee-owen/assets/globalpaper_site3221")

In [141]:
def buffer_points(radius, bounds):
    def buffer_point(pt):
        pt = ee.Feature(pt)
        return pt.buffer(radius).bounds() if bounds else pt.buffer(radius)

    return buffer_point

In [142]:
# apply a 45 m radius buffer (set the second argument as false i.e., do not use bounds)
fc = fc.map(buffer_points(10000, False))

In [143]:
year = ee.Number(2000)
month = ee.Number(1)
day = ee.Number(1)
start_date = ee.Date.fromYMD(year, month, day)
end_date = start_date.advance(2, "day")

In [144]:
days_num = end_date.difference(start_date, "day")
# count day from zero, and ee.List.sequence is a closed interval
days = ee.List.sequence(ee.Number(0), days_num.add(-1))

In [145]:
era5_land = (ee.ImageCollection("ECMWF/ERA5_LAND/HOURLY")
   .filterBounds(fc)
   .filterDate(start_date,end_date)
   .index(['dewpoint_temperature_2m','temperature_2m']))

这个和前面NLDAS的例子类似，但是这种方法点多了之后运行比较慢，所以接下来试试另一种方法即先算到小时的FeatureCollection，再平均到日，即先对imagecollection进行日平均，然后再处理到featurecollection：

In [378]:
ts = era5_land.getTimeSeriesByRegions(reducer = ee.Reducer.mean(),
                                      collection = fc,
                                     # 注意bands后面多了一个 "_mean"
                                      bands = ['dewpoint_temperature_2m','temperature_2m'],
                                      dateFormat = 'YYYYMMdd',
                                      scale = 11132)

In [379]:
ts.first().getInfo()

{'type': 'Feature',
 'geometry': {'type': 'Polygon',
  'coordinates': [[[-44.177082240000004, -22.43959714422811],
    [-44.20735557532208, -22.443581294479042],
    [-44.23522403895136, -22.4552172620936],
    [-44.25847266996974, -22.4735805866811],
    [-44.275251680324395, -22.497211870931057],
    [-44.284223442979375, -22.524232226593124],
    [-44.28466958840863, -22.552492159907178],
    [-44.2765494962648, -22.57974225514612],
    [-44.260505122131875, -22.603812201212396],
    [-44.2378112782019, -22.622783921541146],
    [-44.21027489065342, -22.635144911384057],
    [-44.18009101853645, -22.639909371174088],
    [-44.14966713835469, -22.636697261538195],
    [-44.12142999104214, -22.625764802166927],
    [-44.097630845043525, -22.607983908355195],
    [-44.0801651714276, -22.58477225541165],
    [-44.07042143775684, -22.55797970578207],
    [-44.06917115601642, -22.529740367950314],
    [-44.076508753337016, -22.502302279776302],
    [-44.09184564666628, -22.477848409941284

In [380]:
tsPandas = geemap.ee_to_pandas(ts)

In [381]:
tsPandas

Unnamed: 0,date,dewpoint_temperature_2m,temperature_2m,ID,reducer
0,20000101,294.015627,294.976186,A,mean
1,20000101,279.959294,288.851400,B,mean
2,20000101,293.987549,294.621945,A,mean
3,20000101,280.342099,288.354047,B,mean
4,20000101,293.787338,294.461676,A,mean
...,...,...,...,...,...
91,20000102,279.977142,292.932740,B,mean
92,20000102,293.189405,294.636613,A,mean
93,20000102,280.600600,291.527143,B,mean
94,20000102,293.135146,294.514719,A,mean


In [382]:
tsPandas[(tsPandas["ID"]=="A")][0:24]["dewpoint_temperature_2m"].mean()

293.8017294666263

In [383]:
chose = ts.filter(ee.Filter.eq('date', "20000101"))
chose_df =geemap.ee_to_pandas(chose)

In [384]:
chose_df

Unnamed: 0,date,dewpoint_temperature_2m,temperature_2m,ID,reducer
0,20000101,294.015627,294.976186,A,mean
1,20000101,279.959294,288.8514,B,mean
2,20000101,293.987549,294.621945,A,mean
3,20000101,280.342099,288.354047,B,mean
4,20000101,293.787338,294.461676,A,mean
5,20000101,280.517688,287.898757,B,mean
6,20000101,293.679768,294.005325,A,mean
7,20000101,280.719778,287.453577,B,mean
8,20000101,293.606595,293.88446,A,mean
9,20000101,280.597917,287.12162,B,mean


因为GEE很难调试，所以先来看看一些短代码，看看能得到什么样子的结果。

In [385]:
chose_reduce = chose.reduceColumns(**{
    "selectors": ['ID', 'dewpoint_temperature_2m', "temperature_2m"],
    # 重复2次为两项，第0项是id，不参与reduce的计算
    "reducer": ee.Reducer.mean().repeat(2).group(**{
                      'groupField': 0,
                      'groupName': 'ID_'})}
)

In [386]:
chose_reduce.getInfo()

{'groups': [{'ID_': 'A', 'mean': [293.80172946662617, 295.43070458719734]},
  {'ID_': 'B', 'mean': [279.7598398645175, 289.5211567362275]}]}

In [387]:
chose_reduce_dict_lst = chose_reduce.values()[0]
chose_reduce_dict_lst.getInfo()

[{'ID_': 'A', 'mean': [293.80172946662617, 295.43070458719734]},
 {'ID_': 'B', 'mean': [279.7598398645175, 289.5211567362275]}]

In [391]:
ee.Number(ee.List(ee.Dictionary(ee.List(chose_reduce_dict_lst)[0])["mean"])[0]).getInfo()

293.80172946662617

In [389]:
keys = ["site_id", "dewpoint_temperature_2m", "temperature_2m"]

In [392]:
data = ee.List(chose_reduce_dict_lst).map(
    lambda d: ee.Dictionary(
        {keys[0]: ee.Dictionary(d)["ID_"], 
         # kelvin to celcius
         keys[1]: ee.Number(ee.List(ee.Dictionary(d)["mean"])[0]) - 273.15, 
         keys[2]:ee.List(ee.Dictionary(d)["mean"])[1]}
    )
)

In [393]:
csv_feature = data.map(lambda f: ee.Feature(None, f).set({"date": "20000101"}))
csv_feat_col = ee.FeatureCollection(csv_feature)
csv_feat_col

<ee.featurecollection.FeatureCollection at 0x2a3b2012550>

In [394]:
csv_feature

<ee.ee_list.List at 0x2a3b201c580>

In [395]:
csv_feat_col_df =geemap.ee_to_pandas(csv_feat_col)

In [396]:
csv_feat_col_df

Unnamed: 0,date,dewpoint_temperature_2m,temperature_2m,site_id
0,20000101,20.651729,295.430705,A
1,20000101,6.60984,289.521157,B


现在已经基本上清楚函数应该怎么写了，下面是汇总的实际运行的函数：

In [397]:
selectors = ["ID", 'dewpoint_temperature_2m', "temperature_2m"]
def nestedMap(dayCol, feaCol):
    def mapDayCol(oneDay):
        chosen = feaCol.filter(ee.Filter.eq('date', oneDay))
        chosen_reduce = chosen.reduceColumns(**{
            "selectors": selectors,
            "reducer": ee.Reducer.mean().repeat(2).group(
                **{'groupField': 0,
                   'groupName': "ID"})})
        chosen_reduce_dict_lst = chosen_reduce.values()[0]
        data = ee.List(chosen_reduce_dict_lst).map(
            lambda d: ee.Dictionary(
                {"site_id": ee.Dictionary(d)["ID"], 
                 # kelvin to celcius
                 selectors[1]: ee.Number(ee.List(ee.Dictionary(d)["mean"])[0]) - ee.Number(273.15), 
                 selectors[2]:ee.List(ee.Dictionary(d)["mean"])[1]}
            )
        )
        csv_feature_lst = data.map(
            lambda f: ee.Feature(None, f).set(
                {
                    "date": oneDay
                }
            )
        )
        csv_feat_col = ee.FeatureCollection(csv_feature_lst)
        return csv_feat_col

    # https://developers.google.com/earth-engine/apidocs/ee-featurecollection-merge
    # If many collections need to be merged, consider placing them all in a collection and using FeatureCollection.flatten() instead.
    return dayCol.map(mapDayCol).flatten()

循环天数

In [398]:
dates = days.map(lambda t: start_date.advance(t, "day").format("YYYYMMdd"))

In [399]:
dates.getInfo()

['20000101', '20000102']

In [400]:
reduce_to_daily = nestedMap(dates, ts)

In [401]:
reduce_to_daily

<ee.ee_list.List at 0x2a3b2070b20>

In [402]:
save_reduce_to_daily = ee.FeatureCollection(reduce_to_daily).flatten()

In [403]:
# export to google drive
geemap.ee_export_vector_to_drive(
    save_reduce_to_daily,
    description="era5_test_hourly_to_daily_mean_20000101-02",
    folder="export",
    fileFormat="csv",
    selectors=[
        "site_id",
        "date",
        "dewpoint_temperature_2m",
        "temperature_2m"
    ],
)

Exporting era5_2nd_hourly_to_daily_sites3221_avg_mean_20000101-02... Please check the Task Manager from the JavaScript Code Editor.


这里看一眼结果

In [404]:
reduce_to_daily_df =geemap.ee_to_pandas(ee.FeatureCollection(save_reduce_to_daily))

In [405]:
reduce_to_daily_df

Unnamed: 0,date,dewpoint_temperature_2m,temperature_2m,site_id
0,20000101,20.651729,295.430705,A
1,20000101,6.60984,289.521157,B
2,20000102,20.060945,295.201239,A
3,20000102,6.245491,289.568314,B
