## How to export thousands of image chips from Earth Engine in a few minutes

This source code of this notebook was adopted from the Medium post - [Fast(er) Downloads](https://gorelick.medium.com/fast-er-downloads-a2abd512aa26) by Noel Gorelick. Credits to Noel.  

Due to the [limitation](https://docs.python.org/3/library/multiprocessing.html) of the [multiprocessing](https://docs.python.org/3/library/multiprocessing.html) package, the functionality of this notebook can only be run in the top-level. Therefore, it could not be implemented as a function under geemap. 

### Install packages

Uncomment the following line to install the required packages.

In [None]:
# !pip install  geemap retry



```
os.environ['HTTP_PROXY'] = "http://127.0.0.1:7890"
os.environ['HTTPS_PROXY'] = "http://127.0.0.1:7890"
os.environ['all_proxy'] = 'socks5://127.0.0.1:7890'
os.environ['CURL_CA_BUNDLE'] = ''

ee.Authenticate()
ee.Initialize(project='zhuxiaobo')
```

### Import libraries

In [1]:
import ee
import geemap
import logging
import multiprocessing
import os
import requests
import shutil
from retry import retry
from joblib import Parallel, delayed

import geopandas as gpd
from shapely.geometry import Point, Polygon
geemap.set_proxy(port=7890)

### Initialize GEE to use the high-volume endpoint

- [high-volume endpoint](https://developers.google.com/earth-engine/cloud/highvolume)

In [2]:
ee.Initialize(opt_url="https://earthengine-highvolume.googleapis.com")

### Create an interactive map

In [188]:
Map = geemap.Map(layout={'height': '600px'})

Map

Map(center=[0, 0], controls=(WidgetControl(options=['position', 'transparent_bg'], widget=SearchDataGUI(childr…

### Define the Region of Interest (ROI)

You can use the drawing tools on the map to draw an ROI, then you can use `Map.user_roi` to retrieve the geometry. Alternatively, you can define the ROI as an ee.Geometry as shown below.

In [290]:
# ID = '20181007T020649_20181007T021611_T52SDD'
# ID = '20181007T020649_20181007T021611_T52SEF'
# ID = '20181007T020649_20181007T021611_T52SEE'
ID = '20181014T015649_20181014T015649_T52SGF'
ID = '20230604T013659_20230604T014516_T53SMT'
ID = '20230604T013659_20230604T014516_T53SNU'
ID = '20230619T013701_20230619T013709_T53SMU'
ID = '20230629T013701_20230629T014140_T53SMT'
ID = '20230702T014651_20230702T015725_T53SKT'
ID = '20230722T014651_20230722T015635_T53SKT'
ID = '20230717T014659_20230717T015534_T53SMV'
ID = '20230821T014701_20230821T015532_T53SNV'
ID = '20220927T013659_20220927T014529_T53SQU'
ID = '20220927T013659_20220927T014529_T53SPU'
ID = '20220805T012659_20220805T013242_T54SUJ'
ID = '20220817T011711_20220817T012443_T54TYN'
ID = '20220822T011659_20220822T011653_T54TYN'
ID = '20220822T011659_20220822T011653_T54TYM'
ID = '20220825T012659_20220825T013238_T54TWM'
ID = '20210719T013701_20210719T014459_T53SPB'
ID = '20210719T013701_20210719T014459_T53SNV'
ID = '20210714T013659_20210714T013655_T53SNV'
ID = '20210724T013659_20210724T013655_T53SNA'
ID = '20160904T161342_20160904T162740_T16PCC' #HOG
ID = '20200913T160911_20200913T162053_T16PCC' #HOG
ID = '20200918T160839_20200918T161640_T16PCC' #HOG

ID = '20180911T152629_20180911T152630_T18QYF' #HT
ID = '20180926T152641_20180926T152747_T18QYF' #HT
ID = '20220707T152651_20220707T152652_T18QYF' #HT
ID = '20220722T152649_20220722T152644_T18QYF' #HT
ID = '20220717T152651_20220717T152651_T18QYF' #HT

ID = '20230313T021531_20230313T022730_T50LKR' #Bali

ID = '20200123T031021_20200123T032120_T48MWU' #Java West
ID = '20200212T030831_20200212T031435_T48MWU' #Java West

JSROI =       ee.Geometry.MultiPolygon(
        [[[[105.33686472767602, -6.046645062507107],
           [105.33686472767602, -6.080102467150699],
           [105.36536051625023, -6.080102467150699],
           [105.36536051625023, -6.046645062507107]]],
         [[[105.42166544789086, -5.987918906320247],
           [105.42166544789086, -6.028720600803315],
           [105.4405481993557, -6.028720600803315],
           [105.4405481993557, -5.987918906320247]]],
         [[[105.39196802967797, -5.980236237322712],
           [105.39196802967797, -6.005503275101339],
           [105.4185755431057, -6.005503275101339],
           [105.4185755431057, -5.980236237322712]]]])

### Set parameters

If you want the exported images to have coordinate system, change `format` to `GEO_TIFF`. Otherwise, you can use `png` or `jpg` formats.

### Define the image source


Using the 10-m [Sentinel-2 imagery](https://developers.google.com/earth-engine/datasets/catalog/COPERNICUS_S2_SR#bands).

In [232]:
JAPlist=[
('2018-07-13','2018-07-15'),# JAP 2 images
('2018-10-03','2018-10-05'),# JAP
('2018-09-01','2018-09-05'),#korea
('2018-10-05','2018-10-08'),#korea
]
Debrisindex = 3

In [195]:
    #   .copyProperties(image, image.propertyNames())
def mask_s2_clouds(image):
  """Masks clouds in a Sentinel-2 image using the QA band.

  Args:
      image (ee.Image): A Sentinel-2 image.

  Returns:
      ee.Image: A cloud-masked Sentinel-2 image.
  """
  qa = image.select('QA60')

  # Bits 10 and 11 are clouds and cirrus, respectively.
  cloud_bit_mask = 1 << 10
  cirrus_bit_mask = 1 << 11

  # Both flags should be set to zero, indicating clear conditions.
  mask = (
      qa.bitwiseAnd(cloud_bit_mask)
      .eq(0)
      .And(qa.bitwiseAnd(cirrus_bit_mask).eq(0))
  )
  mask = mask.Not().fastDistanceTransform().sqrt().gt(15)

  return image.select("B.*").updateMask(mask)


In [291]:
region = Map.user_roi

# 定义 DeltaR 函数
def DeltaR(kernel=3):
    def process(image):
        # 计算邻域最小值
        Rmin = image.reduceNeighborhood(
            reducer=ee.Reducer.min(),
            kernel=ee.Kernel.square(kernel)
        )
        # 图像减去邻域最小值
        subtracted = image.subtract(Rmin).regexpRename('^','Diff_')
        # 复制属性并返回结果
        return image.addBands(subtracted).divide(10000).copyProperties(image, image.propertyNames())
    return process
    
# 定义映射函数，提取属性并格式化日期
def extract_properties(image):
    date_str = ee.Date(image.get('system:time_start')).format('YYYY-MM-dd')
    image = image.set({
        'Date': date_str,
        'TargetType': 'Debris',
        'ProductType': 'L1C_TOA',
        'Institution': 'Nanjing University',
        'Author': 'Lu_Group',
    })
    return image
    
dem =ee.Image("NASA/NASADEM_HGT/001").select('elevation');  
oceanMask = dem.gt(0).fastDistanceTransform().sqrt();  
oceanbufferMask = oceanMask.gt(20) #pixel

# .filterDate(*JAPlist[Debrisindex])
        # .filterBounds(JSROI)
        # .filter(ee.Filter.lt('CLOUDY_PIXEL_PERCENTAGE', 60))        

S2 = (
    ee.ImageCollection("COPERNICUS/S2_HARMONIZED")
        .filter(ee.Filter.eq('system:index', ID))
        .map(mask_s2_clouds)
        .map(DeltaR(7))
        .map(lambda img: img.addBands(img.normalizedDifference(['Diff_B8', 'Diff_B4']).rename('NDVI')))
)

image = S2.first()
        
Label = (image
        .select('B8').updateMask(oceanbufferMask)
        .updateMask(image.select('Diff_B10').lt(0.0004))
        .updateMask(image.select('Diff_B8').divide(2).gt(image.select('Diff_B12')))
        .updateMask(image.select('Diff_B8').gt(0.00).And(image.select('NDVI').gt(-0.05)))#.gt(0)
        )
        
# compactness: 0.4,
neighborhoodCount = Label.reduceNeighborhood(
      **{
      'reducer':ee.Reducer.count(),
      'kernel': ee.Kernel.square(1),
  }).gte(3)# export 10m
  
Label = ee.Image.byte(Label.updateMask(neighborhoodCount)
.mask())

Sample = (image
        .select('B8')
        .updateMask(oceanbufferMask)
        .updateMask(image.select('Diff_B10').lt(0.0004))
        .updateMask(image.select('Diff_B8').divide(2).gt(image.select('Diff_B12')))
        .updateMask(image.select('Diff_B8').gt(0.004).And(image.select('NDVI').gt(0.1)))
        .updateMask(neighborhoodCount)
        .gt(0)
        # .clip(region)
        )
        # .copyProperties(image, image.propertyNames())

params = {
    "count": 50,  # How many image chips to export
    "buffer": 128*10,  # The buffer distance (m) around each point
    "scale": 60,  # The scale to do stratified sampling getRequests()
    "seed": 1,  # A randomization seed to use for subsampling.
    "dimensions": "256x256",  # The dimension of each image chip
    "format": "png",  # The output image format, can be png, jpg, ZIPPED_GEO_TIFF, GEO_TIFF, NPY
    "prefix": image.get('system:index').getInfo()+"_",  # The filename prefix
    "processes": 200,  # How many processes to used for parallel processing
    "R_dir": "E:/SDGChips/Chip/R",  # The output directory. Default to the current working directly
    "DiffR_dir": "E:/SDGChips/Chip/DiffR",  # The output directory. Default to the current working directly
    "TCI_dir": "E:/SDGChips/Pic/TCI",  # The output directory. Default to the current working directly
    "FCI_dir": "E:/SDGChips/Pic/FCI",  # The output directory. Default to the current working directly
    "Fea_dir": "E:/SDGChips/Feature",  # The output directory. Default to the current working directly
    "Label_dir": "E:/SDGChips/Chip/Label",  # The output directory. Default to the current working directly
}

image = (
    ee.ImageCollection("COPERNICUS/S2_HARMONIZED")
        .filter(ee.Filter.eq('system:index', ID))
        .select("B.*")
        .map(extract_properties)
        .map(DeltaR(7))
        .map(lambda img: img.addBands(img.normalizedDifference(['Diff_B8', 'Diff_B4']).rename('NDVI')))
).first()

# 获取影像属性
properties = ['system:index', 'ProductType', 'SPACECRAFT_NAME', 'MGRS_TILE','TargetType', 'Date','Institution','Author']
image_properties = image.getInfo()['properties']
filtered_properties = {key: image_properties[key] for key in properties if key in image_properties}
# 注意：getInfo() 会进行客户端请求，可能较慢。如有需要，请仅获取必要的属性。

### Add layers to map

In [294]:
vis_params = {
    "bands": ["B4", "B8", "B2"],
    "min": 0.02,
    "max": 0.15,
    "gamma": 1.5  # Adjust gamma to enhance the difference between the target and water background
}

ndvi_params = {
    "bands": ["NDVI"],
    "min": 0,
    "max":0.2,
    "palette": ["00FFFF", "0000FF"]
}

Map.addLayer(image,vis_params, "Image")
# Map.addLayer(S2.first(),vis_params, "CLD")
# Map.addLayer(image.select('NDVI'), ndvi_params, "NDVI")
Map.addLayer(JSROI, {}, "ROI", False)
Map.addLayer(Sample, {"min": 0, "max": 1}, "Sample", False)
Map.addLayer(Label,{"min": 0, "max": 1}, "Label", False)
# Map.centerObject(region, 12)
# Map.setCenter(-122.4415, 37.7555, 12)
Map

Map(bottom=2168447.0, center=[-6.082094894506808, -254.6640729904175], controls=(WidgetControl(options=['posit…

### Generate a list of work items

In the example, we are going to generate 1000 points using the stratified random sampling, which requires a `classBand`. It is the name of the band containing the classes to use for stratification. If unspecified, the first band of the input image is used. Therefore, we have toADD a new band with a constant value (e.g., 1) to the image. The result of the `getRequests()`function returns a list of dictionaries containing points.

In [25]:
def getRequests():
    # img = ee.Image(1).rename("Class").updateMask(Sample.mask())
    points = Sample.stratifiedSample(
        numPoints=params["count"],
        # region=region,
        region= image.geometry().intersection(JSROI,0.01),
        scale=params["scale"],
        seed=params["seed"],
        geometries=True,
    )
    Map.data = points
    return points.aggregate_array(".geo").getInfo()

### Create a function for downloading image

The `getResult()` function then takes one of those points and generates an image centered on that location, which is then downloaded as a PNG and saved to a file. This function uses `image.getThumbURL()` to select the pixels, however you could also use `image.getDownloadURL()` if you wanted the output to be in GeoTIFF or NumPy format ([source](https://gorelick.medium.com/fast-er-downloads-a2abd512aa26)).

In [48]:
TCI_vis_params = {
    "bands": ["B4", "B3", "B2"],
    "min": 0.02,
    "max": 0.15,
    "gamma": 0.8  # Adjust gamma to enhance the difference between the target and water background
}
FCI_vis_params = {
    "bands": ["B4", "B8", "B2"],
    "min": 0.02,
    "max": 0.15,
    "gamma": 1.2  # Adjust gamma to enhance the difference between the target and water background
}
  
@retry(tries=10, delay=1, backoff=2)
def getPNGResult(index, point):
    centroid = point["coordinates"]
    point = ee.Geometry.Point(centroid)
    region = point.buffer(params["buffer"]).bounds()

    url_TCI = image.visualize(**TCI_vis_params).getThumbURL(
        {
            "region": region,
            "dimensions": params["dimensions"],
            "format": 'png',
        }
    )
    url_FCI = image.visualize(**FCI_vis_params).getThumbURL(
        {
            "region": region,
            "dimensions": params["dimensions"],
            "format": 'png',
        }
    )

    #png TCI
    r = requests.get(url_TCI, stream=True)
    if r.status_code != 200:
        r.raise_for_status()
    out_dir = os.path.abspath(params["TCI_dir"])
    basename = str(index).zfill(len(str(params["count"])))
    filename = f"{out_dir}/{params['prefix']}{basename}.png"
    with open(filename, "wb") as out_file:
        shutil.copyfileobj(r.raw, out_file)
        
    #png FCI
    r = requests.get(url_FCI, stream=True)
    if r.status_code != 200:
        r.raise_for_status()
    out_dir = os.path.abspath(params["FCI_dir"])
    basename = str(index).zfill(len(str(params["count"])))
    filename = f"{out_dir}/{params['prefix']}{basename}.png"
    with open(filename, "wb") as out_file:
        shutil.copyfileobj(r.raw, out_file)
                   
    print("PNGDone: ", basename)

In [27]:
@retry(tries=10, delay=1, backoff=2)
def getTifResult(index, point):
    centroid = point["coordinates"]
    point = ee.Geometry.Point(centroid)
    region = point.buffer(params["buffer"]).bounds()

    url_tif = image.select('B.*').getDownloadURL(
        {
            "region": region,
            "dimensions": params["dimensions"],
            "crs":'EPSG:4326',
            "format": 'GEO_TIFF',
        }
    )

    url_diff = image.select('Diff.*').getDownloadURL(
        {
            "region": region,
            "dimensions": params["dimensions"],
            "crs":'EPSG:4326',
            "format": 'GEO_TIFF',
        }
    )
    url_Label = Label.getDownloadURL(
        {
            "region": region,
            "dimensions": params["dimensions"],
            "crs":'EPSG:4326',
            "format": 'GEO_TIFF',
        }
    )
       
    #tif R
    r = requests.get(url_tif, stream=True)
    if r.status_code != 200:
        r.raise_for_status()
    out_dir = os.path.abspath(params["R_dir"])
    basename = str(index).zfill(len(str(params["count"])))
    filename = f"{out_dir}/{params['prefix']}{basename}.tif"
    with open(filename, "wb") as out_file:
        shutil.copyfileobj(r.raw, out_file)
        
    #tif Diff
    r = requests.get(url_diff, stream=True)
    if r.status_code != 200:
        r.raise_for_status()
    out_dir = os.path.abspath(params["DiffR_dir"])
    basename = str(index).zfill(len(str(params["count"])))
    filename = f"{out_dir}/{params['prefix']}{basename}.tif"
    with open(filename, "wb") as out_file:
        shutil.copyfileobj(r.raw, out_file)
        
    #Label
    r = requests.get(url_Label, stream=True)
    if r.status_code != 200:
        r.raise_for_status()
    out_dir = os.path.abspath(params["Label_dir"])
    basename = str(index).zfill(len(str(params["count"])))
    filename = f"{out_dir}/{params['prefix']}{basename}.tif"
    with open(filename, "wb") as out_file:
        shutil.copyfileobj(r.raw, out_file)
    
    print("FifDone: ", basename)

In [28]:
@retry(tries=10, delay=1, backoff=2)
def getShpResult(index, point):
    centroid = point["coordinates"]
    point = ee.Geometry.Point(centroid)
    region = point.buffer(params["buffer"]).bounds()

    # 获取缓冲区多边形的坐标
    region_coords = region.coordinates().get(0).getInfo()
    # 将坐标列表转换为 shapely Polygon
    region_geom = Polygon(region_coords)
    
    # 创建 GeoDataFrame，包含点和区域几何
    gdf = gpd.GeoDataFrame(
        [{
            'geometry': region_geom,
            'footprint': region_geom.wkt,
            'centroid': Point(centroid).wkt,
            **filtered_properties
        }],
        geometry='geometry',
        crs='EPSG:4326'
    )
    # gdf.set_geometry('geometry')
    # 保存为 Shapefile
    shapefile_dir = os.path.join(r'E:\SDGChips', "Feature")
    os.makedirs(shapefile_dir, exist_ok=True)
    basename = str(index).zfill(len(str(params["count"])))
    shapefile_path = os.path.join(shapefile_dir, f"{params['prefix']}{basename}.json")
    gdf.to_file(shapefile_path, driver='GeoJSON', encoding='utf-8')
       
    print("ShpDone: ", basename)

In [None]:
'20181014T015649_20181014T015649_T52SGF'

In [302]:
import os
import glob
import geopandas as gpd
import pandas as pd  # 导入 pandas

# 设置目标文件夹路径
folder_path = r'E:\SDGChips\Feature'

# 获取所有 GeoJSON 文件的路径列表
geojson_files = glob.glob(os.path.join(folder_path, '*.json'))

# 初始化一个空的列表，用于存储 GeoDataFrame
gdf_list = []

# 遍历所有 GeoJSON 文件，将它们读取并添加到列表中
for geojson_file in geojson_files:
    gdf = gpd.read_file(geojson_file, encoding='utf-8')
    basename = os.path.basename(geojson_file).split('.')[0][:-3]
    system_index = basename
    mgrs_tile = basename.split('_')[2][1:]
    date = basename[:4] + '-' + basename[4:6] + '-' + basename[6:8]

    gdf['system:index'] = system_index
    gdf['MGRS_TILE'] = mgrs_tile
    gdf['Date'] = date
    # 可选：统一 CRS
    # gdf = gdf.to_crs('EPSG:4326')
    gdf_list.append(gdf)
    # 替换单独的json文件
    gdf.to_file(geojson_file, driver='GeoJSON', encoding='utf-8')

# 使用 pd.concat 合并所有 GeoDataFrame
merged_gdf = pd.concat(gdf_list, ignore_index=True)

# 设置输出文件路径
output_json = os.path.join(folder_path, 'merged.json')

# 将合并后的 GeoDataFrame 保存为 GeoJSON
merged_gdf.to_file(output_json, driver='GeoJSON', encoding='utf-8')

print('所有 GeoJSON 文件已合并并保存为：', output_json)

所有 GeoJSON 文件已合并并保存为： E:\SDGChips\Feature\merged.json


### Download images

In [295]:
%%time

logging.basicConfig()
items = getRequests()

Parallel(n_jobs=300,backend='threading')(delayed(getPNGResult)(i, item) for i, item in enumerate(items))
Parallel(n_jobs=300,backend='threading')(delayed(getTifResult)(i, item) for i, item in enumerate(items))
Parallel(n_jobs=300,backend='threading')(delayed(getShpResult)(i, item) for i, item in enumerate(items))



PNGDone:  08
PNGDone:  03
PNGDone:  07
PNGDone:  02
PNGDone:  15
PNGDone:  42
PNGDone:  05
PNGDone:  29
PNGDone:  31
PNGDone:  25
PNGDone:  18
PNGDone:  35
PNGDone:  13
PNGDone:  06
PNGDone:  39
PNGDone:  11
PNGDone:  19
PNGDone:  34
PNGDone:  04
PNGDone:  28
PNGDone:  26
PNGDone:  17
PNGDone:  09
PNGDone:  14
PNGDone:  41
PNGDone:  37
PNGDone:  20
PNGDone:  21
PNGDone:  16
PNGDone:  01
PNGDone:  27
PNGDone:  30
PNGDone:  22
PNGDone:  38
PNGDone:  24
PNGDone:  32
PNGDone:  10
PNGDone:  36
PNGDone:  23
PNGDone:  33
PNGDone:  00
PNGDone:  12
PNGDone:  40




FifDone:  04
FifDone:  08
FifDone:  05
FifDone:  34
FifDone:  22
FifDone:  02
FifDone:  00
FifDone:  25
FifDone:  29
FifDone:  18
FifDone:  27
FifDone:  40
FifDone:  28
FifDone:  06
FifDone:  32
FifDone:  42
FifDone:  38
FifDone:  07
FifDone:  37
FifDone:  09
FifDone:  03
FifDone:  01
FifDone:  31
FifDone:  12
FifDone:  36
FifDone:  14
FifDone:  26
FifDone:  21
FifDone:  17
FifDone:  24
FifDone:  33
FifDone:  35
FifDone:  13
FifDone:  10
FifDone:  30
FifDone:  19
FifDone:  16
FifDone:  11
FifDone:  20
FifDone:  41
FifDone:  15
FifDone:  23
FifDone:  39




ShpDone:  01
ShpDone:  21




ShpDone:  08
ShpDone:  22
ShpDone:  24
ShpDone:  33
ShpDone:  06




ShpDone:  17
ShpDone:  16
ShpDone:  39
ShpDone:  15
ShpDone:  40
ShpDone:  31




ShpDone:  28
ShpDone:  20




ShpDone:  00
ShpDone:  13




ShpDone:  26




ShpDone:  19
ShpDone:  03
ShpDone:  34




ShpDone:  04




ShpDone:  07
ShpDone:  32
ShpDone:  09
ShpDone:  25




ShpDone:  29




ShpDone: ShpDone:  41
 10
ShpDone:  42




ShpDone:  05
ShpDone:  23
ShpDone:  27




ShpDone:  30
ShpDone:  14




ShpDone:  38
ShpDone:  36
ShpDone:  11




ShpDone:  37
ShpDone:  18
ShpDone:  35
ShpDone:  02
ShpDone:  12
CPU times: total: 15.4 s
Wall time: 46.5 s


[None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None]

### Retrieve sample points

In [292]:
len(getRequests())

43

In [293]:
Map.addLayer(Map.data, {}, "Sample points")
Map

Map(bottom=1084470.0, center=[-6.098566720373314, -254.6183784841624], controls=(WidgetControl(options=['posit…

In [79]:
%%writefile SDGMP.py
import ee
import geemap
import logging
import multiprocessing
import os
import requests
import shutil
from retry import retry

# ee.Initialize(opt_url="https://earthengine-highvolume.googleapis.com")

@retry(tries=10, delay=1, backoff=2)
def getResult(kws):
    # 解包参数
    index = kws['index']
    item = kws['item']
    region = kws['region']
    image = kws['image']
    params = kws['params']
    
    point = ee.Geometry.Point(item["coordinates"])
    region = point.buffer(params["buffer"]).bounds()

    if params["format"] in ["png", "jpg"]:
        url = image.getThumbURL(
            {
                "region": region,
                "dimensions": params["dimensions"],
                "format": params["format"],
            }
        )
    else:
        url = image.getDownloadURL(
            {
                "region": region,
                "dimensions": params["dimensions"],
                "format": params["format"],
            }
        )

    if params["format"] == "GEO_TIFF":
        ext = "tif"
    else:
        ext = params["format"]
    print("URL", url)

    r = requests.get(url, stream=True)
    if r.status_code != 200:
        r.raise_for_status()

    out_dir = os.path.abspath(params["out_dir"])
    basename = str(index).zfill(len(str(params["count"])))
    filename = f"{out_dir}/{params['prefix']}{basename}.{ext}"
    with open(filename, "wb") as out_file:
        shutil.copyfileobj(r.raw, out_file)
    print("Done: ", basename)

Overwriting SDGMP.py


In [8]:
%%writefile SDGPartial.py
import ee
import geemap
import logging
import multiprocessing
import os
import requests
import shutil
from retry import retry

ee.Initialize(opt_url="https://earthengine-highvolume.googleapis.com")

@retry(tries=10, delay=1, backoff=2)
def getResult(index, item,*,region, image, params):
    
    point = ee.Geometry.Point(item["coordinates"])
    region = point.buffer(params["buffer"]).bounds()

    if params["format"] in ["png", "jpg"]:
        url = image.getThumbURL(
            {
                "region": region,
                "dimensions": params["dimensions"],
                "format": params["format"],
            }
        )
    else:
        url = image.getDownloadURL(
            {
                "region": region,
                "dimensions": params["dimensions"],
                "format": params["format"],
            }
        )

    if params["format"] == "GEO_TIFF":
        ext = "tif"
    else:
        ext = params["format"]
    print("URL", url)

    r = requests.get(url, stream=True)
    if r.status_code != 200:
        r.raise_for_status()

    out_dir = os.path.abspath(params["out_dir"])
    basename = str(index).zfill(len(str(params["count"])))
    filename = f"{out_dir}/{params['prefix']}{basename}.{ext}"
    with open(filename, "wb") as out_file:
        shutil.copyfileobj(r.raw, out_file)
    print("Done: ", basename)

Overwriting SDGPartial.py


因为jupyter不支持对进程调用，这步的目的是保存一个py文件方便调用

In [None]:
%%time
# 导入外部文件中的函数
import SDGMP
# from functools import partial
# 创建偏函数，固定住 region, image, params,
# kwargs = {'region': region, 'image': image, 'params': params}
# getResult_partial = partial(SDGMP.getResult, **kwargs)

logging.basicConfig()
items = getRequests()

# 准备参数字典列表
kws_list = (
    {
        'index': index,
        'item': item,
        'region': region,
        'image': image,
        'params': params
    }
    for index, item in enumerate(items)
)

pool = multiprocessing.Pool(25)
pool.map(SDGMP.getResult, kws_list)

pool.close()



# from functools import partial

In [10]:
%%time
# 导入外部文件中的函数
import SDGPartial
from functools import partial
# 创建偏函数，固定住 region, image, params,
kwargs = {'region': region, 'image': image, 'params': params}
getResult_partial = partial(SDGPartial.getResult, **kwargs)

logging.basicConfig()
items = getRequests()

pool = multiprocessing.Pool(25)
pool.starmap(getResult_partial, enumerate(items))

pool.close()



CPU times: total: 156 ms
Wall time: 1min 20s


In [None]:
%%time
# 导入外部文件中的函数
import SDGMP

logging.basicConfig()
# items = getRequests()
# 创建 items 列表
items = [(region, image, params, dic['coordinates']) for dic in getRequests()]

pool = multiprocessing.Pool(32)
pool.starmap(SDGMP.getResult, enumerate(items))

pool.close()



CPU times: total: 172 ms
Wall time: 34.3 s
