# Access and Download of Raw NEE and Environmental Predictor Data

Author: Aline Andrade do Nascimento

Affiliation: PhD Candidate in Applied Computing, National Institute for Space Research (INPE)

Contact: alinephysics@gmail.com | aline.andrade@inpe.br

Date: July 2025

This notebook provides code snippets and links to access and download raw environmental datasets commonly used as predictors in Net Ecosystem Exchange (NEE) modeling in the Amazon.
It does not include any data preprocessing or analysis. The main goal is to centralize the procedures for retrieving datasets from multiple sources such as Google Earth Engine (GEE), ECMWF (ERA5), NASA (CERES), and others.

# ☀️ ERA5 Hourly Data on Single Levels (1940–present)

**Dataset:** [ERA5 - Single Levels (Copernicus Climate Data Store)](https://cds.climate.copernicus.eu/datasets/reanalysis-era5-single-levels?tab=overview)  
**Provider:** ECMWF / Copernicus Climate Data Store (CDS)

**Description:**  
ERA5 provides hourly estimates of a wide range of climate variables at a single level (surface or near-surface), from 1940 to the present. This reanalysis product is generated by combining model data with observations using data assimilation techniques.

**Common variables used as predictors for NEE modeling include:**
- `2m_temperature`: Air temperature at 2 meters
- `total_precipitation`: Precipitation (rain + snow)
- `evaporation`: Total evaporation from surface
- `volumetric_soil_water_layer_1`: Soil moisture in upper layer

**Requirements:**
- A CDS account (free registration) is required to use the API.
- Data are downloaded using the CDS Python API (`cdsapi`).
- Files are typically saved in NetCDF format.

In [1]:
pip install 'cdsapi>=0.7.2'

Collecting cdsapi>=0.7.2
  Downloading cdsapi-0.7.6-py2.py3-none-any.whl.metadata (3.0 kB)
Collecting ecmwf-datastores-client (from cdsapi>=0.7.2)
  Downloading ecmwf_datastores_client-0.2.0-py3-none-any.whl.metadata (21 kB)
Collecting multiurl>=0.3.2 (from ecmwf-datastores-client->cdsapi>=0.7.2)
  Downloading multiurl-0.3.5-py3-none-any.whl.metadata (2.8 kB)
Downloading cdsapi-0.7.6-py2.py3-none-any.whl (12 kB)
Downloading ecmwf_datastores_client-0.2.0-py3-none-any.whl (28 kB)
Downloading multiurl-0.3.5-py3-none-any.whl (21 kB)
Installing collected packages: multiurl, ecmwf-datastores-client, cdsapi
Successfully installed cdsapi-0.7.6 ecmwf-datastores-client-0.2.0 multiurl-0.3.5


In [2]:
import cdsapi

cdsapi_key = 'get your key in site'
client = cdsapi.Client(url='https://cds.climate.copernicus.eu/api', key=cdsapi_key)

2025-07-07 23:21:06,711 INFO [2024-09-26T00:00:00] Watch our [Forum](https://forum.ecmwf.int/) for Announcements, news and other discussed topics.
INFO:ecmwf.datastores.legacy_client:[2024-09-26T00:00:00] Watch our [Forum](https://forum.ecmwf.int/) for Announcements, news and other discussed topics.


In [None]:
list_years = ['2002','2003','2004','2005','2006','2007','2008','2009','2010','2011']
list_months = ['01','02', '03',
            '04', '05', '06',
            '07', '08', '09',
            '10', '11', '12']

list_days = ['01', '02', '03',
            '04', '05', '06',
            '07', '08', '09',
            '10', '11', '12',
            '13', '14', '15',
            '16', '17', '18',
            '19', '20', '21',
            '22', '23', '24',
            '25', '26', '27',
            '28', '29', '30']


In [None]:
era5_atmosphere_initial = ['2m_dewpoint_temperature', '2m_temperature', 'evaporation',
            'near_ir_albedo_for_diffuse_radiation', 'uv_visible_albedo_for_diffuse_radiation', 'precipitation_type',
                           'runoff','soil_temperature_level_1', 'total_precipitation','total_column_water_vapour']


In [None]:
# Spatial Bounds – Brazilian Amazon Biome (ERA5 Regular Grid)
north = 5.090000000000001
west = -73.97999999999999
south = -16.66
east = -43.48

# west, south, east, north = bounds
print(f"{north}/{west}/{south}/{east}")

In [None]:
import concurrent.futures

def download_data(year,variable):

    # Update the path to your local directory
    filename = '/content/drive/MyDrive/bioma_amazonia_era5_single_one/era5_hourly_netcdf_grid_bioma_amazonia/grid-era5-hourly-data-single-level-' + year +  '-' + variable + '.nc'

    c.retrieve(
        'reanalysis-era5-single-levels',
        {
            'product_type': 'reanalysis',
            'variable': [variable],
            'year': year,
            'month': list_months,
            'day': list_days,
            'time': [
                '00:00', '01:00', '02:00',
                '03:00', '04:00', '05:00',
                '06:00', '07:00', '08:00',
                '09:00', '10:00', '11:00',
                '12:00', '13:00', '14:00',
                '15:00', '16:00', '17:00',
                '18:00', '19:00', '20:00',
                '21:00', '22:00', '23:00',
            ],
            'format': 'netcdf',
            "area": f"{north}/{west}/{south}/{east}"
        },
        filename)

    print(filename, 'saved!')

with concurrent.futures.ThreadPoolExecutor() as executor:
    futures = []
    for variable in era5_atmosphere_initial:
      futures.append(executor.submit(download_data, '2006', variable))

    # Await completion of all tasks
    concurrent.futures.wait(futures)

# 🌿 MODIS Products

**Description:**  
The Moderate Resolution Imaging Spectroradiometer (MODIS) provides a wide range of remote sensing data products relevant for ecosystem and climate studies. MODIS data include vegetation indices, land surface temperature, and reflectance, among others, which are commonly used as predictors in NEE modeling.

**Access:**  
MODIS data can be accessed through Google Earth Engine (GEE), NASA’s LP DAAC, or other data portals.



## Library Imports

In [5]:
!pip install ipython-autotime
%load_ext autotime

Collecting ipython-autotime
  Downloading ipython_autotime-0.3.2-py2.py3-none-any.whl.metadata (1.4 kB)
Collecting jedi>=0.16 (from ipython->ipython-autotime)
  Downloading jedi-0.19.2-py2.py3-none-any.whl.metadata (22 kB)
Downloading ipython_autotime-0.3.2-py2.py3-none-any.whl (7.0 kB)
Downloading jedi-0.19.2-py2.py3-none-any.whl (1.6 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.6/1.6 MB[0m [31m20.8 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: jedi, ipython-autotime
Successfully installed ipython-autotime-0.3.2 jedi-0.19.2
time: 257 µs (started: 2025-07-07 23:24:46 +00:00)


In [6]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive
time: 43.8 s (started: 2025-07-07 23:24:48 +00:00)


In [7]:
pip install pandas fiona shapely pyproj rtree rasterio geopandas

Collecting fiona
  Downloading fiona-1.10.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (56 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m56.6/56.6 kB[0m [31m2.3 MB/s[0m eta [36m0:00:00[0m
Collecting rtree
  Downloading rtree-1.4.0-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.metadata (2.1 kB)
Collecting rasterio
  Downloading rasterio-1.4.3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (9.1 kB)
Collecting click-plugins>=1.0 (from fiona)
  Downloading click_plugins-1.1.1.2-py2.py3-none-any.whl.metadata (6.5 kB)
Collecting cligj>=0.5 (from fiona)
  Downloading cligj-0.7.2-py3-none-any.whl.metadata (5.0 kB)
Collecting affine (from rasterio)
  Downloading affine-2.4.0-py3-none-any.whl.metadata (4.0 kB)
Downloading fiona-1.10.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (17.3 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m17.3/17.3 MB[0m [31m81.7 MB/s[0m eta [36m0:00:0

In [None]:
pip install rasterio

In [None]:
pip install geopandas

In [8]:
# Libraries to open 'netcdf' file
# import netCDF4 as netcdf
import xarray

# To pre-process dataframe and use in visualization
import numpy as np
import pandas as pd
import geopandas as gpd
import rasterio as rio
import shapely
from shapely.geometry import Point
import ast

time: 2.68 s (started: 2025-07-07 23:25:42 +00:00)


In [10]:
import os

def get_list_csv_files(path):
  directory = path
  files = []
  for filename in os.listdir(directory):
    f = os.path.join(directory, filename)

    if (os.path.isfile(f)) and ('.csv' in f):
      files.append(f)
  return files

time: 1.08 ms (started: 2025-07-07 23:25:53 +00:00)


In [11]:
import os

def get_list_shp_files(path):
  directory = path
  files = []
  for filename in os.listdir(directory):
    f = os.path.join(directory, filename)

    if (os.path.isfile(f)) and ('.shp' in f):
      files.append(f)
  return files

time: 951 µs (started: 2025-07-07 23:25:54 +00:00)


##  Connecting Google Earth Engine (GEE) with Other Google Services

A API do Python do Earth Engine e as ferramentas de linha de comando podem ser instaladas usando a ferramenta de instalação do pacote pip do Python.

In [12]:
!pip install earthengine-api

time: 6.26 s (started: 2025-07-07 23:26:06 +00:00)


In [None]:
!earthengine authenticate

In [None]:
from google.colab import auth
auth.authenticate_user()

In [None]:
# Import the Earth Engine API and initialize it.
import ee
ee.Initialize()

In [None]:
import folium
token = '4/1AX4XfWif2ASmQd28DqQmuR9blSSDRQqVHJ8d3Jri9TIPA4h832W7H1dWeKY'
# Define the URL format used for Earth Engine generated map tiles.
EE_TILES = 'https://earthengine.googleapis.com/map/{mapid}/{{z}}/{{x}}/{{y}}?token={token}'

print('Folium version: ' + folium.__version__)

In [None]:
pip install geemap

In [None]:
import json
import geemap
import geemap.colormaps as cm
import altair as alt
import folium

In [None]:
#@title Mapdisplay: Display GEE objects using folium.
def Mapdisplay(center, dicc, Tiles="OpensTreetMap",zoom_start=10):
    '''
    :param center: Center of the map (Latitude and Longitude).
    :param dicc: Earth Engine Geometries or Tiles dictionary
    :param Tiles: Mapbox Bright,Mapbox Control Room,Stamen Terrain,Stamen Toner,stamenwatercolor,cartodbpositron.
    :zoom_start: Initial zoom level for the map.
    :return: A folium.Map object.
    '''
    mapViz = folium.Map(location=center,tiles=Tiles, zoom_start=zoom_start)
    for k,v in dicc.items():
      if ee.image.Image in [type(x) for x in v.values()]:
        folium.TileLayer(
            tiles = v["tile_fetcher"].url_format,
            attr  = 'Google Earth Engine',
            overlay =True,
            name  = k
          ).add_to(mapViz)
      else:
        folium.GeoJson(
        data = v,
        name = k
          ).add_to(mapViz)
    mapViz.add_child(folium.LayerControl())
    return mapViz

## Data Extraction from GEE


In [None]:
import ee
ee.Initialize()

def get_asset_list(parent):
    parent_asset = ee.data.getAsset(parent)
    parent_id = parent_asset['name']
    parent_type = parent_asset['type']
    asset_list = []
    child_assets = ee.data.listAssets({'parent': parent_id})['assets']
    for child_asset in child_assets:
        child_id = child_asset['name']
        child_type = child_asset['type']
        if child_type in ['FOLDER','IMAGE_COLLECTION']:
            # Recursively call the function to get child assets
            asset_list.extend(get_asset_list(child_id))
        else:
            asset_list.append(child_id)
    return asset_list

all_assets = get_asset_list('projects/ee-alinephysics/assets/')

print('Found {} assets'.format(len(all_assets)))

In [None]:
all_assets

In [None]:
# import argparse
# import ee

# parser = argparse.ArgumentParser()
# parser.add_argument('--old_collection', help='old collection')
# parser.add_argument('--new_collection', help='new collection')
# parser.add_argument('--delete', help='delete old collection',
#     action=argparse.BooleanOptionalAction)

# # args = parser.parse_args()

# old_collection = old_collection
# new_collection = new_collection

# ee.Initialize()

# # Check if new collection exists
# try:
#     ee.ImageCollection(new_collection).getInfo()
# except:
#     print('Collection {} does not exist'.format(new_collection))
#     ee.data.createAsset({'type': ee.data.ASSET_TYPE_IMAGE_COLL}, new_collection)
#     print('Created a new empty collection {}.'.format(new_collection))


# assets = ee.data.listAssets({'parent': old_collection})['assets']


# for asset in assets:
#     old_name = asset['name']
#     new_name = old_name.replace(old_collection, new_collection)
#     print('Copying {} to {}'.format(old_name, new_name))
#     ee.data.copyAsset(old_name, new_name, True)
#     if args.delete:
#         print('Deleting <{}>'.format(old_name))
#         ee.data.deleteAsset(old_name)

# if args.delete:
#     print('Deleting Collection <{}>'.format(old_collection))
#     ee.data.deleteAsset(old_collection)

In [None]:
import ee
import os
import geemap
import pandas as pd
import geopandas as gpd


def create_geodataframe_and_save_in_assets(df, path_name, asset_name):
  # create a tmp gdf
  gdf = gpd.GeoDataFrame(
      df,
      crs='EPSG:4326',
      geometry = gpd.points_from_xy(
          df['longitude'],
          df['latitude']
      )
  )

  # convert it into geo-json
  json_df = json.loads(gdf.to_json())

  # create a gee object with geemap
  ee_object = geemap.geojson_to_ee(json_df)

  # upload this object to earthengine
  asset = os.path.join(path_name, asset_name)

  #create and launch the task
  task_config = {
      'collection': ee_object,
      'description':asset_name,
      'assetId': asset
  }
  task = ee.batch.Export.table.toAsset(**task_config)
  task.start()
  return "Success"

 ## MODIS: Downloading Data from GEE to Assets and Google Drive

In [None]:
pip install geopandas


- **MOD13A2 (Version 061):**  
  16-day composite vegetation indices (12 spectral bands)  
  Example date: February 18, 2000

- **MCD15A3H (Version 061):**  
  4-day composite Leaf Area Index (LAI) and Fraction of Photosynthetically Active Radiation (FPAR) data (6 bands)  
  Example date: July 4, 2002

In [None]:
# Import the module for shapefile manipulation
import geopandas as gpd

def get_km67_data_var(name_collection, variable):
  # Load the shapefile containing the polygon
  shapefile_path = '/content/drive/MyDrive/bioma-amazonia/km67-7655cell-grid-era5-one-single0-25deg.shp'
  gdf = gpd.read_file(shapefile_path)

  # Example: Extract NDVI time series from the MODIS collection
  collection = ee.ImageCollection(name_collection).select(variable)

  # Get the polygon from the shapefile
  polygon = ee.Geometry.Polygon(list(gdf['geometry'].iloc[0].exterior.coords))

  # Filter images within the polygon and date range
  filtered_collection = collection.filterBounds(polygon).filterDate('2002-01-01', '2012-01-01')

  # Run the query and obtain the time series
  time_series = filtered_collection.getRegion(polygon, scale=27830).getInfo()
  return time_series

def get_km67_data(name_collection):
  # Load the shapefile containing the polygon
  shapefile_path = '/content/drive/MyDrive/bioma-amazonia/km67-7655cell-grid-era5-one-single0-25deg.shp'
  gdf = gpd.read_file(shapefile_path)

  # Example: Extract time series from the image collection
  collection = ee.ImageCollection(name_collection)

  # Get the polygon from the shapefile
  polygon = ee.Geometry.Polygon(list(gdf['geometry'].iloc[0].exterior.coords))

  # Remaining code...

  # Filter images within the polygon and date range
  filtered_collection = collection.filterBounds(polygon).filterDate('2002-01-01', '2012-01-01')

  # Run the query and obtain the time series
  time_series = filtered_collection.getRegion(polygon, scale=27830).getInfo()
  return time_series


In [None]:
time_serie_MODIS_MCD15A3H =  get_km67_data("MODIS/061/MCD15A3H")
time_serie_MODIS_MOD13A2 =  get_km67_data("MODIS/061/MOD13A2")

In [None]:
def create_dataframe_for_time_series_data(time_series,filename):
  df = pd.DataFrame(time_series[1:], columns=time_series[0])

  if '_' in df['id'][0]:
    df['id'] = df['id'].str.replace('_','')

  df['date'] = pd.to_datetime(df['id'], format='%Y%m%d')

  df.to_csv(filename)
  return df

In [None]:
df_MODIS_MCD15A3H = create_dataframe_for_time_series_data(time_serie_MODIS_MCD15A3H,'/content/drive/MyDrive/bioma-amazonia/cell_km67/time_series_MODIS_MCD15A3H.csv')
df_MODIS_MOD13A2 = create_dataframe_for_time_series_data(time_serie_MODIS_MOD13A2,'/content/drive/MyDrive/bioma-amazonia/cell_km67/time_series_MODIS_MCD15A3H.csv')

In [None]:
t = pd.read_csv('/content/drive/MyDrive/bioma-amazonia/cell_km67/time_series_MODIS_MCD15A3H.csv')
t

# ☀️ CERES Radiation Data

**Dataset links:**  
- [CERES Data Portal](https://ceres.larc.nasa.gov/data/)  
- [CERES SYN1deg Edition 4A Selection Tool](https://ceres-tool.larc.nasa.gov/ord-tool/jsp/SYN1degEd42Selection.jsp)  
- [Data Quality Summary (CERES SYN1deg Ed4A)](https://ceres.larc.nasa.gov/documents/DQ_summaries/CERES_SYN1deg_Ed4A_DQS.pdf)

**Description:**  
The Clouds and the Earth's Radiant Energy System (CERES) dataset provides global measurements of radiative energy fluxes at the top of the atmosphere, surface, and within the atmosphere. It is widely used for climate studies and as an important predictor for ecosystem carbon flux modeling.

This dataset offers variables such as:
- Shortwave and longwave radiation fluxes  
- Surface radiation budget components

The data is available at a 1-degree spatial resolution and can be downloaded using the online selection tool or via FTP.



# ☔ MERGE Precipitation Data

**Data sources:**  
- [MERGE GPM Daily Data FTP](https://ftp.cptec.inpe.br/modelos/tempo/MERGE/GPM/DAILY/)  
- [MERGE GPM FTP Directory](https://ftp.cptec.inpe.br/modelos/tempo/MERGE/GPM/)  
- [MERGE GPM Data Description (Rozante, 2024)](https://ftp.cptec.inpe.br/modelos/tempo/MERGE/GPM/Rozante.2024.pdf)

**Description:**  
The MERGE dataset combines satellite-based precipitation estimates with ground observations to provide daily rainfall data. It is produced by CPTEC/INPE and is widely used for hydrological and ecological studies in Brazil, including Amazon basin applications.

This dataset is suitable as a predictor in NEE models, capturing rainfall variability at regional scales.




# 🌿 FluxCom NEE Estimates

**Data source:**  
[FluxCom FTP Server](ftp://ftp.bgc-jena.mpg.de)

**Description:**  
FluxCom provides machine learning-based estimates of Net Ecosystem Exchange (NEE) by combining eddy covariance flux tower data with remote sensing and meteorological predictors. The dataset covers multiple biomes globally and is widely used for benchmarking and model comparison studies in ecosystem carbon flux research.



In [None]:
from google.colab import drive
drive.mount('/content/drive')

In [None]:
import xarray as xr
import numpy as np
import pandas as pd

In [None]:
a = 'NEE.ANN.CRUNCEPv6.daily.1980.nc'

In [None]:
years = ['2002','2003','2004','2005','2006','2007','2008','2009','2010','2011','2012']

In [None]:
a[-7:-3]

/outgoing/FluxCom/CarbonFluxes/RS_METEO/ensemble/ERA5/monthly

In [None]:
from ftplib import FTP
import os

# FTP server details
ftp_host = "ftp.bgc-jena.mpg.de"
ftp_path = "/outgoing/FluxCom/CarbonFluxes/RS_METEO/ensemble/ERA5/monthly"

# Path to save the files to Google Drive
local_dir = "/content/drive/My Drive/NEE_Files_daily_ERA5_RSMETEO_monthly"

# Create the directory on Google Drive if it doesn't exist
os.makedirs(local_dir, exist_ok=True)

# Connect to the FTP server
ftp = FTP(ftp_host)
ftp.login()  # Anonymous login

# Navigate to the specific directory
ftp.cwd(ftp_path)

# List files in the directory
files = ftp.nlst()

# Filter files that start with "NEE"
nee_files = [
    file for file in files
    if file.startswith("NEE")
]

# Download the files
for file in nee_files:
    local_file_path = os.path.join(local_dir, file)
    with open(local_file_path, "wb") as local_file:
        ftp.retrbinary(f"RETR {file}", local_file.write)
    print(f"File {file} successfully downloaded to {local_file_path}")

# Close the FTP connection
ftp.quit()


In [None]:
from ftplib import FTP
import os

# FTP server details
ftp_host = "ftp.bgc-jena.mpg.de"
ftp_path = "/outgoing/FluxCom/CarbonFluxes_v1_2017/RS+METEO/CRUNCEPv6/raw/daily"

# Path to save the files in Google Drive
local_dir = "/content/drive/My Drive/NEE_Files_daily_Cruncepv6_RSMETEO"

# Create the directory in Google Drive if it doesn't exist
os.makedirs(local_dir, exist_ok=True)

# Connect to the FTP server
ftp = FTP(ftp_host)
ftp.login()  # Anonymous login

# Navigate to the target directory
ftp.cwd(ftp_path)

# List files in the directory
files = ftp.nlst()

# Filter files that start with "NEE"
nee_files = [
    file for file in files
    if file.startswith("NEE")
]

# Download the files
for file in nee_files:
    local_file_path = os.path.join(local_dir, file)
    with open(local_file_path, "wb") as local_file:
        ftp.retrbinary(f"RETR {file}", local_file.write)
    print(f"File {file} successfully downloaded to {local_file_path}")

# Close the FTP connection
ftp.quit()


In [None]:
from ftplib import FTP
import os

# FTP server details
ftp_host = "ftp.bgc-jena.mpg.de"
ftp_path = "/outgoing/FluxCom/CarbonFluxes_v1_2017/RS+METEO/WFDEI/raw/daily"

# Path to save the files in Google Drive
local_dir = "/content/drive/My Drive/NEE_Files_daily_WFDEI_RSMETEO"

# Create the directory in Google Drive if it doesn't exist
os.makedirs(local_dir, exist_ok=True)

# Connect to the FTP server
ftp = FTP(ftp_host)
ftp.login()  # Anonymous login

# Navigate to the target directory
ftp.cwd(ftp_path)

# List files in the directory
files = ftp.nlst()

# Filter files that start with "NEE"
nee_files = [
    file for file in files
    if file.startswith("NEE")
]

# Download the files
for file in nee_files:
    local_file_path = os.path.join(local_dir, file)
    with open(local_file_path, "wb") as local_file:
        ftp.retrbinary(f"RETR {file}", local_file.write)
    print(f"File {file} successfully downloaded to {local_file_path}")

# Close the FTP connection
ftp.quit()


GPP /outgoing/FluxCom/CarbonFluxes_v1_2017/RS/ensemble/720_360/8daily

## Updated Daily NEE (2023)

/outgoing/FluxCom/CarbonFluxes/RS_METEO/member/CRUNCEP_v8/daily

In [None]:
from ftplib import FTP
import os

# FTP server details
ftp_host = "ftp.bgc-jena.mpg.de"
ftp_path = "/outgoing/FluxCom/CarbonFluxes/RS_METEO/member/CRUNCEP_v8/daily"

# Path to save the files in Google Drive
local_dir = "/content/drive/My Drive/NEE_Files_daily_Cruncep_v8_update2023"

# Create the directory in Google Drive if it doesn't exist
os.makedirs(local_dir, exist_ok=True)

# Connect to the FTP server
ftp = FTP(ftp_host)
ftp.login()  # Anonymous login

# Navigate to the target directory
ftp.cwd(ftp_path)

# List files in the directory
files = ftp.nlst()

# Filter files that start with "NEE"
nee_files = [
    file for file in files
    if file.startswith("NEE")
]

# Download the files
for file in nee_files:
    local_file_path = os.path.join(local_dir, file)
    with open(local_file_path, "wb") as local_file:
        ftp.retrbinary(f"RETR {file}", local_file.write)
    print(f"File {file} successfully downloaded to {local_file_path}")

# Close the FTP connection
ftp.quit()
