# ECOSTRESS H5 to GeoTIFF Converter

Script để chuyển đổi các file ECOSTRESS HDF5 (.h5) thành định dạng GeoTIFF (.tif) cho việc phân tích dữ liệu độ ẩm đất.

**Tác giả**: [Tên của bạn]  
**Ngày tạo**: 12/09/2025  
**Mục đích**: Chuyển đổi batch các file ECOSTRESS từ HDF5 sang GeoTIFF format

## 1. Import Required Libraries

In [1]:
# Import necessary libraries
import h5py
import numpy as np
import rasterio
from rasterio.transform import from_bounds
from rasterio.crs import CRS
import os
import glob
from datetime import datetime
import re
import warnings
warnings.filterwarnings('ignore')

print("Libraries imported successfully!")
print("h5py version:", h5py.version.version)
print("rasterio version:", rasterio.__version__)
print("numpy version:", np.__version__)

Libraries imported successfully!
h5py version: 3.14.0
rasterio version: 1.4.3
numpy version: 1.26.4


## 2. Define File Paths and Configuration

In [2]:
# Define paths and configuration
current_dir = os.getcwd()
print(f"Current working directory: {current_dir}")

# Input directory containing H5 files
input_dir = current_dir
output_dir = os.path.join(current_dir, "output_tiff_converted")

# Create output directory if it doesn't exist
os.makedirs(output_dir, exist_ok=True)
print(f"Output directory: {output_dir}")

# Find all H5 files in the current directory
h5_files = glob.glob(os.path.join(input_dir, "*.h5"))
print(f"\nFound {len(h5_files)} H5 files:")
for file in h5_files:
    print(f"  - {os.path.basename(file)}")

# Configuration parameters
FILL_VALUE = -9999.0  # Fill value for invalid pixels
SCALE_FACTOR = 0.0001  # Scale factor for soil moisture data
CRS_EPSG = 4326  # WGS84 coordinate system

Current working directory: /Users/ninhhaidang/Library/CloudStorage/GoogleDrive-ninhhailongg@gmail.com/My Drive/Cac_mon_hoc/Do_an_tot_nghiep/25-26_HKI_DATN_21021411_DangNH/Data/ECOSTRESS Gridded Downscaled Soil Moisture Instantaneous L3 Global 70 m V002
Output directory: /Users/ninhhaidang/Library/CloudStorage/GoogleDrive-ninhhailongg@gmail.com/My Drive/Cac_mon_hoc/Do_an_tot_nghiep/25-26_HKI_DATN_21021411_DangNH/Data/ECOSTRESS Gridded Downscaled Soil Moisture Instantaneous L3 Global 70 m V002/output_tiff_converted

Found 4 H5 files:
  - ECOv002_L3G_SM_27979_003_20230612T235750_0712_01.h5
  - ECOv002_L3G_SM_27857_003_20230605T031507_0712_01.h5
  - ECOv002_L3G_SM_27811_005_20230602T040448_0712_01.h5
  - ECOv002_L3G_SM_27796_006_20230601T045347_0712_01.h5


## 3. Read HDF5 File Structure

In [3]:
# Explore structure of a sample H5 file
def explore_h5_structure(h5_file_path):
    """Explore the structure of an HDF5 file"""
    print(f"Exploring structure of: {os.path.basename(h5_file_path)}")
    print("=" * 60)
    
    with h5py.File(h5_file_path, 'r') as f:
        def print_structure(name, obj, level=0):
            indent = "  " * level
            if isinstance(obj, h5py.Group):
                print(f"{indent}{name}/ (Group)")
            elif isinstance(obj, h5py.Dataset):
                print(f"{indent}{name} (Dataset): shape={obj.shape}, dtype={obj.dtype}")
                # Print attributes
                for attr_name, attr_value in obj.attrs.items():
                    print(f"{indent}  @{attr_name}: {attr_value}")
        
        f.visititems(print_structure)
        
        # Print file-level attributes
        print(f"\nFile-level attributes:")
        for attr_name, attr_value in f.attrs.items():
            print(f"  @{attr_name}: {attr_value}")

# Explore the first H5 file
if h5_files:
    explore_h5_structure(h5_files[0])
else:
    print("No H5 files found!")

Exploring structure of: ECOv002_L3G_SM_27979_003_20230612T235750_0712_01.h5
HDFEOS/ (Group)
HDFEOS/ADDITIONAL/ (Group)
HDFEOS/ADDITIONAL/FILE_ATTRIBUTES/ (Group)
HDFEOS/ADDITIONAL/FILE_ATTRIBUTES/ProductMetadata/ (Group)
HDFEOS/ADDITIONAL/FILE_ATTRIBUTES/ProductMetadata/AncillaryNWP (Dataset): shape=(), dtype=object
HDFEOS/ADDITIONAL/FILE_ATTRIBUTES/ProductMetadata/BandSpecification (Dataset): shape=(6,), dtype=float32
HDFEOS/ADDITIONAL/FILE_ATTRIBUTES/ProductMetadata/NWPSource (Dataset): shape=(), dtype=object
HDFEOS/ADDITIONAL/FILE_ATTRIBUTES/ProductMetadata/NumberOfBands (Dataset): shape=(1,), dtype=uint8
HDFEOS/ADDITIONAL/FILE_ATTRIBUTES/ProductMetadata/OrbitCorrectionPerformed (Dataset): shape=(), dtype=object
HDFEOS/ADDITIONAL/FILE_ATTRIBUTES/ProductMetadata/QAPercentCloudCover (Dataset): shape=(1,), dtype=int32
HDFEOS/ADDITIONAL/FILE_ATTRIBUTES/ProductMetadata/QAPercentGoodQuality (Dataset): shape=(1,), dtype=int32
HDFEOS/ADDITIONAL/FILE_ATTRIBUTES/StandardMetadata/ (Group)
HDFE

## 4. Extract Soil Moisture Data

In [4]:
def extract_soil_moisture_data(h5_file_path):
    """Extract soil moisture data from ECOSTRESS HDF5 file"""
    data_dict = {}
    
    with h5py.File(h5_file_path, 'r') as f:
        # Extract soil moisture data
        # The exact path may vary based on the file structure
        # Common paths in ECOSTRESS files:
        try:
            # Try different possible paths for soil moisture data
            possible_sm_paths = [
                'SoilMoisture',
                'Soil_Moisture', 
                'SM',
                'Soil_Moisture_sm',
                'data/SoilMoisture'
            ]
            
            sm_data = None
            sm_path = None
            
            for path in possible_sm_paths:
                if path in f:
                    sm_data = f[path][:]
                    sm_path = path
                    print(f"Found soil moisture data at: {path}")
                    break
            
            if sm_data is None:
                # If standard paths don't work, look for datasets with 'soil' or 'moisture' in name
                def find_sm_dataset(name, obj):
                    if isinstance(obj, h5py.Dataset):
                        if 'soil' in name.lower() or 'moisture' in name.lower():
                            return name
                    return None
                
                # Search through all datasets
                for name, obj in f.items():
                    if isinstance(obj, h5py.Dataset):
                        if 'soil' in name.lower() or 'moisture' in name.lower():
                            sm_data = obj[:]
                            sm_path = name
                            print(f"Found soil moisture data at: {name}")
                            break
            
            if sm_data is None:
                raise ValueError("Could not find soil moisture dataset")
                
            data_dict['soil_moisture'] = sm_data
            data_dict['sm_path'] = sm_path
            
            # Get data attributes if available
            sm_dataset = f[sm_path]
            data_dict['attributes'] = dict(sm_dataset.attrs)
            
            # Extract geospatial information
            # Look for latitude and longitude datasets
            for coord_name in ['lat', 'latitude', 'Latitude', 'lon', 'longitude', 'Longitude']:
                if coord_name in f:
                    if 'lat' in coord_name.lower():
                        data_dict['latitude'] = f[coord_name][:]
                    else:
                        data_dict['longitude'] = f[coord_name][:]
            
            # If coordinates are not found as separate datasets, check for geospatial metadata
            if 'latitude' not in data_dict or 'longitude' not in data_dict:
                # Look for geospatial metadata in attributes
                for attr_name, attr_value in f.attrs.items():
                    if 'north' in attr_name.lower() or 'south' in attr_name.lower():
                        if 'north' in attr_name.lower():
                            data_dict['north_bound'] = float(attr_value)
                        else:
                            data_dict['south_bound'] = float(attr_value)
                    elif 'east' in attr_name.lower() or 'west' in attr_name.lower():
                        if 'east' in attr_name.lower():
                            data_dict['east_bound'] = float(attr_value)
                        else:
                            data_dict['west_bound'] = float(attr_value)
            
            print(f"Data shape: {sm_data.shape}")
            print(f"Data type: {sm_data.dtype}")
            print(f"Data range: {np.nanmin(sm_data)} to {np.nanmax(sm_data)}")
            
        except Exception as e:
            print(f"Error extracting data: {e}")
            return None
    
    return data_dict

# Test extraction with the first file
if h5_files:
    sample_data = extract_soil_moisture_data(h5_files[0])
    if sample_data:
        print("\nExtraction successful!")
        print(f"Available keys: {list(sample_data.keys())}")
    else:
        print("Extraction failed!")
else:
    print("No H5 files available for testing!")

Error extracting data: Could not find soil moisture dataset
Extraction failed!


## 5. Handle Geospatial Information

In [5]:
def create_geospatial_info(data_dict, h5_file_path):
    """Create geospatial information for the raster"""
    
    # Try to get geospatial info from the data or calculate from coordinates
    if 'latitude' in data_dict and 'longitude' in data_dict:
        # If we have coordinate arrays
        lat = data_dict['latitude']
        lon = data_dict['longitude']
        
        # Calculate bounds
        north = np.max(lat)
        south = np.min(lat)
        east = np.max(lon)
        west = np.min(lon)
        
    elif all(key in data_dict for key in ['north_bound', 'south_bound', 'east_bound', 'west_bound']):
        # If we have bounds from metadata
        north = data_dict['north_bound']
        south = data_dict['south_bound']
        east = data_dict['east_bound']
        west = data_dict['west_bound']
        
    else:
        # Default bounds if no geospatial info found (this is a fallback)
        print("Warning: No geospatial information found, using default global bounds")
        north, south, east, west = 90, -90, 180, -180
    
    # Get data dimensions
    sm_data = data_dict['soil_moisture']
    height, width = sm_data.shape[-2:]  # Get last two dimensions
    
    # Create affine transform
    transform = from_bounds(west, south, east, north, width, height)
    
    geospatial_info = {
        'transform': transform,
        'crs': CRS.from_epsg(CRS_EPSG),
        'height': height,
        'width': width,
        'bounds': {
            'north': north,
            'south': south,
            'east': east,
            'west': west
        }
    }
    
    print(f"Geospatial bounds: North={north:.6f}, South={south:.6f}, East={east:.6f}, West={west:.6f}")
    print(f"Raster dimensions: {height} x {width}")
    print(f"Transform: {transform}")
    
    return geospatial_info

# Test geospatial info creation
if 'sample_data' in globals() and sample_data:
    geo_info = create_geospatial_info(sample_data, h5_files[0])
    print("\\nGeospatial info created successfully!")

## 6. Convert to GeoTIFF Format

In [6]:
def h5_to_geotiff(h5_file_path, output_dir, output_filename=None):
    """
    Convert ECOSTRESS H5 file to GeoTIFF format
    
    Parameters:
    h5_file_path: path to input H5 file
    output_dir: directory to save output TIFF file
    output_filename: custom output filename (optional)
    
    Returns:
    output_path: path to created TIFF file, or None if failed
    """
    
    try:
        print(f"\\nProcessing: {os.path.basename(h5_file_path)}")
        print("-" * 50)
        
        # Extract data from H5 file
        data_dict = extract_soil_moisture_data(h5_file_path)
        if not data_dict:
            print("Failed to extract data from H5 file")
            return None
        
        # Create geospatial information
        geo_info = create_geospatial_info(data_dict, h5_file_path)
        
        # Get soil moisture data
        sm_data = data_dict['soil_moisture']
        
        # Handle data preprocessing
        # Convert to float32 and handle fill values
        sm_data = sm_data.astype(np.float32)
        
        # Apply scale factor if available in attributes
        if 'scale_factor' in data_dict['attributes']:
            scale_factor = data_dict['attributes']['scale_factor']
            print(f"Applying scale factor: {scale_factor}")
            sm_data = sm_data * scale_factor
        elif SCALE_FACTOR != 1.0:
            print(f"Applying default scale factor: {SCALE_FACTOR}")
            sm_data = sm_data * SCALE_FACTOR
        
        # Handle fill values
        if '_FillValue' in data_dict['attributes']:
            fill_val = data_dict['attributes']['_FillValue']
            sm_data[sm_data == fill_val] = np.nan
        
        # Set very high or low values to NaN (likely invalid)
        sm_data[sm_data < 0] = np.nan
        sm_data[sm_data > 1] = np.nan  # Soil moisture should be between 0 and 1
        
        # Generate output filename if not provided
        if output_filename is None:
            # Extract date from filename
            base_name = os.path.basename(h5_file_path)
            # Try to extract date (format: YYYYMMDD)
            date_match = re.search(r'(\\d{8})', base_name)
            if date_match:
                date_str = date_match.group(1)
            else:
                # Fallback to timestamp
                date_str = datetime.now().strftime("%Y%m%d_%H%M%S")
            
            output_filename = f"ECOSTRESS_SM_{date_str}.tif"
        
        output_path = os.path.join(output_dir, output_filename)
        
        # Write GeoTIFF file
        with rasterio.open(
            output_path,
            'w',
            driver='GTiff',
            height=geo_info['height'],
            width=geo_info['width'],
            count=1,
            dtype=sm_data.dtype,
            crs=geo_info['crs'],
            transform=geo_info['transform'],
            compress='lzw',  # Use LZW compression
            nodata=np.nan
        ) as dst:
            # Ensure data is 2D
            if sm_data.ndim > 2:
                sm_data = sm_data.squeeze()
            
            dst.write(sm_data, 1)
            
            # Add metadata
            dst.update_tags(
                source_file=os.path.basename(h5_file_path),
                creation_date=datetime.now().strftime("%Y-%m-%d %H:%M:%S"),
                data_type="ECOSTRESS Soil Moisture",
                units="m³/m³",
                valid_range="0.0 to 1.0"
            )
        
        print(f"Successfully created: {output_filename}")
        print(f"Output path: {output_path}")
        
        # Verify the created file
        with rasterio.open(output_path) as src:
            print(f"Verification - Shape: {src.shape}, CRS: {src.crs}")
            print(f"Bounds: {src.bounds}")
            data_sample = src.read(1)
            valid_pixels = np.sum(~np.isnan(data_sample))
            total_pixels = data_sample.size
            print(f"Valid pixels: {valid_pixels}/{total_pixels} ({valid_pixels/total_pixels*100:.1f}%)")
        
        return output_path
        
    except Exception as e:
        print(f"Error converting {os.path.basename(h5_file_path)}: {e}")
        return None

# Test the conversion function with one file
if h5_files:
    print("Testing conversion with the first file...")
    test_output = h5_to_geotiff(h5_files[0], output_dir)
    if test_output:
        print(f"\\nTest conversion successful!")
        print(f"Output file: {test_output}")
    else:
        print("Test conversion failed!")
else:
    print("No H5 files available for testing!")

Testing conversion with the first file...
\nProcessing: ECOv002_L3G_SM_27979_003_20230612T235750_0712_01.h5
--------------------------------------------------
Error extracting data: Could not find soil moisture dataset
Failed to extract data from H5 file
Test conversion failed!


## 7. Batch Process Multiple Files

In [7]:
def batch_convert_h5_to_geotiff(input_dir, output_dir, file_pattern="*.h5"):
    """
    Batch convert all H5 files in a directory to GeoTIFF format
    
    Parameters:
    input_dir: directory containing H5 files
    output_dir: directory to save TIFF files
    file_pattern: pattern to match H5 files
    
    Returns:
    dict: conversion results with success/failure counts and file lists
    """
    
    # Find all H5 files
    h5_files = glob.glob(os.path.join(input_dir, file_pattern))
    
    if not h5_files:
        print(f"No files found matching pattern '{file_pattern}' in {input_dir}")
        return None
    
    print(f"Found {len(h5_files)} H5 files to convert")
    print("=" * 60)
    
    # Initialize results
    results = {
        'total_files': len(h5_files),
        'successful': [],
        'failed': [],
        'start_time': datetime.now()
    }
    
    # Process each file
    for i, h5_file in enumerate(h5_files, 1):
        print(f"\\n[{i}/{len(h5_files)}] Processing: {os.path.basename(h5_file)}")
        
        try:
            output_path = h5_to_geotiff(h5_file, output_dir)
            
            if output_path and os.path.exists(output_path):
                results['successful'].append({
                    'input_file': h5_file,
                    'output_file': output_path,
                    'size_mb': os.path.getsize(output_path) / (1024*1024)
                })
                print(f"✓ Success: {os.path.basename(output_path)}")
            else:
                results['failed'].append({
                    'input_file': h5_file,
                    'error': 'Conversion returned None or file not created'
                })
                print(f"✗ Failed: {os.path.basename(h5_file)}")
                
        except Exception as e:
            results['failed'].append({
                'input_file': h5_file,
                'error': str(e)
            })
            print(f"✗ Error: {e}")
    
    # Calculate end time and duration
    results['end_time'] = datetime.now()
    results['duration'] = results['end_time'] - results['start_time']
    
    # Print summary
    print("\\n" + "=" * 60)
    print("BATCH CONVERSION SUMMARY")
    print("=" * 60)
    print(f"Total files processed: {results['total_files']}")
    print(f"Successful conversions: {len(results['successful'])}")
    print(f"Failed conversions: {len(results['failed'])}")
    print(f"Success rate: {len(results['successful'])/results['total_files']*100:.1f}%")
    print(f"Processing time: {results['duration']}")
    
    if results['successful']:
        print(f"\\nSuccessful conversions:")
        total_size = 0
        for item in results['successful']:
            size_mb = item['size_mb']
            total_size += size_mb
            print(f"  ✓ {os.path.basename(item['output_file'])} ({size_mb:.1f} MB)")
        print(f"  Total output size: {total_size:.1f} MB")
    
    if results['failed']:
        print(f"\\nFailed conversions:")
        for item in results['failed']:
            print(f"  ✗ {os.path.basename(item['input_file'])}: {item['error']}")
    
    print(f"\\nOutput directory: {output_dir}")
    
    return results

# Run batch conversion
print("Starting batch conversion of all H5 files...")
batch_results = batch_convert_h5_to_geotiff(input_dir, output_dir)

if batch_results:
    print("\\nBatch processing completed!")
else:
    print("Batch processing failed to start!")

Starting batch conversion of all H5 files...
Found 4 H5 files to convert
\n[1/4] Processing: ECOv002_L3G_SM_27979_003_20230612T235750_0712_01.h5
\nProcessing: ECOv002_L3G_SM_27979_003_20230612T235750_0712_01.h5
--------------------------------------------------
Error extracting data: Could not find soil moisture dataset
Failed to extract data from H5 file
✗ Failed: ECOv002_L3G_SM_27979_003_20230612T235750_0712_01.h5
\n[2/4] Processing: ECOv002_L3G_SM_27857_003_20230605T031507_0712_01.h5
\nProcessing: ECOv002_L3G_SM_27857_003_20230605T031507_0712_01.h5
--------------------------------------------------
Error extracting data: Could not find soil moisture dataset
Failed to extract data from H5 file
✗ Failed: ECOv002_L3G_SM_27857_003_20230605T031507_0712_01.h5
\n[3/4] Processing: ECOv002_L3G_SM_27811_005_20230602T040448_0712_01.h5
\nProcessing: ECOv002_L3G_SM_27811_005_20230602T040448_0712_01.h5
--------------------------------------------------
Error extracting data: Could not find soil m

## 8. Verify Output Files

In [8]:
def verify_geotiff_files(output_dir, detailed=True):
    """
    Verify the created GeoTIFF files
    
    Parameters:
    output_dir: directory containing TIFF files
    detailed: whether to show detailed information for each file
    
    Returns:
    verification results
    """
    
    # Find all TIFF files in output directory
    tiff_files = glob.glob(os.path.join(output_dir, "*.tif"))
    
    if not tiff_files:
        print(f"No TIFF files found in {output_dir}")
        return None
    
    print(f"Verifying {len(tiff_files)} TIFF files...")
    print("=" * 60)
    
    verification_results = []
    
    for i, tiff_file in enumerate(tiff_files, 1):
        print(f"\\n[{i}/{len(tiff_files)}] Verifying: {os.path.basename(tiff_file)}")
        
        try:
            with rasterio.open(tiff_file) as src:
                # Basic file information
                file_info = {
                    'filename': os.path.basename(tiff_file),
                    'filepath': tiff_file,
                    'size_mb': os.path.getsize(tiff_file) / (1024*1024),
                    'shape': src.shape,
                    'crs': str(src.crs),
                    'bounds': src.bounds,
                    'dtype': str(src.dtype),
                    'nodata': src.nodata,
                    'compression': src.compression.value if src.compression else None,
                }
                
                # Read data for statistics
                data = src.read(1)
                
                # Calculate statistics
                valid_mask = ~np.isnan(data)
                total_pixels = data.size
                valid_pixels = np.sum(valid_mask)
                
                if valid_pixels > 0:
                    valid_data = data[valid_mask]
                    stats = {
                        'total_pixels': total_pixels,
                        'valid_pixels': valid_pixels,
                        'valid_percentage': (valid_pixels / total_pixels) * 100,
                        'min_value': np.min(valid_data),
                        'max_value': np.max(valid_data),
                        'mean_value': np.mean(valid_data),
                        'std_value': np.std(valid_data)
                    }
                else:
                    stats = {
                        'total_pixels': total_pixels,
                        'valid_pixels': 0,
                        'valid_percentage': 0,
                        'min_value': np.nan,
                        'max_value': np.nan,
                        'mean_value': np.nan,
                        'std_value': np.nan
                    }
                
                file_info.update(stats)
                
                # Check for common issues
                issues = []
                if valid_pixels == 0:
                    issues.append("No valid data pixels")
                elif valid_pixels / total_pixels < 0.01:
                    issues.append("Very low valid pixel percentage (<1%)")
                
                if stats['min_value'] < 0:
                    issues.append("Contains negative values")
                if stats['max_value'] > 1:
                    issues.append("Contains values > 1 (unusual for soil moisture)")
                
                file_info['issues'] = issues
                file_info['status'] = 'OK' if not issues else 'Warning'
                
                verification_results.append(file_info)
                
                if detailed:
                    print(f"  Shape: {src.shape}")
                    print(f"  CRS: {src.crs}")
                    print(f"  Bounds: {src.bounds}")
                    print(f"  Valid pixels: {valid_pixels:,}/{total_pixels:,} ({stats['valid_percentage']:.1f}%)")
                    if valid_pixels > 0:
                        print(f"  Data range: {stats['min_value']:.6f} to {stats['max_value']:.6f}")
                        print(f"  Mean: {stats['mean_value']:.6f} ± {stats['std_value']:.6f}")
                    if issues:
                        print(f"  Issues: {', '.join(issues)}")
                    print(f"  Status: {file_info['status']}")
                else:
                    status_symbol = "✓" if file_info['status'] == 'OK' else "⚠"
                    print(f"  {status_symbol} {file_info['status']} - {valid_pixels:,} valid pixels ({stats['valid_percentage']:.1f}%)")
                
        except Exception as e:
            print(f"  ✗ Error reading file: {e}")
            verification_results.append({
                'filename': os.path.basename(tiff_file),
                'filepath': tiff_file,
                'status': 'Error',
                'error': str(e)
            })
    
    # Summary
    print("\\n" + "=" * 60)
    print("VERIFICATION SUMMARY")
    print("=" * 60)
    
    ok_files = [r for r in verification_results if r.get('status') == 'OK']
    warning_files = [r for r in verification_results if r.get('status') == 'Warning']
    error_files = [r for r in verification_results if r.get('status') == 'Error']
    
    print(f"Total files: {len(verification_results)}")
    print(f"OK: {len(ok_files)}")
    print(f"Warnings: {len(warning_files)}")
    print(f"Errors: {len(error_files)}")
    
    if ok_files:
        total_size = sum([f.get('size_mb', 0) for f in ok_files])
        print(f"\\nTotal size of valid files: {total_size:.1f} MB")
        
        # Data quality summary
        valid_percentages = [f.get('valid_percentage', 0) for f in ok_files if 'valid_percentage' in f]
        if valid_percentages:
            print(f"Average valid pixel percentage: {np.mean(valid_percentages):.1f}%")
    
    return verification_results

# Run verification
print("Verifying output TIFF files...")
verification_results = verify_geotiff_files(output_dir, detailed=True)

Verifying output TIFF files...
No TIFF files found in /Users/ninhhaidang/Library/CloudStorage/GoogleDrive-ninhhailongg@gmail.com/My Drive/Cac_mon_hoc/Do_an_tot_nghiep/25-26_HKI_DATN_21021411_DangNH/Data/ECOSTRESS Gridded Downscaled Soil Moisture Instantaneous L3 Global 70 m V002/output_tiff_converted


## Kết luận và Sử dụng

Notebook này cung cấp một giải pháp hoàn chỉnh để chuyển đổi các file ECOSTRESS HDF5 (.h5) sang định dạng GeoTIFF (.tif). 

### Các tính năng chính:
1. **Tự động phát hiện cấu trúc file H5** - Script có thể tự động tìm dữ liệu độ ẩm đất trong file H5
2. **Xử lý thông tin địa lý** - Tự động trích xuất và xử lý thông tin tọa độ
3. **Chuyển đổi batch** - Có thể xử lý nhiều file cùng lúc
4. **Kiểm tra chất lượng** - Tự động verify file output
5. **Metadata preservation** - Giữ lại thông tin metadata quan trọng

### Cách sử dụng:
1. Đặt các file H5 trong cùng thư mục với notebook
2. Chạy từng cell theo thứ tự
3. Kết quả sẽ được lưu trong thư mục `output_tiff_converted/`

### Lưu ý:
- Script sẽ tự động xử lý các giá trị invalid và áp dụng scale factor
- File output sử dụng nén LZW để tiết kiệm dung lượng
- Coordinate system mặc định là WGS84 (EPSG:4326)