# 🔵 ⚪ 🔵 LiDAR to Smooth Ground Surface Using PDAL

## Overview

This workflow efficiently converts LiDAR point cloud data (LAZ/LAS) to a smooth ground surface raster (GeoTIFF) using PDAL (Point Data Abstraction Library). The pipeline is optimized for both speed and output quality.

## ⚙️ Key Processing Steps Explained

### 1. Ground Point Extraction (`filters.range`)
- **Function**: Isolates only ground-classified points (Class 2 in LAS/LAZ format)
- **Importance**: Creates a clean dataset containing only terrain surface points
- **Performance Impact**: Reduces data volume for subsequent processing steps

### 2. Outlier Removal (`filters.outlier`)
- **Function**: Identifies and removes statistical outliers
- **Parameters**:
  - `mean_k`: 12 - Number of nearest neighbors to analyze
  - `multiplier`: 2.0 - Statistical threshold for outlier identification  
- **Importance**: Eliminates noise and erroneous points that would create artifacts
- **Output Quality**: Creates smoother surfaces by removing spikes and holes

### 3. Data Thinning (`filters.decimation`)
- **Function**: Systematically reduces point density by keeping every nth point
- **Parameter**: `step`: 3 - Keeps every 3rd point (reduces data by ~67%)
- **Performance Impact**: Significantly improves processing speed
- **Trade-off**: Slight reduction in detail, but maintains overall terrain characteristics

### 4. Rasterization with Smoothing (`writers.raster`)
- **Function**: Converts point cloud to raster grid with interpolation
- **Parameters**:
  - `resolution`: 2.0 - Output cell size in units of the source data
  - `output_type`: "idw" - Inverse Distance Weighting interpolation
  - `radius`: 6.0 - Search radius for influencing points
  - `power`: 2.0 - Controls how quickly influence diminishes with distance
- **Importance**: Creates continuous surface with natural transitions between points
- **Output Quality**: IDW produces a smooth surface while respecting the actual elevation values

### 5. Multi-threading Optimization
- **Function**: Utilizes multiple CPU cores for parallel processing
- **Parameter**: `thread_count`: num_cores - Automatically uses all available cores
- **Performance Impact**: Near-linear speedup with number of cores

## ⏱️ Performance Optimization Techniques

1. **Strategic Filtering Order**:
   - Extract ground first to minimize data volume for subsequent steps
   - Apply computationally expensive operations (outlier removal) on reduced dataset

2. **Data Reduction**:
   - Point classification filtering removes non-ground data
   - Decimation reduces overall point count by 67%
   - Outlier removal eliminates noise that would slow triangulation

3. **Efficient Interpolation Method**:
   - IDW provides excellent speed/quality balance compared to triangulation
   - Configurable radius and power parameters for fine-tuning

4. **Parallel Processing**:
   - Automatic detection and utilization of all available CPU cores
   - Thread count matching to hardware capabilities

## 🛠️ Adjustable Parameters for Different Requirements

### For Higher Resolution Output
- Decrease `resolution` (e.g., 1.0)
- Decrease `step` in decimation (e.g., 2)
- Increase `radius` in IDW interpolation

### For Faster Processing
- Increase `resolution` (e.g., 5.0)
- Increase `step` in decimation (e.g., 5 or 8)
- Decrease `mean_k` in outlier filter

### For Smoother Output
- Increase `radius` in IDW
- Increase `power` parameter (e.g., 3.0)
- Use a larger `mean_k` value for outlier detection

## Alternative Approaches

For different requirements, consider these alternative filter combinations:

1. **Triangle-Based Approach**:
   - Uses `filters.delaunay` and `filters.faceraster`
   - Creates triangulated irregular network (TIN) before rasterization
   - Better for preserving sharp features but slower

2. **Moving Least Squares Approach** (if available):
   - Uses `filters.mls` before rasterization
   - Creates mathematically smooth surfaces
   - Computationally intensive but produces very smooth results

3. **Grid-Based Approach**:
   - Uses direct grid projection methods
   - Fastest option but potentially less smooth

In [None]:
"""
This script demonstrates how to create a ground surface raster from LiDAR data using PDAL.
It includes steps for filtering, decimation, outlier removal, and rasterization.
The script is designed to be run in a Python environment with PDAL installed.
It is assumed that the PDAL library and its dependencies are properly installed and configured.
"""
import pdal
import os
import json
import multiprocessing
import time

# Determine the number of cores available
num_cores = multiprocessing.cpu_count()

file_path = r"C:\Users\Public\Documents\FLO-2D PRO Documentation\Example Projects\Self Help Kit Gila\ElevationData\LiDAR"
input_file = os.path.join(file_path, "USGS_LPC_AZ_MaricopaPinal_2020_B20_w0401n3720.laz")
output_file = os.path.join(file_path, "ground_surface_smooth.tif")

# Enhanced pipeline for a smoother surface
pipeline_dict = {
  "pipeline":[
    input_file,
    # Extract ground points
    {
        "type": "filters.range",
        "limits": "Classification[2:2]"
    },
    # Moderate thinning - balance between speed and detail
    {
        "type": "filters.decimation",
        "step": 4  # Keep every 4th point
    },
    # Remove outliers for a smoother surface
    {
        "type": "filters.outlier",
        "method": "statistical",
        "mean_k": 8,
        "multiplier": 2.0
    },
    # Use delaunay triangulation for smoother interpolation
    {
        "type": "filters.delaunay"
    },
    # Create a smooth raster surface from the triangulation
    {
        "type": "filters.faceraster",
        "resolution": 1.0
    },
    # Write the raster output
    {
        "type": "writers.raster",
        "filename": output_file,
        "gdaldriver": "GTiff"
    }
  ],
  "thread_count": num_cores
}

# Create and execute the pipeline
start_time = time.time()
pipeline = pdal.Pipeline(json.dumps(pipeline_dict))
count = pipeline.execute()
elapsed_time = time.time() - start_time

print(f"Processed {count} points using {num_cores} threads")
print(f"Total processing time: {elapsed_time:.2f} seconds")

In [None]:
"""
This script demonstrates how to create a ground surface raster from multiple LiDAR files using PDAL.
It includes steps for filtering, decimation, outlier removal, and rasterization.
The script processes all .laz files in the specified directory.
"""
import pdal
import os
import json
import multiprocessing
import time
import glob

# Determine the number of cores available
num_cores = multiprocessing.cpu_count()

# Directory containing LAZ files
file_path = r"C:\Users\Public\Documents\FLO-2D PRO Documentation\Example Projects\Self Help Kit Gila\ElevationData\LiDAR"

# Get a list of all LAZ files in the directory
laz_files = glob.glob(os.path.join(file_path, "*.laz"))

print(f"Found {len(laz_files)} LAZ files to process.")

# Process each LAZ file
for input_file in laz_files:
    # Create output filename based on input filename
    base_name = os.path.basename(input_file)
    output_name = os.path.splitext(base_name)[0] + "_ground_surface.tif"
    output_file = os.path.join(file_path, output_name)
    
    print(f"\nProcessing file: {base_name}")
    print(f"Output will be saved as: {output_name}")
    
    # Enhanced pipeline for a smoother surface
    pipeline_dict = {
      "pipeline":[
        input_file,
        # Extract ground points
        {
            "type": "filters.range",
            "limits": "Classification[2:2]"
        },
        # Moderate thinning - balance between speed and detail
        {
            "type": "filters.decimation",
            "step": 4  # Keep every 4th point
        },
        # Remove outliers for a smoother surface
        {
            "type": "filters.outlier",
            "method": "statistical",
            "mean_k": 8,
            "multiplier": 2.0
        },
        # Use delaunay triangulation for smoother interpolation
        {
            "type": "filters.delaunay"
        },
        # Create a smooth raster surface from the triangulation
        {
            "type": "filters.faceraster",
            "resolution": 1.0
        },
        # Write the raster output
        {
            "type": "writers.raster",
            "filename": output_file,
            "gdaldriver": "GTiff"
        }
      ],
      "thread_count": num_cores
    }
    
    # Create and execute the pipeline
    start_time = time.time()
    try:
        pipeline = pdal.Pipeline(json.dumps(pipeline_dict))
        count = pipeline.execute()
        elapsed_time = time.time() - start_time
        
        print(f"Successfully processed {count} points using {num_cores} threads")
        print(f"Processing time: {elapsed_time:.2f} seconds")
    except Exception as e:
        print(f"Error processing {base_name}: {e}")

print("\nAll files processed!")

In [None]:
import pdal
import os
import glob
import json
import multiprocessing

# This is the most reliable approach - merge the point clouds BEFORE rasterization

# Find all LAZ files
file_path = r"C:\Users\Public\Documents\FLO-2D PRO Documentation\Example Projects\Self Help Kit Gila\ElevationData\LiDAR"
laz_files = glob.glob(os.path.join(file_path, "*.laz"))
output_merged = os.path.join(file_path, "merged_ground_surface.tif")
num_cores = multiprocessing.cpu_count()

# Create a PDAL pipeline that merges all point clouds first, then creates a single raster
pipeline_dict = {
  "pipeline": [
    # Use all LAZ files as input
    *laz_files,
    {
        "type": "filters.merge"  # Merge all point clouds into one
    },
    {
        "type": "filters.range",
        "limits": "Classification[2:2]"  # Extract ground points
    },
    {
        "type": "filters.outlier",
        "method": "statistical",
        "mean_k": 8,
        "multiplier": 2.0
    },
    # Create a single TIN across the entire dataset
    {
        "type": "filters.delaunay"
    },
    # Rasterize the entire TIN at once - eliminates edge effects
    {
        "type": "filters.faceraster",
        "resolution": 1.0
    },
    {
        "type": "writers.raster",
        "filename": output_merged,
        "gdaldriver": "GTiff",
        "gdalopts": "COMPRESS=LZW,BIGTIFF=YES"
    }
  ],
  "thread_count": num_cores
}

# Execute the pipeline
pipeline = pdal.Pipeline(json.dumps(pipeline_dict))
count = pipeline.execute()
print(f"Processed {count} points into a seamless raster")

In [41]:
"""
Scalable LiDAR processing workflow for hydrologic/hydraulic modeling:
1. Process LiDAR tiles with consistent parameters
2. Create consistent, lower-resolution surfaces
3. Generate a seamless mosaic suitable for visualization and modeling
"""
import pdal
import os
import glob
import json
import multiprocessing
import time
from osgeo import gdal
import subprocess

def process_lidar_scalable_no_buffer(lidar_dir, output_dir, target_resolution=10.0, target_epsg=2223):
    """
    Process multiple LiDAR files for hydraulic modeling with a focus on scalability
    Uses only standard PDAL filters that are widely available
    
    Parameters:
    - lidar_dir: Directory containing LAZ files
    - output_dir: Directory for output files
    - target_resolution: Resolution in feet (default 10ft for hydraulic modeling)
    - target_epsg: Target coordinate system (default EPSG:2223)
    """
    # Create output directory if it doesn't exist
    os.makedirs(output_dir, exist_ok=True)
    
    # Step 1: Find all LAZ files
    laz_files = glob.glob(os.path.join(lidar_dir, "*.laz"))
    print(f"Found {len(laz_files)} LAZ files to process")
    
    # Number of cores to use
    num_cores = multiprocessing.cpu_count()
    
    # Step 2: Process each LAZ file with extended boundaries
    processed_tiles = []
    
    for i, laz_file in enumerate(laz_files):
        print(f"\nProcessing file {i+1}/{len(laz_files)}: {os.path.basename(laz_file)}")
        
        # Create output filename
        base_name = os.path.splitext(os.path.basename(laz_file))[0]
        output_raster = os.path.join(output_dir, f"{base_name}_ground_{target_resolution}ft.tif")
        processed_tiles.append(output_raster)
        
        # Create PDAL pipeline without buffer (not available in your installation)
        pipeline_dict = {
          "pipeline": [
            laz_file,
            {
                "type": "filters.range",
                "limits": "Classification[2:2]"  # Extract ground points
            },
            {
                "type": "filters.decimation",
                "step": 4  # Reduce point count for large datasets
            },
            {
                "type": "filters.outlier",
                "method": "statistical",
                "mean_k": 8,
                "multiplier": 2.0
            },
            # Use delaunay triangulation for a smooth surface
            {
                "type": "filters.delaunay"
            },
            # Create raster at the target resolution (e.g., 10ft)
            {
                "type": "filters.faceraster",
                "resolution": target_resolution
            },
            # Write the raster output
            {
                "type": "writers.raster",
                "filename": output_raster,
                "gdaldriver": "GTiff",
                "gdalopts": "COMPRESS=LZW,BIGTIFF=YES"
            }
          ],
          "thread_count": num_cores
        }
        
        # Execute the pipeline
        try:
            start_time = time.time()
            pipeline = pdal.Pipeline(json.dumps(pipeline_dict))
            count = pipeline.execute()
            end_time = time.time()
            print(f"  Processed {count} points in {end_time - start_time:.2f} seconds")
        except Exception as e:
            print(f"  Error processing {base_name}: {e}")
            continue
    
    # Step 3: Create a seamless mosaic with enhanced blending
    print("\nCreating seamless mosaic with enhanced blending...")
    
    # Output for the merged result
    mosaic_output = os.path.join(output_dir, f"seamless_mosaic_{target_resolution}ft.tif")
    reprojected_output = os.path.join(output_dir, f"seamless_mosaic_{target_resolution}ft_epsg{target_epsg}.tif")
    
    # Create a VRT to merge the tiles
    vrt_path = os.path.join(output_dir, "temp_mosaic.vrt")
    
    # Use gdalbuildvrt command line with additional options
    gdalbuildvrt_cmd = [
        'gdalbuildvrt',
        '-resolution', 'highest',
        '-a_srs', 'EPSG:4326',  # Ensure correct source SRS
        '-r', 'average',
        vrt_path
    ] + processed_tiles
    
    subprocess.run(gdalbuildvrt_cmd)
    
    # Create mosaic with a very large blending distance to compensate for lack of buffering
    gdalwarp_cmd = [
        'gdalwarp',
        '-co', 'COMPRESS=LZW',
        '-co', 'BIGTIFF=YES',
        '-r', 'cubic',  # Cubic interpolation for smoother results
        '-wo', 'CUTLINE_BLEND_DIST=100',  # Very large blend distance (10x the resolution)
        '-wo', 'UNIFIED_SRC_NODATA=YES',
        '-dstnodata', '-9999',  # Explicit NoData value
        '-multi',  # Use multithreading
        vrt_path,
        mosaic_output
    ]
    
    subprocess.run(gdalwarp_cmd)
    
    # Additional step: Fill any remaining NoData gaps
    filled_mosaic = os.path.join(output_dir, f"filled_mosaic_{target_resolution}ft.tif")
    gdal_fillnodata_cmd = [
        'gdal_fillnodata.py',
        '-md', '10',  # Maximum search distance in pixels
        '-si', '0',   # No smoothing iterations
        mosaic_output,
        filled_mosaic
    ]
    
    subprocess.run(gdal_fillnodata_cmd)
    
    # Reproject to target CRS
    if target_epsg != 0:
        print(f"Reprojecting to EPSG:{target_epsg}...")
        gdalwarp_reproj_cmd = [
            'gdalwarp',
            '-co', 'COMPRESS=LZW',
            '-co', 'BIGTIFF=YES',
            '-r', 'cubic',
            '-t_srs', f'EPSG:{target_epsg}',
            '-multi',
            filled_mosaic,
            reprojected_output
        ]
        
        subprocess.run(gdalwarp_reproj_cmd)
    
    # Clean up
    if os.path.exists(vrt_path):
        os.remove(vrt_path)
    
    print("\nProcessing complete!")
    print(f"Seamless mosaic: {filled_mosaic}")
    if target_epsg != 0:
        print(f"Reprojected mosaic: {reprojected_output}")
    
    return filled_mosaic, reprojected_output if target_epsg != 0 else None

if __name__ == "__main__":
    # Set your directories and parameters
    lidar_directory = r"C:\Users\Public\Documents\FLO-2D PRO Documentation\Example Projects\Self Help Kit Gila\ElevationData\LiDAR"
    output_directory = r"C:\Users\Public\Documents\FLO-2D PRO Documentation\Example Projects\Self Help Kit Gila\ElevationData\LiDAR\Processed"
    
    # Set resolution to 10 feet (common for hydraulic modeling)
    resolution = 10.0
    
    # Process the data with the modified approach
    process_lidar_scalable_no_buffer(lidar_directory, output_directory, resolution, target_epsg=2223)

Found 6 LAZ files to process

Processing file 1/6: USGS_LPC_AZ_MaricopaPinal_2020_B20_w0401n3720.laz
  Processed 1978057 points in 25.04 seconds

Processing file 2/6: USGS_LPC_AZ_MaricopaPinal_2020_B20_w0401n3721.laz
  Processed 2036293 points in 30.29 seconds

Processing file 3/6: USGS_LPC_AZ_MaricopaPinal_2020_B20_w0402n3720.laz
  Processed 2252174 points in 32.03 seconds

Processing file 4/6: USGS_LPC_AZ_MaricopaPinal_2020_B20_w0402n3721.laz
  Processed 2492533 points in 33.84 seconds

Processing file 5/6: USGS_LPC_AZ_MaricopaPinal_2020_B20_w0403n3720.laz
  Processed 2079105 points in 30.12 seconds

Processing file 6/6: USGS_LPC_AZ_MaricopaPinal_2020_B20_w0403n3721.laz
  Processed 2159343 points in 30.34 seconds

Creating seamless mosaic with enhanced blending...


OSError: [WinError 193] %1 is not a valid Win32 application