# Find Missing Tiles from LAS Catalog

**Overview**: I plotted my LAScatalog in R and see that there are some empty tile spaces in the analysis area. Fortunately for me, only about 17 out of 1372 are missing. All of the file names are identical except for a number within the file name. In this notebook, I write a script to search through these file names and create a list of numbers that were skipped. For instance, if `usgs_lpc_co_sanluisjuanmiguel_2020_d20_13s_bc_3746.las` and `usgs_lpc_co_sanluisjuanmiguel_2020_d20_13s_bc_3748.las` both exist, then my script will append `usgs_lpc_co_sanluisjuanmiguel_2020_d20_13s_bc_3747.las` to a list. I can then search for that file in my explorer using a script, or I can download it directly from the Colorado LiDAR website. 

In [1]:
# Import libraries
import os
import re

In [2]:
# Define the directory, which is the LAScatalog
las_directory = r'F:/_BRYCE/LiDAR/Ouray_County/las_catalog'

# File prefix and suffix
file_prefix = "usgs_lpc_co_sanluisjuanmiguel_2020_d20_13s_bc_"
file_suffix = ".las"

# Regex pattern to extract the number from the filename
pattern = r'usgs_lpc_co_sanluisjuanmiguel_2020_d20_13s_bc_(\d+)\.las'

# Create a list to store the tile numbers
tile_numbers = []

# Loop through the LAS catalog directory and extract tile numbers
for filename in os.listdir(las_directory):
    match = re.match(pattern, filename)
    if match:
        tile_numbers.append(int(match.group(1)))  # Extract the tile number and convert it to an integer so it can be sorted

# Sort the tile numbers
tile_numbers.sort()

# Identify missing numbers where one or two numbers are missing in sequence
missing_files = []
for i in range(len(tile_numbers) - 1):
    gap = tile_numbers[i+1] - tile_numbers[i]
    if gap == 2:  # One tile missing
        missing_files.append(f"{file_prefix}{tile_numbers[i] + 1}{file_suffix}")
    elif gap == 3:  # Two tiles missing
        missing_files.append(f"{file_prefix}{tile_numbers[i] + 1}{file_suffix}")
        missing_files.append(f"{file_prefix}{tile_numbers[i] + 2}{file_suffix}")

# Output
print("Missing file names:", len(missing_files))
print(missing_files)

Missing file names: 20
['usgs_lpc_co_sanluisjuanmiguel_2020_d20_13s_bc_4520.las', 'usgs_lpc_co_sanluisjuanmiguel_2020_d20_13s_bc_4521.las', 'usgs_lpc_co_sanluisjuanmiguel_2020_d20_13s_bc_4832.las', 'usgs_lpc_co_sanluisjuanmiguel_2020_d20_13s_bc_5138.las', 'usgs_lpc_co_sanluisjuanmiguel_2020_d20_13s_bc_5243.las', 'usgs_lpc_co_sanluisjuanmiguel_2020_d20_13s_bc_5625.las', 'usgs_lpc_co_sanluisjuanmiguel_2020_d20_13s_bc_5937.las', 'usgs_lpc_co_sanluisjuanmiguel_2020_d20_13s_bc_6019.las', 'usgs_lpc_co_sanluisjuanmiguel_2020_d20_13s_bc_6036.las', 'usgs_lpc_co_sanluisjuanmiguel_2020_d20_13s_bc_6043.las', 'usgs_lpc_co_sanluisjuanmiguel_2020_d20_13s_bc_6210.las', 'usgs_lpc_co_sanluisjuanmiguel_2020_d20_13s_bc_6221.las', 'usgs_lpc_co_sanluisjuanmiguel_2020_d20_13s_bc_6418.las', 'usgs_lpc_co_sanluisjuanmiguel_2020_d20_13s_bc_6441.las', 'usgs_lpc_co_sanluisjuanmiguel_2020_d20_13s_bc_6533.las', 'usgs_lpc_co_sanluisjuanmiguel_2020_d20_13s_bc_6541.las', 'usgs_lpc_co_sanluisjuanmiguel_2020_d20_13s_bc_6

In [4]:
missing_files

['usgs_lpc_co_sanluisjuanmiguel_2020_d20_13s_bc_4520.las',
 'usgs_lpc_co_sanluisjuanmiguel_2020_d20_13s_bc_4521.las',
 'usgs_lpc_co_sanluisjuanmiguel_2020_d20_13s_bc_4832.las',
 'usgs_lpc_co_sanluisjuanmiguel_2020_d20_13s_bc_5138.las',
 'usgs_lpc_co_sanluisjuanmiguel_2020_d20_13s_bc_5243.las',
 'usgs_lpc_co_sanluisjuanmiguel_2020_d20_13s_bc_5625.las',
 'usgs_lpc_co_sanluisjuanmiguel_2020_d20_13s_bc_5937.las',
 'usgs_lpc_co_sanluisjuanmiguel_2020_d20_13s_bc_6019.las',
 'usgs_lpc_co_sanluisjuanmiguel_2020_d20_13s_bc_6036.las',
 'usgs_lpc_co_sanluisjuanmiguel_2020_d20_13s_bc_6043.las',
 'usgs_lpc_co_sanluisjuanmiguel_2020_d20_13s_bc_6210.las',
 'usgs_lpc_co_sanluisjuanmiguel_2020_d20_13s_bc_6221.las',
 'usgs_lpc_co_sanluisjuanmiguel_2020_d20_13s_bc_6418.las',
 'usgs_lpc_co_sanluisjuanmiguel_2020_d20_13s_bc_6441.las',
 'usgs_lpc_co_sanluisjuanmiguel_2020_d20_13s_bc_6533.las',
 'usgs_lpc_co_sanluisjuanmiguel_2020_d20_13s_bc_6541.las',
 'usgs_lpc_co_sanluisjuanmiguel_2020_d20_13s_bc_6820.las

We have identified 20 missing tiles, which seems right, judging by the plot. Now we're going to search the fishnet folders for these file names and extract the name of the fishnet folder associated with the missing file. If there are none, then I will have to go onto the website and download these individually, which shouldn't take long.

In [3]:
import os

# base directory containing the fishnet folders
base_dir = r'F:/_BRYCE/LiDAR/Ouray_County/fishnet_tiles'

# List of missing files (replace this with the output from the previous script)
# missing_files = [
#     "usgs_lpc_co_sanluisjuanmiguel_2020_d20_13s_bc_3747.las",
#     "usgs_lpc_co_sanluisjuanmiguel_2020_d20_13s_bc_3748.las"
# ]

# Dictionary to store the folder where each missing file is found
found_files = {}

# Loop through all fishnet folders (fishnet_1 to fishnet_23)
for folder_num in range(1, 24):
    folder_path = os.path.join(base_dir, f'fishnet_{folder_num}')
    
    # Check if the folder exists
    if os.path.exists(folder_path):
        # Loop through the missing files
        for missing_file in missing_files:
            # Check if the missing file is in the current folder
            if missing_file in os.listdir(folder_path):
                # Add the folder and file to the dictionary
                found_files[missing_file] = f'fishnet_{folder_num}'

# Output the dictionary
print("Missing files found in the following folders:")
for file, folder in found_files.items():
    print(f"{file} found in {folder}")


Missing files found in the following folders:


Okay, they don't exist in my files, which means that nothing went wrong when I was adding files to the LAS catalog, but that something went wrong when lassoing files from the website.

I was able to search the website database and find all the tiles fairly quickly. It was also helpful to see the tiles displayed on the map, because I could see that the tiles I ended up selecting matched the pattern of the tiles that showed up missing when I plotted my LAScatalog in R.