# Working with deepforest data

Hi all, welcome to another installment of working with DeepForest. Today we have some great data from Australian Eucalyptus forests. Let's walk through the steps to get some predictions. The first thing I did was look at the tile in QGIS to get a sense of the resolution (5cm), habitat type and image quality. Then I started with our standard boilerplate DeepForest prediction code from the 'Getting Started' page. In each of the code snippets below, I show the entire code used to create the output, which reflects what I'm really doing during debugging, which is trying a set of parameters, viewing the output and re-running.

In [8]:
# -----------------------------------------
# Step 1: Import libraries
# -----------------------------------------
import sys
from pathlib import Path

import numpy as np
import scipy
import matplotlib.pyplot as plt
import rasterio
from rasterio.plot import show

from deepforest import main
from deepforest.visualize import plot_results

# Try to enable file picker (desktop environments)
try:
    from tkinter import Tk, filedialog
    TK_AVAILABLE = True
except ImportError:
    TK_AVAILABLE = False

# -----------------------------------------
# Step 2: Print environment information
# -----------------------------------------
print("Python:", sys.version)
print("NumPy:", np.__version__)
print("SciPy:", scipy.__version__)

import deepforest
print("DeepForest:", deepforest.__version__)

# -----------------------------------------
# Step 3: Load DeepForest pre-trained model
# (DeepForest 2.0 loads weights automatically)
# -----------------------------------------
model = main.deepforest()
print("DeepForest model loaded.")

# -----------------------------------------
# Step 4: Select raster / orthomosaic image
# -----------------------------------------
def select_image():
    if not TK_AVAILABLE:
        raise RuntimeError(
            "Tkinter is not available. "
            "Please run this script in a desktop environment."
        )

    root = Tk()
    root.withdraw()  # Hide main window

    file_path = filedialog.askopenfilename(
        title="Select raster / orthomosaic image",
        filetypes=[
            ("TIFF files", "*.tif *.tiff"),
            ("All files", "*.*")
        ],
    )

    root.destroy()

    if not file_path:
        raise ValueError("No image selected")

    return Path(file_path)

# Get image path
image_path = select_image()
print("Using image:", image_path)

# -----------------------------------------
# Step 5: Open raster image
# -----------------------------------------
with rasterio.open(image_path) as src:
    image_array = src.read()  # (bands, height, width)
    print("Image shape (bands, height, width):", image_array.shape)

    # Show image
    show(src)

# -----------------------------------------
# Step 6: Run DeepForest prediction
# -----------------------------------------
predictions = model.predict_image(image_array)
print("Number of trees detected:", len(predictions))

# -----------------------------------------
# Step 7: Visualize predictions
# -----------------------------------------
plot_results(image_array, predictions)
plt.title("DeepForest Tree Detection Results")
plt.show()


Enter path to raster image (e.g., forest_tile.tif):  /Users/benweinstein/Downloads/Plot13Ortho.tif


Error opening image: '/Users/benweinstein/Downloads/Plot13Ortho.tif' does not exist in the file system, and is not recognized as a supported dataset name.


RasterioIOError: '/Users/benweinstein/Downloads/Plot13Ortho.tif' does not exist in the file system, and is not recognized as a supported dataset name.

Here we get some error messages saying that the input raster image has four bands. This is pretty common for data that was exported from tools that create orthomosaics, like AgiSoft and Pix4d. Most programs have a toggle button for turning of the 'alpha channel'. We can use rasterio to open up the image and just select the bands we want.

In [2]:
m = main.deepforest()
m.load_model("weecology/deepforest-tree")

# Make into a 3 page, remove alpha channel, and make channels last
raster_path = get_data("2019_YELL_2_541000_4977000_image_crop.tif")
r = rio.open(raster_path)
r = r[:3, :, :]
r = r.transpose(1, 2, 0)

# boxes = m.predict_tile(image=r, patch_size=700, patch_overlap=0.2, iou_threshold=0.5)


NameError: name 'main' is not defined

Its starts to run, but wow that's going to take too long on CPU for me to write this post. On GPU this might only take 2mins, but with a CPU almost an hour according to progress bar. Kill that and let's come back to the full prediction set when we are happy.

## Crop a small portion to work with

In [None]:
m = main.deepforest()
m.load_model("weecology/deepforest-tree")

# Load portable sample raster
raster_path = get_data("2019_YELL_2_541000_4977000_image_crop.tif")
r = rio.open(raster_path)
r = r[:3, :, :]
r = r.transpose(1, 2, 0)

# Grab a portion of image just to test, near the middle
r = r[12000:13000, 6000:7000, :]
plt.imshow(r)
plt.show()

print(m.config)

boxes = m.predict_tile(
    image=r,
    patch_size=700,
    patch_overlap=0.2,
    iou_threshold=0.5
)

# Logical image name only (no filesystem dependency)
boxes["image_path"] = "Plot13Ortho_crop.tif"

gdf = utilities.image_to_geo_coordinates(
    boxes,
    flip_y_axis=True
)

plot = visualize.plot_results(results=boxes, image=r)
plt.imshow(plot)
plt.show()


This is a decent start for zero-shot imagery in a new resolutions. Let's try a couple things. To see if we can make it any better without new annotations. We always say that DeepForest is best used as a backbone, and an hour of new annotation on target imagery and gentle finetuning will produce better results than changing.  hyperparameters. 

# Make geospatial predictions on the full tile

Now that i'm happy with a small crop, I want to make predictions on the entire image. This will take some time. We can do away with the cropping of the image, as well as flipping the y axis, since the coordinates are now in the geospatial projection of the tile.


In [None]:
m = main.deepforest()
m.load_model("weecology/deepforest-tree")

# Make into a 3 page, remove alpha channel, and make channels last
raster_path = get_data("2019_YELL_2_541000_4977000_image_crop.tif")
r = rio.open(raster_path)
r = r[:3, :, :]
r = r.transpose(1, 2, 0)

# Example prediction (optional)
# boxes = m.predict_tile(image=r, patch_size=700, patch_overlap=0.2, iou_threshold=0.5)
# boxes["image_path"] = "Plot13Ortho.tif"


In general I would look at this in QGIS, its much easier to zoom. Just for the sake of showing how its done, we can overlay the geospatial predicts on the large image

In [None]:
# View geopandas overlayed on image
fig, ax = plt.subplots(1, 1, figsize=(10, 10))

show(r, ax=ax)
gdf.plot(ax=ax, color="red", alpha=0.5)
