# Working with deepforest data

Hi all, welcome to another installment of working with DeepForest. Today we have some great data from Australian Eucalyptus forests. Let's walk through the steps to get some predictions. The first thing I did was look at the tile in QGIS to get a sense of the resolution (5cm), habitat type and image quality. Then I started with our standard boilerplate DeepForest prediction code from the 'Getting Started' page. In each of the code snippets below, I show the entire code used to create the output, which reflects what I'm really doing during debugging, which is trying a set of parameters, viewing the output and re-running.

In [2]:
import cv2
import numpy as np
import rasterio as rio
from matplotlib import pyplot as plt
from rasterio.plot import show

from deepforest import main, utilities, visualize
from deepforest.visualize import plot_results

m = main.deepforest()

m.load_model("weecology/deepforest-tree")
try:
    image = m.predict_tile(
        path="/Users/benweinstein/Downloads/Plot13Ortho.tif",
        patch_size=500,
        patch_overlap=0,
    )
    plot_results(image)
except Exception as e:
    print(e)

Python: 3.12.7 | packaged by Anaconda, Inc. | (main, Oct  4 2024, 13:17:27) [MSC v.1929 64 bit (AMD64)]
NumPy: 1.26.4
SciPy: 1.17.0
DeepForest: 2.0.0


AttributeError: module 'deepforest' has no attribute 'deepforest'

Here we get some error messages saying that the input raster image has four bands. This is pretty common for data that was exported from tools that create orthomosaics, like AgiSoft and Pix4d. Most programs have a toggle button for turning of the 'alpha channel'. We can use rasterio to open up the image and just select the bands we want.

In [2]:
m = main.deepforest()
m.load_model("weecology/deepforest-tree")
# Make into a 3 page, remove alpha channel, and make channels last
r = rio.open("/Users/benweinstein/Downloads/Plot13Ortho.tif").read()
r = r[:3, :, :]
r = r.transpose(1, 2, 0)

# boxes = m.predict_tile(image=r, patch_size=700, patch_overlap=0.2, iou_threshold=0.5)

NameError: name 'main' is not defined

Its starts to run, but wow that's going to take too long on CPU for me to write this post. On GPU this might only take 2mins, but with a CPU almost an hour according to progress bar. Kill that and let's come back to the full prediction set when we are happy.

## Crop a small portion to work with

In [None]:
m = main.deepforest()
m.load_model("weecology/deepforest-tree")
# Make into a 3 page, remove alpha channel, and make channels last
r = rio.open("/Users/benweinstein/Downloads/Plot13Ortho.tif").read()
r = r[:3, :, :]
r = r.transpose(1, 2, 0)

# Grab a portion of image just to test, near the middle
r = r[12000:13000, 6000:7000, :]
plt.imshow(r)
plt.show()

# save the image as numpy array
cv2.imwrite("/Users/benweinstein/Downloads/Plot13Ortho_crop.tif", r)
print(m.config)
boxes = m.predict_tile(image=r, patch_size=700, patch_overlap=0.2, iou_threshold=0.5)
boxes["image_path"] = "Plot13Ortho_crop.tif"
gdf = utilities.image_to_geo_coordinates(
    boxes, root_dir="/Users/benweinstein/Downloads", flip_y_axis=True
)
gdf.to_file("/Users/benweinstein/Downloads/Plot13Ortho_crop.shp")

plot = visualize.plot_results(results=boxes, image=r)
plt.imshow(plot)
plt.show()

This is a decent start for zero-shot imagery in a new resolutions. Let's try a couple things. To see if we can make it any better without new annotations. We always say that DeepForest is best used as a backbone, and an hour of new annotation on target imagery and gentle finetuning will produce better results than changing.  hyperparameters. 

# Make geospatial predictions on the full tile

Now that i'm happy with a small crop, I want to make predictions on the entire image. This will take some time. We can do away with the cropping of the image, as well as flipping the y axis, since the coordinates are now in the geospatial projection of the tile.


In [None]:
m = main.deepforest()
m.load_model("weecology/deepforest-tree")
# Make into a 3 page, remove alpha channel, and make channels last
r = rio.open("/Users/benweinstein/Downloads/Plot13Ortho.tif").read()
r = r[:3, :, :]
r = r.transpose(1, 2, 0)

# boxes = m.predict_tile(image=r, patch_size=700, patch_overlap=0.2, iou_threshold=0.5)
# boxes["image_path"] = "Plot13Ortho.tif"
# gdf = utilities.boxes_to_shapefile(boxes, root_dir="/Users/benweinstein/Downloads")
# gdf.to_file("/Users/benweinstein/Downloads/Plot13Ortho.shp")

In general I would look at this in QGIS, its much easier to zoom. Just for the sake of showing how its done, we can overlay the geospatial predicts on the large image

In [None]:
# View geopandas overlayed on image
fig, ax = plt.subplots(1, 1, figsize=(10, 10))

# transpose to channels first and then plot
show(np.rollaxis(r, 0, 3), ax=ax)
gdf.plot(ax=ax, color="red", alpha=0.5)