# Crop Classification

## Setup
<hr>

In [1]:
import os
import numpy as np
from osgeo import gdal
from dbfread import DBF
from pandas import DataFrame


## Load Data
<hr>

In [2]:
def raster_to_numpy_array(filename):
    return np.array(gdal.Open(filename).ReadAsArray())


full_training_data = raster_to_numpy_array(
    os.path.join("data", "20130824_RE3_3A_Analytic_Champaign_north.tif")
)

training_data_labels = raster_to_numpy_array(
    os.path.join("data", "CDL_2013_Champaign_north.tif")
)


In [3]:
label_mapping = DataFrame(
    iter(
        DBF(os.path.join("data", "CDL_2013_clip_20170525181724_1012622514.tif.vat.dbf"))
    )
)


### Data Metadata

 There should be 5 bands in the imagery data. Because the training data label raster shares the same projection as the imagery data and was resampled from 30m/pixel to 5m/pixel to match it's resolution, they should have the same data.

In [4]:
# Training data has 5 bands
print(full_training_data.shape[0])

# Training data and label have the same shape
print(full_training_data.shape[1:])
print(training_data_labels.shape)


5
(5959, 9425)
(5959, 9425)


The RapidEye product specification I found says the pixeld depth is 16 bits, unsigned. Does that match the data?

In [5]:
print(np.amax(full_training_data, axis=(1, 2)))
print(np.amin(full_training_data, axis=(1, 2)))


[35709 35412 31680 24759 24381]
[0 0 0 0 0]


Not sure. These values seem pretty low. Perhaps there are simply no bright colors. It shouldn't be a problem as these values will be normalized anyway. 

What about the label data?

In [6]:
print(np.amax(training_data_labels))
print(np.amin(training_data_labels))

print(label_mapping["VALUE"].max())
print(label_mapping["VALUE"].min())

print(label_mapping[label_mapping["VALUE"] == 0])
print(label_mapping[label_mapping["CLASS_NAME"] == "Corn"])
print(label_mapping[label_mapping["CLASS_NAME"] == "Sorghum"])


254
1
254
0
   VALUE  CLASS_NAME  RED  GREEN  BLUE  OPACITY
0      0  Background  0.0    0.0   0.0      0.0
   VALUE CLASS_NAME  RED     GREEN  BLUE  OPACITY
1      1       Corn  1.0  0.827451   0.0      1.0
   VALUE CLASS_NAME  RED     GREEN      BLUE  OPACITY
4      4    Sorghum  1.0  0.619608  0.047059      1.0


This checks out. I guess Background is never used as are probably many others. Corn (1) and Sorghum (4) are the important ones.