# Generate Cross-Validation with Spatial Leave-One-SubGroup-Out (SLOO)

In [1]:
from MuseoToolBox import crossValidationTools
inRaster = '../data/map_lowres.tif'
inVector = '../data/train_withROI.gpkg'
levelField = 'Class'
subGroup = 'uniquefid'

## Select a sampling method
In crossValidationSelection, a class samplingMethods contains a function for each method.

Here we choose the SLOOCV to generate a Cross-Validation using each pixel for validation according to a thresold distance.

Compute distance matrix

In [2]:
distanceMatrix,distanceLabel = crossValidationTools.getDistanceMatrix(inRaster,inVector,subGroup)

Values from 'uniquefid' field will be extracted
Reading raster values...  [########################################]100%


In [3]:
#distanceThresold is 100 pixels here
SLOSGO = crossValidationTools.SpatialLeaveOneSubGroupOut(inRaster,inVector,inField='Class',inGroup='uniquefid',\
                                                         distanceThresold=100,distanceMatrix=distanceMatrix,distanceLabel=distanceLabel,\
                                                         seed=12)

Values from 'Class' field will be extracted
Reading raster values...  [########################################]100%


Now the crossValidation is ready to compute. You have two choices : 
### Generate the Cross-Validation for Scikit-Learn

In [4]:
CV = SLOSGO.split()
print('Number of iteration : '+str(CV.n_splits))
print('Number of samples for train / validation : ')
for idx, trvl in enumerate(CV):
    train = trvl[0]
    valid = trvl[1]
    print(train.shape[0],valid.shape[0])

Number of iteration : 2
Number of samples for train / validation : 
6677 3099
6677 3406


### Save the Cross-Validation in as many as files as training/validation iteration.

As Cross-Validation are generated on demand, you have to reinitialize the process and please make sure to have defined a seed to have exactly the same CV.

In [5]:
CV = SLOSGO.saveVectorFiles('../data/cv.sqlite')
for tr,vl in CV:
    print(tr,vl)

Values from 'Class' field will be extracted
Reading raster values...  [########################################]100%
../data/cv_train_0.sqlite ../data/cv_valid_0.sqlite
../data/cv_train_1.sqlite ../data/cv_valid_1.sqlite
