# Generate Cross-Validation with Spatial Leave-One-Out (SLOO)

In [1]:
from MuseoToolBox.vectorTools import crossValidationSelection
inRaster = '../data/map.tif'
inVector = '../data/train_withROI.gpkg'
levelField = 'Class'

## Select a sampling method
In crossValidationSelection, a class samplingMethods contains a function for each method.

Here we choose the SLOOCV to generate a Cross-Validation using each pixel for validation according to a thresold distance.

Compute distance matrix

In [2]:
distanceMatrix = crossValidationSelection.samplingMethods.getDistanceMatrixForDistanceCV(inRaster,inVector)

#distanceThresold is 100 pixels here
samplingMethod = crossValidationSelection.samplingMethods.SLOOCV(inRaster,inVector,distanceThresold=100,distanceMatrix=distanceMatrix,seed=12)

Reading raster values...  [########################################]100%


In [3]:
crossValidation = crossValidationSelection.sampleSelection(inVector,levelField,samplingMethod)

Reading raster values...  [########################################]100%


Now the crossValidation is ready to compute. You have two choices : 
### Generate the Cross-Validation for Scikit-Learn

In [4]:
CV = crossValidation.getCrossValidationForScikitLearn()
print('Number of iteration : '+str(CV.maxIter))
print('Number of samples for train / validation : ')
for idx, trvl in enumerate(CV):
    train = trvl[0]
    valid = trvl[1]
    print(train.shape[0],valid.shape[0])

Number of iteration : 39
Number of samples for train / validation : 
31655 4
28755 4
33762 4
30404 4
33757 4
34101 4
26606 4
32516 4
30639 4
28086 4
28468 4
31553 4
26606 4
33020 4
32398 4
26606 4
29268 4
31625 4
29956 4
32569 4
32147 4
32407 4
28769 4
29961 4
31470 4
30017 4
31488 4
31208 4
29961 4
31950 4
28504 4
32522 4
31626 4
28321 4
28537 4
34071 4
32701 4
29417 4
29961 4


### Save the Cross-Validation in as many as files as training/validation iteration.

As Cross-Validation are generated on demand, you have to reinitialize the process and please make sure to have defined a seed to have exactly the same CV.

In [5]:
CV = crossValidation.saveVectorFiles('../data/cv.sqlite')
for tr,vl in CV:
    print(tr,vl)

Reading raster values...  [########################################]100%
data/cv_train_0.sqlite data/cv_valid_0.sqlite
data/cv_train_1.sqlite data/cv_valid_1.sqlite
data/cv_train_2.sqlite data/cv_valid_2.sqlite
data/cv_train_3.sqlite data/cv_valid_3.sqlite
data/cv_train_4.sqlite data/cv_valid_4.sqlite
data/cv_train_5.sqlite data/cv_valid_5.sqlite
data/cv_train_6.sqlite data/cv_valid_6.sqlite
data/cv_train_7.sqlite data/cv_valid_7.sqlite
data/cv_train_8.sqlite data/cv_valid_8.sqlite
data/cv_train_9.sqlite data/cv_valid_9.sqlite
data/cv_train_10.sqlite data/cv_valid_10.sqlite
data/cv_train_11.sqlite data/cv_valid_11.sqlite
data/cv_train_12.sqlite data/cv_valid_12.sqlite
data/cv_train_13.sqlite data/cv_valid_13.sqlite
data/cv_train_14.sqlite data/cv_valid_14.sqlite
data/cv_train_15.sqlite data/cv_valid_15.sqlite
data/cv_train_16.sqlite data/cv_valid_16.sqlite
data/cv_train_17.sqlite data/cv_valid_17.sqlite
data/cv_train_18.sqlite data/cv_valid_18.sqlite
data/cv_train_19.sqlite data/cv_val