# Generate Cross-Validation with farthest points

In [1]:
from MuseoToolBox.vectorTools import getDistanceMatrixForDistanceCV
from MuseoToolBox import crossValidationTools
import os 

inRaster = '../data/map_lowres.tif'
inVector = '../data/train_withROI.gpkg'
inField = 'Class'


## Select a sampling method
In crossValidationSelection, a class samplingMethods contains a function for each method.

Here we choose the farthestCV to generate a Cross-Validation using a random pixel for validation then take the X% farthest points from this point. The rest will be add with the validation.

### Compute distance matrix

In [2]:
distanceMatrix = getDistanceMatrixForDistanceCV(inRaster,inVector)

Values from 'False' field will be extracted
Reading raster values...  [###################.....................]49%

Choose a samplingMethod and add it to sampleSelection

In [3]:
#distanceThresold is 100 pixels here
LPSO = crossValidationTools.LeavePSideOut(inRaster,inVector,inField,\
                            distanceMatrix=distanceMatrix,n_splits=5,minTrain=0.5,seed=12)

Values from 'Class' field will be extracted
Reading raster values...  [###################.....................]49%

Now the crossValidation is ready to compute. You have two choices : 
### Generate the Cross-Validation for Scikit-Learn

In [4]:
CV = LPSO.split()
print('Number of iteration : '+str(CV.n_splits))
print('Number of samples for train / validation : ')
for tr,vl in CV:
    print(tr.shape[0],vl.shape[0])

Number of iteration : 5
Number of samples for train / validation : 
5 12642
5 12642
5 12642
5 12642
5 12642


### Save the Cross-Validation in as many as files as training/validation iteration.

As Cross-Validation are generated on demand, you have to reinitialize the process and please **make sure to have defined a seed** to have exactly the same CV.

In [5]:
CV = LPSO.saveVectorFiles('../data/cv.sqlite')
for tr,vl in CV:
    print(tr,vl)

Values from 'Class' field will be extracted
../data/cv_train_0.sqlite ../data/cv_valid_0.sqlite................]49%
../data/cv_train_1.sqlite ../data/cv_valid_1.sqlite
../data/cv_train_2.sqlite ../data/cv_valid_2.sqlite
../data/cv_train_3.sqlite ../data/cv_valid_3.sqlite
../data/cv_train_4.sqlite ../data/cv_valid_4.sqlite
