<img src="images/banner3.png" width="100%" />

<font face="Calibri">
<br>
<font size="5"> <b>Sand clustering with Silhouette Analysis and KMeans notebook</b></font>

<br>
<font size="4"> <b> Nicolas Pucino; PhD Student @ Deakin University, Australia </b> <br>
<img style="padding:7px;" src="images/sandpiper_sand_retouched.png" width="170" align="right" /></font>

<font size="3">This notebook illustrates how to use Sandpiper to perform Silhouette Analysis and KMeans on all previously extracted points. <br>

<b>This notebook covers the following concepts:</b>

- Silhouete Analysis.
- KMeans clustering.
</font>


</font>

In [1]:
import pandas as pd
import geopandas as gpd
import numpy as np

from sandpyper.outils import coords_to_points 
from sandpyper.labels import get_sil_location, get_opt_k, kmeans_sa

Loading the project-related lists

- loc codes
- crs dict string

In [2]:
# The location codes used troughout the analysis
loc_codes=["mar","leo"]

# The Coordinate Reference Systems used troughout this study
crs_dict_string= {
                 'mar': {'init': 'epsg:32754'},
                 'leo': {'init': 'epsg:32755'},
                 }

### Loading, merging and preparing the tables

The function __get_merged_table__ merge the rgb and z tables together and format it in a way it is digestible for further analysis.

In [16]:
%%time

#Loading the tables

rgb_table_path=r"C:\my_packages\doc_data\profiles\rgb.csv"
z_table_path=r"C:\my_packages\doc_data\profiles\elevation.csv"

rgb_table=gpd.read_file(rgb_table_path)
z_table=gpd.read_file(z_table_path)

# As the distance (across-transect) comes from an interpolation, it has too many digits.
# let's round both tables distance columns to 2 significant values and assign their data type as "float".

rgb_table["distance"]=np.round(rgb_table.loc[:,"distance"].values.astype("float"),2)
z_table["distance"]=np.round(z_table.loc[:,"distance"].values.astype("float"),2)

  for feature in features_lst:


Wall time: 16.9 s


Storing Geodataframes as CSV is handy, but __we lose the column data type information__.
Especially important is the __geometry column__, which we need to convert back into __Shapely Point object format__.
To do that, the function __coords_to_points__ can be used across a Series ('geometry'). It can take quite a bit of time, so, if you have a lot of points, get ready!

In [17]:
rgb_table['geometry']=rgb_table.coordinates.apply(coords_to_points)
z_table['geometry']=z_table.coordinates.apply(coords_to_points)

In [18]:
# Here, we merge the two tables (storing elevation and rgb information)

data_merged = pd.merge(z_table,rgb_table[["band1","band2","band3","point_id"]],on="point_id",validate="one_to_one")

# replace empty values with np.NaN
data_merged=data_merged.replace("", np.NaN)

# and convert the z column into floats.
data_merged['z']=data_merged.z.astype("float")

In [19]:
# Here, we add two features, slope and curvature, computed from the elevation series,
# in case we wnat to use for KMeans clustering.
# Note that when passing from one transect to another, slope and curvature computations are wrong.
# However, we will clip those areas as they are in the water or in the backdune.

data_merged["slope"]=np.gradient(data_merged.z)
data_merged["curve"]=np.gradient(data_merged.slope)

In [20]:
# Our rasters have NaN values set to -32767.0. Thus, we replace them with np.Nan.
data_merged.z.replace(-32767.0,np.nan,inplace=True)


The __get_sil_location__ function will iteratively perform KMeans clustering and Silhouette Analysis with increasing number of clusters (k, specified in the `ks` parameter) for every survey, using the feature set specified in the parameter `feature_set`.

This will return a dataframe with Average Silhouette scores with different k for all surveys, which we use to find sub-optimal number of clusters with __get_opt_k__ function.

Then, with the sub-optimal k, we finally run KMeans with __kmeans_sa__ function on all the surveys to obtain clustered points to visually discriminate between sand and non-sand in a Qgis environment.

In [21]:
%%time
# Run interatively KMeans + SA

feature_set=["band1","band2","band3","slope","curve"]
sil_df=get_sil_location(data_merged,
                        ks=(2,30), 
                        feature_set=feature_set,
                       random_state=10)

  0%|          | 0/2 [00:00<?, ?it/s]

  0%|          | 0/9 [00:00<?, ?it/s]

Working on : mar, 2019-05-16.


  0%|          | 0/28 [00:00<?, ?it/s]



For n_clusters = 2 The average silhouette_score is : 0.731461942979032
For n_clusters = 3 The average silhouette_score is : 0.5142792127162619




For n_clusters = 4 The average silhouette_score is : 0.5077025845611792




For n_clusters = 5 The average silhouette_score is : 0.5274412297665048




For n_clusters = 6 The average silhouette_score is : 0.4713461918256104




For n_clusters = 7 The average silhouette_score is : 0.4799284043786106




For n_clusters = 8 The average silhouette_score is : 0.45685828962169767




For n_clusters = 9 The average silhouette_score is : 0.3946815417502011




For n_clusters = 10 The average silhouette_score is : 0.40929105540440447




For n_clusters = 11 The average silhouette_score is : 0.41393059007055466




For n_clusters = 12 The average silhouette_score is : 0.3968217123368743




For n_clusters = 13 The average silhouette_score is : 0.3859211110635064




For n_clusters = 14 The average silhouette_score is : 0.390034641319038




For n_clusters = 15 The average silhouette_score is : 0.382400415540968




For n_clusters = 16 The average silhouette_score is : 0.3828346160005395




For n_clusters = 17 The average silhouette_score is : 0.3646356463998043




For n_clusters = 18 The average silhouette_score is : 0.36277694395472443




For n_clusters = 19 The average silhouette_score is : 0.36090028633089233




For n_clusters = 20 The average silhouette_score is : 0.3648769331995712




For n_clusters = 21 The average silhouette_score is : 0.3693061976115246




For n_clusters = 22 The average silhouette_score is : 0.3383253700145481




For n_clusters = 23 The average silhouette_score is : 0.3384421114836936




For n_clusters = 24 The average silhouette_score is : 0.371693752606444




For n_clusters = 25 The average silhouette_score is : 0.3337880334315523




For n_clusters = 26 The average silhouette_score is : 0.3340330850983966




For n_clusters = 27 The average silhouette_score is : 0.33516953913193454




For n_clusters = 28 The average silhouette_score is : 0.3541249042171639




For n_clusters = 29 The average silhouette_score is : 0.34990092757699076
Working on : mar, 2019-03-13.


  0%|          | 0/28 [00:00<?, ?it/s]



For n_clusters = 2 The average silhouette_score is : 0.6791905394415744
For n_clusters = 3 The average silhouette_score is : 0.49841476403754303




For n_clusters = 4 The average silhouette_score is : 0.4752185801076474




For n_clusters = 5 The average silhouette_score is : 0.44013869935799826




For n_clusters = 6 The average silhouette_score is : 0.4540780426285227




For n_clusters = 7 The average silhouette_score is : 0.45318142071053097




For n_clusters = 8 The average silhouette_score is : 0.4598371539110976




For n_clusters = 9 The average silhouette_score is : 0.4234921969570807




For n_clusters = 10 The average silhouette_score is : 0.40884904290263246




For n_clusters = 11 The average silhouette_score is : 0.41423359341080174




For n_clusters = 12 The average silhouette_score is : 0.4241818030106797




For n_clusters = 13 The average silhouette_score is : 0.41224468929513913




For n_clusters = 14 The average silhouette_score is : 0.4008607687717615




For n_clusters = 15 The average silhouette_score is : 0.3960717995055602




For n_clusters = 16 The average silhouette_score is : 0.38013581920777223




For n_clusters = 17 The average silhouette_score is : 0.3503441491992906




For n_clusters = 18 The average silhouette_score is : 0.33974006949137775




For n_clusters = 19 The average silhouette_score is : 0.3422007773504867




For n_clusters = 20 The average silhouette_score is : 0.3497316392989021




For n_clusters = 21 The average silhouette_score is : 0.3208801042694714




For n_clusters = 22 The average silhouette_score is : 0.3232509415669243




For n_clusters = 23 The average silhouette_score is : 0.3295069353221652




For n_clusters = 24 The average silhouette_score is : 0.3213606624384845




For n_clusters = 25 The average silhouette_score is : 0.3237029845636845




For n_clusters = 26 The average silhouette_score is : 0.30643045447262396




For n_clusters = 27 The average silhouette_score is : 0.3060292914684607




For n_clusters = 28 The average silhouette_score is : 0.31829352885732415




For n_clusters = 29 The average silhouette_score is : 0.30061270554926245
Working on : mar, 2019-02-05.


  0%|          | 0/28 [00:00<?, ?it/s]



For n_clusters = 2 The average silhouette_score is : 0.6940242955142734




For n_clusters = 3 The average silhouette_score is : 0.5239709989510006




For n_clusters = 4 The average silhouette_score is : 0.4430964270968288




For n_clusters = 5 The average silhouette_score is : 0.43385346460990065




For n_clusters = 6 The average silhouette_score is : 0.45325871513817173




For n_clusters = 7 The average silhouette_score is : 0.43917225692211503




For n_clusters = 8 The average silhouette_score is : 0.44545408972953565




For n_clusters = 9 The average silhouette_score is : 0.42124726304816856




For n_clusters = 10 The average silhouette_score is : 0.42770475606256836




For n_clusters = 11 The average silhouette_score is : 0.4024129399008645




For n_clusters = 12 The average silhouette_score is : 0.40180570404660854




For n_clusters = 13 The average silhouette_score is : 0.39326900608285564




For n_clusters = 14 The average silhouette_score is : 0.3899442008909326




For n_clusters = 15 The average silhouette_score is : 0.3859612694810786




For n_clusters = 16 The average silhouette_score is : 0.36769425327408184




For n_clusters = 17 The average silhouette_score is : 0.34840992649757324




For n_clusters = 18 The average silhouette_score is : 0.34400525179062325




For n_clusters = 19 The average silhouette_score is : 0.3401493861693228




For n_clusters = 20 The average silhouette_score is : 0.3177990275318842




For n_clusters = 21 The average silhouette_score is : 0.34969920882614086




For n_clusters = 22 The average silhouette_score is : 0.32273728060663115




For n_clusters = 23 The average silhouette_score is : 0.34163644843347896




For n_clusters = 24 The average silhouette_score is : 0.34868035357066146




For n_clusters = 25 The average silhouette_score is : 0.3187674773761478




For n_clusters = 26 The average silhouette_score is : 0.32029165561341727




For n_clusters = 27 The average silhouette_score is : 0.3313611310354304




For n_clusters = 28 The average silhouette_score is : 0.32747055611832643




For n_clusters = 29 The average silhouette_score is : 0.3136156566779985
Working on : mar, 2018-12-11.


  0%|          | 0/28 [00:00<?, ?it/s]



For n_clusters = 2 The average silhouette_score is : 0.6108416588172602
For n_clusters = 3 The average silhouette_score is : 0.5419713029819637




For n_clusters = 4 The average silhouette_score is : 0.4860668658671436




For n_clusters = 5 The average silhouette_score is : 0.44555198634590354




For n_clusters = 6 The average silhouette_score is : 0.459956845932178




For n_clusters = 7 The average silhouette_score is : 0.424196359030616




For n_clusters = 8 The average silhouette_score is : 0.43108176107404805




For n_clusters = 9 The average silhouette_score is : 0.4109677091856917




For n_clusters = 10 The average silhouette_score is : 0.4149283446264138




For n_clusters = 11 The average silhouette_score is : 0.42607841172993627




For n_clusters = 12 The average silhouette_score is : 0.403296132522953




For n_clusters = 13 The average silhouette_score is : 0.38195766646236406




For n_clusters = 14 The average silhouette_score is : 0.3645941717680797




For n_clusters = 15 The average silhouette_score is : 0.3709348900182611




For n_clusters = 16 The average silhouette_score is : 0.35213353630042654




For n_clusters = 17 The average silhouette_score is : 0.34277146541321385




For n_clusters = 18 The average silhouette_score is : 0.3331461376161049




For n_clusters = 19 The average silhouette_score is : 0.3387444898180107




For n_clusters = 20 The average silhouette_score is : 0.3306079433933054




For n_clusters = 21 The average silhouette_score is : 0.33074500500985204




For n_clusters = 22 The average silhouette_score is : 0.31018579362867194




For n_clusters = 23 The average silhouette_score is : 0.3126788289652423




For n_clusters = 24 The average silhouette_score is : 0.2946579071417924




For n_clusters = 25 The average silhouette_score is : 0.29095128281280497




For n_clusters = 26 The average silhouette_score is : 0.2897990965951312




For n_clusters = 27 The average silhouette_score is : 0.2939705527732183




For n_clusters = 28 The average silhouette_score is : 0.2839873884291382




For n_clusters = 29 The average silhouette_score is : 0.2777330442901152
Working on : mar, 2018-11-13.


  0%|          | 0/28 [00:00<?, ?it/s]



For n_clusters = 2 The average silhouette_score is : 0.6363713896309675




For n_clusters = 3 The average silhouette_score is : 0.5336398118383892




For n_clusters = 4 The average silhouette_score is : 0.5284741458188553




For n_clusters = 5 The average silhouette_score is : 0.4527327099754082




For n_clusters = 6 The average silhouette_score is : 0.4685280966328017




For n_clusters = 7 The average silhouette_score is : 0.4770223756379673




For n_clusters = 8 The average silhouette_score is : 0.4590963889067767




For n_clusters = 9 The average silhouette_score is : 0.3889784724939575




For n_clusters = 10 The average silhouette_score is : 0.4012659473069437




For n_clusters = 11 The average silhouette_score is : 0.36697145264037595




For n_clusters = 12 The average silhouette_score is : 0.3559179100907277




For n_clusters = 13 The average silhouette_score is : 0.35957013054920256




For n_clusters = 14 The average silhouette_score is : 0.3638494953872532




For n_clusters = 15 The average silhouette_score is : 0.3418276440960754




For n_clusters = 16 The average silhouette_score is : 0.34231143698102806




For n_clusters = 17 The average silhouette_score is : 0.3220722151181281




For n_clusters = 18 The average silhouette_score is : 0.34123111839209025




For n_clusters = 19 The average silhouette_score is : 0.33247883890596414




For n_clusters = 20 The average silhouette_score is : 0.33263762620373644




For n_clusters = 21 The average silhouette_score is : 0.31565659102062565




For n_clusters = 22 The average silhouette_score is : 0.321854435230215




For n_clusters = 23 The average silhouette_score is : 0.31262425505604535




For n_clusters = 24 The average silhouette_score is : 0.2978652010477722




For n_clusters = 25 The average silhouette_score is : 0.30734288666967396




For n_clusters = 26 The average silhouette_score is : 0.318669657524578




For n_clusters = 27 The average silhouette_score is : 0.32637640451743666




For n_clusters = 28 The average silhouette_score is : 0.2915833578926177




For n_clusters = 29 The average silhouette_score is : 0.2972889445804515
Working on : mar, 2018-09-25.


  0%|          | 0/28 [00:00<?, ?it/s]



For n_clusters = 2 The average silhouette_score is : 0.6517256937555473




For n_clusters = 3 The average silhouette_score is : 0.5079537663245399




For n_clusters = 4 The average silhouette_score is : 0.467757704070288




For n_clusters = 5 The average silhouette_score is : 0.42238754591437117




For n_clusters = 6 The average silhouette_score is : 0.43680208152158984




For n_clusters = 7 The average silhouette_score is : 0.4460457095386317




For n_clusters = 8 The average silhouette_score is : 0.4208086511946568




For n_clusters = 9 The average silhouette_score is : 0.39604551595888077




For n_clusters = 10 The average silhouette_score is : 0.3782140430279348




For n_clusters = 11 The average silhouette_score is : 0.371624500075546




For n_clusters = 12 The average silhouette_score is : 0.364102888771415




For n_clusters = 13 The average silhouette_score is : 0.3589711050670307




For n_clusters = 14 The average silhouette_score is : 0.36560696324582365




For n_clusters = 15 The average silhouette_score is : 0.35382113646282626




For n_clusters = 16 The average silhouette_score is : 0.37061575759594406




For n_clusters = 17 The average silhouette_score is : 0.3436344599237542




For n_clusters = 18 The average silhouette_score is : 0.365870903095111




For n_clusters = 19 The average silhouette_score is : 0.3202207024647968




For n_clusters = 20 The average silhouette_score is : 0.3457127620822661




For n_clusters = 21 The average silhouette_score is : 0.338628631267582




For n_clusters = 22 The average silhouette_score is : 0.30696119996948107




For n_clusters = 23 The average silhouette_score is : 0.3219392479374696




For n_clusters = 24 The average silhouette_score is : 0.30182452162031215




For n_clusters = 25 The average silhouette_score is : 0.3139427176719299




For n_clusters = 26 The average silhouette_score is : 0.31205201771378




For n_clusters = 27 The average silhouette_score is : 0.3120801510833247




For n_clusters = 28 The average silhouette_score is : 0.31233098540770693




For n_clusters = 29 The average silhouette_score is : 0.30883158591112325
Working on : mar, 2018-07-27.


  0%|          | 0/28 [00:00<?, ?it/s]



For n_clusters = 2 The average silhouette_score is : 0.5587531541250162




For n_clusters = 3 The average silhouette_score is : 0.4748287725000199




For n_clusters = 4 The average silhouette_score is : 0.40370889304215346




For n_clusters = 5 The average silhouette_score is : 0.3726803329031655




For n_clusters = 6 The average silhouette_score is : 0.3839181114173076




For n_clusters = 7 The average silhouette_score is : 0.34054897905039433




For n_clusters = 8 The average silhouette_score is : 0.35269448872318443




For n_clusters = 9 The average silhouette_score is : 0.32463787316550774




For n_clusters = 10 The average silhouette_score is : 0.31054631499334096




For n_clusters = 11 The average silhouette_score is : 0.32505587025177385




For n_clusters = 12 The average silhouette_score is : 0.2989340206742518




For n_clusters = 13 The average silhouette_score is : 0.3134752733098823




For n_clusters = 14 The average silhouette_score is : 0.32256132076496324




For n_clusters = 15 The average silhouette_score is : 0.3271791207797847




For n_clusters = 16 The average silhouette_score is : 0.331359528927845




For n_clusters = 17 The average silhouette_score is : 0.3270684015726018




For n_clusters = 18 The average silhouette_score is : 0.32566805504533614




For n_clusters = 19 The average silhouette_score is : 0.31995668830240714




For n_clusters = 20 The average silhouette_score is : 0.3111590140444867




For n_clusters = 21 The average silhouette_score is : 0.302243256038968




For n_clusters = 22 The average silhouette_score is : 0.29313042058062383




For n_clusters = 23 The average silhouette_score is : 0.3168243495906858




For n_clusters = 24 The average silhouette_score is : 0.3012960601175711




For n_clusters = 25 The average silhouette_score is : 0.3065296710015245




For n_clusters = 26 The average silhouette_score is : 0.31261364488247917




For n_clusters = 27 The average silhouette_score is : 0.3073885159440493




For n_clusters = 28 The average silhouette_score is : 0.2971823889978033




For n_clusters = 29 The average silhouette_score is : 0.30395585776964756
Working on : mar, 2018-06-21.


  0%|          | 0/28 [00:00<?, ?it/s]



For n_clusters = 2 The average silhouette_score is : 0.6361167136866437




For n_clusters = 3 The average silhouette_score is : 0.4780306786492281




For n_clusters = 4 The average silhouette_score is : 0.46839187275265437




For n_clusters = 5 The average silhouette_score is : 0.40440930502398553




For n_clusters = 6 The average silhouette_score is : 0.41943165860711723




For n_clusters = 7 The average silhouette_score is : 0.4312599680141288




For n_clusters = 8 The average silhouette_score is : 0.41556007499687214




For n_clusters = 9 The average silhouette_score is : 0.3895232047904907




For n_clusters = 10 The average silhouette_score is : 0.38017535831603483




For n_clusters = 11 The average silhouette_score is : 0.3803461267879127




For n_clusters = 12 The average silhouette_score is : 0.3752902650395718




For n_clusters = 13 The average silhouette_score is : 0.3896031799023281




For n_clusters = 14 The average silhouette_score is : 0.3638512012325654




For n_clusters = 15 The average silhouette_score is : 0.36348066252859407




For n_clusters = 16 The average silhouette_score is : 0.3713269014778313




For n_clusters = 17 The average silhouette_score is : 0.37290957758176146




For n_clusters = 18 The average silhouette_score is : 0.3875134083481261




For n_clusters = 19 The average silhouette_score is : 0.35644707823736643




For n_clusters = 20 The average silhouette_score is : 0.3500920918399626




For n_clusters = 21 The average silhouette_score is : 0.35086933242343815




For n_clusters = 22 The average silhouette_score is : 0.37215020425847156




For n_clusters = 23 The average silhouette_score is : 0.35897673337451824




For n_clusters = 24 The average silhouette_score is : 0.34073816882336294




For n_clusters = 25 The average silhouette_score is : 0.35389634934657516




For n_clusters = 26 The average silhouette_score is : 0.34199710833225655




For n_clusters = 27 The average silhouette_score is : 0.3471947703800534




For n_clusters = 28 The average silhouette_score is : 0.3605106960748425




For n_clusters = 29 The average silhouette_score is : 0.3275087532624176
Working on : mar, 2018-06-01.


  0%|          | 0/28 [00:00<?, ?it/s]



For n_clusters = 2 The average silhouette_score is : 0.5585637619456908




For n_clusters = 3 The average silhouette_score is : 0.41699987977003644




For n_clusters = 4 The average silhouette_score is : 0.3919522769723395




For n_clusters = 5 The average silhouette_score is : 0.3726608044839666




For n_clusters = 6 The average silhouette_score is : 0.3878234638879123




For n_clusters = 7 The average silhouette_score is : 0.3459648390983824




For n_clusters = 8 The average silhouette_score is : 0.3594515476879077




For n_clusters = 9 The average silhouette_score is : 0.35284813670049286




For n_clusters = 10 The average silhouette_score is : 0.36778046246961693




For n_clusters = 11 The average silhouette_score is : 0.3288487626100666




For n_clusters = 12 The average silhouette_score is : 0.3572589742123086




For n_clusters = 13 The average silhouette_score is : 0.33546974718856876




For n_clusters = 14 The average silhouette_score is : 0.3337076310429198




For n_clusters = 15 The average silhouette_score is : 0.31828935771640354




For n_clusters = 16 The average silhouette_score is : 0.3236216283237657




For n_clusters = 17 The average silhouette_score is : 0.2971417387683045




For n_clusters = 18 The average silhouette_score is : 0.2883734785645883




For n_clusters = 19 The average silhouette_score is : 0.3115588267407949




For n_clusters = 20 The average silhouette_score is : 0.315528040389246




For n_clusters = 21 The average silhouette_score is : 0.3119907228689671




For n_clusters = 22 The average silhouette_score is : 0.3107533475988858




For n_clusters = 23 The average silhouette_score is : 0.3038503191341936




For n_clusters = 24 The average silhouette_score is : 0.30791888640119314




For n_clusters = 25 The average silhouette_score is : 0.3126624307865237




For n_clusters = 26 The average silhouette_score is : 0.27959303142556535




For n_clusters = 27 The average silhouette_score is : 0.29727187173525954




For n_clusters = 28 The average silhouette_score is : 0.2927804961459732




For n_clusters = 29 The average silhouette_score is : 0.2855442413109075


  0%|          | 0/15 [00:00<?, ?it/s]

Working on : leo, 2020-11-06.


  0%|          | 0/28 [00:00<?, ?it/s]



For n_clusters = 2 The average silhouette_score is : 0.5281888866526274




For n_clusters = 3 The average silhouette_score is : 0.583647285326557




For n_clusters = 4 The average silhouette_score is : 0.5147874673792354




For n_clusters = 5 The average silhouette_score is : 0.43250457575344126




For n_clusters = 6 The average silhouette_score is : 0.44307039110523194




For n_clusters = 7 The average silhouette_score is : 0.3911035799116992




For n_clusters = 8 The average silhouette_score is : 0.3834770322129325




For n_clusters = 9 The average silhouette_score is : 0.37236817337246775




For n_clusters = 10 The average silhouette_score is : 0.35654127132247393




For n_clusters = 11 The average silhouette_score is : 0.335030641257081




For n_clusters = 12 The average silhouette_score is : 0.3294876738612691




For n_clusters = 13 The average silhouette_score is : 0.33096363509131427




For n_clusters = 14 The average silhouette_score is : 0.300346268597407




For n_clusters = 15 The average silhouette_score is : 0.3032479687167359




For n_clusters = 16 The average silhouette_score is : 0.309823559427849




For n_clusters = 17 The average silhouette_score is : 0.3038900751287074




For n_clusters = 18 The average silhouette_score is : 0.3085855847670872




For n_clusters = 19 The average silhouette_score is : 0.305928851952988




For n_clusters = 20 The average silhouette_score is : 0.30228540626644756




For n_clusters = 21 The average silhouette_score is : 0.2981397229617392




For n_clusters = 22 The average silhouette_score is : 0.30216461554763374




For n_clusters = 23 The average silhouette_score is : 0.2986735738604957




For n_clusters = 24 The average silhouette_score is : 0.2877421681457221




For n_clusters = 25 The average silhouette_score is : 0.30231234881722613




For n_clusters = 26 The average silhouette_score is : 0.2881464914954439




For n_clusters = 27 The average silhouette_score is : 0.2964305687456459




For n_clusters = 28 The average silhouette_score is : 0.3001275118998849




For n_clusters = 29 The average silhouette_score is : 0.29766518633413896
Working on : leo, 2020-09-21.


  0%|          | 0/28 [00:00<?, ?it/s]



For n_clusters = 2 The average silhouette_score is : 0.6002337257999919




For n_clusters = 3 The average silhouette_score is : 0.5330325782425519




For n_clusters = 4 The average silhouette_score is : 0.4690166515240728




For n_clusters = 5 The average silhouette_score is : 0.44309237058807327




For n_clusters = 6 The average silhouette_score is : 0.41967906254737986




For n_clusters = 7 The average silhouette_score is : 0.40092269479332937




For n_clusters = 8 The average silhouette_score is : 0.3723502162587356




For n_clusters = 9 The average silhouette_score is : 0.35308494740087615




For n_clusters = 10 The average silhouette_score is : 0.3409258850625948




For n_clusters = 11 The average silhouette_score is : 0.3546310443684818




For n_clusters = 12 The average silhouette_score is : 0.3330961584609218




For n_clusters = 13 The average silhouette_score is : 0.3152016895600483




For n_clusters = 14 The average silhouette_score is : 0.315558479793675




For n_clusters = 15 The average silhouette_score is : 0.3147982782174297




For n_clusters = 16 The average silhouette_score is : 0.3040679789929803




For n_clusters = 17 The average silhouette_score is : 0.31180315568775235




For n_clusters = 18 The average silhouette_score is : 0.299573351208705




For n_clusters = 19 The average silhouette_score is : 0.3021301982567003




For n_clusters = 20 The average silhouette_score is : 0.2993547807868097




For n_clusters = 21 The average silhouette_score is : 0.28591189754765356




For n_clusters = 22 The average silhouette_score is : 0.2940424948400002




For n_clusters = 23 The average silhouette_score is : 0.2899186349038026




For n_clusters = 24 The average silhouette_score is : 0.2910216214958964




For n_clusters = 25 The average silhouette_score is : 0.2913301244672575




For n_clusters = 26 The average silhouette_score is : 0.29415781414695047




For n_clusters = 27 The average silhouette_score is : 0.27416041882644065




For n_clusters = 28 The average silhouette_score is : 0.2812486509964313




For n_clusters = 29 The average silhouette_score is : 0.26717619784378055
Working on : leo, 2020-08-10.


  0%|          | 0/28 [00:00<?, ?it/s]

For n_clusters = 2 The average silhouette_score is : 0.6424615514718051




For n_clusters = 3 The average silhouette_score is : 0.5551294704809917




For n_clusters = 4 The average silhouette_score is : 0.4893363302661409




For n_clusters = 5 The average silhouette_score is : 0.46021068888406463




For n_clusters = 6 The average silhouette_score is : 0.45487074307773157




For n_clusters = 7 The average silhouette_score is : 0.41620399357682386




For n_clusters = 8 The average silhouette_score is : 0.39933022688138126




For n_clusters = 9 The average silhouette_score is : 0.3799870375344405




For n_clusters = 10 The average silhouette_score is : 0.35510693809459515




For n_clusters = 11 The average silhouette_score is : 0.36521928867017867




For n_clusters = 12 The average silhouette_score is : 0.34763040529496575




For n_clusters = 13 The average silhouette_score is : 0.3665904710115509




For n_clusters = 14 The average silhouette_score is : 0.3753425556535115




For n_clusters = 15 The average silhouette_score is : 0.36015231565890415




For n_clusters = 16 The average silhouette_score is : 0.3543931399629676




For n_clusters = 17 The average silhouette_score is : 0.3460847054264392




For n_clusters = 18 The average silhouette_score is : 0.3329816435221996




For n_clusters = 19 The average silhouette_score is : 0.3303880434770365




For n_clusters = 20 The average silhouette_score is : 0.34077378945141407




For n_clusters = 21 The average silhouette_score is : 0.3327730909664091




For n_clusters = 22 The average silhouette_score is : 0.33387882814394093




For n_clusters = 23 The average silhouette_score is : 0.3116816602778455




For n_clusters = 24 The average silhouette_score is : 0.30465029607568317




For n_clusters = 25 The average silhouette_score is : 0.34825245647886793




For n_clusters = 26 The average silhouette_score is : 0.333459120331639




For n_clusters = 27 The average silhouette_score is : 0.32031626724746987




For n_clusters = 28 The average silhouette_score is : 0.31420814173694456




For n_clusters = 29 The average silhouette_score is : 0.3170568188445575
Working on : leo, 2020-06-24.


  0%|          | 0/28 [00:00<?, ?it/s]



For n_clusters = 2 The average silhouette_score is : 0.6423585339740709
For n_clusters = 3 The average silhouette_score is : 0.49164113788923774




For n_clusters = 4 The average silhouette_score is : 0.48426553563891617




For n_clusters = 5 The average silhouette_score is : 0.43141538223333625




For n_clusters = 6 The average silhouette_score is : 0.4131510251570278




For n_clusters = 7 The average silhouette_score is : 0.39320830939301743




For n_clusters = 8 The average silhouette_score is : 0.3628436397196079




For n_clusters = 9 The average silhouette_score is : 0.3435546384715059




For n_clusters = 10 The average silhouette_score is : 0.3714760636306107




For n_clusters = 11 The average silhouette_score is : 0.3543083018093621




For n_clusters = 12 The average silhouette_score is : 0.3531740674981181




For n_clusters = 13 The average silhouette_score is : 0.3451863021374112




For n_clusters = 14 The average silhouette_score is : 0.3487865131239066




For n_clusters = 15 The average silhouette_score is : 0.34397630403870755




For n_clusters = 16 The average silhouette_score is : 0.33278332029629637




For n_clusters = 17 The average silhouette_score is : 0.34207314549220774




For n_clusters = 18 The average silhouette_score is : 0.34105328664070333




For n_clusters = 19 The average silhouette_score is : 0.32973024407754076




For n_clusters = 20 The average silhouette_score is : 0.33625609750939633




For n_clusters = 21 The average silhouette_score is : 0.3220172705079443




For n_clusters = 22 The average silhouette_score is : 0.33165242127208455




For n_clusters = 23 The average silhouette_score is : 0.33356094511093876




For n_clusters = 24 The average silhouette_score is : 0.31908331122901606




For n_clusters = 25 The average silhouette_score is : 0.323710769001456




For n_clusters = 26 The average silhouette_score is : 0.31449432684245354




For n_clusters = 27 The average silhouette_score is : 0.3196336859864677




For n_clusters = 28 The average silhouette_score is : 0.31549818997468504




For n_clusters = 29 The average silhouette_score is : 0.30318599758600995
Working on : leo, 2020-05-11.


  0%|          | 0/28 [00:00<?, ?it/s]



For n_clusters = 2 The average silhouette_score is : 0.6277815770641756




For n_clusters = 3 The average silhouette_score is : 0.482682703307032




For n_clusters = 4 The average silhouette_score is : 0.45707840340009814




For n_clusters = 5 The average silhouette_score is : 0.415119017923859




For n_clusters = 6 The average silhouette_score is : 0.37278912517066054




For n_clusters = 7 The average silhouette_score is : 0.34524992004746774




For n_clusters = 8 The average silhouette_score is : 0.3281043226374901




For n_clusters = 9 The average silhouette_score is : 0.3367613396821807




For n_clusters = 10 The average silhouette_score is : 0.34405107657756523




For n_clusters = 11 The average silhouette_score is : 0.3297098697160012




For n_clusters = 12 The average silhouette_score is : 0.310911016857495




For n_clusters = 13 The average silhouette_score is : 0.31633321949192517




For n_clusters = 14 The average silhouette_score is : 0.31121047641521626




For n_clusters = 15 The average silhouette_score is : 0.32001584193160937




For n_clusters = 16 The average silhouette_score is : 0.3140766602279361




For n_clusters = 17 The average silhouette_score is : 0.30185535932906093




For n_clusters = 18 The average silhouette_score is : 0.2845093530788891




For n_clusters = 19 The average silhouette_score is : 0.30159498703725135




For n_clusters = 20 The average silhouette_score is : 0.3011391860934951




For n_clusters = 21 The average silhouette_score is : 0.29380015194439346




For n_clusters = 22 The average silhouette_score is : 0.27550369005305547




For n_clusters = 23 The average silhouette_score is : 0.2897157757320046




For n_clusters = 24 The average silhouette_score is : 0.2807542425332122




For n_clusters = 25 The average silhouette_score is : 0.2790624487953439




For n_clusters = 26 The average silhouette_score is : 0.27442507342260025




For n_clusters = 27 The average silhouette_score is : 0.272744766172005




For n_clusters = 28 The average silhouette_score is : 0.26447468035341487




For n_clusters = 29 The average silhouette_score is : 0.2777284236332617
Working on : leo, 2020-02-21.


  0%|          | 0/28 [00:00<?, ?it/s]



For n_clusters = 2 The average silhouette_score is : 0.6033660130334083
For n_clusters = 3 The average silhouette_score is : 0.5097792598234936




For n_clusters = 4 The average silhouette_score is : 0.48138322833974856




For n_clusters = 5 The average silhouette_score is : 0.430544613930021




For n_clusters = 6 The average silhouette_score is : 0.4063596892123566




For n_clusters = 7 The average silhouette_score is : 0.38273907073956026




For n_clusters = 8 The average silhouette_score is : 0.3560208788791157




For n_clusters = 9 The average silhouette_score is : 0.3351538758613634




For n_clusters = 10 The average silhouette_score is : 0.32261076315904025




For n_clusters = 11 The average silhouette_score is : 0.3264000623801154




For n_clusters = 12 The average silhouette_score is : 0.3041757930405521




For n_clusters = 13 The average silhouette_score is : 0.31012706780614085




For n_clusters = 14 The average silhouette_score is : 0.31233485675549233




For n_clusters = 15 The average silhouette_score is : 0.3151677990306041




For n_clusters = 16 The average silhouette_score is : 0.306284407577793




For n_clusters = 17 The average silhouette_score is : 0.2917435823338012




For n_clusters = 18 The average silhouette_score is : 0.31017987927938206




For n_clusters = 19 The average silhouette_score is : 0.32711332720673414




For n_clusters = 20 The average silhouette_score is : 0.3019801402615737




For n_clusters = 21 The average silhouette_score is : 0.2768797048668849




For n_clusters = 22 The average silhouette_score is : 0.269840650197827




For n_clusters = 23 The average silhouette_score is : 0.2740380168877466




For n_clusters = 24 The average silhouette_score is : 0.2810765623923193




For n_clusters = 25 The average silhouette_score is : 0.29353411969727417




For n_clusters = 26 The average silhouette_score is : 0.27601478641527505




For n_clusters = 27 The average silhouette_score is : 0.27143420713873156




For n_clusters = 28 The average silhouette_score is : 0.2997816693250246




For n_clusters = 29 The average silhouette_score is : 0.27956282792004944
Working on : leo, 2019-12-12.


  0%|          | 0/28 [00:00<?, ?it/s]



For n_clusters = 2 The average silhouette_score is : 0.5963672915386007
For n_clusters = 3 The average silhouette_score is : 0.5237539149622122




For n_clusters = 4 The average silhouette_score is : 0.472974759293193
For n_clusters = 5 The average silhouette_score is : 0.44702476329392515




For n_clusters = 6 The average silhouette_score is : 0.41614256700553




For n_clusters = 7 The average silhouette_score is : 0.3863879954007865




For n_clusters = 8 The average silhouette_score is : 0.36883538953654854




For n_clusters = 9 The average silhouette_score is : 0.3440673666592822




For n_clusters = 10 The average silhouette_score is : 0.35287436501825864




For n_clusters = 11 The average silhouette_score is : 0.35796158048511767




For n_clusters = 12 The average silhouette_score is : 0.3463030974259655




For n_clusters = 13 The average silhouette_score is : 0.33657522551136904




For n_clusters = 14 The average silhouette_score is : 0.3234971051440392




For n_clusters = 15 The average silhouette_score is : 0.3501619695163068




For n_clusters = 16 The average silhouette_score is : 0.3420995044355819




For n_clusters = 17 The average silhouette_score is : 0.32110424828922735




For n_clusters = 18 The average silhouette_score is : 0.3367305390632264




For n_clusters = 19 The average silhouette_score is : 0.3313949172711




For n_clusters = 20 The average silhouette_score is : 0.3041310183148518




For n_clusters = 21 The average silhouette_score is : 0.31561511930383085




For n_clusters = 22 The average silhouette_score is : 0.3207100305337192




For n_clusters = 23 The average silhouette_score is : 0.3155750687735596




For n_clusters = 24 The average silhouette_score is : 0.3117846946944759




For n_clusters = 25 The average silhouette_score is : 0.3198549429667813




For n_clusters = 26 The average silhouette_score is : 0.3164006342519225




For n_clusters = 27 The average silhouette_score is : 0.31875287230426685




For n_clusters = 28 The average silhouette_score is : 0.31175020942633686




For n_clusters = 29 The average silhouette_score is : 0.3133499814200095
Working on : leo, 2019-10-14.


  0%|          | 0/28 [00:00<?, ?it/s]



For n_clusters = 2 The average silhouette_score is : 0.6390901517920431
For n_clusters = 3 The average silhouette_score is : 0.4968633323400854




For n_clusters = 4 The average silhouette_score is : 0.4634311387002372




For n_clusters = 5 The average silhouette_score is : 0.43449805325966123




For n_clusters = 6 The average silhouette_score is : 0.3942281967815132




For n_clusters = 7 The average silhouette_score is : 0.36281804790344324




For n_clusters = 8 The average silhouette_score is : 0.34088214280637




For n_clusters = 9 The average silhouette_score is : 0.32150369645710486




For n_clusters = 10 The average silhouette_score is : 0.32173130169494485




For n_clusters = 11 The average silhouette_score is : 0.30295354145778247




For n_clusters = 12 The average silhouette_score is : 0.27957654340859056




For n_clusters = 13 The average silhouette_score is : 0.2997606824543993




For n_clusters = 14 The average silhouette_score is : 0.29134989888533264




For n_clusters = 15 The average silhouette_score is : 0.28977029124702297




For n_clusters = 16 The average silhouette_score is : 0.29975584566461455




For n_clusters = 17 The average silhouette_score is : 0.2842843547828882




For n_clusters = 18 The average silhouette_score is : 0.29243967395625986




For n_clusters = 19 The average silhouette_score is : 0.2846816646377789




For n_clusters = 20 The average silhouette_score is : 0.27371474476541807




For n_clusters = 21 The average silhouette_score is : 0.2627985885196309




For n_clusters = 22 The average silhouette_score is : 0.27920491252541463




For n_clusters = 23 The average silhouette_score is : 0.2754687830791936




For n_clusters = 24 The average silhouette_score is : 0.2848475541610871




For n_clusters = 25 The average silhouette_score is : 0.2731800939876091




For n_clusters = 26 The average silhouette_score is : 0.2700557491732087




For n_clusters = 27 The average silhouette_score is : 0.289710136073938




For n_clusters = 28 The average silhouette_score is : 0.263613957995337




For n_clusters = 29 The average silhouette_score is : 0.27711805922802824
Working on : leo, 2019-09-03.


  0%|          | 0/28 [00:00<?, ?it/s]



For n_clusters = 2 The average silhouette_score is : 0.5491710652814255




For n_clusters = 3 The average silhouette_score is : 0.5285102981933815




For n_clusters = 4 The average silhouette_score is : 0.5012598682847649




For n_clusters = 5 The average silhouette_score is : 0.4649852572133777




For n_clusters = 6 The average silhouette_score is : 0.4431165965007349




For n_clusters = 7 The average silhouette_score is : 0.41527486573861017




For n_clusters = 8 The average silhouette_score is : 0.4093504469056676




For n_clusters = 9 The average silhouette_score is : 0.3813101301843531




For n_clusters = 10 The average silhouette_score is : 0.37322406487135856




For n_clusters = 11 The average silhouette_score is : 0.3564787126869359




For n_clusters = 12 The average silhouette_score is : 0.35840688214599115




For n_clusters = 13 The average silhouette_score is : 0.33568000514078294




For n_clusters = 14 The average silhouette_score is : 0.3450676583399238




For n_clusters = 15 The average silhouette_score is : 0.327117063507465




For n_clusters = 16 The average silhouette_score is : 0.30405263161441615




For n_clusters = 17 The average silhouette_score is : 0.2944709811158382




For n_clusters = 18 The average silhouette_score is : 0.30050499655538304




For n_clusters = 19 The average silhouette_score is : 0.28707547500301484




For n_clusters = 20 The average silhouette_score is : 0.30517461267831303




For n_clusters = 21 The average silhouette_score is : 0.3044969305875857




For n_clusters = 22 The average silhouette_score is : 0.2851280037347084




For n_clusters = 23 The average silhouette_score is : 0.2864811871913009




For n_clusters = 24 The average silhouette_score is : 0.27994582503501414




For n_clusters = 25 The average silhouette_score is : 0.288594462298213




For n_clusters = 26 The average silhouette_score is : 0.2862990059246232




For n_clusters = 27 The average silhouette_score is : 0.2933326141652882




For n_clusters = 28 The average silhouette_score is : 0.29242467341127143




For n_clusters = 29 The average silhouette_score is : 0.2862911816358023
Working on : leo, 2019-07-31.


  0%|          | 0/28 [00:00<?, ?it/s]



For n_clusters = 2 The average silhouette_score is : 0.5387102911483821




For n_clusters = 3 The average silhouette_score is : 0.5067259443899318




For n_clusters = 4 The average silhouette_score is : 0.49692567472639376




For n_clusters = 5 The average silhouette_score is : 0.455180093098948




For n_clusters = 6 The average silhouette_score is : 0.42533482646695875




For n_clusters = 7 The average silhouette_score is : 0.41637322113739494




For n_clusters = 8 The average silhouette_score is : 0.3892584885309799




For n_clusters = 9 The average silhouette_score is : 0.3592078242086822




For n_clusters = 10 The average silhouette_score is : 0.3430401636334895




For n_clusters = 11 The average silhouette_score is : 0.3457624583882739




For n_clusters = 12 The average silhouette_score is : 0.32608027973135306




For n_clusters = 13 The average silhouette_score is : 0.3124354691490248




For n_clusters = 14 The average silhouette_score is : 0.29943356769533547




For n_clusters = 15 The average silhouette_score is : 0.31598413720128504




For n_clusters = 16 The average silhouette_score is : 0.3043481780688884




For n_clusters = 17 The average silhouette_score is : 0.306108849885236




For n_clusters = 18 The average silhouette_score is : 0.2950775031889356




For n_clusters = 19 The average silhouette_score is : 0.2954441003806107




For n_clusters = 20 The average silhouette_score is : 0.29309668235536723




For n_clusters = 21 The average silhouette_score is : 0.28165761288767055




For n_clusters = 22 The average silhouette_score is : 0.29029765223963666




For n_clusters = 23 The average silhouette_score is : 0.2747679459682494




For n_clusters = 24 The average silhouette_score is : 0.2790683226193257




For n_clusters = 25 The average silhouette_score is : 0.2886054337054611




For n_clusters = 26 The average silhouette_score is : 0.2987021620339955




For n_clusters = 27 The average silhouette_score is : 0.2763763730037475




For n_clusters = 28 The average silhouette_score is : 0.28820998319120744




For n_clusters = 29 The average silhouette_score is : 0.28578537754085864
Working on : leo, 2019-03-28.


  0%|          | 0/28 [00:00<?, ?it/s]



For n_clusters = 2 The average silhouette_score is : 0.5566238306101189




For n_clusters = 3 The average silhouette_score is : 0.5129397132754796




For n_clusters = 4 The average silhouette_score is : 0.5011183756754416




For n_clusters = 5 The average silhouette_score is : 0.4695147346931049




For n_clusters = 6 The average silhouette_score is : 0.4404541369749108




For n_clusters = 7 The average silhouette_score is : 0.42104989110808977




For n_clusters = 8 The average silhouette_score is : 0.3902017994049036




For n_clusters = 9 The average silhouette_score is : 0.373118227818357




For n_clusters = 10 The average silhouette_score is : 0.35175835982937576




For n_clusters = 11 The average silhouette_score is : 0.3570681523502617




For n_clusters = 12 The average silhouette_score is : 0.3348599724459093




For n_clusters = 13 The average silhouette_score is : 0.3254540512127179




For n_clusters = 14 The average silhouette_score is : 0.32813816227533615




For n_clusters = 15 The average silhouette_score is : 0.3090555098251298




For n_clusters = 16 The average silhouette_score is : 0.31288753068975095




For n_clusters = 17 The average silhouette_score is : 0.3168704079325131




For n_clusters = 18 The average silhouette_score is : 0.2888559836832245




For n_clusters = 19 The average silhouette_score is : 0.3052261288613432




For n_clusters = 20 The average silhouette_score is : 0.2933163214311773




For n_clusters = 21 The average silhouette_score is : 0.2833948509913895




For n_clusters = 22 The average silhouette_score is : 0.3027853579395268




For n_clusters = 23 The average silhouette_score is : 0.28170208708640726




For n_clusters = 24 The average silhouette_score is : 0.2829690693552958




For n_clusters = 25 The average silhouette_score is : 0.2980004746302383




For n_clusters = 26 The average silhouette_score is : 0.2805132029296034




For n_clusters = 27 The average silhouette_score is : 0.27546716247968395




For n_clusters = 28 The average silhouette_score is : 0.28325114938685036




For n_clusters = 29 The average silhouette_score is : 0.2863616501981182
Working on : leo, 2019-02-11.


  0%|          | 0/28 [00:00<?, ?it/s]



For n_clusters = 2 The average silhouette_score is : 0.5195536381114592




For n_clusters = 3 The average silhouette_score is : 0.5215055453360141




For n_clusters = 4 The average silhouette_score is : 0.4582780958111023




For n_clusters = 5 The average silhouette_score is : 0.4199476084621303




For n_clusters = 6 The average silhouette_score is : 0.40772601730005775




For n_clusters = 7 The average silhouette_score is : 0.3729176587214893




For n_clusters = 8 The average silhouette_score is : 0.3555787917102887




For n_clusters = 9 The average silhouette_score is : 0.33416389003652414




For n_clusters = 10 The average silhouette_score is : 0.31311906102205406




For n_clusters = 11 The average silhouette_score is : 0.3182990183091331




For n_clusters = 12 The average silhouette_score is : 0.3181840165006444




For n_clusters = 13 The average silhouette_score is : 0.30359359196208713




For n_clusters = 14 The average silhouette_score is : 0.28830940608721467




For n_clusters = 15 The average silhouette_score is : 0.2954170709592047




For n_clusters = 16 The average silhouette_score is : 0.2932966171112718




For n_clusters = 17 The average silhouette_score is : 0.29279308540671706




For n_clusters = 18 The average silhouette_score is : 0.2884013403295577




For n_clusters = 19 The average silhouette_score is : 0.28865476929908




For n_clusters = 20 The average silhouette_score is : 0.28455383917431637




For n_clusters = 21 The average silhouette_score is : 0.28090921802029806




For n_clusters = 22 The average silhouette_score is : 0.28247349297085433




For n_clusters = 23 The average silhouette_score is : 0.2792943729354272




For n_clusters = 24 The average silhouette_score is : 0.2819960391253674




For n_clusters = 25 The average silhouette_score is : 0.28065368344111197




For n_clusters = 26 The average silhouette_score is : 0.2782383792205279




For n_clusters = 27 The average silhouette_score is : 0.2681191117530501




For n_clusters = 28 The average silhouette_score is : 0.28255081370519686




For n_clusters = 29 The average silhouette_score is : 0.28121923747093674
Working on : leo, 2018-09-20.


  0%|          | 0/28 [00:00<?, ?it/s]



For n_clusters = 2 The average silhouette_score is : 0.5299524580111263




For n_clusters = 3 The average silhouette_score is : 0.490459280854898




For n_clusters = 4 The average silhouette_score is : 0.47944202116910706




For n_clusters = 5 The average silhouette_score is : 0.471284471952345




For n_clusters = 6 The average silhouette_score is : 0.4377472430041456




For n_clusters = 7 The average silhouette_score is : 0.4100322590823178




For n_clusters = 8 The average silhouette_score is : 0.38192795751184705




For n_clusters = 9 The average silhouette_score is : 0.3546917715412201




For n_clusters = 10 The average silhouette_score is : 0.3387521522024018




For n_clusters = 11 The average silhouette_score is : 0.314083100063878




For n_clusters = 12 The average silhouette_score is : 0.3213061119797772




For n_clusters = 13 The average silhouette_score is : 0.28316990407620357




For n_clusters = 14 The average silhouette_score is : 0.3020098058174265




For n_clusters = 15 The average silhouette_score is : 0.2941653845752277




For n_clusters = 16 The average silhouette_score is : 0.28880360843955755




For n_clusters = 17 The average silhouette_score is : 0.2954654816023097




For n_clusters = 18 The average silhouette_score is : 0.29556972093396766




For n_clusters = 19 The average silhouette_score is : 0.2967782767221509




For n_clusters = 20 The average silhouette_score is : 0.2993703725424944




For n_clusters = 21 The average silhouette_score is : 0.29004331637000524




For n_clusters = 22 The average silhouette_score is : 0.2980657807979899




For n_clusters = 23 The average silhouette_score is : 0.28613508502985463




For n_clusters = 24 The average silhouette_score is : 0.28177044081700187




For n_clusters = 25 The average silhouette_score is : 0.2816500183576445




For n_clusters = 26 The average silhouette_score is : 0.2848146861979962




For n_clusters = 27 The average silhouette_score is : 0.27730595213049886




For n_clusters = 28 The average silhouette_score is : 0.28103962449512837




For n_clusters = 29 The average silhouette_score is : 0.27756991078408816
Working on : leo, 2018-07-13.


  0%|          | 0/28 [00:00<?, ?it/s]



For n_clusters = 2 The average silhouette_score is : 0.5685790048466635




For n_clusters = 3 The average silhouette_score is : 0.5289077922574459




For n_clusters = 4 The average silhouette_score is : 0.49688043723715736




For n_clusters = 5 The average silhouette_score is : 0.4631675052295303




For n_clusters = 6 The average silhouette_score is : 0.425288303299642




For n_clusters = 7 The average silhouette_score is : 0.3985852073681967




For n_clusters = 8 The average silhouette_score is : 0.3770236563476342




For n_clusters = 9 The average silhouette_score is : 0.3587237050944354




For n_clusters = 10 The average silhouette_score is : 0.3650788303377572




For n_clusters = 11 The average silhouette_score is : 0.35249142905572894




For n_clusters = 12 The average silhouette_score is : 0.3616520744101613




For n_clusters = 13 The average silhouette_score is : 0.360198217740529




For n_clusters = 14 The average silhouette_score is : 0.3558372459143482




For n_clusters = 15 The average silhouette_score is : 0.339741389806201




For n_clusters = 16 The average silhouette_score is : 0.3340459057995359




For n_clusters = 17 The average silhouette_score is : 0.3471742216995145




For n_clusters = 18 The average silhouette_score is : 0.34817945126544986




For n_clusters = 19 The average silhouette_score is : 0.34532902337236043




For n_clusters = 20 The average silhouette_score is : 0.3335447467961286




For n_clusters = 21 The average silhouette_score is : 0.3311715433788587




For n_clusters = 22 The average silhouette_score is : 0.3333323438407203




For n_clusters = 23 The average silhouette_score is : 0.31726247506808053




For n_clusters = 24 The average silhouette_score is : 0.3131239897451784




For n_clusters = 25 The average silhouette_score is : 0.29716890521908673




For n_clusters = 26 The average silhouette_score is : 0.31661096432660857




For n_clusters = 27 The average silhouette_score is : 0.31571259687916314




For n_clusters = 28 The average silhouette_score is : 0.3053839630336943




For n_clusters = 29 The average silhouette_score is : 0.3130800249800227
Working on : leo, 2018-06-06.


  0%|          | 0/28 [00:00<?, ?it/s]



For n_clusters = 2 The average silhouette_score is : 0.4870232656381958




For n_clusters = 3 The average silhouette_score is : 0.5112746196203316




For n_clusters = 4 The average silhouette_score is : 0.4731528661957991




For n_clusters = 5 The average silhouette_score is : 0.44669481259545357




For n_clusters = 6 The average silhouette_score is : 0.4258512499653525




For n_clusters = 7 The average silhouette_score is : 0.4036426218862234




For n_clusters = 8 The average silhouette_score is : 0.3752803522259062




For n_clusters = 9 The average silhouette_score is : 0.36036446513898435




For n_clusters = 10 The average silhouette_score is : 0.331527139640298




For n_clusters = 11 The average silhouette_score is : 0.3389761557649432




For n_clusters = 12 The average silhouette_score is : 0.3215598847683479




For n_clusters = 13 The average silhouette_score is : 0.3116874984255199




For n_clusters = 14 The average silhouette_score is : 0.33387769913848997




For n_clusters = 15 The average silhouette_score is : 0.3257113444567351




For n_clusters = 16 The average silhouette_score is : 0.3392125717039824




For n_clusters = 17 The average silhouette_score is : 0.3026450888664127




For n_clusters = 18 The average silhouette_score is : 0.30367045167692286




For n_clusters = 19 The average silhouette_score is : 0.3180222584768933




For n_clusters = 20 The average silhouette_score is : 0.3121919788558074




For n_clusters = 21 The average silhouette_score is : 0.3116840945276956




For n_clusters = 22 The average silhouette_score is : 0.2996545308180057




For n_clusters = 23 The average silhouette_score is : 0.3030293183384783




For n_clusters = 24 The average silhouette_score is : 0.2939921050686294




For n_clusters = 25 The average silhouette_score is : 0.30026725905014534




For n_clusters = 26 The average silhouette_score is : 0.29908832083482434




For n_clusters = 27 The average silhouette_score is : 0.3007054556094827




For n_clusters = 28 The average silhouette_score is : 0.29219020455362954




For n_clusters = 29 The average silhouette_score is : 0.2970891262785888
Wall time: 4min 45s


In [22]:
# Find sub-optimal k by searching inflexion points where
# an additional cluster do not considerably degrade the overall clustering performance.

opt_k=get_opt_k(sil_df, sigma=0 )
opt_k

{'leo_2018-06-06': 10,
 'leo_2018-07-13': 9,
 'leo_2018-09-20': 11,
 'leo_2019-02-11': 10,
 'leo_2019-03-28': 10,
 'leo_2019-07-31': 10,
 'leo_2019-09-03': 11,
 'leo_2019-10-14': 9,
 'leo_2019-12-12': 9,
 'leo_2020-02-21': 10,
 'leo_2020-05-11': 8,
 'leo_2020-06-24': 9,
 'leo_2020-08-10': 10,
 'leo_2020-09-21': 10,
 'leo_2020-11-06': 5,
 'mar_2018-06-01': 5,
 'mar_2018-06-21': 5,
 'mar_2018-07-27': 5,
 'mar_2018-09-25': 5,
 'mar_2018-11-13': 5,
 'mar_2018-12-11': 5,
 'mar_2019-02-05': 5,
 'mar_2019-03-13': 5,
 'mar_2019-05-16': 4}

If we are not satisfied with the sub-optimal k returned by the algorithm, we can manually specify each survey k
by defining a dictionary.

In [9]:
# Based on our observations on a dataset comprising 87 surveys, 10 clusters (k=10) is generally a good tradeoff.

opt_k={'leo_2018-06-06': 10,
 'leo_2018-07-13': 10,
 'leo_2018-09-20': 10,
 'leo_2019-02-11': 10,
 'leo_2019-03-28': 10,
 'leo_2019-07-31': 10,
 'mar_2018-06-01': 10,
 'mar_2018-06-21': 10,
 'mar_2018-07-27': 10,
 'mar_2018-09-25': 10,
 'mar_2018-11-13': 10,
 'mar_2018-12-11': 10,
 'mar_2019-02-05': 10,
 'mar_2019-03-13': 10,
 'mar_2019-05-16': 10}

In [23]:
# Clustering the dataset with optimal k

data_classified=kmeans_sa(data_merged,opt_k, feature_set=feature_set)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  data_merged.dropna(inplace=True)


  0%|          | 0/2 [00:00<?, ?it/s]

  0%|          | 0/9 [00:00<?, ?it/s]

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  data_in["label_k"]=clusterer.fit_predict(minmax_scaled_df)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  data_in["label_k"]=clusterer.fit_predict(minmax_scaled_df)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  data_in["label_k"]=clusterer.fit_predict(minmax_scaled_df)
A value is trying to be set 

  0%|          | 0/15 [00:00<?, ?it/s]

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  data_in["label_k"]=clusterer.fit_predict(minmax_scaled_df)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  data_in["label_k"]=clusterer.fit_predict(minmax_scaled_df)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  data_in["label_k"]=clusterer.fit_predict(minmax_scaled_df)
A value is trying to be set 

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  data_in["label_k"]=clusterer.fit_predict(minmax_scaled_df)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  data_in["label_k"]=clusterer.fit_predict(minmax_scaled_df)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  data_in["label_k"]=clusterer.fit_predict(minmax_scaled_df)


In [24]:
data_classified=pd.merge(data_classified[["point_id","label_k"]],data_merged, how="left", on="point_id", validate="one_to_one")
data_classified

Unnamed: 0,point_id,label_k,distance,z,tr_id,raw_date,coordinates,location,survey_date,x,y,geometry,band1,band2,band3,slope,curve
0,67148080l2690700eo10,2,1.0,1.096543,47,20180606,POINT (299874.2117248897 5773731.971115951),leo,2018-06-06,299874.2117248897,5773731.971115951,POINT (299874.212 5773731.971),123.0,126.0,121.0,-0.010084,-0.010396
1,67141080l2630750eo10,6,1.5,1.076090,47,20180606,POINT (299874.7086046137 5773732.026888166),leo,2018-06-06,299874.7086046137,5773732.026888166,POINT (299874.709 5773732.027),92.0,86.0,97.0,-0.033487,-0.006814
2,67143080l2670800eo20,6,2.0,1.029569,47,20180606,POINT (299875.2054843378 5773732.08266038),leo,2018-06-06,299875.20548433776,5773732.08266038,POINT (299875.205 5773732.083),90.0,84.0,94.0,-0.023713,0.016813
3,67146080l2610850eo20,8,2.5,1.028664,47,20180606,POINT (299875.7023640618 5773732.138432594),leo,2018-06-06,299875.7023640618,5773732.138432594,POINT (299875.702 5773732.138),98.0,94.0,101.0,0.000139,0.017553
4,67148080l2650800eo30,8,3.0,1.029846,47,20180606,POINT (299876.1992437858 5773732.194204807),leo,2018-06-06,299876.1992437858,5773732.194204807,POINT (299876.199 5773732.194),109.0,106.0,118.0,0.011393,-0.001586
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
56091,60101091m2575700ar25,2,25.5,1.972202,0,20190516,POINT (731471.4863936177 5705142.796794743),mar,2019-05-16,731471.4863936177,5705142.796794743,POINT (731471.486 5705142.797),211.0,205.0,183.0,-0.040461,-0.003882
56092,60106091m2506900ar20,2,26.0,1.942527,0,20190516,POINT (731471.9834328609 5705142.742462518),mar,2019-05-16,731471.9834328609,5705142.742462518,POINT (731471.983 5705142.742),210.0,206.0,182.0,-0.045471,-0.002531
56093,60100091m2546200ar25,2,26.5,1.881259,0,20190516,POINT (731472.4804721042 5705142.688130293),mar,2019-05-16,731472.4804721042,5705142.688130293,POINT (731472.480 5705142.688),201.0,195.0,169.0,-0.045523,0.003576
56094,60104091m2577400ar20,2,27.0,1.851482,0,20190516,POINT (731472.9775113474 5705142.633798068),mar,2019-05-16,731472.9775113474,5705142.633798068,POINT (731472.978 5705142.634),209.0,202.0,176.0,-0.038319,0.001305


### GOOD!

save the __data_classified__ dataframe as a CSV file and head to the __SANDPYPER polygon correction notebook__.

In [25]:
data_classified.to_csv(r"C:\my_packages\doc_data\labels\data_classified.csv", index=False)

___