<p style="font-size:35px; text-align:center; font-weight:bold">CMASS run SOM clustering method</p>
<p style="font-size:17px; text-align:left">Dr. Ina Storch 06-11-2023 </p>
<p style="font-size:17px; text-align:left">Note: This notebook is designed to run SOM using preprocessed data gained from the datacube from: Lawley et al., 2021. </p>
<p style="font-size:17px; text-align:left">Reference: Lawley, C.J.M., McCafferty, A.E., Graham, G.E., Gadd, M.G., Huston, D.L., Kelley, K.D., Paradis, S., Peter, J.M., and Czarnota, K., 2021. Datasets to support prospectivity modelling for sediment-hosted Zn-Pb mineral systems; Geological Survey of Canada, Open File 8836, 1 .zip file. https://doi.org/10.4095/329203</p>

<p style="font-size:19px; text-align:left; font-weight:bold">1) Import libraries</p>

In [1]:
from src.nextsomcore.nextsomcore import NxtSomCore
import pickle

In [2]:
import src.argsSOM

args = src.argsSOM.Args()

<p style="font-size:19px; text-align:left; font-weight:bold">2) Specify parameter for SOM. 

Input data can eighter be in .lrn file format or .tiff file format. Choose one.

Create a "data" folder within the "methods/som/" folder. This "data" folder should contain a folder for "input" and "output" data, each. To be able to run this jupyter notebook, copy your input data into the folder "methods/som/data/input". 

No Data Handeling is not jet implemented.

In [14]:
# #------------- 
# #- Data Input .tiff files:
# #------------- 
# #- If input data is geotiff: list geotiff files, separated by "," ["name1.tiff","name2.tiff"]
# #input_list_text=["data/input/70Gravity_Bouguer._norm.tif","data/input/83Magnetic_LongWavelength_HGM._norm.tif","data/input/50Geology_Fault_Proximity._norm.tif"]
# input_list_text=["data/input/testdata/Magnetics.tif",
#                 "data/input/testdata/RockContact(bmgg_bvc).tif",
#                 "data/input/testdata/RockContact(gsh_bs).tif",
#                 "data/input/testdata/Unit(bmgg).tif",
#                 "data/input/testdata/Unit(bvc).tif",
#                 "data/input/testdata/Unit(gsb).tif",
#                 "data/input/testdata/Unit(gsh).tif"
#                 ]
# args.input_file= ",".join(input_list_text)
 
args.geotiff_input=None      # geotiff_input ("None", arg.input_file)

#------------- 
#- Or: Data Input .lrn file:
#------------- 
#args.input_file="data/input/SOM_grav_mag.lrn"
args.input_file="/methods/methods/som/data/input/SOM_grav_mag.lrn"

#-------------
#- Data Output
#-------------

#args.output_folder="data/output"         # Folder to save som dictionary and cluster dictionary
args.output_folder="/methods/methods/som/data/output"

args.output_file_somspace= args.output_folder+"/result_som.txt"   # DO NOT CHANGE! Text file that will contain calculated values: som_x som_y b_data1 b_data2 b_dataN umatrix cluster in geospace.
        

#-------------
#- Parameter
#-------------

args.som_x=10                # X dimension of generated SOM
args.som_y=10                # Y dimension of generated SOM
args.epochs=10               # Number of epochs to run

# Base parameters required for som calculation. 
# Additional optional parameters below:
args.outgeofile= args.output_folder+"/result_geo.txt"             # DO NOT CHANGE!
args.output_file_geospace=args.outgeofile   # Text file that will contain calculated values: {X Y Z} data1 data2 dataN som_x som_y cluster b_data1 b_data2 b_dataN in geospace.

args.kmeans="false"          # Run k-means clustering (true, false)
args.kmeans_init=5           # Number of initializations
args.kmeans_min=2            # Minimum number of k-mean clusters
args.kmeans_max=25           # Maximum number of k-mean clusters

args.neighborhood='gaussian'     # Shape of the neighborhood function. gaussian or bubble
args.std_coeff=0.5               # Coefficient in the Gaussian neighborhood function
args.maptype='toroid'            # Type of SOM (sheet, toroid)
args.initialcodebook=None        # File path of initial codebook, 2D numpy.array of float32.
args.radius0=0                   # Initial size of the neighborhood
args.radiusN=1                   # Final size of the neighborhood
args.radiuscooling='linear'      # Function that defines the decrease in the neighborhood size as the training proceeds (linear, exponential)
args.scalecooling='linear'       # Function that defines the decrease in the learning scale as the training proceeds (linear, exponential)
args.scale0=0.1                  # Initial learning rate
args.scaleN=0.01                 # Final learning rate
args.initialization='random'     # Type of SOM initialization (random, pca)
args.gridtype='rectangular'      # Type of SOM grid (hexagonal, rectangular)
#args.xmlfile="none"              # SOM inputs as an xml file

args.normalized="false"      # Whether the data has been normalized or not (true, false)
args.minN=0                  # Minimum value for normalization
args.maxN=1                  # Maximum value for normalization
args.label=None              # Whether data contains label column, true or false


In [15]:
print(args.input_file)

/methods/methods/som/data/input/SOM_grav_mag.lrn


<p style="font-size:19px; text-align:left; font-weight:bold">3) Run SOM 

Run SOM with parameters specified above and save the results. Uses NxtSomCore package to do the actual work.

In [16]:
import src.do_nextsomcore_save_results as dnsr

dnsr.run_SOM(args)

Time for epoch 1: 0.5097 Time for epoch 2: 0.3899 Time for epoch 3: 0.4699 Time for epoch 4: 0.51 Time for epoch 5: 0.4401 Time for epoch 6: 0.4339 Time for epoch 7: 0.3359 Time for epoch 8: 0.3216 Time for epoch 9: 0.4385 Time for epoch 10: 0.4099 

/methods/methods/som/data/output


AttributeError: 'Args' object has no attribute 'geotiff_input'

<p style="font-size:19px; text-align:left; font-weight:bold">4) Plot Results.

In [None]:
#import functions.argsPlot
#
#argsP = functions.argsPlot.Args()
#
#argsP.outsomfile= "data/output/somspace.txt"   # som calculation somspace output text file
#argsP.som_x= 100                 # som x dimension
#argsP.som_y= 100                 # som y dimension
#argsP.input_file= "data/input/SOM_grav_mag.lrn"    # Input file(*.lrn)
#argsP.dir= "data/output"        # Input file(*.lrn) or directory where som.dictionary was safet to (/output/som.dictionary)
#argsP.grid_type= 'rectangular'  # grid type (square or hexa), (rectangular or hexagonal)
#argsP.redraw='true'             # whether to draw all plots, or only those required for clustering (true: draw all. false:draw only for clustering).
#argsP.outgeofile='data/output/geospace.txt'     # som geospace results txt file
#argsP.dataType=None             # Data type (scatter or grid)
#argsP.noDataValue='NA'          # noData value

In [12]:
run src/plot_som_results.py

GeoSpace plots finished
SomSpace plots finshed
Boxplots finished
