# Whiskey clustering with Apache Ignite

This example examines 86 Scotch whiskies that have been rated on
a five-point scale for 12 flavor characteristics: Body, sweetness, smoky, medicinal, tobacco, honey, spicy, winey, nutty, malty, fruity, and floral.
This data set comes from a book on the classification of Scotch whisky based on
flavors by David Wishart (2002) and can be found [online](https://www.niss.org/sites/default/files/ScotchWhisky01.txt).

We'll use [Apache Commons CSV](https://commons.apache.org/proper/commons-csv/) to read and manipulate our data
and the clustering classes from the [Apache Ignite](https://ignite.apache.org/)
[machine learning library](https://ignite.apache.org/features/machinelearning.html).

We'll add those libraries to the classpath and define some imports to simplify access to the classes we need.

In [None]:
%%classpath add mvn
org.apache.ignite ignite-core 2.15.0
org.apache.ignite ignite-ml 2.15.0
org.apache.commons commons-csv 1.10.0
org.knowm.xchart xchart 3.8.3

In [None]:
%import static org.apache.commons.csv.CSVFormat.RFC4180
%import org.knowm.xchart.*
%import org.apache.ignite.Ignition
%import org.apache.ignite.cache.affinity.rendezvous.RendezvousAffinityFunction
%import org.apache.ignite.configuration.CacheConfiguration
%import org.apache.ignite.configuration.IgniteConfiguration
%import org.apache.ignite.ml.clustering.kmeans.KMeansTrainer
%import org.apache.ignite.ml.dataset.feature.extractor.impl.DoubleArrayVectorizer
%import static org.apache.ignite.ml.dataset.feature.extractor.Vectorizer$LabelCoordinate.FIRST
%import org.apache.ignite.ml.math.distances.*
%import org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi
%import org.apache.ignite.spi.discovery.tcp.ipfinder.multicast.TcpDiscoveryMulticastIpFinder

We start by loading the data.

In [None]:
file = '../resources/whiskey.csv' as File
rows = file.withReader {r -> RFC4180.parse(r).records*.toList() }
data = rows[1..-1].collect{ it[2..-1]*.toDouble() } as double[][]
distilleries = rows[1..-1]*.get(1)
features = rows[0][2..-1]