Skip to content


Repository files navigation

Clojars Projectcljdoc badge

  • v0.3: Gitpod ready-to-code v0.2.2
  • latest snapshot: Gitpod ready-to-code latest-snapshot
  • latest snapshot: Binder

A idiomatic Clojure machine learning library.

Main features:

  • Harmonized and idiomatic use of various classification, regression and unsupervised models
  • Supports creation of machine learning pipelines as-data
  • Includes easy-to-use, sophisticated cross-validations of pipelines
  • Includes most important data transformation for data preprocessing
  • Experiment tracking can be added by the user via a callback mechanism
  • Open architecture to allow to plugin any potential ML model, even in non-JVM languages, including deep learning
  • Based on well established Clojure/Java Data Science libraries



 {scicloj/ {:mvn/version "0.3"}}}


(require '[ :as ml]
         '[ :as mm]
         '[ :as ds])

;; read train and test datasets
(def titanic-train
  (ds/dataset "" {:key-fn keyword :parser-fn :string}))

(def titanic-test
  (-> ""
      (ds/dataset {:key-fn keyword :parser-fn :string})
      (ds/add-column :Survived [""] :cycle)))

;; construct pipeline function including Logistic Regression model
(def pipe-fn
   (mm/select-columns [:Survived :Pclass ])
   (mm/add-column :Survived (fn [ds] (map #(case % "1" "yes" "0" "no" nil "") (:Survived ds))))
   (mm/categorical->number [:Survived :Pclass])
   (mm/set-inference-target :Survived)
   {:metamorph/id :model}
   (mm/model {:model-type :smile.classification/logistic-regression})))

;;  execute pipeline with train data including model in mode :fit
(def trained-ctx
  (pipe-fn {:metamorph/data titanic-train
            :metamorph/mode :fit}))

;; execute pipeline in mode :transform with test data which will do a prediction 
(def test-ctx
   (assoc trained-ctx
          :metamorph/data titanic-test
          :metamorph/mode :transform)))

;; extract prediction from pipeline function result
(-> test-ctx :metamorph/data
    (ds/column-values->categorical :Survived))
;; => #tech.v3.dataset.column<string>[418]
;;    :Survived
;;    [no, no, yes, no, no, no, no, yes, no, no, no, no, no, yes, no, yes, yes, no, no, no...]   


For support use Clojurians on Zulip: on Zulip

or on Clojurians Slack: on Slack


Full documentation is here as userguides

API documentation:

Reference to projects is using/based on:

This library itself is a shim, not containing any functions. The code is present in the following repositories, and the functions get re-exported in in a small number of namespaces for user convenience. organises the existing code in 3 namespaces, as following:


Functions are re-exported from:

  • scicloj.metamorph.core


All functions in this ns take a dataset as first argument. The functions are re-exported from:

  • tabecloth.api
  • tech.v3.dataset.modelling
  • tech.v3.dataset.column-filters


All functions in this ns take a metamorph context as first argument, so can directly be used in metamorph pipelines. The functions are re-exported from:

  • tablecloth.pipeline
  • tech.v3.dataset.metamorph

In case you are already familar with any of the original namespaces, they can of course be used directly as well:

(require '[tablecloth.api :as tc])
(tc/add-column ...)

Plugins can be easely extended by plugins, which contribute models or other algorithms. By now the following plugins exist: