# Calling vald-client-clj APIs using Clojupyter

In this note, how to call vald-client-clj APIs using Clojupyter is described.

## Run the containers

First, run a Vald cluster or a Vald Agent NGT container.

- [Vald Get Started Document](https://vald.vdaas.org/docs/tutorial/get-started/)

In this example, a Vald Agent NGT container is used.
Please create a docker network to allow the containers to access by name.

    $ docker create network clojupyter-vald

To run a Vald Agent NGT, 

    $ docker run -v path-to-config-dir:/etc/server --network clojupyter-vald --rm --name vald-agent-ngt -it vdaas/vald-agent-ngt:nightly
    2020-06-29 06:12:59     [INFO]: maxprocs: Leaving GOMAXPROCS=3: CPU quota undefined
    2020-06-29 06:12:59     [INFO]: service agent ngt v0.0.0 starting...
    2020-06-29 06:12:59     [INFO]: daemon start
    2020-06-29 06:12:59     [INFO]: server pprof executing preStartFunc
    2020-06-29 06:12:59     [INFO]: server grpc executing preStartFunc
    2020-06-29 06:12:59     [INFO]: server prometheus executing preStartFunc
    2020-06-29 06:12:59     [INFO]: REST server pprof starting on 0.0.0.0:6060
    2020-06-29 06:12:59     [INFO]: gRPC server grpc starting on 0.0.0.0:8081
    2020-06-29 06:12:59     [INFO]: REST server readiness starting on 0.0.0.0:3001
    2020-06-29 06:12:59     [INFO]: REST server prometheus starting on 0.0.0.0:6061
    
Then, please start the clojupyter-vald-sample container.

    $ docker run -p 8888:8888 --network clojupyter-vald -it rinx/clojupyter-vald-sample

Please access to the http://localhost:8888 to open the Jupyter Notebook.

## Require vald-client-clj library and create a client

To require vald-client-clj library, please run the following:

In [7]:
(require '[vald-client-clj.core :as vald])

nil

and then, create a agent client.

In [2]:
(def client
  (vald/agent-client "vald-agent-ngt" 8081))

#'user/client

If you're using Vald cluster, please create a gateway client instead.

```clojure
(def client
  (vald/vald-client "domain of cluster" 8081))
```

## Call vald-client-clj APIs

You can use vald-client-clj APIs like the followings.


In [3]:
(def dimension 784)

#'user/dimension

In [4]:
;; insert
(-> client
    (vald/insert "meta" (take dimension (repeatedly rand))))

{}

In [5]:
;; remove-id
(-> client
    (vald/remove-id "meta"))

{}

In [6]:
;; stream insert
(-> client
    (vald/stream-insert
     identity
     (->> (take 1000 (range))
         (map (fn [i]
                  {:id (str "meta" i)
                   :vector (take dimension (repeatedly rand))}))
         (vec)))
    (deref))

{:status :done, :count 1000}

please wait for finishing creating index.

After index created, send search requests.

In [7]:
;; get-object
(-> client
    (vald/get-object "meta1"))

{:id "meta1", :vector (0.28687677 0.072202615 0.08684743 0.09626826 0.6310945 0.6804732 0.15106194 0.37331024 0.085573636 0.1173129 0.95588225 0.9082217 0.7447617 0.6998726 0.8441438 0.94755673 0.7475735 0.06149543 0.91841465 0.7049656 0.87028813 0.65931916 0.7387954 0.43461055 0.11743653 0.92684937 0.8681505 0.9112257 0.018595515 0.6683359 0.8227864 0.42390677 0.015763257 0.85135514 0.24147205 0.6534125 0.40899736 0.1698208 0.24683152 0.22170429 0.8172554 0.6217165 0.040521897 0.27764648 0.006885614 0.87416404 0.7472405 0.21423443 0.2419729 0.3730108 0.40249008 0.50041306 0.018753275 0.9949402 0.7651629 0.65449464 0.16056788 0.53718054 0.32838637 0.5863779 0.77033496 0.49320596 0.50070816 0.8922186 0.4642769 0.37239563 0.43457842 0.19789435 0.6436462 0.2874355 0.39022237 0.31655365 0.30045274 0.6553193 0.8403775 0.81073374 0.9987028 0.63612413 0.74500304 0.5309769 0.22441696 0.3683074 0.5549527 0.8405705 0.25570244 0.15009153 0.5397332 0.5299818 0.068414666 0.46357346 0.97685516 0.874

In [8]:
;; search
(-> client
    (vald/search {:num 3} (vec (take 784 (repeatedly rand))))
    (clojure.pprint/pprint))

[{:id "meta46", :distance 10.721643}
 {:id "meta113", :distance 10.769736}
 {:id "meta696", :distance 10.816422}]


nil

In [9]:
;; stream search
(def results (atom []))
(-> client
    (vald/stream-search
     (fn [res]
         (swap! results conj res))
     {:num 3}
     (vec (take 3 (repeatedly #(vec (take dimension (repeatedly rand)))))))
    (deref))
(clojure.pprint/pprint
 (deref results))

[[{:id "meta66", :distance 11.000912}
  {:id "meta614", :distance 11.092979}
  {:id "meta361", :distance 11.095173}]
 [{:id "meta168", :distance 10.869421}
  {:id "meta390", :distance 10.945769}
  {:id "meta177", :distance 10.985801}]
 [{:id "meta770", :distance 10.8477125}
  {:id "meta6", :distance 10.858251}
  {:id "meta687", :distance 10.861681}]]


nil

In [10]:
;; stream search-by-id
(def results2 (atom []))
(-> client
    (vald/stream-search-by-id
     (fn [res]
         (swap! results2 conj res))
     {:num 3}
     ["meta1" "meta2" "meta3"])
    (deref))
(clojure.pprint/pprint
 (deref results2))

[[{:id "meta850", :distance 10.874201}
  {:id "meta863", :distance 10.906281}
  {:id "meta422", :distance 10.935315}]
 [{:id "meta2", :distance 0.0}
  {:id "meta151", :distance 10.678402}
  {:id "meta562", :distance 10.739888}]
 [{:id "meta94", :distance 10.7507715}
  {:id "meta550", :distance 10.87126}
  {:id "meta317", :distance 10.892149}]]


nil

### Visualize

Let's visualize the results using [metasoarous/oz](https://github.com/metasoarous/oz).

In [11]:
(require '[clojupyter.misc.helper :as helper])
(helper/add-dependencies '[metasoarous/oz "1.6.0-alpha6"])
(require '[oz.notebook.clojupyter :as oz])

nil

In [12]:
(let [results
      (-> client
          (vald/search {:num 20} (vec (take dimension (repeatedly rand)))))]
    (oz/view!
     {:data {:values results}
      :mark "bar"
      :encoding
      {:x
       {:field "id"
        :type "nominal"}
       :y
       {:field "distance"
        :type "quantitative"}}}))

### Close

Please close the connection after finishing your work.

In [13]:
(vald/close client)

#object[io.grpc.internal.ManagedChannelImpl 0x17a0bb43 "ManagedChannelImpl{logId=1, target=vald-agent-ngt:8081}"]

## Using Fashion-MNIST dataset

In this example, [Fashion-MNIST](https://github.com/zalandoresearch/fashion-mnist) dataset will be used.

First, download the dataset from ann-benchmarks.com.

In [1]:
(require '[clojure.java.io :as io])
(with-open [in (io/input-stream "http://ann-benchmarks.com/fashion-mnist-784-euclidean.hdf5")
              out (io/output-stream "fashion-mnist.hdf5")]
    (io/copy in out))

nil

require pure Java implementation of HDF5 loader [jHDF](https://github.com/jamesmudd/jhdf).

In [2]:
(require '[clojupyter.misc.helper :as helper])
(helper/add-dependencies '[io.jhdf/jhdf "0.5.7"])

{[io.jhdf/jhdf "0.5.7"] #{[org.apache.commons/commons-lang3 "3.10" :scope "runtime"] [org.slf4j/slf4j-api "1.7.30" :scope "runtime"]}, [org.slf4j/slf4j-api "1.7.30" :scope "runtime"] nil, [org.apache.commons/commons-lang3 "3.10" :scope "runtime"] nil}

In [9]:
(import '[java.io File])
(import '[java.util UUID])
(import '[io.jhdf HdfFile])
(import '[io.jhdf.api Dataset])

io.jhdf.api.Dataset

create an instance of the HDF5 file.

In [4]:
(def hdf5file
    (HdfFile. (File. "fashion-mnist.hdf5")))

#'user/hdf5file

create an instance of the train dataset.

In [12]:
(def train-dataset
    (-> hdf5file
    (.getDatasetByPath "train")))

#'user/train-dataset

create a client again and insert 1000 vectors.

In [8]:
(def client
  (vald/agent-client "vald-agent-ngt" 8081))

#'user/client

In [13]:
(let [dims (-> train-dataset
                  (.getDimensions)
                  (seq))
      vecs (-> train-dataset
               (.getData)
               (->> (cast (class (make-array Float/TYPE (first dims) (second dims)))))
               (seq)
               (->> (take 1000))
               (->> (map seq))
               (->> (mapv (fn [v]
                             {:id (-> (UUID/randomUUID) (.toString))
                              :vector v}))))]
    (-> client
        (vald/stream-insert identity vecs)
        (deref)))

{:status :done, :count 1000}

please wait for finishing creating index.

After index created, send search requests.

In [14]:
;; stream search
(let [results (atom [])
      dims (-> train-dataset
                  (.getDimensions)
                  (seq))
      vecs (-> train-dataset
               (.getData)
               (->> (cast (class (make-array Float/TYPE (first dims) (second dims)))))
               (seq)
               (->> (drop 1000))
               (->> (take 10))
               (->> (mapv seq)))]
    (-> client
        (vald/stream-search
         (fn [res]
             (swap! results conj res))
         {:num 3}
         vecs)
        (deref))
    (clojure.pprint/pprint
     (deref results)))

[[{:id "c300191d-6949-47cc-b3c2-0909dc3d5d35", :distance 601.4624}
  {:id "b36ede63-03c9-4cd1-89df-68967f3d06b0", :distance 833.77094}
  {:id "d2dbca8e-4b0f-4ad3-a6e9-a3acaef0603f", :distance 887.6548}]
 [{:id "dd030b3f-2874-42d9-bc6b-c19b9f5e563d", :distance 1253.3287}
  {:id "2249029f-6838-4e95-a7fe-c17d07c0a326", :distance 1806.3563}
  {:id "842fe170-efb9-4887-9ab7-278e4e2c81e7", :distance 1857.4353}]
 [{:id "05ce6dfd-8db7-42f8-8f77-8e6c5fbc9ffc", :distance 1211.038}
  {:id "de232c78-9467-4b3c-8866-30a05f1f4e26", :distance 1235.6666}
  {:id "f8491f7c-27eb-44ee-8863-c7ad8e47b2f5", :distance 1306.4796}]
 [{:id "0d08f31c-a0b8-4de3-a65a-e3bd643ade32", :distance 1250.3619}
  {:id "ed59cd98-15d6-4bbd-80e6-41739c965a00", :distance 1279.3639}
  {:id "f8491f7c-27eb-44ee-8863-c7ad8e47b2f5", :distance 1326.2885}]
 [{:id "50a9f810-e0b7-44b6-abdc-744076b3f620", :distance 764.7366}
  {:id "aae89688-5577-46f2-9020-9615fc533d43", :distance 840.42847}
  {:id "b816da8f-6360-4b85-bb4d-c609c41a6418", :

nil

In [15]:
(vald/close client)

#object[io.grpc.internal.ManagedChannelImpl 0x6dcdc911 "ManagedChannelImpl{logId=1, target=vald-agent-ngt:8081}"]