# Semantic search with Qdrant and Hugot: a tutorial

Hello! In this tutorial we will show how one can quickly and easily setup semantic search with Hugot and Qdrant. The goal is to showcase how easy it is to enhance your golang applications with semantic search using Hugot!

First, note that this is a jupyter notebook with golang code. You can run this notebook yourself on your machine using [https://github.com/janpfeifer/gonb](goNB). The steps to be able to do this are:

1. Install [https://github.com/janpfeifer/gonb](goNB) on your machine/docker container. It's easy! You should just be able to do:

```
go install github.com/janpfeifer/gonb@latest && \
  go install golang.org/x/tools/cmd/goimports@latest && \
  go install golang.org/x/tools/gopls@latest &&
```

And then make sure $GOBIN (where `install` puts binaries) is in your path and do `gonb --install`, that's it.

2. Install jupyterlab. If you don't have python you can just use the following bash command to install what you need:

```
mkdir -p ~/miniconda3 && \
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O ~/miniconda3/miniconda.sh && \
bash ~/miniconda3/miniconda.sh -b -u -p ~/miniconda3 && \
rm -rf ~/miniconda3/miniconda.sh
```

and then:

```
~/miniconda3/bin/conda init
pip install jupyterlab
```

and finally `jupyter lab --ip 0.0.0.0 --no-browser --allow-root --NotebookApp.token=''` from the directory with this notebook to start jupyterlab without asking for a token. Navigating to localhost:8888 should now show the jupyterlab window with this notebook, and `Go (gonb)` should appear as a kernel to run this code. Note that if you are running this inside a docker container, you should expose port 8888 to the host.

In [1]:
import hf "github.com/bodaay/HuggingFaceModelDownloader/hfdownloader"

In [7]:
%%
hf.DownloadModel("lewtun/github-issues", false, false, true, ".", "main", 4, "", true)


Getting File Download Files List Tree from: https://huggingface.co/api/datasets/lewtun/github-issues/tree/main/
Checking file size matching: lewtun_github-issues/.gitattributes
Checking file size matching: lewtun_github-issues/README.md
Start Downloading: lewtun_github-issues/datasets-issues-with-comments.jsonl


Merging lewtun_github-issues/datasets-issues-with-comments.jsonl Chunks
Finished Downloading: lewtun_github-issues/datasets-issues-with-comments.jsonl
Checking SHA256 Hash for LFS file: lewtun_github-issues/datasets-issues-with-comments.jsonl
Hash Matched for LFS file: lewtun_github-issues/datasets-issues-with-comments.jsonl

In [1]:
func CheckError(err error) {
    if err != nil {
        panic(err)
    }
}

In [2]:
type Issue struct {
    IsPullRequest bool `json:"is_pull_request"`
    Title string `json:"title"`
    Body string `json:"body"`
    Comments []string `json:"comments"`
    HtmlUrl string `json:"html_url"`
}

In [3]:
%%
f, err := os.Open("./lewtun_github-issues/datasets-issues-with-comments.jsonl")
CheckError(err)
defer f.Close()
scanner := bufio.NewScanner(f)
var counter int
for scanner.Scan() {
    var issue Issue
    json.Unmarshal(scanner.Bytes(), &issue)

    for _, comment := range issue.Comments {
        fmt.Println(comment)
    }
    counter++
    if counter > 5 {
        break
    }
}

There is a speed up in Windows machines:
- From `13m 52s` to `11m 10s`

In Linux machines, some workers crash with error message:
```
OSError: [Errno 12] Cannot allocate memory
```
There is also a speed up in Linux machines:
- From `7m 30s` to `5m 32s`
@lhoestq Let me make sure we never need it, and if not then I'll remove it entirely in a follow-up PR.
Thanks ;) it will be less confusing and easier to maintain to not keep unused hacky features
