> `Finder` is an `explorer` focused on **search**.
>
> :speedboat: It can help you select points using a **filter** based on search results.

-   <details open><summary>This page addresses **single components** of `hover`</summary>
    For illustration, we are using code snippets to pick out specific widgets so that the documentation can explain what they do.

    -   Please be aware that you won't need to get the widgets by code in an actual use case.
    -   Typical usage deals with [recipes](../../tutorial/t1-active-learning) where the individual parts have been tied together.

</details>

-   <details open><summary>Dependencies for {== local environments ==}</summary>
    When you run the code locally, you may need to install additional packages.

    To run the text embedding code on this page, you need:
```shell
    pip install spacy
    python -m spacy download en_core_web_md
```

    To render `bokeh` plots in Jupyter, you need:
```shell
    pip install jupyter_bokeh
```

    If you are using JupyterLab older than 3.0, use this instead ([reference](https://pypi.org/project/jupyter-bokeh/)):
```shell
    jupyter labextension install @jupyter-widgets/jupyterlab-manager
    jupyter labextension install @bokeh/jupyter_bokeh
```

</details>

## **More Angles -> Better Results**

`Explorer`s other than `annotator` are specialized in finding additional insight to help us understand the data. Having them juxtaposed with `annotator`, we can label more accurately, more confidently, and even faster.

## **Preparation**

As always, start with a ready-for-plot dataset:

In [1]:
from hover.core.dataset import SupervisableTextDataset
import pandas as pd

raw_csv_path = "https://raw.githubusercontent.com/phurwicz/hover-gallery/main/0.5.0/20_newsgroups_raw.csv"
train_csv_path = "https://raw.githubusercontent.com/phurwicz/hover-gallery/main/0.5.0/20_newsgroups_train.csv"

# for fast, low-memory demonstration purpose, sample the data
df_raw = pd.read_csv(raw_csv_path).sample(400)
df_raw["SUBSET"] = "raw"
df_train = pd.read_csv(train_csv_path).sample(400)
df_train["SUBSET"] = "train"
df_dev = pd.read_csv(train_csv_path).sample(100)
df_dev["SUBSET"] = "dev"
df_test = pd.read_csv(train_csv_path).sample(100)
df_test["SUBSET"] = "test"

# build overall dataframe and ensure feature type
df = pd.concat([df_raw, df_train, df_dev, df_test])
df["text"] = df["text"].astype(str)

# this class stores the dataset throught the labeling process
dataset = SupervisableTextDataset.from_pandas(df, feature_key="text", label_key="label")

<br>

In [2]:
import spacy
import re
from functools import lru_cache

# use your preferred embedding for the task
nlp = spacy.load("en_core_web_md")

# raw data (str in this case) -> np.array
@lru_cache(maxsize=int(1e+4))
def vectorizer(text):
    clean_text = re.sub(r"[\s]+", r" ", str(text))
    return nlp(clean_text, disable=nlp.pipe_names).vector

# any kwargs will be passed onto the corresponding reduction
# for umap: https://umap-learn.readthedocs.io/en/latest/parameters.html
# for ivis: https://bering-ivis.readthedocs.io/en/latest/api.html
reducer = dataset.compute_nd_embedding(vectorizer, "umap", dimension=2)

Vectorizing: 100%|██████████| 957/957 [00:02<00:00, 386.38it/s]


  @numba.jit()
  @numba.jit()
  @numba.jit()
  @numba.jit()


<br>

## **Filter Toggles**

When we use lasso or polygon select, we are describing a shape. Sometimes that shape is not accurate enough -- we need extra conditions to narrow down the data.

Just like `annotator`, `finder` has search widgets. But unlike `annotator`, `finder` has a **filter toggle** which can directly **intersect** *what we selected* with *what meets the search criteria*.

In [3]:
from bokeh.io import show, output_notebook

output_notebook()

# normally your would skip notebook_url or use Jupyter address
notebook_url = 'localhost:8888'

from hover.recipes.subroutine import standard_finder
from bokeh.layouts import row, column

finder = standard_finder(dataset)
show(row(
    column(finder.search_pos, finder.search_neg),
    finder.search_filter_box,
), notebook_url=notebook_url)

You are generating standalone HTML/JS output, but trying to use real Python
callbacks (i.e. with on_change or on_event). This combination cannot work.

Only JavaScript callbacks may be used with standalone output. For more
information on JavaScript callbacks with Bokeh, see:

    https://docs.bokeh.org/en/latest/docs/user_guide/interaction/callbacks.html

Alternatively, to use real Python callbacks, a Bokeh server application may
be used. For more information on building and running Bokeh applications, see:

    https://docs.bokeh.org/en/latest/docs/user_guide/server.html



<br>

Next to the search widgets is a checkbox. The filter will stay active as long as the checkbox is.

-   <details open><summary>How the filter interacts with selection options</summary>
    Selection options apply before filters.

    `hover` memorizes your pre-filter selections, so you can keep selecting without having to tweaking the filter toggle.

    -   Example:
        -   suppose you have previously selected a set of points called `A`.
        -   then you toggled a filter `f`, giving you `A∩F` where `F` is the set satisfying `f`.
        -   now, with selection option "union", you select a set of points called `B`.
        -   your current selection will be `(A ∪ B) ∩ F`, i.e. `(A ∩ F) ∪ (B ∩ F)`.
            -   similarly, you would get `(A ∩ B) ∩ F` for "intersection" and `(A ∖ B) ∩ F` for "difference".
        -   if you untoggle the filter now, you selection would be `A ∪ B`.

    -   In the later tutorials, we shall see multiple filters in action together.
        -   spoiler: `F = F1 ∩ F2 ∩ ...` and that's it!
</details>

## **Stronger Highlight for Search**

`finder` also colors data points based on search criteria, making them easier to find.

In [4]:
show(column(
    row(finder.search_pos, finder.search_neg),
    finder.figure,
), notebook_url=notebook_url)

You are generating standalone HTML/JS output, but trying to use real Python
callbacks (i.e. with on_change or on_event). This combination cannot work.

Only JavaScript callbacks may be used with standalone output. For more
information on JavaScript callbacks with Bokeh, see:

    https://docs.bokeh.org/en/latest/docs/user_guide/interaction/callbacks.html

Alternatively, to use real Python callbacks, a Bokeh server application may
be used. For more information on building and running Bokeh applications, see:

    https://docs.bokeh.org/en/latest/docs/user_guide/server.html



<br>