# [Hashformers](https://github.com/ruanchaves/hashformers)

Hashformers is a framework for hashtag segmentation with transformers. For more information, please check the [GitHub repository](https://github.com/ruanchaves/hashformers). 

# Installation

The steps below will install the hashformers framework on Google Colab. 

Make sure you are on GPU mode. 

In [None]:
!nvidia-smi

Fri Feb  4 07:56:16 2022       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.32.03    Driver Version: 460.32.03    CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  Tesla T4            Off  | 00000000:00:04.0 Off |                    0 |
| N/A   36C    P8     9W /  70W |      0MiB / 15109MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Proces

Here we install `mxnet-cu110`, which is compatible with Google Colab. 
If installing in another environment, replace it by the mxnet package compatible with your CUDA version.

In [2]:
%%capture

!pip install mxnet-cu110 
!pip install hashformers

# Segmenting hashtags

Visit the [HuggingFace Model Hub](https://huggingface.co/models) and choose any GPT-2 and a BERT models for the WordSegmenter class.

The GPT-2 model should be informed as `segmenter_model_name_or_path` and the BERT model as `reranker_model_name_or_path`.

Here we choose `distilgpt2` and `distilbert-base-uncased`.

In [None]:
%%capture

from hashformers import TransformerWordSegmenter as WordSegmenter

ws = WordSegmenter(
    segmenter_model_name_or_path="distilgpt2",
    reranker_model_name_or_path="distilbert-base-uncased"
)

Now we can simply segment lists of hashtags with the default settings and look at the segmentations.

In [None]:
hashtag_list = [
    "#myoldphonesucks",
    "#latinosinthedeepsouth",
    "#weneedanationalpark"
]

segmentations = ws.segment(hashtag_list)

In [None]:
print(*segmentations, sep='\n')

my old phone sucks
latinos in the deep south
we need a national park


Remember that any pair of BERT and GPT-2 models will work. This means you can use **hashformers** to segment hashtags in any language, not just English.

In [None]:
%%capture

from hashformers import TransformerWordSegmenter as WordSegmenter

portuguese_ws = WordSegmenter(
    segmenter_model_name_or_path="pierreguillou/gpt2-small-portuguese",
    reranker_model_name_or_path="neuralmind/bert-base-portuguese-cased"
)

In [None]:
hashtag_list = [
    "#benficamemes",
    "#mouraria",
    "#CristianoRonaldo"
]

segmentations = portuguese_ws.segment(hashtag_list)

print(*segmentations, sep='\n')

ben ficam em es
m ouraria
Cristiano Ronaldo


# Advanced usage

## Speeding up

If you want to investigate the speed-accuracy trade-off, here are a few things that can be done to improve the speed of the segmentations:


* Turn off the reranker model by passing `use_reranker = False` to the `ws.segment` method.

* Adjust the `segmenter_gpu_batch_size` (default: `1` ) and the `reranker_gpu_batch_size` (default: `2000`) parameters in the `WordSegmenter` initialization.


* Decrease the beamsearch parameters `topk` (default: `20`) and `steps` (default: `13`) when calling the `ws.segment` method.

In [None]:
%%capture

from hashformers import TransformerWordSegmenter as WordSegmenter

ws = WordSegmenter(
    segmenter_model_name_or_path="distilgpt2",
    reranker_model_name_or_path="distilbert-base-uncased",
    segmenter_gpu_batch_size=1,
    reranker_gpu_batch_size=2000
)

In [None]:
%%timeit

hashtag_list = [
    "#myoldphonesucks",
    "#latinosinthedeepsouth",
    "#weneedanationalpark"
]

segmentations = ws.segment(hashtag_list)

1 loop, best of 5: 8.84 s per loop


In [None]:
%%timeit

hashtag_list = [
    "#myoldphonesucks",
    "#latinosinthedeepsouth",
    "#weneedanationalpark"
]

segmentations = ws.segment(
    hashtag_list,
    topk=5,
    steps=5,
    use_reranker=False
)

1 loop, best of 5: 3.23 s per loop


## Getting the ranks

If you pass `return_ranks == True` to the `ws.segment` method, you will receive a dictionary with the ranks generated by the segmenter and the reranker, the dataframe utilized by the ensemble and the final segmentations. A segmentation will rank higher if its score value is **lower** than the other segmentation scores.

Rank outputs are useful if you want to combine the segmenter rank and the reranker rank in ways which are more sophisticated than what is done by the basic ensembler that comes by default with **hashformers**.   

For instance, you may want to take two or more ranks ( also called "runs" ), convert them to the trec format and combine them through a rank fusion technique on the [trectools library](https://github.com/joaopalotti/trectools).    

In [None]:
hashtag_list = [
    "#myoldphonesucks",
    "#latinosinthedeepsouth",
    "#weneedanationalpark"
]

ranks = ws.segment(
    hashtag_list,
    use_reranker=True,
    return_ranks=True
)

In [None]:
# Segmenter rank
ranks.segmenter_rank

Unnamed: 0,characters,segmentation,score
0,latinosinthedeepsouth,latinos in the deep south,50.041458
1,latinosinthedeepsouth,latino s in the deep south,53.423897
2,latinosinthedeepsouth,latinosin the deep south,53.662689
3,latinosinthedeepsouth,la tinos in the deep south,54.122768
4,latinosinthedeepsouth,latinos in the deepsouth,54.437469
...,...,...,...
905,weneedanationalpark,weneed anatio nalpark,80.100243
906,weneedanationalpark,weneedanati onalpa rk,80.674561
907,weneedanationalpark,weneedanat ionalpa rk,81.096085
908,weneedanationalpark,weneedanat ionalpar k,82.248749


In [None]:
# Reranker rank
ranks.reranker_rank

Unnamed: 0,characters,segmentation,score
0,latinosinthedeepsouth,latinos in the deep south,18.863357
1,latinosinthedeepsouth,latino s in the deep south,36.419517
2,latinosinthedeepsouth,latinos in the deepsouth,37.305017
3,latinosinthedeepsouth,latin os in the deep south,38.368534
4,latinosinthedeepsouth,la tinos in the deep south,38.611647
...,...,...,...
905,weneedanationalpark,weneed a nati onalpark,84.555845
906,weneedanationalpark,w eneedanationalpar k,85.361568
907,weneedanationalpark,w eneedanationalp ark,86.047094
908,weneedanationalpark,w eneedanationa lpark,86.134639


## Evaluation 

The `evaluate_df` function can evaluate the accuracy, precision and recall of our segmentations. It uses exactly the same evaluation method as previous authors in the field of hashtag segmentation ( Çelebi et al., [BOUN Hashtag Segmentor](https://tabilab.cmpe.boun.edu.tr/projects/hashtag_segmentation/) ).

We have to pass a dataframe with fields for the gold segmentations ( a `gold_field` ) and your candidate segmentations ( a `segmentation_field` ).

The relationship between gold and candidate segmentations does not have to be one-to-one. If we pass more than one candidate segmentation for a single hashtag, `evaluate_df` will measure what is the upper boundary that can be achieved on our ranks ( e.g. Acc@10, Recall@10 ).   

### Minimal example

In [None]:
# Let's measure the actual performance of the segmenter: 
# we will evaluate only the top-1.
import pandas as pd
from hashformers.experiments.evaluation import evaluate_df

gold_segmentations = {
    "myoldphonesucks" : "my old phone sucks",
    "latinosinthedeepsouth": "latinos in the deep south",
    "weneedanationalpark": "we need a national park"
}

gold_df = pd.DataFrame(gold_segmentations.items(),
    columns=["characters", "gold"])

segmenter_top_1 = ranks.segmenter_rank.groupby('characters').head(1)
segmenter_top_1 = segmenter_top_1.astype(str).applymap(lambda x: x.lstrip("#").strip())

eval_df = pd.merge(gold_df, segmenter_top_1, on="characters")

eval_df

Unnamed: 0,characters,gold,segmentation,score
0,myoldphonesucks,my old phone sucks,my old phone sucks,34.331543
1,latinosinthedeepsouth,latinos in the deep south,latinos in the deep south,50.041458
2,weneedanationalpark,we need a national park,we need a national park,35.088081


In [None]:
evaluate_df(
    eval_df,
    gold_field="gold",
    segmentation_field="segmentation"
)

{'acc': 100.0, 'f1': 100.0, 'precision': 100.0, 'recall': 100.0}

### Benchmarking

Here we evaluate a `distilgpt2` model on 1000 hashtags.

We collect our hashtags from 10 word segmentation datasets by taking the first 100 hashtags from each dataset. 

You can read more about each dataset on [their cards at the Hugging Face Hub](https://huggingface.co/ruanchaves).

In [5]:
%%capture
!pip install datasets

In [6]:
from hashformers.experiments.evaluation import evaluate_df
import pandas as pd
from hashformers import TransformerWordSegmenter
from datasets import load_dataset

user = "ruanchaves"

dataset_names = [
    "boun",
    "stan_small",
    "stan_large",
    "dev_stanford",
    "test_stanford",
    "snap",
    "hashset_distant",
    "hashset_manual",
    "hashset_distant_sampled",
    "nru_hse"
]

dataset_names = [ f"{user}/{dataset}" for dataset in dataset_names ]

ws = TransformerWordSegmenter(
    segmenter_model_name_or_path="distilgpt2",
    reranker_model_name_or_path=None
)

def generate_experiments(datasets, splits, samples=100):
    for dataset_name in datasets:
        for split in splits:
            try:
                dataset = load_dataset(dataset_name, split=f"{split}[0:{samples}]")
                yield {
                    "dataset": dataset,
                    "split": split,
                    "name": dataset_name
                }
            except:
                continue

benchmark = []
for experiment in generate_experiments(dataset_names, ["train", "validation", "test"], samples=100):
    hashtags = experiment['dataset']['hashtag']
    annotations = experiment['dataset']['segmentation']
    segmentations = ws.segment(hashtags, use_reranker=False, return_ranks=False)

    eval_df = [{
      "gold": gold,
      "hashtags": hashtag,
      "segmentation": segmentation   
  } for gold, hashtag, segmentation in zip(annotations, hashtags, segmentations)]
    eval_df = pd.DataFrame(eval_df)
  
    eval_results = evaluate_df(
        eval_df,
        gold_field="gold",
        segmentation_field="segmentation"
    )

    eval_results.update({
      "name": experiment["name"],
      "split": experiment["split"]
      })
    benchmark.append(eval_results)

Downloading:   0%|          | 0.00/762 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/0.99M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/446k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/1.29M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/336M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/2.26k [00:00<?, ?B/s]

Using custom data configuration default


Downloading and preparing dataset boun/default to /root/.cache/huggingface/datasets/ruanchaves___boun/default/1.0.0/a87c8460dd151e81465990bc4e8efa158f72b40fdbd0ce59717a4f1f0bf91b97...


  0%|          | 0/2 [00:00<?, ?it/s]

Downloading:   0%|          | 0.00/4.17k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/4.15k [00:00<?, ?B/s]

0 examples [00:00, ? examples/s]

0 examples [00:00, ? examples/s]

Dataset boun downloaded and prepared to /root/.cache/huggingface/datasets/ruanchaves___boun/default/1.0.0/a87c8460dd151e81465990bc4e8efa158f72b40fdbd0ce59717a4f1f0bf91b97. Subsequent calls will reuse this data.


Using custom data configuration default
Reusing dataset boun (/root/.cache/huggingface/datasets/ruanchaves___boun/default/1.0.0/a87c8460dd151e81465990bc4e8efa158f72b40fdbd0ce59717a4f1f0bf91b97)
Using custom data configuration default
Reusing dataset boun (/root/.cache/huggingface/datasets/ruanchaves___boun/default/1.0.0/a87c8460dd151e81465990bc4e8efa158f72b40fdbd0ce59717a4f1f0bf91b97)


Downloading:   0%|          | 0.00/3.64k [00:00<?, ?B/s]

Using custom data configuration default


Downloading and preparing dataset stan_small/default to /root/.cache/huggingface/datasets/ruanchaves___stan_small/default/1.0.0/662a7207801d48960191440767611b947f5184acc6b677a5a4797c2609f49822...


  0%|          | 0/1 [00:00<?, ?it/s]

Downloading:   0%|          | 0.00/37.4k [00:00<?, ?B/s]

0 examples [00:00, ? examples/s]

Using custom data configuration default


Downloading and preparing dataset stan_small/default to /root/.cache/huggingface/datasets/ruanchaves___stan_small/default/1.0.0/662a7207801d48960191440767611b947f5184acc6b677a5a4797c2609f49822...


  0%|          | 0/1 [00:00<?, ?it/s]

0 examples [00:00, ? examples/s]

Using custom data configuration default


Downloading and preparing dataset stan_small/default to /root/.cache/huggingface/datasets/ruanchaves___stan_small/default/1.0.0/662a7207801d48960191440767611b947f5184acc6b677a5a4797c2609f49822...


  0%|          | 0/1 [00:00<?, ?it/s]

0 examples [00:00, ? examples/s]

Downloading:   0%|          | 0.00/6.28k [00:00<?, ?B/s]

Using custom data configuration default


Downloading and preparing dataset stan_large/default to /root/.cache/huggingface/datasets/ruanchaves___stan_large/default/1.0.0/c70db854094ad2d75857e9009c98025f779e4a760ef0a40b3113b18f0778bfe1...


  0%|          | 0/3 [00:00<?, ?it/s]

Downloading:   0%|          | 0.00/80.5k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/80.5k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/301k [00:00<?, ?B/s]

0 examples [00:00, ? examples/s]

Using custom data configuration default


Downloading and preparing dataset stan_large/default to /root/.cache/huggingface/datasets/ruanchaves___stan_large/default/1.0.0/c70db854094ad2d75857e9009c98025f779e4a760ef0a40b3113b18f0778bfe1...


  0%|          | 0/3 [00:00<?, ?it/s]

0 examples [00:00, ? examples/s]

Using custom data configuration default


Downloading and preparing dataset stan_large/default to /root/.cache/huggingface/datasets/ruanchaves___stan_large/default/1.0.0/c70db854094ad2d75857e9009c98025f779e4a760ef0a40b3113b18f0778bfe1...


  0%|          | 0/3 [00:00<?, ?it/s]

0 examples [00:00, ? examples/s]

Downloading:   0%|          | 0.00/1.94k [00:00<?, ?B/s]

Using custom data configuration default


Downloading and preparing dataset dev_stanford/default to /root/.cache/huggingface/datasets/ruanchaves___dev_stanford/default/1.0.0/8b798da37d3e8e601ef34d681993e56bba7b1f00b4c2ff34d1ba7d917fa600d0...


  0%|          | 0/1 [00:00<?, ?it/s]

Downloading:   0%|          | 0.00/6.11k [00:00<?, ?B/s]

0 examples [00:00, ? examples/s]

Dataset dev_stanford downloaded and prepared to /root/.cache/huggingface/datasets/ruanchaves___dev_stanford/default/1.0.0/8b798da37d3e8e601ef34d681993e56bba7b1f00b4c2ff34d1ba7d917fa600d0. Subsequent calls will reuse this data.


Using custom data configuration default
Reusing dataset dev_stanford (/root/.cache/huggingface/datasets/ruanchaves___dev_stanford/default/1.0.0/8b798da37d3e8e601ef34d681993e56bba7b1f00b4c2ff34d1ba7d917fa600d0)
Using custom data configuration default
Reusing dataset dev_stanford (/root/.cache/huggingface/datasets/ruanchaves___dev_stanford/default/1.0.0/8b798da37d3e8e601ef34d681993e56bba7b1f00b4c2ff34d1ba7d917fa600d0)


Downloading:   0%|          | 0.00/3.88k [00:00<?, ?B/s]

Using custom data configuration default


Downloading and preparing dataset test_stanford/default to /root/.cache/huggingface/datasets/ruanchaves___test_stanford/default/1.0.0/68a4d902ae9519f8735c73f0ce46cd63e38b1501dc11e6344b67814cca9853fe...


  0%|          | 0/1 [00:00<?, ?it/s]

Downloading:   0%|          | 0.00/88.5k [00:00<?, ?B/s]

0 examples [00:00, ? examples/s]

Dataset test_stanford downloaded and prepared to /root/.cache/huggingface/datasets/ruanchaves___test_stanford/default/1.0.0/68a4d902ae9519f8735c73f0ce46cd63e38b1501dc11e6344b67814cca9853fe. Subsequent calls will reuse this data.


Using custom data configuration default
Reusing dataset test_stanford (/root/.cache/huggingface/datasets/ruanchaves___test_stanford/default/1.0.0/68a4d902ae9519f8735c73f0ce46cd63e38b1501dc11e6344b67814cca9853fe)
Using custom data configuration default
Reusing dataset test_stanford (/root/.cache/huggingface/datasets/ruanchaves___test_stanford/default/1.0.0/68a4d902ae9519f8735c73f0ce46cd63e38b1501dc11e6344b67814cca9853fe)


Downloading:   0%|          | 0.00/1.88k [00:00<?, ?B/s]

Using custom data configuration default


Downloading and preparing dataset snap/default to /root/.cache/huggingface/datasets/ruanchaves___snap/default/1.0.0/60d397812a1c7d16078ce14214d56f790c29c6470935d6b1a98aa008a4dfea89...


Downloading:   0%|          | 0.00/5.85M [00:00<?, ?B/s]

0 examples [00:00, ? examples/s]

Dataset snap downloaded and prepared to /root/.cache/huggingface/datasets/ruanchaves___snap/default/1.0.0/60d397812a1c7d16078ce14214d56f790c29c6470935d6b1a98aa008a4dfea89. Subsequent calls will reuse this data.


Using custom data configuration default
Reusing dataset snap (/root/.cache/huggingface/datasets/ruanchaves___snap/default/1.0.0/60d397812a1c7d16078ce14214d56f790c29c6470935d6b1a98aa008a4dfea89)
Using custom data configuration default
Reusing dataset snap (/root/.cache/huggingface/datasets/ruanchaves___snap/default/1.0.0/60d397812a1c7d16078ce14214d56f790c29c6470935d6b1a98aa008a4dfea89)


Downloading:   0%|          | 0.00/2.23k [00:00<?, ?B/s]

Using custom data configuration default


Downloading and preparing dataset hash_set_distant/default to /root/.cache/huggingface/datasets/ruanchaves___hash_set_distant/default/1.0.0/c634c3662dd3e21d1e692902e7d40000fb495ef4088a172f58c02907ec4abed7...


Downloading:   0%|          | 0.00/9.76M [00:00<?, ?B/s]

0 examples [00:00, ? examples/s]

Dataset hash_set_distant downloaded and prepared to /root/.cache/huggingface/datasets/ruanchaves___hash_set_distant/default/1.0.0/c634c3662dd3e21d1e692902e7d40000fb495ef4088a172f58c02907ec4abed7. Subsequent calls will reuse this data.


Using custom data configuration default
Reusing dataset hash_set_distant (/root/.cache/huggingface/datasets/ruanchaves___hash_set_distant/default/1.0.0/c634c3662dd3e21d1e692902e7d40000fb495ef4088a172f58c02907ec4abed7)
Using custom data configuration default
Reusing dataset hash_set_distant (/root/.cache/huggingface/datasets/ruanchaves___hash_set_distant/default/1.0.0/c634c3662dd3e21d1e692902e7d40000fb495ef4088a172f58c02907ec4abed7)


Downloading:   0%|          | 0.00/5.79k [00:00<?, ?B/s]

Using custom data configuration default


Downloading and preparing dataset hash_set_manual/default to /root/.cache/huggingface/datasets/ruanchaves___hash_set_manual/default/1.0.0/e2c5c264c894054f48604b1212e813eedfd311e6c54637f147f3a9696fd07828...


Downloading:   0%|          | 0.00/149k [00:00<?, ?B/s]

0 examples [00:00, ? examples/s]

Dataset hash_set_manual downloaded and prepared to /root/.cache/huggingface/datasets/ruanchaves___hash_set_manual/default/1.0.0/e2c5c264c894054f48604b1212e813eedfd311e6c54637f147f3a9696fd07828. Subsequent calls will reuse this data.


Using custom data configuration default
Reusing dataset hash_set_manual (/root/.cache/huggingface/datasets/ruanchaves___hash_set_manual/default/1.0.0/e2c5c264c894054f48604b1212e813eedfd311e6c54637f147f3a9696fd07828)
Using custom data configuration default
Reusing dataset hash_set_manual (/root/.cache/huggingface/datasets/ruanchaves___hash_set_manual/default/1.0.0/e2c5c264c894054f48604b1212e813eedfd311e6c54637f147f3a9696fd07828)


Downloading:   0%|          | 0.00/2.35k [00:00<?, ?B/s]

Using custom data configuration default


Downloading and preparing dataset hash_set_distant_sampled/default to /root/.cache/huggingface/datasets/ruanchaves___hash_set_distant_sampled/default/1.0.0/db905db0d83624655a93ba266c6591aaa509e1d132263337a3f335e2341921a5...


Downloading:   0%|          | 0.00/703k [00:00<?, ?B/s]

0 examples [00:00, ? examples/s]

Dataset hash_set_distant_sampled downloaded and prepared to /root/.cache/huggingface/datasets/ruanchaves___hash_set_distant_sampled/default/1.0.0/db905db0d83624655a93ba266c6591aaa509e1d132263337a3f335e2341921a5. Subsequent calls will reuse this data.


Using custom data configuration default
Reusing dataset hash_set_distant_sampled (/root/.cache/huggingface/datasets/ruanchaves___hash_set_distant_sampled/default/1.0.0/db905db0d83624655a93ba266c6591aaa509e1d132263337a3f335e2341921a5)
Using custom data configuration default
Reusing dataset hash_set_distant_sampled (/root/.cache/huggingface/datasets/ruanchaves___hash_set_distant_sampled/default/1.0.0/db905db0d83624655a93ba266c6591aaa509e1d132263337a3f335e2341921a5)


Downloading:   0%|          | 0.00/2.00k [00:00<?, ?B/s]

Using custom data configuration default


Downloading and preparing dataset hse/default to /root/.cache/huggingface/datasets/ruanchaves___hse/default/1.0.0/7e873e49d866666eb412f9f9f8f63677052f9aceb84e8a1677eef8ebf589ba45...


Downloading:   0%|          | 0.00/15.1k [00:00<?, ?B/s]

0 examples [00:00, ? examples/s]

Dataset hse downloaded and prepared to /root/.cache/huggingface/datasets/ruanchaves___hse/default/1.0.0/7e873e49d866666eb412f9f9f8f63677052f9aceb84e8a1677eef8ebf589ba45. Subsequent calls will reuse this data.


Using custom data configuration default
Reusing dataset hse (/root/.cache/huggingface/datasets/ruanchaves___hse/default/1.0.0/7e873e49d866666eb412f9f9f8f63677052f9aceb84e8a1677eef8ebf589ba45)
Using custom data configuration default
Reusing dataset hse (/root/.cache/huggingface/datasets/ruanchaves___hse/default/1.0.0/7e873e49d866666eb412f9f9f8f63677052f9aceb84e8a1677eef8ebf589ba45)


In [7]:
benchmark_df = pd.DataFrame(benchmark)
benchmark_df["name"] = benchmark_df["name"].apply(lambda x: x[(len(user) + 1):])
benchmark_df = benchmark_df.set_index(["name", "split"])
benchmark_df = benchmark_df.round(3)
benchmark_df

Unnamed: 0_level_0,Unnamed: 1_level_0,f1,acc,recall,precision
name,split,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
boun,validation,94.577,93.0,92.766,96.46
boun,test,77.679,69.0,73.109,82.857
dev_stanford,validation,78.75,78.0,77.301,80.255
test_stanford,test,68.896,69.474,62.424,76.866
snap,train,84.296,76.0,81.557,87.225
hashset_distant,test,86.331,78.0,86.022,86.643
hashset_manual,test,58.332,49.0,53.586,64.0
hashset_distant_sampled,test,84.561,81.0,86.071,83.103
nru_hse,test,88.39,91.0,86.765,90.076


In [9]:
benchmark_df.agg(['mean', 'std']).round(3)

Unnamed: 0,f1,acc,recall,precision
mean,80.201,76.053,77.733,83.054
std,10.964,13.075,12.745,9.141
