<a href="https://colab.research.google.com/github/yonikremer/grouped_sampling/blob/master/colab_demo.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Use grouped sampling:

In [1]:
!pip install -q transformers grouped_sampling torch beautifulsoup4 accelerate sentencepiece

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m6.8/6.8 MB[0m [31m26.2 MB/s[0m eta [36m0:00:00[0m
[?25h  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m212.8/212.8 KB[0m [31m23.7 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.3/1.3 MB[0m [31m67.9 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m199.8/199.8 KB[0m [31m17.4 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m7.6/7.6 MB[0m [31m74.1 MB/s[0m eta [36m0:00:00[0m
[?25h  Building wheel for grouped_sampling (pyproject.toml) ... [?25l[?25hdone


In [2]:
import timeit
from math import ceil, floor

from transformers import AutoConfig
from grouped_sampling import GroupedSamplingPipeLine


def compare_generators(
        pipeline: GroupedSamplingPipeLine,
        prompt: str,
        num_tokens: int,
        ):
    """Compares grouped and non-grouped text generators"""
    print(f"Your prompt:")
    print(prompt)

    start_non_grouped = timeit.default_timer()
    non_grouped_ans: str = pipeline(
        prompt_s=prompt,
        max_new_tokens=num_tokens,
        return_full_text=False
    )["generated_text"]
    stop_non_grouped = timeit.default_timer()
    non_grouped_time = stop_non_grouped - start_non_grouped
    print(f"Text generated by Non grouped sampling"
          f" in {non_grouped_time} seconds:")
    print(non_grouped_ans)

    pipeline.group_size = 1024
    grouped_generator = pipeline
    start_grouped_generation = timeit.default_timer()
    grouped_ans: str = grouped_generator(
        prompt_s=prompt,
        max_new_tokens=num_tokens,
        return_full_text=False
    )["generated_text"]
    stop_grouped_generation = timeit.default_timer()
    grouped_time = stop_grouped_generation - start_grouped_generation
    print(f"Text generated by grouped sampling"
          f" in {grouped_time} seconds:")
    print(grouped_ans)


model_name = "facebook/opt-iml-1.3b"
prompt = "I had so much fun I the" #@param {type:"string"}
num_tokens = 100 #@param {type:"integer", min:1}
top_p = 1 #@param {type:"slider", min:0.0, max:1.0, step:0.05}
top_k = 1 #@param {type:"integer"}
temperature = 1 #@param {type:"number", min:0.000000001}

non_grouped_generator = GroupedSamplingPipeLine(
    model_name=model_name,
    group_size=1,
    temp=temperature, 
    top_k=top_k, 
    top_p=top_p,
    end_of_sentence_stop=False,
    load_in_8bit=False,
    )

compare_generators(non_grouped_generator, prompt, num_tokens)
del non_grouped_generator

Downloading (…)okenizer_config.json:   0%|          | 0.00/682 [00:00<?, ?B/s]

Downloading (…)lve/main/config.json:   0%|          | 0.00/597 [00:00<?, ?B/s]

Downloading (…)olve/main/vocab.json:   0%|          | 0.00/899k [00:00<?, ?B/s]

Downloading (…)olve/main/merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

Downloading (…)cial_tokens_map.json:   0%|          | 0.00/221 [00:00<?, ?B/s]

Downloading (…)lve/main/config.json:   0%|          | 0.00/596 [00:00<?, ?B/s]



Downloading pytorch_model.bin:   0%|          | 0.00/2.63G [00:00<?, ?B/s]

Your prompt:
I had so much fun I the
Text generated by Non grouped sampling in 13.83421635999997 seconds:
 last few days.
I was able to get a lot done and I'm so glad that I did!
It's been a long time since I've had this much fun, but it feels good to be back in the swing of things again.
The weather has been great too - perfect for running outside.
We went on our first run yesterday morning and it felt like summertime out there.
Today we're going to go for another run before heading home for some rest and relaxation
Text generated by grouped sampling in 9.09141257799996 seconds:
 last few days.
I was able to get a lot done and I'm so glad that I did!
It's been a long time since I've had this much fun, but it feels good to be back in the swing of things again.
The weather has been great too - perfect for running outside.
We went on our first run yesterday morning and it felt like summertime out there.
Today we're going to go for another run before heading home for some rest and relaxat

# Change the hyper-parameters to see what will happen!