# Getting Started with Bloom


## Zero Shot and Few Shot Learners

Large Language Models can be fine-tuned to new tasks very quickly. 
For some of the tasks, these models can show good enough performance based on just a few examples. 
These examples have come to be called `prompts` to a language model, and formatting the examples as an input is being referred to as `prompt engineering`.

## Downloading a Pre-Trained Tokenizer & Model

In [10]:
from transformers import AutoTokenizer, AutoModel

tokenizer = AutoTokenizer.from_pretrained("bigscience/bloom")
model = AutoModel.from_pretrained("bigscience/bloom")

Downloading pytorch_model_00016-of-00072.bin:   0%|          | 0.00/4.59G [00:00<?, ?B/s]

Downloading pytorch_model_00017-of-00072.bin:   0%|          | 0.00/4.59G [00:00<?, ?B/s]

Downloading pytorch_model_00018-of-00072.bin:   0%|          | 0.00/4.59G [00:00<?, ?B/s]

Downloading pytorch_model_00019-of-00072.bin:   0%|          | 0.00/4.59G [00:00<?, ?B/s]

Downloading pytorch_model_00020-of-00072.bin:   0%|          | 0.00/4.59G [00:00<?, ?B/s]

Downloading pytorch_model_00021-of-00072.bin:   0%|          | 0.00/4.59G [00:00<?, ?B/s]

Downloading pytorch_model_00022-of-00072.bin:   0%|          | 0.00/4.59G [00:00<?, ?B/s]

Downloading pytorch_model_00023-of-00072.bin:   0%|          | 0.00/4.59G [00:00<?, ?B/s]

Downloading pytorch_model_00024-of-00072.bin:   0%|          | 0.00/4.59G [00:00<?, ?B/s]

ChunkedEncodingError: ("Connection broken: ConnectionResetError(104, 'Connection reset by peer')", ConnectionResetError(104, 'Connection reset by peer'))

In [None]:
from transformers import pipeline
import torch
import time

s = time.time()
pipe = pipeline(model="bigscience/bloom", torch_dtype=torch.bfloat16)
print(f"Time to load model: {time.time()-s}")

In [None]:
rom IPython.display import HTML as html_print

def cstr(s, color='black'):
    #return "<text style=color:{}>{}</text>".format(color, s)
    return "<text style=color:{}>{}</text>".format(color, s.replace('\n', '<br>'))

def cstr_with_newlines(s, color='black'):
    return "<text style=color:{}>{}</text>".format(color, s.replace('\n', '<br>'))

Q: A juggler can juggle 16 balls. Half of the balls are golf balls, and half of the golf balls are blue. How many blue golf balls are there?
A: Let’s think step by step.
Let $B \overset{\text{ref}}{=}$ blue golf balls.
Since half of the balls are a few foreign yones distinctive of Chinese name cosmogonies but are China a unique case have variations been overlooked do exist Thesis method to cover all ages according which are almost but in the former case but again with

In [7]:
prompt = "One of the hottest areas of investing in recent years has been ESG: "
promtt += "the use of environmental, social, and governance criteria to evaluate possible investments."

result_length = 100
inputs = tokenizer(prompt, return_tensors="pt")

- `result_length` calibrates the size of the response (in tokens) we get for the prompt from the model.
- `inputs` contains the embedding representation of prompt, encoded for use specifically by PyTorch. If we were using TensorFlow we’d pass return_tensors="tf".


## Running Inference: Strategies for Better Responses

In [8]:
# Greedy Search
print(
    tokenizer.decode(
        model.generate(inputs["input_ids"], max_length=result_length)[0]))

One of the hottest areas of investing in recent years has been ESG: the use of environmental, social, and governance criteria to evaluate possible investments. The ESG movement has been growing steadily since the financial crisis, and the number of companies that have adopted ESG criteria has increased by more than 50% in the last five years. The ESG movement is also growing in the United States, where the number of companies that have adopted ESG criteria has increased by more than 50% in the


In [None]:
def local_inf(prompt, temperature=0.7, top_p=None, max_new_tokens=32, repetition_penalty=None, do_sample=False, num_return_sequences=1):  
    response = pipe(f"{prompt}", 
                    temperature = temperature, # 0 to 1
                    top_p = top_p, # None, 0-1
                    max_new_tokens = max_new_tokens, # up to 2047 theoretically
                    return_full_text = False, # include prompt or not.
                    repetition_penalty = repetition_penalty, # None, 0-100 (penalty for repeat tokens.
                    do_sample = do_sample, # True: use sampling, False: Greedy decoding.
                    num_return_sequences = num_return_sequences
                    )
    return html_print(cstr(prompt, color='#f1f1c7') + cstr(response[0]['generated_text'], color='#a1d8eb')), response[0]['generated_text']


In [None]:
inp = """# Use OpenCV in Python"""
color_resp, resp = local_inf(inp, max_new_tokens=64)
color_resp
