Thursday, November 9, 2023

[InfiniText: Empowering Conversations & Content with Mistral-7B-Instruct-v0.1](https://huggingface.co/blog/Andyrasika/mistral-7b-empowering-conversation) (Published October 13, 2023)

This all runs in one pass.

In [1]:
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig, AutoProcessor
from transformers import GenerationConfig, pipeline
from langchain import HuggingFacePipeline
from langchain import PromptTemplate, LLMChain

In [2]:
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_use_double_quant=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16
)

# model_id = "mistralai/Mistral-7B-Instruct-v0.1"
# model = AutoModelForCausalLM.from_pretrained(model_id, quantization_config=bnb_config, device_map="auto")
# tokenizer = AutoTokenizer.from_pretrained(model_id)

model_id = "mistralai/Mistral-7B-Instruct-v0.1"
model = AutoModelForCausalLM.from_pretrained(model_id, quantization_config=bnb_config)
tokenizer = AutoTokenizer.from_pretrained(model_id)

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

In [None]:
# docker cp c9b676310ea0://home/rob/Data2/huggingface/transformers/models--mistralai--Mistral-7B-Instruct-v0.1 /home/rob/Data3/huggingface/transformers/
# Successfully copied 15GB to /home/rob/Data2/huggingface/transformers

In [3]:
import random
import textwrap

device = 'cuda' if torch.cuda.is_available() else 'cpu'

def wrap_text(text, width=90): #preserve_newlines
  lines = text.split('\n')
  wrapped_lines = [textwrap.fill(line, width=width) for line in lines]
  wrapped_text = '\n'.join(wrapped_lines)
  return wrapped_text

def multimodal_prompt(input_text:str, system_prompt="",max_length=512) -> str:
  """few-shot text-to-text prompting

  Generates text using a large language model, given a prompt and a device.

  Args:
    model: An AutoModelForCausalLM instance.
    tokenizer: An AutoTokenizer instance.
    prompt: The prompt to use for generation.
    device: The device to use for generation.

  Returns:
    A string containing the generated text.
  """
  prompt = f"""<s>[INST]{input_text}[/INST]"""
  encodeds = tokenizer(prompt, return_tensors="pt", add_special_tokens=False)
  model_inputs = encodeds.to(device)

  output = model.generate(**model_inputs,
                                max_length=max_length,
                                use_cache=True,
                                early_stopping=True,
                                bos_token_id=model.config.bos_token_id,
                                eos_token_id=model.config.eos_token_id,
                                pad_token_id=model.config.eos_token_id,
                                temperature=0.1,
                               do_sample=True)

# Randomly select one of the generated outputs
  response = random.choice(tokenizer.batch_decode(output))

  # Wrap the response text to a width of 90 characters
  wrapped_response = wrap_text(response)
  print(wrapped_response)


In [4]:
multimodal_prompt('Write a detailed analogy between mathematics and a lighthouse.',
         max_length=256)




<s> [INST]Write a detailed analogy between mathematics and a lighthouse.[/INST]
Mathematics can be compared to a lighthouse in many ways. Just as a lighthouse guides
mathematics provides a framework for understanding and navigating the complexities of the
world around us.

The light emitted by a lighthouse can be seen for miles, guiding ships even in the darkest
of conditions. Similarly, the principles of mathematics can be applied in a wide range of
situations, from simple calculations to complex scientific and technological problems.

Just as a lighthouse must be carefully maintained and updated to remain effective,
mathematics must also be constantly refined and expanded to keep pace with new discoveries
and technologies.

Furthermore, just as a lighthouse serves as a beacon of hope and guidance for sailors,
mathematics can provide a sense of order and predictability in an often chaotic world. By
using mathematical models and equations, we can make predictions and plan for the futur

In [5]:
multimodal_prompt("""Alice: I don't know why, I'm struggling to maintain focus while studying. Any suggestion? \n\n Bob:""",max_length=128)


<s> [INST]Alice: I don't know why, I'm struggling to maintain focus while studying. Any
suggestion?

 Bob:[/INST] Bob: Have you tried breaking your study sessions into smaller, more focused
intervals? This technique is called the Pomodoro Technique and it can help you maintain
focus and increase productivity. You can also try taking short breaks between study
sessions to give your brain a break and recharge. Additionally, make sure you're studying
in a comfortable and distraction-free environment.</s>
