# Introduction 

This notebook replicates a simple instruction template. The model can only answer to instructions and does not do a good job at remembering context.

This is a Phi 1.5 model fine-tuned to Alpaca dataset (https://huggingface.co/datasets/tatsu-lab/alpaca). Find the fine-tuning notebook in the `assistant_sft` directory.

**NOTE: The notebook uses a customized streamer for text streaming.**

In [1]:
from transformers import (
    AutoModelForCausalLM, 
    AutoTokenizer,
    pipeline,
    logging,
    # TextStreamer
)

from streaming_utils import TextStreamer

In [2]:
model = AutoModelForCausalLM.from_pretrained(
    '../assistant_sft/outputs/phi_1_5_alpaca_qlora/best_model/', load_in_4bit=True
)
tokenizer = AutoTokenizer.from_pretrained(
    '../assistant_sft/outputs/phi_1_5_alpaca_qlora/best_model/'
)

The `load_in_4bit` and `load_in_8bit` arguments are deprecated and will be removed in the future versions. Please, pass a `BitsAndBytesConfig` object in `quantization_config` argument instead.
`low_cpu_mem_usage` was None, now set to True since model is quantized.
  return self.fget.__get__(instance, owner)()
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.


In [3]:
streamer = TextStreamer(
    tokenizer, 
    skip_prompt=True, 
    skip_special_tokens=True, 
    truncate_before_pattern=['###', 'Instruction:'],
    truncate=True
)

In [4]:
print(tokenizer.eos_token)

<|endoftext|>


In [5]:
logging.set_verbosity(logging.CRITICAL)

In [6]:
template = """### Instruction:
{prompt}

### Input:
{inputs}

### Response:
"""
eos_string = tokenizer.eos_token
history = None

In [7]:
print(template)

### Instruction:
{prompt}

### Input:
{inputs}

### Response:



In [8]:
while True:
    question=input("Question: ")
    inputs = ''

    prompt = history + '\n' + template.format(prompt=question, inputs=inputs) if history is not None else template.format(prompt=question, inputs=inputs)

    # print(f"PROMPT: {prompt}")

    prompt_tokenized = tokenizer(prompt, return_tensors='pt').to('cuda')['input_ids']
    
    output_tokenized = model.generate(
        input_ids=prompt_tokenized, 
        max_length=len(prompt_tokenized[0])+400,
        temperature=0.7,
        top_k=40,
        top_p=0.1,
        do_sample=True,
        eos_token_id=tokenizer.eos_token_id,
        streamer=streamer
    )
    answer = tokenizer.decode(token_ids=output_tokenized[0][len(prompt_tokenized[0]):]).strip()
    
    if eos_string in answer:
        answer = answer.split(eos_string)[0].strip()
    if '###' in answer:
        answer = answer.split('###')[0].strip()

    history = '\n'.join([prompt, answer, eos_string])
    # print(f"ANSWER: {answer}\n")
    # print(f"HISTORY: {history}\n")
    print('#' * 50)

Question:  You are a helpful AI assistant.




I'm here to help! What can I do for you today?
##################################################


Question:  Who are you?


I am a human being from planet Earth. I am a member of the Homo sapiens species and I am a member of the Homo sapiens species. I am a member of the Homo sapiens species and I am a member of the Homo sapiens species. I am a member of the Homo sapiens species and I am a member of the Homo sapiens species. I am a member of the Homo sapiens species and I am a member of the Homo sapiens species. I am a member of the Homo sapiens species and I am a member of the Homo sapiens species. I am a member of the Homo sapiens species and I am a member of the Homo sapiens species. I am a member of the Homo sapiens species and I am a member of the Homo sapiens species. I am a member of the Homo sapiens species and I am a member of the Homo sapiens species. I am a member of the Homo sapiens species and I am a member of the Homo sapiens species. I am a member of the Homo sapiens species and I am a member of the Homo sapiens species. I am a member of the Homo sapiens species and I am a member of the Homo 

Question:  What is 2+2?


4
##################################################


Question:  What is digital pathology?


Digital pathology is the use of digital technology to analyze and interpret medical images. It involves the use of digital imaging systems, such as digital X-ray, CT, and MRI, to create digital images of medical specimens. These images can then be analyzed and interpreted by medical professionals, allowing for more accurate diagnoses and treatments.
##################################################


Question:  What are some models used in it?


Some models used in it include the Gaussian Naive Bayes, Decision Trees, Random Forests, Support Vector Machines, and Neural Networks.
##################################################


Question:  Used in where?


The word "where" can be used in a variety of contexts, such as in a sentence to indicate a location, in a question to ask where something is located, or in a statement to indicate where something is located.
##################################################


KeyboardInterrupt: Interrupted by user