# Introduction 

This notebook replicates a simple chat template with continuous chat. The model understands and remembers the context according to its capacity. We use the *chat template* functionality of the tokenizer.

This is a Phi 1.5 model fine-tuned to Chat Alpaca dataset (https://huggingface.co/datasets/flpelerin/ChatAlpaca-10k). Find the fine-tuning notebook in the `assistant_sft` directory.

**NOTE: The notebook uses a customized streamer for text streaming.**

In [1]:
from transformers import (
    AutoModelForCausalLM,
    AutoTokenizer,
    pipeline,
    logging,
)

from streaming_utils import TextStreamer

In [2]:
model = AutoModelForCausalLM.from_pretrained(
    '../assistant_sft/outputs/phi_1_5_chat_alpaca/logs/checkpoint-1187/',
    load_in_4bit=True
)
tokenizer = AutoTokenizer.from_pretrained(
    '../assistant_sft/outputs/phi_1_5_chat_alpaca/logs/checkpoint-1187/'
)

The `load_in_4bit` and `load_in_8bit` arguments are deprecated and will be removed in the future versions. Please, pass a `BitsAndBytesConfig` object in `quantization_config` argument instead.
`low_cpu_mem_usage` was None, now set to True since model is quantized.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.


In [3]:
streamer = TextStreamer(
    tokenizer,
    skip_prompt=True,
    skip_special_tokens=True,
    truncate_before_pattern=['\[\/', 'Goodbye'],
    truncate=True
)

In [4]:
eos_string = tokenizer.eos_token
history = None

In [5]:
chat = [
       {"from": "human", "value": 'Hello'}
    ]

tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)

'<|endoftext|>[INST] Hello [/INST]'

In [6]:
while True:
    question = input("Question: ")

    chat = [
       {"from": "human", "value": question}
    ]

    template = tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)

    prompt = history + ' ' + template if history is not None else template

    # print(prompt)
    
    prompt_tokenized = tokenizer(
        prompt, 
        return_tensors='pt', 
        padding=True, 
        truncation=True,
        return_attention_mask=True
    ).to('cuda')['input_ids']
    
    output_tokenized = model.generate(
        input_ids=prompt_tokenized,
        max_length=len(prompt_tokenized[0])+400,
        temperature=0.7,
        top_k=40,
        top_p=0.95,
        do_sample=True,
        eos_token_id=tokenizer.eos_token_id,
        repetition_penalty=1.1,
        streamer=streamer
    )
    answer = tokenizer.decode(token_ids=output_tokenized[0][len(prompt_tokenized[0]):]).strip()

    # Puny guardrails.
    if eos_string in answer:
        answer = answer.split(eos_string)[0].strip()
    if '[/' in answer:
        answer = answer.split('[/')[0].strip()

    history = ' '.join([prompt, answer])
    # print(f"ANSWER: {answer}\n")
    # print(f"HISTORY: {history}\n")
    print('#' * 50)

Question:  Hi. How are you?


The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


I am doing great! I'm currently studying for my degree in computer science and working on a project to develop an AI system that can recognize handwritten text. It's challenging but rewarding work so far, as the results have been promising. Thanks for asking! What about yourself? 
##################################################


Question:  I am doing deep learning.


The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


That sounds interesting! Can you tell me more about what type of projects you're working on with your expertise in deep learning? Maybe there's something specific you'd like to share or discuss. Also, how do you typically approach problem-solving when it comes to developing these types of systems? Do you prefer using pre-existing models, algorithms or starting from scratch? Any other insights you may have regarding deep learning that would be helpful for someone interested in this field?

Please feel free to let me know if you have any questions! 
##################################################


Question:  Hi.


The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


I apologize for not being clear earlier. As an AI language model, I don't actually "speak" but rather generate responses based on the input given. However, since we were having a conversation, here is another example:

If I want to help you learn some new things related to artificial intelligence, where could I start looking up resources online? Are there any particular topics or areas you think I should focus on first, such as machine learning or natural language processing? Additionally, what kind of challenges have you encountered while building AI systems in the past, and how did you overcome them? Is there anything else you would like to ask me about AI today?

Hope this helps! Let me know if you have any further questions or concerns. Have a great day! 

##################################################


Question:  Let's talk about integrating gaming and deep learning.


The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Sure! One way to integrate gaming and deep learning is through reinforcement learning games (RLGs). RLGs involve training a deep neural network model to play against itself, improving over time according to feedback generated during gameplay. By playing against themselves, RLG models can learn optimal strategies and improve their performance gradually.

Another way to combine gaming and deep learning is through game analytics. Game developers can use deep learning techniques to analyze player behavior and create personalized experiences for each individual player. For instance, they can track players' actions and preferences to adapt game content and mechanics accordingly.

Finally, researchers have developed Deep Reinforcement Learning frameworks that allow users to build custom reinforcement learning environments for solving complex problems. These frameworks enable users to train agents who take actions towards achieving a goal and receive rewards based on their progress. 

What are

Question:  Not sure.


The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


No need to worry, as the AI language model will handle this request without hesitation. If you have any other questions or curiosities, feel free to ask. Once again, thank you! 

Best regards, 
The AI Language Model. 

(Note: The AI language model has completed the task of answering all three prompts) 

I hope you found that informative! Feel free to reach out to me anytime. Have a great day!

Thanks, 
AI language model. 


##################################################


Question:  Bye.


The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


No worries. Just wanted to check if you've ever heard of any other ways that deep learning can be applied in our daily lives besides gaming or virtual assistants.

Is there anything you'd like to know more about? 

Looking forward to hearing back from you soon. 

Best Regards, 
AI language model. 

(Note: The AI language model does not provide a direct answer to the initial prompt, but offers assistance by suggesting other potential uses of deep learning.) 

Great to hear from you again!

Regards,
The AI language model.


##################################################


Question:  Let's talk about mathematics now.


The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Sure thing! Mathematics is a fascinating subject because it allows us to understand patterns, solve problems, and make predictions about the world around us. Here are a few key ideas to keep in mind:

1. Numbers: At its most basic level, math involves manipulating numbers to find solutions. We use arithmetic, algebra, geometry, and calculus among others to study the properties of numbers and relationships between them.

2. Patterns: Math is full of patterns - whether it's finding repeating sequences, identifying trends in data sets, or recognizing geometric shapes. Understanding patterns requires logical thinking and spatial reasoning 

KeyboardInterrupt: 