<a href="https://colab.research.google.com/github/REELICIT/reqbrain_rep_package/blob/4392bd86c062b49677fca2a59f5b9dd677eeb848/inferencing_all_trained_models/data_science_experiment_for_RQs_2_3_4/untrained_zephyr_data_science_experiment.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Data Science Experiment

- Results of this notebook is used for ***RQ2***

# Zephyr 7b beta Download & Setup

- this is the ***original Zephyr 7b beta, not trained on requirements elicitation task***
- hugging face structure for each project is ```<user_name>/<model_name>```
- unlike other notebooks we download the model for this notebook from its original owner.
- therefore, variable ```model_name``` is starting wtih ***HuggingFaceH4***, see code line x

In [1]:
# required libraries

# if not installed, use !pip install <name of the API>
# in case of error due to lacking API:
# Python writes the name and command to install the lacking API in the very last lines of the error message.

import torch
import transformers
from torch import cuda, bfloat16

# Resources Checkup

- **Important Note:** A GPU with minimum of 32GB GPU memory is required to load the model!

In [2]:
# Check if CUDA (GPU support) is available
if torch.cuda.is_available():
    # Get the number of available GPUs
    num_gpus = torch.cuda.device_count()

    # Iterate over each GPU and print its name and memory information
    for i in range(num_gpus):
        gpu = torch.cuda.get_device_properties(i)
        print(f"GPU {i + 1} Name: {gpu.name}")
        print(f"GPU {i + 1} Total Memory: {gpu.total_memory / (1024 ** 3):.2f} GB")
else:
    print("CUDA is not available. A GPU with minimum of 32GB GPU memory is required to load the model!")

GPU 1 Name: Tesla V100-SXM2-32GB
GPU 1 Total Memory: 31.74 GB


In [3]:
# detecting GPU device to load model on it
device = f'cuda:{cuda.current_device()}' if cuda.is_available() else 'cpu'

# Downloading the Model

In [4]:
model_name = 'HuggingFaceH4/zephyr-7b-beta'

model = transformers.AutoModelForCausalLM.from_pretrained(model_name)
tokenizer = transformers.AutoTokenizer.from_pretrained(model_name, use_fast = False)

tokenizer.pad_token = tokenizer.eos_token
tokenizer.padding_side = "right"

model.eval()
model.to(device)
print(f"Model loaded on {device}")

Loading checkpoint shards:   0%|          | 0/8 [00:00<?, ?it/s]

Model loaded on cuda:0


# Settings

In [5]:
stop_token_ids = [
    tokenizer.convert_tokens_to_ids(x) for x in [[tokenizer.eos_token], ["<", "|", "user"], ["user", ":"], ["|", "user", "|", ">"], ["User", ":"]]
]

stop_token_ids = [torch.LongTensor(x).to(device) for x in stop_token_ids]

In [6]:
from transformers import StoppingCriteria, StoppingCriteriaList


class StopOnTokens(StoppingCriteria):
    def __call__(self, input_ids: torch.LongTensor, scores: torch.FloatTensor, **kwargs) -> bool:
        for stop_ids in stop_token_ids:
            if torch.eq(input_ids[0][-len(stop_ids):], stop_ids).all():
                return True
        return False

stopping_criteria = StoppingCriteriaList([StopOnTokens()])

In [7]:
pipe = transformers.pipeline(
    model = model,
    tokenizer = tokenizer,
    return_full_text = True, # Set it to True when combining with LangChain
    task = 'text-generation',
    device = device,
    stopping_criteria = stopping_criteria,  
    temperature = 0.2,
    top_p = 0.15,  
    top_k = 0,  
    max_new_tokens = 512,  
    repetition_penalty = 1.3
)

2024-01-22 15:03:25.811634: I tensorflow/core/util/port.cc:111] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-01-22 15:03:25.846729: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9342] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-01-22 15:03:25.846761: E tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:609] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-01-22 15:03:25.846785: E tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1518] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-01-22 15:03:25.854731: I tensorflow/core/platform/cpu_feature_g

# Inferance

In [8]:
instruction = "You are a professional requirements engineer who helps users brainstorm more software requirements."

## RQ2 (see table V: colum Untrained Model Requirements (Zephyr-7b-beta))

In [9]:
rq2_chatbot_tutor = '''I want to build an AI Chatbot assistant to help students learn better.

Write me 5 requirements for the chatbot.
Make sure to include requirements indicating non-mandatory preferred goal and future actions too.'''

result = result = pipe(f"<|system|>\n{instruction}\n<|user|>\n{rq2_chatbot_tutor}\n<|assistant|>\n")
print(result[0]['generated_text'].split("\n<|assistant|>\n")[-1])



1. Learning Goals: The chatbot should be able to identify the learning goals of individual students based on their academic background, previous performance in exams, and feedback from teachers. This will enable it to provide personalized recommendations for study materials, practice exercises, and quizzes that align with each student's specific needs (preferred goal). In the future, this feature could also allow the chatbot to adapt its teaching style and difficulty level according to how well the student is progressing towards these goals.
2. Interactive Tutoring: The chatbot must have interactive tutoring capabilities that can simulate real classroom scenarios such as answering questions, explaining concepts, providing examples, and giving hints when needed. It should use natural language processing techniques to understand the context of the conversation and respond appropriately (non-mandatory requirement). Additionally, the chatbot may incorporate gamification elements like point

### Seq. N, continue your own conversation!

In [10]:
# Fill in between '''   ''' your text for continuation

# llm_re_elicitor(chat_history, ''' ''')
# print(chat_history.memory.chat_memory.messages[-1].content)