<a href="https://colab.research.google.com/github/REELICIT/reqbrain_rep_package/blob/db6c7a1f2a35b5a50a6ad3912b69b5b57ed08e43/inferencing_all_trained_models/data_science_experiment_for_RQs_2_3_4/ReqBrain_zephyr_data_science_experiment.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Data Science Experiment

- Results of this notebook is used for ***RQs*** ***2***, ***3***, and ***4***

# ReqBrain (Zephyr 7b beta) Download & Setup

In [1]:
# required libraries

# if not installed, use !pip install <name of the API>
# in case of error due to lacking API:
# Python writes the name and command to install the lacking API in the very last lines of the error message.

import torch
import transformers
from torch import cuda, bfloat16

# Resources Checkup

- **Important Note:** A GPU with minimum of 32GB GPU memory is required to load the model!

In [2]:
# Check if CUDA (GPU support) is available
if torch.cuda.is_available():
    # Get the number of available GPUs
    num_gpus = torch.cuda.device_count()

    # Iterate over each GPU and print its name and memory information
    for i in range(num_gpus):
        gpu = torch.cuda.get_device_properties(i)
        print(f"GPU {i + 1} Name: {gpu.name}")
        print(f"GPU {i + 1} Total Memory: {gpu.total_memory / (1024 ** 3):.2f} GB")
else:
    print("CUDA is not available. A GPU with minimum of 32GB GPU memory is required to load the model!")

GPU 1 Name: Tesla V100-SXM2-32GB
GPU 1 Total Memory: 31.74 GB


In [3]:
# detecting GPU device to load model on it
device = f'cuda:{cuda.current_device()}' if cuda.is_available() else 'cpu'

# Downloading the Model

In [4]:
model_name = 'REELICIT/zephyr-7b-beta-ReqBrain'

model = transformers.AutoModelForCausalLM.from_pretrained(model_name)
tokenizer = transformers.AutoTokenizer.from_pretrained(model_name, use_fast = False)

tokenizer.pad_token = tokenizer.eos_token
tokenizer.padding_side = "right"

model.eval()
model.to(device)
print(f"Model loaded on {device}")

Loading checkpoint shards:   0%|          | 0/6 [00:00<?, ?it/s]

Model loaded on cuda:0


# Settings

In [5]:
stop_token_ids = [
    tokenizer.convert_tokens_to_ids(x) for x in [[tokenizer.eos_token], ["<", "|", "user"], ["user", ":"], ["|", "user", "|", ">"], ["User", ":"]]
]

stop_token_ids = [torch.LongTensor(x).to(device) for x in stop_token_ids]

In [6]:
from transformers import StoppingCriteria, StoppingCriteriaList

class StopOnTokens(StoppingCriteria):
    def __call__(self, input_ids: torch.LongTensor, scores: torch.FloatTensor, **kwargs) -> bool:
        for stop_ids in stop_token_ids:
            if torch.eq(input_ids[0][-len(stop_ids):], stop_ids).all():
                return True
        return False

stopping_criteria = StoppingCriteriaList([StopOnTokens()])

In [7]:
pipe = transformers.pipeline(
    model = model,
    tokenizer = tokenizer,
    return_full_text = True, # Set it to True when combining with LangChain
    task='text-generation',
    device = device,
    stopping_criteria = stopping_criteria,  
    temperature = 0.2,
    top_p = 0.15,  
    top_k = 0,  
    max_new_tokens = 512,  
    repetition_penalty = 1.3
)

2024-01-22 14:37:38.008948: I tensorflow/core/util/port.cc:111] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-01-22 14:37:38.054807: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9342] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-01-22 14:37:38.054841: E tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:609] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-01-22 14:37:38.054865: E tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1518] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-01-22 14:37:38.073104: I tensorflow/core/platform/cpu_feature_g

# Inferance for RQs

In [8]:
instruction = "You are a professional requirements engineer who helps users brainstorm more software requirements."

## RQ2 (table V)

In [9]:
rq2_chatbot_tutor = '''I want to build an AI Chatbot assistant to help students learn better.

Write me 5 requirements for the chatbot.
Make sure to include requirements indicating non-mandatory preferred goal and future actions too.'''

rq2_result = pipe(f"<|system|>\n{instruction}\n<|user|>\n{rq2_chatbot_tutor}\n<|assistant|>\n")
print(rq2_result[0]['generated_text'].split("\n<|assistant|>\n")[-1].strip('<|user'))



RQ1: The chatbot shall provide personalized study suggestions based on the student’s course history and learning style.
RQ2: The chatbot shall respond to queries related to academic courses and campus life within 3 seconds of response time.
RQ3: The chatbot's responses to student inquiries will be accurate 95% of the time.
RQ4: The chatbot will be integrated with the university system to access up-to-date information about academics, campuses, accommodation, and other relevant topics.
RQ5: In case the chatbot is unable to understand or resolve a student query, it should handover the conversation to a human support agent.



## RQ3 (table VI: column ReqBrain Generated Requirements)

In [10]:
rq3_chatbot_tutor = '''Below is the list of requirements we collected for an AI Chatbot to help school students.
1 - The chatbot shall be able to interface with the content search mechanism/index to provide students with pointers to materials.
2 - The chatbot shall be capable of acting as a quiz master.
3 - The chatbot shall motivate students.
4 - The chatbot should resolve misunderstandings for user intents.
5 - The chatbot shall provide its services in real time.
6 - The chatbot shall be able to deliver certain answers/content from the curriculum in three formats, i.e., text, video and pictures.
7 - The chatbot shall provide definitions for a topic if asked for in FAQ-style.
8 - The chatbot should interface with recommender.
9 - The chatbot should be capable of small talk.

Can you give a few requirements, that are missing, for the chatbot?'''

rq3_result = result = pipe(f"<|system|>\n{instruction}\n<|user|>\n{rq3_chatbot_tutor}\n<|assistant|>\n")
print(rq3_result[0]['generated_text'].split("\n<|assistant|>\n")[-1].strip('<|user'))

Sure! Here are some more requirements:

1 - The chatbot shall identify whether a user question falls into a known query category or not.
2 - In case the user question is not familiar, the chatbot shall direct the user to the support staff.
3 - The chatbot shall ensure the privacy of student data.
4 - The chatbot shall be able to suggest remedial exercises based on past performances.
5 - The chatbot shall provide feedback to teachers.
6 - The chatbot shall improve over time through learning from conversations.
7 - The chatbot shall work 24 hours a day.
8 - Some messages delivered by the chatbot shall be saved in a log for analysis purposes.
9 - The chatbot shall communicate via instant messaging.


## RQ4 (table VII)

In [11]:
# Importing langchain required APIs

from langchain.llms import HuggingFacePipeline
from langchain.chains.conversation.memory import ConversationBufferWindowMemory
from langchain.chains import ConversationChain

In [12]:
chat_model = HuggingFacePipeline(pipeline = pipe)

memory = ConversationBufferWindowMemory(
    memory_key = "history",
    k = 5,     # Chat history memory, increase/decrease it depending on how far the model should remember!
    return_only_outputs = True)

chat_history = ConversationChain(
    llm = chat_model,
    memory = memory,
    verbose = True)

chat_history.prompt.template = f"{instruction}\n" + """

Current conversation:
{history}
user: {input}
assistant:
"""

In [13]:
def llm_re_elicitor(chat_chain, query):
    chat_chain.predict(input = query)
    last_message = chat_chain.memory.chat_memory.messages[-1]
    last_message_content = last_message.content.split('\n\n')[0].strip()
    
    # cleaning prtompt text
    prompt_id = last_message_content.find('\nuser:')
    if prompt_id != -1:
        last_message_content = last_message_content[:prompt_id]

    # cleaning model generated text
    for stop_token in ['user:', 'assistant:']:
        last_message_content = last_message_content.removesuffix(stop_token)

    return last_message_content.strip()

### Seq. #1 (shown on table VII)

In [14]:
llm_re_elicitor(chat_history, '''As a software engineer, I want to build an AI chatbot and integrate it with our school's adaptive learning platform. 
The chatbot should be capable of doing content search in the system, doing quizzes, motivating, real-time interaction with students, resolution of misunderstandings, and delivering curriculum content in various formats. 
It also needs the ability to provide topic definitions, connect with a recommendation system, and engage in small talk for a more interactive experience.
I want you to brainstorm requirements for me.''')
print(chat_history.memory.chat_memory.messages[-1].content)



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mYou are a professional requirements engineer who helps users brainstorm more software requirements.


Current conversation:

user: As a software engineer, I want to build an AI chatbot and integrate it with our school's adaptive learning platform. 
The chatbot should be capable of doing content search in the system, doing quizzes, motivating, real-time interaction with students, resolution of misunderstandings, and delivering curriculum content in various formats. 
It also needs the ability to provide topic definitions, connect with a recommendation system, and engage in small talk for a more interactive experience.
I want you to brainstorm requirements for me.
assistant:
[0m





[1m> Finished chain.[0m
1 - The chatbot shall perform content search in the Learning Management System (LMS).
2 - The chatbot shall carry out Quizzes.
3 - The Chatbot shall Motivate Students.
4 - The Chatbot shall have Real-Time Interaction With Students.
5 - The Chatbot shall resolve Misunderstandings.
6 - The Chatbot shall deliver Curriculum Content in Various Formats.
7 - The Chatbot shall Provide Topic Definitions.
8 - The Chatbot shall Connect To A Recommendation System.
9 - The Chatbot shall Engage In Small Talk.
user:


### Seq. #2 (shown on table VII)

In [15]:
llm_re_elicitor(chat_history, '''Concerning requirement number 3, which you proposed in your previous message, tell me two methods I can employ to motivate students?''')
print(chat_history.memory.chat_memory.messages[-1].content)



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mYou are a professional requirements engineer who helps users brainstorm more software requirements.


Current conversation:
Human: As a software engineer, I want to build an AI chatbot and integrate it with our school's adaptive learning platform. 
The chatbot should be capable of doing content search in the system, doing quizzes, motivating, real-time interaction with students, resolution of misunderstandings, and delivering curriculum content in various formats. 
It also needs the ability to provide topic definitions, connect with a recommendation system, and engage in small talk for a more interactive experience.
I want you to brainstorm requirements for me.
AI: 1 - The chatbot shall perform content search in the Learning Management System (LMS).
2 - The chatbot shall carry out Quizzes.
3 - The Chatbot shall Motivate Students.
4 - The Chatbot shall have Real-Time Interaction With Students.
5 - 




[1m> Finished chain.[0m
Sure! Here are two ways to motivate students:
1 - Utilize gamification techniques such as points, badges, leaderboards, and progress bars.
2 - Implement personalized study plans based on student performance and preferences.


### Seq. #3 (shown on table VII)

In [16]:
llm_re_elicitor(chat_history, "Amazing! Can you turn your ways to motivating students into requirements?")
print(chat_history.memory.chat_memory.messages[-1].content)



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mYou are a professional requirements engineer who helps users brainstorm more software requirements.


Current conversation:
Human: As a software engineer, I want to build an AI chatbot and integrate it with our school's adaptive learning platform. 
The chatbot should be capable of doing content search in the system, doing quizzes, motivating, real-time interaction with students, resolution of misunderstandings, and delivering curriculum content in various formats. 
It also needs the ability to provide topic definitions, connect with a recommendation system, and engage in small talk for a more interactive experience.
I want you to brainstorm requirements for me.
AI: 1 - The chatbot shall perform content search in the Learning Management System (LMS).
2 - The chatbot shall carry out Quizzes.
3 - The Chatbot shall Motivate Students.
4 - The Chatbot shall have Real-Time Interaction With Students.
5 - 




[1m> Finished chain.[0m
Yes, here are two requirements:
1 - The chatbot shall motivate students through the use of gamification techniques like points, badges, leaderboards, and progress bars.
2 - The chatbot shall motivation by implementing personalized study plans based on student performance and preferences.


### Seq. #4 (shown on table VII)

In [17]:
llm_re_elicitor(chat_history, "Give me three performance requirements for the chatbot.")
print(chat_history.memory.chat_memory.messages[-1].content)



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mYou are a professional requirements engineer who helps users brainstorm more software requirements.


Current conversation:
Human: As a software engineer, I want to build an AI chatbot and integrate it with our school's adaptive learning platform. 
The chatbot should be capable of doing content search in the system, doing quizzes, motivating, real-time interaction with students, resolution of misunderstandings, and delivering curriculum content in various formats. 
It also needs the ability to provide topic definitions, connect with a recommendation system, and engage in small talk for a more interactive experience.
I want you to brainstorm requirements for me.
AI: 1 - The chatbot shall perform content search in the Learning Management System (LMS).
2 - The chatbot shall carry out Quizzes.
3 - The Chatbot shall Motivate Students.
4 - The Chatbot shall have Real-Time Interaction With Students.
5 - 




[1m> Finished chain.[0m
1 - The chatbot shall achieve a 90% accuracy rate when responding to questions about academic topics.
2 - The chatbot shall respond to simple requests in less than 1 second.
3 - The chatbot shall handle complex queries in under 5 seconds 90% of the time.


### Seq. N, continue your own conversation!

In [18]:
# Fill in between '''   ''' your text for continuation

# llm_re_elicitor(chat_history, ''' ''')
# print(chat_history.memory.chat_memory.messages[-1].content)