<a href="https://colab.research.google.com/github/REELICIT/reqbrain_rep_package/blob/d6f1edc5fb5f41bdb4a6a9105b37c378ab4c96aa/inferencing_all_trained_models/all_models_used_with_langchain/falcon-instruct_inferance.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# ReqBrain (Falcon 7b instruct) Download & Setup

In [1]:
# required libraries

# if not installed, use !pip install <name of the API>
# in case of error due to lacking API:
# Python writes the name and command to install the lacking API in the very last lines of the error message.

import torch
import transformers
from torch import cuda, bfloat16

# Resources Checkup

- **Important Note:** A GPU with minimum of 32GB GPU memory is required to load the model!

In [2]:
# Check if CUDA (GPU support) is available
if torch.cuda.is_available():
    # Get the number of available GPUs
    num_gpus = torch.cuda.device_count()

    # Iterate over each GPU and print its name and memory information
    for i in range(num_gpus):
        gpu = torch.cuda.get_device_properties(i)
        print(f"GPU {i + 1} Name: {gpu.name}")
        print(f"GPU {i + 1} Total Memory: {gpu.total_memory / (1024 ** 3):.2f} GB")
else:
    print("CUDA is not available. A GPU with minimum of 32GB GPU memory is required to load the model!")

GPU 1 Name: Tesla V100-SXM2-32GB
GPU 1 Total Memory: 31.74 GB


In [3]:
# detecting GPU device to load model on it
device = f'cuda:{cuda.current_device()}' if cuda.is_available() else 'cpu'

# Downloading the Model

In [4]:
model_name = 'REELICIT/falcon-7b-instruct-ReqBrain'

model = transformers.AutoModelForCausalLM.from_pretrained(model_name)
tokenizer = transformers.AutoTokenizer.from_pretrained(model_name, use_fast = False)

tokenizer.pad_token = tokenizer.eos_token
tokenizer.padding_side = "right"

model.eval()
model.to(device)
print(f"Model loaded on {device}")

Loading checkpoint shards:   0%|          | 0/6 [00:00<?, ?it/s]

Model loaded on cuda:0


# Settings

In [5]:
# Setting the list of stopping criteria for the model

stop_token_ids = [
    tokenizer.convert_tokens_to_ids(x) for x in [['Human', ':'], ['###', 'Human', ':'], ["***"], ["###"], ["###", "Assistant", ":"], [tokenizer.eos_token]]
]

stop_token_ids = [torch.LongTensor(x).to(device) for x in stop_token_ids]

In [6]:
from transformers import StoppingCriteria, StoppingCriteriaList


class StopOnTokens(StoppingCriteria):
    def __call__(self, input_ids: torch.LongTensor, scores: torch.FloatTensor, **kwargs) -> bool:
        for stop_ids in stop_token_ids:
            if torch.eq(input_ids[0][-len(stop_ids):], stop_ids).all():
                return True
        return False

stopping_criteria = StoppingCriteriaList([StopOnTokens()])

In [7]:
pipe = transformers.pipeline(
    model = model,
    tokenizer = tokenizer,
    return_full_text = True, # Set it to True when combining with LangChain
    task = 'text-generation',
    device = device,
    stopping_criteria = stopping_criteria,  
    temperature = 0.1,
    top_p = 0.15,  
    top_k = 0,  
    max_new_tokens = 200,  # the number of tokens the model must generate
    repetition_penalty = 1.3
)

2024-01-19 14:21:28.920135: I tensorflow/core/util/port.cc:111] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-01-19 14:21:28.969836: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9342] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-01-19 14:21:28.969871: E tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:609] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-01-19 14:21:28.969901: E tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1518] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-01-19 14:21:28.983538: I tensorflow/core/platform/cpu_feature_g

# Inferance

In [8]:
prompt_1 = "I want two usability requirements for bus ticket machines." 

result = pipe(f"### Human: {prompt_1}.### Assistant: ")
print(result[0]['generated_text'].split('### Assistant: ')[-1])

Setting `pad_token_id` to `eos_token_id`:11 for open-end generation.
The current implementation of Falcon calls `torch.scaled_dot_product_attention` directly, this will be deprecated in the future in favor of the `BetterTransformer` API. Please install the latest optimum library with `pip install -U optimum` and call `model.to_bettertransformer()` to benefit from `torch.scaled_dot_product_attention` and future performance optimizations.


 The first requirement, NP3-005-1, states that the system must provide a mechanism to store personal information such as name and address. This functionality should allow the individual to anonymously use the machine without revealing their identity if desired. The second requirement, NP3-005-2, outlines a provision allowing the insertion of credit or debit card numbers by numeric keystrokes rather than using symbols for each transaction. It must be possible to make changes or transactions quickly while still seated at the computer. Together, these requirements outline four distinct aspects pertaining to user interface and usage guidelines. These are just some examples of usability requirements; additional specifics can be found in ISO 29148 recommendations. Examples include setting up default preferences, utilizing existing data and providing options for future uses based on past interactions.
The given requirement is not specific enough to be considered as a concrete requirement rega

In [9]:
prompt_2 = "Write me three security requirements for an ATM software?"

result = pipe(f"### Human: {prompt_2}.### Assistant: ")
print(result[0]['generated_text'].split('### Assistant: ')[-1])

Setting `pad_token_id` to `eos_token_id`:11 for open-end generation.


 The Security Regulation Requirement RS-1 specifies that the system must implement secure passwords or biometric authentication methods to authorize access to account data. Other Requries include Secure Connection Protocol and SSL Certificate Validation, both of which aim to ensure secure communication between the client and the server. Additionally, the Use Access Control subclass Restriction RUSC2 restricts access to transaction details to authorized users only.###


## Try it for your self

In [10]:
# Uncomment below lines to use a custom query

# your_prompt = 'your prompt goes here!'
# result = pipe(f"### Human: {your_prompt}.### Assistant: ")
# print(result[0]['generated_text'])

# Model Setup with Langchain for Chat-based RE Elicitation!

In [11]:
# Importing langchain required APIs

from langchain.llms import HuggingFacePipeline
from langchain.chains.conversation.memory import ConversationBufferWindowMemory
from langchain.chains import ConversationChain

In [12]:
chat_model = HuggingFacePipeline(pipeline = pipe)

memory = ConversationBufferWindowMemory(
    memory_key = "history",
    k = 5,     # Chat history memory, increase/decrease it depending on how far the model should remember!
    return_only_outputs = True)

chat_history = ConversationChain(
    llm = chat_model,
    memory = memory,
    verbose = True)

chat_history.prompt.template = """
You are a professional requirements engineer who helps users brainstorm more software requirements.

Current conversation:
{history}
### Human: {input}
### Assistant: """

In [13]:
def llm_re_elicitor(chat_chain, query):
    chat_chain.predict(input = query)
    last_message = chat_chain.memory.chat_memory.messages[-1]
    last_message_content = last_message.content.split('\n\n')[0].strip()
    
    # cleaning prtompt text
    prompt_id = last_message_content.find('\n### Human:')
    if prompt_id != -1:
        last_message_content = last_message_content[:prompt_id]

    # cleaning model generated text
    for stop_token in ['***', '###', '### Assistant:', '### Human:', 'Human:']:
        last_message_content = last_message_content.removesuffix(stop_token)

    return last_message_content.strip()

In [14]:
llm_re_elicitor(chat_history, '''As a software engineer, I want to build an AI chatbot and integrate it with our school's adaptive learning platform. 
The chatbot should be capable of doing content search in the system, doing quizzes, motivating, real-time interaction with students, resolution of misunderstandings, and delivering curriculum content in various formats. 
It also needs the ability to provide topic definitions, connect with a recommendation system, and engage in small talk for a more interactive experience.
I want you to brainstorm requirements for me.''')
print(chat_history.memory.chat_memory.messages[-1].content)

Setting `pad_token_id` to `eos_token_id`:11 for open-end generation.




[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3m
You are a professional requirements engineer who helps users brainstorm more software requirements.

Current conversation:

### Human: As a software engineer, I want to build an AI chatbot and integrate it with our school's adaptive learning platform. 
The chatbot should be capable of doing content search in the system, doing quizzes, motivating, real-time interaction with students, resolution of misunderstandings, and delivering curriculum content in various formats. 
It also needs the ability to provide topic definitions, connect with a recommendation system, and engage in small talk for a more interactive experience.
I want you to brainstorm requirements for me.
### Assistant: [0m

[1m> Finished chain.[0m

As a user, I would like to have a feature where the AI chatbot can generate personalized lesson plans based on each student's individual learning needs.

Requirements:

- The AI chatbot s

In [15]:
llm_re_elicitor(chat_history, ''''Concerning requirement number 3, which you proposed in your previous message, tell me two methods I can employ to motivate students?''')
print(chat_history.memory.chat_memory.messages[-1].content)

Setting `pad_token_id` to `eos_token_id`:11 for open-end generation.




[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3m
You are a professional requirements engineer who helps users brainstorm more software requirements.

Current conversation:
Human: As a software engineer, I want to build an AI chatbot and integrate it with our school's adaptive learning platform. 
The chatbot should be capable of doing content search in the system, doing quizzes, motivating, real-time interaction with students, resolution of misunderstandings, and delivering curriculum content in various formats. 
It also needs the ability to provide topic definitions, connect with a recommendation system, and engage in small talk for a more interactive experience.
I want you to brainstorm requirements for me.
AI: 
As a user, I would like to have a feature where the AI chatbot can generate personalized lesson plans based on each student's individual learning needs.

Requirements:

- The AI chatbot shall be able to create individualized lesson pla

In [16]:
# put your prtompt between ''' marks for continuation of the chat!

# llm_re_elicitor(chat_history, '''' ''')
# print(chat_history.memory.chat_memory.messages[-1].content)