# CyborgGPT

**Cyborg** is a LLM Bot `CyborgGPT` that can be used as a `secure` asistance in your meetings/recorded conversations along with relevant documents, giving you ability to chat on the context of Knowledge Bage (**KB**) created from your data (`meetings/recorded conversations along with relevant documents`). It has no ability to send your data over internet. That's why it is `secure/private/local`

**NOTES:**
- This colab notebook currently `only` work for `GPU Runtime`. It'will be updated later for `CPU Rintime` too.
- It is prototype and will be improved day by day.
- Please keep cells inside [Upload Information (Audios and/or Documents)](https://colab.research.google.com/github/belal-bh/cyborg/blob/main/Cyborg.ipynb?authuser=1#scrollTo=Ro6CDR3TR7-d&line=1&uniqifier=1)
open to upload your documents or audios.

# First Things First

## Install Packages

In [None]:
requirements = '''
# For Knowledge Base stuff
openai-whisper
langchain
chromadb
InstructorEmbedding
sentence-transformers

# GPT
faiss-cpu
huggingface_hub
transformers
protobuf
auto-gptq

llama-cpp-python
pdfminer.six
openpyxl

# Utility library
pathlib
urllib3
accelerate
bitsandbytes
click
'''

# Save the requirements to a file
with open('requirements.txt', 'w') as file:
    file.write(requirements)

# Run the pip install command
!pip install -r requirements.txt

# !pip install langchain==0.0.191 chromadb==0.3.22 llama-cpp-python==0.1.48 pdfminer.six==20221105 InstructorEmbedding sentence-transformers faiss-cpu huggingface_hub transformers protobuf==3.20.0 auto-gptq urllib3==1.26.6 accelerate bitsandbytes click openpyxl


Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting openai-whisper (from -r requirements.txt (line 3))
  Downloading openai-whisper-20230314.tar.gz (792 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m792.9/792.9 kB[0m [31m14.7 MB/s[0m eta [36m0:00:00[0m
[?25h  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
Collecting langchain (from -r requirements.txt (line 4))
  Downloading langchain-0.0.209-py3-none-any.whl (1.1 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.1/1.1 MB[0m [31m58.4 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting chromadb (from -r requirements.txt (line 5))
  Downloading chromadb-0.3.26-py3-none-any.whl (123 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m123.6/123.6 kB[0m [31m17.9 MB/s[0m eta [36m0:00:00[0m
[?25hCollecti

## Import packages

In [None]:
import os
import shutil
from pathlib import Path
from datetime import datetime
from chromadb.config import Settings
from langchain.document_loaders import TextLoader, PDFMinerLoader, CSVLoader, UnstructuredExcelLoader
import click

import whisper
from langchain.embeddings import HuggingFaceInstructEmbeddings
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.vectorstores import Chroma


from langchain.chains import RetrievalQA
from langchain.embeddings import HuggingFaceInstructEmbeddings
from langchain.vectorstores import Chroma

import torch
from auto_gptq import AutoGPTQForCausalLM
from langchain.llms import HuggingFacePipeline

from concurrent.futures import ProcessPoolExecutor, ThreadPoolExecutor, as_completed
from langchain.docstore.document import Document

from transformers import (
    AutoModelForCausalLM,
    AutoTokenizer,
    GenerationConfig,
    LlamaForCausalLM,
    LlamaTokenizer,
    pipeline,
)

from google.colab import files

## Configuration Form

In [None]:
#@title ### Please Fill the Form

# BASE Directory
BASE_DIR = Path.cwd()

#@markdown **Audio files directory**:
#@markdown Give a valid `Path` by selecting or editing.
_AUDIO_DIR = 'audios' #@param ["audios"] {allow-input: true}
AUDIO_DIR = BASE_DIR.joinpath(_AUDIO_DIR)

#@markdown **Whisper transribed output directory**:
#@markdown Give a valid `Path` by selecting or editing.
_TRANSCRIBE_DIR = 'transcriptions' #@param ["transcriptions"] {allow-input: true}
TRANSCRIBE_DIR = BASE_DIR.joinpath(_TRANSCRIBE_DIR)

#@markdown **Additional documents for knowledge base**:
#@markdown Give a valid `Path` by selecting or editing.
_DOCUMENTS_DIR = 'documents' #@param ["documents"] {allow-input: true}
DOCUMENTS_DIR = BASE_DIR.joinpath(_DOCUMENTS_DIR)
#@markdown ---

# Supported audio file extensions (TODO: More will be added later)
SUPPORTED_AUDIO_FILE_EXTENSIONS = ['.mp3', '.m4a', '.wav']


#@markdown **Whisper Model to be used**:
WHISPER_MODEL_NAME = 'base' #@param ["base", "tiny", "small", "medium", "large"] {allow-input: false}

#@markdown **Device type**:
#@markdown 'cuda' or 'cpu'
DEVICE_TYPE = 'cuda' #@param ["cuda", "cpu"] {allow-input: false}

# documents reader currently supported document types with reader
SUPPORTED_DOCUMENT_MAP = {
    ".txt": TextLoader,
    '.pdf': PDFMinerLoader,
    '.csv': CSVLoader,
    '.xls': UnstructuredExcelLoader,
    '.xlxs': UnstructuredExcelLoader
}


#@markdown **Instructor Model**:
#@markdown Choose the model you prefer with the consideration of `Runtime Type`
#@markdown Use default model if are not sure about it.
_EMBEDDING_MODEL_NAME =  "hkunlp/instructor-large" #@param ["hkunlp/instructor-large"] {allow-input: false}
EMBEDDING_MODEL_NAME = _EMBEDDING_MODEL_NAME


#@markdown **Persisted Knoledge Bage Directory**:
_KB_DIR =  "DB" #@param ["DB"] {allow-input: true}
KB_DIR = BASE_DIR.joinpath(_KB_DIR)

# KB Threads
KB_THREADS = os.cpu_count() or 8


# Chroma settings
CHROMA_SETTINGS = Settings(
    chroma_db_impl="duckdb+parquet",
    persist_directory=str(KB_DIR),
    anonymized_telemetry=False
)

#@markdown **LLM model id and Basename**:
#@markdown Choose the model you prefer with the consideration of `Runtime Type`
#@markdown Use default model if are not sure about it.
LLM_MODEL_ID = "TheBloke/WizardLM-7B-uncensored-GPTQ"  #@param ["TheBloke/WizardLM-7B-uncensored-GPTQ"] {allow-input: false}
LLM_MODEL_BASENAME = "WizardLM-7B-uncensored-GPTQ-4bit-128g.compat.no-act-order.safetensors"  #@param ["WizardLM-7B-uncensored-GPTQ-4bit-128g.compat.no-act-order.safetensors", null] {allow-input: false}
#@markdown ---

#@markdown If You want to see source of the result set it True:
SHOW_SOURCES = False #@param {type:"boolean"}
#@markdown ---

CHAT_HISTORY_DIR = BASE_DIR.joinpath("chats")


## Configure Environment

In [None]:
Path.mkdir(AUDIO_DIR, exist_ok=True)
Path.mkdir(TRANSCRIBE_DIR, exist_ok=True)
Path.mkdir(DOCUMENTS_DIR, exist_ok=True)
Path.mkdir(KB_DIR, exist_ok=True)
Path.mkdir(CHAT_HISTORY_DIR, exist_ok=True)

In [None]:
if torch.cuda.is_available():
    device = torch.cuda.get_device_name(0)
    print("GPU is available:", device)
else:
    if DEVICE_TYPE == "cuda":
        print("CPU is being used.")
        raise Exception("User selected wrong DEVICE_TYPE OR forget to change the Runtime Type in Google Colaboratory.")


GPU is available: Tesla T4


# Prepare Codebase

## Script Utils

In [None]:
def get_audio_files(audio_dir=AUDIO_DIR):
    audio_files = get_supported_file_paths(
        audio_dir, SUPPORTED_AUDIO_FILE_EXTENSIONS)
    return audio_files

def get_supported_file_paths(files_dir, extensions):
    file_paths = []
    path = Path(files_dir)
    for file in path.glob('**/*'):
        if file.is_file() and file.suffix.lower() in extensions:
            file_paths.append(file)
    return file_paths

def write_to_file(file_path, content):
    with open(file_path, 'w') as file:
        file.write(content)


def save_transcripsion(file_name, content):
    file_path = TRANSCRIBE_DIR.joinpath(file_name)
    write_to_file(file_path, content)

def save_chat_history(file_name, content):
    file_path = CHAT_HISTORY_DIR.joinpath(file_name)
    write_to_file(file_path, content)


def load_single_document(file_path: str) -> Document:
    # Loads a single document from a file path
    file_extension = os.path.splitext(file_path)[1]
    loader_class = SUPPORTED_DOCUMENT_MAP.get(file_extension)
    if loader_class:
        loader = loader_class(file_path)
    else:
        raise ValueError("Document type is undefined")
    return loader.load()[0]


def load_document_batch(filepaths):
    print("Loading document batch")
    # create a thread pool
    with ThreadPoolExecutor(len(filepaths)) as exe:
        # load files
        futures = [exe.submit(load_single_document, name)
                   for name in filepaths]
        # collect data
        data_list = [future.result() for future in futures]
        # return data and file paths
        return (data_list, filepaths)


def load_documents(source_dirs):
    print('source_dirs', source_dirs)
    # Loads all documents from the source documents directories
    paths = []
    for source_dir in source_dirs:
        all_files = os.listdir(source_dir)
        for file_path in all_files:
            file_extension = os.path.splitext(file_path)[1]
            source_file_path = os.path.join(source_dir, file_path)
            if file_extension in SUPPORTED_DOCUMENT_MAP.keys():
                paths.append(source_file_path)

    # Have at least one worker and at most KB_THREADS workers
    n_workers = min(KB_THREADS, max(len(paths), 1))
    chunksize = round(len(paths) / n_workers)
    docs = []
    with ProcessPoolExecutor(n_workers) as executor:
        futures = []
        # split the load operations into chunks
        for i in range(0, len(paths), chunksize):
            # select a chunk of filenames
            filepaths = paths[i: (i + chunksize)]
            # submit the task
            future = executor.submit(load_document_batch, filepaths)
            futures.append(future)
        # process all results
        for future in as_completed(futures):
            # open the file and load the data
            contents, _ = future.result()
            docs.extend(contents)

    return docs


def is_iterable(obj):
    try:
        iter(obj)
        return True
    except TypeError:
        return False


In [None]:
def upload_files(dir):
    # Prompt the user to select a file
    uploaded = files.upload()

    # Access the uploaded file
    for filename, content in uploaded.items():
        file_path = dir.joinpath(filename)
        with open(file_path, 'wb') as file:
            file.write(content)
            print(f"Saved at {file_path}")

def download_files(directory_paths):
    # Create zip archives for each directory
    for i, directory_path in enumerate(directory_paths):
        zip_filename = f'/content/directory{i+1}.zip'
        shutil.make_archive(zip_filename.rstrip('.zip'), 'zip', directory_path)

        # Download the zip file
        files.download(zip_filename)

        # Clean up the zip file
        # os.remove(zip_filename)

## Script `generateKB.py`

### Load Whisper Model

In [None]:
model = whisper.load_model(WHISPER_MODEL_NAME)

100%|████████████████████████████████████████| 139M/139M [00:01<00:00, 124MiB/s]


### Define `generateKB`

In [None]:
def transcribe_audios(audio_files):
    # decode the audio
    # options = whisper.DecodingOptions(fp16=(DEVICE_TYPE == "cuda"))

    for audio_file in audio_files:
        # # load audio and pad/trim it to fit 30 seconds
        # audio = whisper.load_audio(audio_file)
        # audio = whisper.pad_or_trim(audio)

        # # make log-Mel spectrogram and move to the same device as the model
        # mel = whisper.log_mel_spectrogram(audio).to(model.device)

        # # detect the spoken language
        # _, probs = model.detect_language(mel)
        # print(f"Detected language: {max(probs, key=probs.get)}")

        # result = whisper.decode(model, mel, options)

        # # print(result.text)
        # if (is_iterable(result)):
        #     for r in result:
        #         print("r.text", r.text)
        # else:
        #     print("result", result.text)

        # # get the audio file name from path and save the transcription
        # file_name = audio_file.name.split('.')[0] + '.txt'
        # save_transcripsion(file_name, result.text)

        try:
            command = f"whisper '{audio_file}' --model {WHISPER_MODEL_NAME} --output_dir {str(TRANSCRIBE_DIR.relative_to(Path.cwd()))} --output_format txt --fp16={True if DEVICE_TYPE == 'cuda' else False}"
            print("command:", command)
            os.system(command)
            print(f"Transcribe of {audio_file} generated at: {TRANSCRIBE_DIR}")
        except Exception as e:
            print(f"Transcribe of {audio_file} failed")


In [None]:
def generateKB():
    # audio_files = get_audio_files()
    # print("audio_files", audio_files)
    # transcribe_audios(audio_files)

    SOURCE_DIRECTORIES = [TRANSCRIBE_DIR, DOCUMENTS_DIR]

    # Load documents and split in chunks
    print(f"Loading documents from {SOURCE_DIRECTORIES} folders")
    documents = load_documents(SOURCE_DIRECTORIES)
    text_splitter = RecursiveCharacterTextSplitter(
        chunk_size=1000, chunk_overlap=200)
    texts = text_splitter.split_documents(documents)
    print(
        f"Loaded {len(documents)} documents from  {SOURCE_DIRECTORIES} folders")
    print(f"Split into {len(texts)} chunks of text")

    # Create embeddings
    embeddings = HuggingFaceInstructEmbeddings(
        model_name=EMBEDDING_MODEL_NAME,
        model_kwargs={"device": DEVICE_TYPE},
    )
    # change the embedding type here if you are running into issues.
    # These are much smaller embeddings and will work for most appications
    # If you use HuggingFaceEmbeddings, make sure to also use the same in the
    # other files

    db = Chroma.from_documents(
        texts,
        embeddings,
        persist_directory=str(KB_DIR.absolute()),
        client_settings=CHROMA_SETTINGS,
    )
    db.persist()
    db = None

## Script `load_model.py`

In [None]:

def load_model(device_type, model_id, model_basename=None):
    """
    Select a model for text generation using the HuggingFace library.
    If you are running this for the first time, it will download a model for you.
    subsequent runs will use the model from the disk.

    Args:
        device_type (str): Type of device to use, e.g., "cuda" for GPU or "cpu" for CPU.
        model_id (str): Identifier of the model to load from HuggingFace's model hub.
        model_basename (str, optional): Basename of the model if using quantized models.
            Defaults to None.

    Returns:
        HuggingFacePipeline: A pipeline object for text generation using the loaded model.

    Raises:
        ValueError: If an unsupported model or device type is provided.
    """

    print(f"Loading Model: {model_id}, on: {device_type}")
    print("This action can take a few minutes!")

    if model_basename is not None:
        # The code supports all huggingface models that ends with GPTQ and have some variation
        # of .no-act.order or .safetensors in their HF repo.
        print("Using AutoGPTQForCausalLM for quantized models")

        if ".safetensors" in model_basename:
            # Remove the ".safetensors" ending if present
            model_basename = model_basename.replace(".safetensors", "")

        tokenizer = AutoTokenizer.from_pretrained(model_id, use_fast=True)
        print("Tokenizer loaded")

        model = AutoGPTQForCausalLM.from_quantized(
            model_id,
            model_basename=model_basename,
            use_safetensors=True,
            trust_remote_code=True,
            device="cuda:0" if device_type == "cuda" else device_type,
            use_triton=False,
            quantize_config=None,
        )
    elif (
        device_type.lower() == "cuda"
    ):  # The code supports all huggingface models that ends with -HF or which have a .bin
        # file in their HF repo.
        print("Using AutoModelForCausalLM for full models")
        tokenizer = AutoTokenizer.from_pretrained(model_id)
        print("Tokenizer loaded")

        model = AutoModelForCausalLM.from_pretrained(
            model_id,
            device_map="auto",
            torch_dtype=torch.float16,
            low_cpu_mem_usage=True,
            trust_remote_code=True,
            # max_memory={0: "15GB"} # Uncomment this line with you encounter CUDA out of memory errors
        )
        model.tie_weights()
    else:
        print("Using LlamaTokenizer")
        tokenizer = LlamaTokenizer.from_pretrained(model_id)
        model = LlamaForCausalLM.from_pretrained(model_id)

    # Load configuration from the model to avoid warnings
    generation_config = GenerationConfig.from_pretrained(model_id)
    # see here for details:
    # https://huggingface.co/docs/transformers/
    # main_classes/text_generation#transformers.GenerationConfig.from_pretrained.returns

    # Create a pipeline for text generation
    pipe = pipeline(
        "text-generation",
        model=model,
        tokenizer=tokenizer,
        max_length=2048,
        temperature=0,
        top_p=0.95,
        repetition_penalty=1.15,
        generation_config=generation_config,
    )

    local_llm = HuggingFacePipeline(pipeline=pipe)
    print("Local LLM Loaded")

    return local_llm


## Script `get_retrieval_qa.py`

In [None]:
def get_retrieval_qa(device_type):
    embeddings = HuggingFaceInstructEmbeddings(
        model_name=EMBEDDING_MODEL_NAME, model_kwargs={"device": device_type})

    # uncomment the following line if you used HuggingFaceEmbeddings in the generateKB.py
    # embeddings = HuggingFaceEmbeddings(model_name=EMBEDDING_MODEL_NAME)

    # load the vectorstore
    db = Chroma(
        persist_directory=KB_DIR,
        embedding_function=embeddings,
        client_settings=CHROMA_SETTINGS,
    )
    retriever = db.as_retriever()

    llm = load_model(device_type, model_id=LLM_MODEL_ID,
                     model_basename=LLM_MODEL_BASENAME)

    qa = RetrievalQA.from_chain_type(
        llm=llm, chain_type="stuff", retriever=retriever, return_source_documents=True)

    return qa

## Script `cyborgGPT.py`

In [None]:
def cyborgGPT(device_type, qa, show_sources):
    """
    This function implements the information retrieval task.


    1. Loads an embedding model, can be HuggingFaceInstructEmbeddings or HuggingFaceEmbeddings
    2. Loads the existing vectorestore that was created by generateKB.py
    3. Loads the local LLM using load_model function - You can now set different LLMs.
    4. Setup the Question Answer retreival chain.
    5. Question answers.
    """

    print(f"Running on: {device_type}")
    print(f"Display Source Documents set to: {show_sources}")
    # qa = get_retrieval_qa(device_type)

    chats = []

    # Interactive questions and answers
    print("----------------------------------Cyborg---------------------------")
    print("Starting QA Session with Cyborg...(Type 'exit' to end session)...")
    print("...................................................................")
    while True:
        query = input("\nEnter a query: ")
        if query == "exit":
            print("...................................................................")
            print("Cyborg: Ending Session, Good Luck!...")
            print("----------------------------------Cyborg---------------------------")
            # save the chat
            chat_txt = ""
            for chat in chats:
                chat_text = f"{chat_txt}\n\nUser: {chat['query']} \nCyborg: {chat['response']}"


            # Get the current timestamp
            timestamp = datetime.now()
            # Convert the timestamp to a string
            timestamp_str = timestamp.strftime("%Y-%m-%d %H-%M-%S")
            save_chat_history(f"chat_{timestamp_str}.txt", chat_text)

            # file_path = CHAT_HISTORY_DIR.joinpath(f"chat_{timestamp_str}.txt")
            # with open(file_path, 'w') as file:
            #     file.write(chat_txt)
            #     print(f"Chat saved at {file_path}")

            break
        # Get the answer from the chain
        res = qa(query)
        answer, docs = res["result"], res["source_documents"]

        # Print the result
        print("\n\n> Question:")
        print(query)
        print("\n> Answer:")
        print(answer)

        chats.append({"query": query, "response": answer})

        if show_sources:  # this is a flag that you can set to disable showing answers.
            # # Print the relevant sources used for the answer
            print(
                "----------------------------------SOURCE DOCUMENTS---------------------------")
            for document in docs:
                print("\n> " + document.metadata["source"] + ":")
                print(document.page_content)
            print(
                "----------------------------------SOURCE DOCUMENTS---------------------------")


# Upload Information (Audios and/or Documents)

## Upload Audios
You can upload multiple audio files. Supported formats: `'.mp3', '.m4a', '.wav'`.

In [None]:
# Prompt the user to select file
upload_files(AUDIO_DIR)

# Print the list of uploaded files
print("Audio files:")
print(*list(os.listdir(AUDIO_DIR)))

Saving convo about bs 23.mp3 to convo about bs 23.mp3
Saved at /content/audios/convo about bs 23.mp3
Audio files:
convo about bs 23.mp3


## Upload aditional documents if need

Relavent documents can be uploaded. Supported file types: `'.txt', '.pdf', '.csv', '.xls', '.xlxs'`.

In [None]:
# Prompt the user to select file
upload_files(DOCUMENTS_DIR)

# Print the list of uploaded files
print("Documents files:")
print(*list(os.listdir(DOCUMENTS_DIR)))

Saving bs23 website.pdf to bs23 website.pdf
Saving businesspostbd.pdf to businesspostbd.pdf
Saving tbsnews news.pdf to tbsnews news.pdf
Saving Top Rich People.csv to Top Rich People.csv
Saved at /content/documents/bs23 website.pdf
Saved at /content/documents/businesspostbd.pdf
Saved at /content/documents/tbsnews news.pdf
Saved at /content/documents/Top Rich People.csv
Documents files:
tbsnews news.pdf bs23 website.pdf Top Rich People.csv businesspostbd.pdf


# Generate Knowledge Base (KB)

It will create a `KB` form your uploaded data. First time it will take time because of some module will be downloaded if required.

In [None]:
audio_files = get_audio_files()
print("audio_files", audio_files)
transcribe_audios(audio_files)

audio_files [PosixPath('/content/audios/convo about bs 23.mp3')]
command: whisper '/content/audios/convo about bs 23.mp3' --model base --output_dir transcriptions --output_format txt --fp16=True
Transcribe of /content/audios/convo about bs 23.mp3 generated at: /content/transcriptions


In [None]:
# generate Knowledgebage
generateKB()

Loading documents from [PosixPath('/content/transcriptions'), PosixPath('/content/documents')] folders
source_dirs [PosixPath('/content/transcriptions'), PosixPath('/content/documents')]
Loading document batch
Loading document batch
Loading document batch
Loaded 5 documents from  [PosixPath('/content/transcriptions'), PosixPath('/content/documents')] folders
Split into 34 chunks of text


Downloading (…)c7233/.gitattributes: 0.00B [00:00, ?B/s]

Downloading (…)_Pooling/config.json:   0%|          | 0.00/270 [00:00<?, ?B/s]

Downloading (…)/2_Dense/config.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

Downloading pytorch_model.bin:   0%|          | 0.00/3.15M [00:00<?, ?B/s]

Downloading (…)9fb15c7233/README.md: 0.00B [00:00, ?B/s]

Downloading (…)b15c7233/config.json: 0.00B [00:00, ?B/s]

Downloading (…)ce_transformers.json:   0%|          | 0.00/122 [00:00<?, ?B/s]

Downloading pytorch_model.bin:   0%|          | 0.00/1.34G [00:00<?, ?B/s]

Downloading (…)nce_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

Downloading (…)cial_tokens_map.json: 0.00B [00:00, ?B/s]

Downloading spiece.model:   0%|          | 0.00/792k [00:00<?, ?B/s]

Downloading (…)c7233/tokenizer.json: 0.00B [00:00, ?B/s]

Downloading (…)okenizer_config.json: 0.00B [00:00, ?B/s]

Downloading (…)15c7233/modules.json:   0%|          | 0.00/461 [00:00<?, ?B/s]

load INSTRUCTOR_Transformer
max_seq_length  512


# Load CyborgGPT

In [None]:
cyborgQA = get_retrieval_qa(DEVICE_TYPE)

load INSTRUCTOR_Transformer
max_seq_length  512
Loading Model: TheBloke/WizardLM-7B-uncensored-GPTQ, on: cuda
This action can take a few minutes!
Using AutoGPTQForCausalLM for quantized models


Downloading (…)okenizer_config.json:   0%|          | 0.00/727 [00:00<?, ?B/s]

Downloading tokenizer.model:   0%|          | 0.00/500k [00:00<?, ?B/s]

Downloading (…)/main/tokenizer.json: 0.00B [00:00, ?B/s]

Downloading (…)in/added_tokens.json:   0%|          | 0.00/21.0 [00:00<?, ?B/s]

Downloading (…)cial_tokens_map.json:   0%|          | 0.00/96.0 [00:00<?, ?B/s]

Tokenizer loaded


Downloading (…)lve/main/config.json:   0%|          | 0.00/552 [00:00<?, ?B/s]

Downloading (…)quantize_config.json:   0%|          | 0.00/57.0 [00:00<?, ?B/s]

Downloading (…)ct-order.safetensors:   0%|          | 0.00/3.89G [00:00<?, ?B/s]



Downloading (…)neration_config.json:   0%|          | 0.00/137 [00:00<?, ?B/s]

Xformers is not installed correctly. If you want to use memory_efficient_attention to accelerate training use the following command to install Xformers
pip install xformers.
The model 'LlamaGPTQForCausalLM' is not supported for text-generation. Supported models are ['BartForCausalLM', 'BertLMHeadModel', 'BertGenerationDecoder', 'BigBirdForCausalLM', 'BigBirdPegasusForCausalLM', 'BioGptForCausalLM', 'BlenderbotForCausalLM', 'BlenderbotSmallForCausalLM', 'BloomForCausalLM', 'CamembertForCausalLM', 'CodeGenForCausalLM', 'CpmAntForCausalLM', 'CTRLLMHeadModel', 'Data2VecTextForCausalLM', 'ElectraForCausalLM', 'ErnieForCausalLM', 'GitForCausalLM', 'GPT2LMHeadModel', 'GPT2LMHeadModel', 'GPTBigCodeForCausalLM', 'GPTNeoForCausalLM', 'GPTNeoXForCausalLM', 'GPTNeoXJapaneseForCausalLM', 'GPTJForCausalLM', 'LlamaForCausalLM', 'MarianForCausalLM', 'MBartForCausalLM', 'MegaForCausalLM', 'MegatronBertForCausalLM', 'MvpForCausalLM', 'OpenLlamaForCausalLM', 'OpenAIGPTLMHeadModel', 'OPTForCausalLM', 'Peg

Local LLM Loaded


# Chat with CyborgGPT

Now `KB` is ready and you are able to ask relavant questions with `CyborgGPT`

In [None]:
# Start Chat with Cyborg Bot
cyborgGPT(DEVICE_TYPE, cyborgQA, SHOW_SOURCES)

Running on: cuda
Display Source Documents set to: False
----------------------------------Cyborg---------------------------
Starting QA Session with Cyborg...(Type 'exit' to end session)...
...................................................................

Enter a query: tell me about bs23


> Question:
tell me about bs23

> Answer:
 BrainStation 23 Limited is a leading software development company based in Bangladesh. Founded in 2006 by Raisul Kabir, the company has since grown to employ over 700+ software engineers and operate in multiple countries worldwide. Our mission is to provide digital solutions that help businesses and organizations achieve their goals while empowering people to succeed. We strive to create a positive impact through innovative thinking and strategic partnerships. Our leadership values include creativity, collaboration, commitment, and customer satisfaction.

Enter a query: Who is the founder of BS23


> Question:
Who is the founder of BS23

> Answer:
 Raisu





> Question:
Can you provide me the members of BS23?

> Answer:
 Unfortunately, I do not have this information. Could you please clarify if you need help finding specific information related to BS23 or if you would like me to provide additional details about the company?

Enter a query: Head of HR of Brainstation 23


> Question:
Head of HR of Brainstation 23

> Answer:
 Hello! How can I assist you today?

Enter a query: Name of Head of HR of Brainstation 23?


> Question:
Name of Head of HR of Brainstation 23?

> Answer:
 The current head of HR of Brainstation 23 is Ms. Farah 
Yasmin.

Enter a query: exit
...................................................................
Cyborg: Ending Session, Good Luck!...
----------------------------------Cyborg---------------------------


In [None]:
# Start Chat with Cyborg Bot with SHOW_SOURCES to True
cyborgGPT(DEVICE_TYPE, cyborgQA, True)

# Download Generated Data

In [None]:
# Directories to download
# directory_paths = [TRANSCRIBE_DIR, KB_DIR, CHAT_HISTORY_DIR]
directory_paths = [CHAT_HISTORY_DIR, TRANSCRIBE_DIR]

In [None]:
download_files(directory_paths)

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>