# FastChat Demo
This notebook uses the [OpenAI REST API](https://platform.openai.com/docs/api-reference/introduction) to interact with LLMs hosted in a [FastChat](https://github.com/lm-sys/FastChat) deployment.
FastChat only supports chat completion and embeddings API endpoints.
For use on [jupyterhub.sdsu.edu](jupyterhub.sdsu.edu) select the image "Stack PRP". The Stack PRP image and FastChat both use Open AI API v0.28.1.

The OpenAI REST API endpoint is availbale at [https://sdsu-rci-fastchat.nrp-nautilus.io/v1](https://sdsu-rci-fastchat.nrp-nautilus.io/v1).

Your credentials should be stored in a file `env.yaml`. The API key will be shared with you via your instructor. Your `env.yaml` file should mimic the structure of the provided sample `env-template.yaml`.

In [1]:
import yaml
import openai

## Import Environment Variables

In [3]:
with open('../env.yaml', 'r') as f:
    env = yaml.safe_load(f)

print(env["fastchat"]["base_url"])

https://sdsu-rci-fastchat-test.nrp-nautilus.io/v1


## Setup API Credentials

In [5]:
openai.api_key = env["fastchat"]["api_key"]
openai.api_base = env["fastchat"]["base_url"]

# Test config by printing available models
models = openai.Model.list()
print(models)

{
  "object": "list",
  "data": [
    {
      "id": "THUDM/CogVLM",
      "object": "model",
      "created": 1711214415,
      "owned_by": "fastchat",
      "root": "THUDM/CogVLM",
      "parent": null,
      "permission": [
        {
          "id": "modelperm-3VgQB7H2hXn5stePKtGZg4",
          "object": "model_permission",
          "created": 1711214415,
          "allow_create_engine": false,
          "allow_sampling": true,
          "allow_logprobs": true,
          "allow_search_indices": true,
          "allow_view": true,
          "allow_fine_tuning": false,
          "organization": "*",
          "group": null,
          "is_blocking": false
        }
      ]
    },
    {
      "id": "mistralai/Mixtral-8x7B-Instruct-v0.1",
      "object": "model",
      "created": 1711214415,
      "owned_by": "fastchat",
      "root": "mistralai/Mixtral-8x7B-Instruct-v0.1",
      "parent": null,
      "permission": [
        {
          "id": "modelperm-RbDZPwxX2vW9oPNb3NJtTq",
         

## Preprocess paper.txt

In [7]:
text_filename = env['file_name']
text_filename

'paper.txt'

In [10]:
transcript_raw = ""

with open(text_filename, 'r') as f:
    transcript_raw = f.read()

# Calculate and print info about raw file
rawCharCount = len(transcript_raw)
rawWordCount = len(transcript_raw.split())
rawLineCount = len(transcript_raw.split("\n"))

print(f"Raw transcript character count: {rawCharCount}")
print(f"Raw transcript word count: {rawWordCount}")
print(f"Raw transcript line count: {rawLineCount}")


Raw transcript character count: 5129
Raw transcript word count: 696
Raw transcript line count: 7


In [13]:
# Process transcript as a list to make it iterable
transcript_transform = transcript_raw.split("\n")
transcript_concat = "".join(transcript_transform)

# some models have a word limit in which you can change here, though not limiting this word count might be okay as some tokenizers can handle extra words
final_sentence = transcript_concat[:]

## Ask the LLM to Perform the Analysis


In [7]:
# Model can be replaced with the model id from the previous call
# "vicuna-33b-v1.3" is the second model
model = models.data[1].id

initial_prompt = "You will be given the introduction to a scientific paper in Artificial Intelligence. \
From this introduction: Provide the top 3 items discussed and what the researcher is trying to accomplish."

prompt = final_sentence

# create a chat completion
completion = openai.ChatCompletion.create(
  model=model,
  messages=[
      {"role": "system", "content": initial_prompt},
      {"role": "user", "content": prompt}
  ]
)

# print the completion
print(completion.choices[0].message.content)

The researcher is trying to accomplish the following three items:

1. Introduce a new attention mechanism called the Gaussian Adaptive Attention Mechanism (GAAM) that improves the standard self-attention mechanism in Transformers.
2. Enhance the Transformer model's Attention mechanism to improve its ability to interpret and process sequential and spatial data.
3. Improve the performance of the Transformer model across various domains such as multimedia recommendation, image classification, and text classification by using GAAM.
4. Provide a more interpretable framework for Artificial Intelligence (AI) by introducing GAAM, which offers improved accuracy, robustness, and user experience across diverse real-world applications.


In [17]:
#How to use the stream parameter
model = models.data[1].id

initial_prompt = "You will be given the introduction to a scientific paper in Artificial Intelligence. \
From this introduction: Provide the top 3 items discussed and what the researcher is trying to accomplish."

prompt = final_sentence

# create a chat completion
completion = openai.ChatCompletion.create(
  model=model,
  messages=[
      {"role": "system", "content": initial_prompt},
      {"role": "user", "content": prompt}
  ],
 stream=True
)

# print the completion
for chunk in completion:
    try:
        print(chunk.choices[0].delta.content)
    except (KeyError,AttributeError):
        continue



The
 researcher
 is trying
 to accomplish
 the following
:


1
. Int
roduce
 a significant
 enhancement
 to the
 Transformer
 model'
s Att
ention mechanism
 by introdu
cing the
 (Multi
-Head
) Gaussian
 Adapt
ive Att
ention Mechan
ism (
GAAM
).

2.
 Impro
ve the
 Transformer
 model'
s cap
ability to
 interpret and
 process sequ
ential and
 spatial data
.

3.
 Impro
ve model
 performance in
 various domains
 such as
 multimedia recommendation
, image
 classification,
 and text
 classification.

4
. Prov
ide a
 more interpre
table framework
 for Art
ificial
 Intelligence
 by address
ing the
 critical need
 for trans
parency
 and trust
worthiness
 in real
-world
 AI
 systems.

5
. Create
 a mechanism
 that lear
ns both
 the mean
 and variance
 of input
 features in
 a Multi
-Head
ed setting
, allowing
 for a
 focused approach
 to different
 skewn
ess aspects
 in data
 subsets.

6
. Gener
ate local
 attention outputs
 from each
 head and
 combine them
 to construct
 a compreh
ensive Global

In [17]:
#How to use the stream parameter to collect chuks of full sentences
model = models.data[1].id

initial_prompt = "You will be given the introduction to a scientific paper in Artificial Intelligence. \
From this introduction: Provide the top 3 items discussed and what the researcher is trying to accomplish."

prompt = final_sentence

# create a chat completion
completion = openai.ChatCompletion.create(
  model=model,
  messages=[
      {"role": "system", "content": initial_prompt},
      {"role": "user", "content": prompt}
  ],
 stream=True
)

punctuation_marks = [",","?","!",".",":","*"]
sentence = ""
# print the completion
for chunk in completion:
    try:
        for char in chunk.choices[0].delta.content:
            if char in punctuation_marks:
                sentence += char 
                print(sentence)
                sentence = ""
            else:
                sentence += char
            
    except (KeyError,AttributeError):
        continue

The top three items discussed in the introduction are:


1.
 Attention mechanisms have significantly advanced the field of sequence modeling,
 particularly in NLP and various branches of signal processing.

2.
 Attention mechanisms in Transformer models have inherent limitations in handling long-range dependencies due to quadratic complexity.

3.
 Ongoing research continues to address these challenges and seeks more efficient ways to model long sequences and capture global context dependencies.


The researcher is trying to accomplish the following:


1.
 Introduce a significant enhancement to the Transformer model's Attention mechanism:
 the (Multi-Head) Gaussian Adaptive Attention Mechanism (GAAM).

2.
 Improve the capability of the Transformer model to interpret and process sequential and spatial data using a more context-sensitive and interpretable approach.

3.
 Enhance the model performance in various domains,
 including multimedia recommendation,
 image classification,
 and text