<a href="https://colab.research.google.com/github/Chirag314/LLM/blob/main/LLM2.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [23]:
!pip install -Uqqq rich openai tiktoken wandb tenacity

In [24]:
import tiktoken
import wandb
from pprint import pprint
from wandb.integration.openai import autolog
import random

In [25]:
import os
from getpass import getpass
import openai
from pathlib import Path
from rich.markdown import Markdown
import pandas as pd
from tenacity import(
    retry,
    stop_after_attempt,
    wait_random_exponential,
)


In [26]:
if os.getenv("OPENAI_API_KEY") is None:
  if any(['COLAB' in x for x in os.environ.keys()]):
    print('Please enter password in the prompt at the top of your window!')
  os.environ["OPENAI_API_KEY"] = getpass("Paste your OpenAI key from: https://platform.openai.com/account/api-keys\n")
  openai.api_key = os.getenv("OPENAI_API_KEY", "")

In [27]:
# For logging into W&B
autolog({"project":"llmapps","job_type":"introduction"})

VBox(children=(Label(value='0.027 MB of 0.027 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
usage/completion_tokens,▁▂█
usage/elapsed_time,▁▁█
usage/prompt_tokens,▁▁█
usage/total_tokens,▁▁█

0,1
usage/completion_tokens,358.0
usage/elapsed_time,4.6468
usage/prompt_tokens,218.0
usage/total_tokens,576.0


[34m[1mwandb[0m: Currently logged in as: [33mcdesai[0m. Use [1m`wandb login --relogin`[0m to force relogin


GENERATING SYNTHETIC SUPPORT QUESTIONS

In [28]:
# ADD A RETRY BEHAVIOUR IN CASE WE HIT THE API RATE LIMIT
@retry(wait=wait_random_exponential(min=1,max=60),stop=stop_after_attempt(6))
def completion_with_backoff(**kwargs):
  return openai.ChatCompletion.create(**kwargs)

In [29]:
model_name="gpt-3.5-turbo"

In [30]:
system_prompt="You are a helpful assistant"
user_prompt="Generate a support question from W & B user"

def generate_and_print(system_prompt,user_prompt,n=5):
  messages=[
      {"role":"system","content":system_prompt},
      {"role":"user","content":user_prompt},
  ]
  responses=completion_with_backoff(
      model=model_name,
      messages=messages,
      n=n

  )
  for response in responses.choices:
    generation =response.message.content
    display(Markdown(generation))

generate_and_print(system_prompt,user_prompt)

In [31]:
#test if examples.txt is present, download it if not
if not Path("examples.txt").exists():
  !wget https://raw.githubusercontent.com/wandb/edu/main/llm-apps-course/notebooks/examples.txt

In [32]:
delimiter='\t'
with open('examples.txt','r') as file:
  data=file.read()
  real_queries=data.split(delimiter)

pprint(f"We have {len(real_queries)} real queries:")
Markdown(f"Sample one :\n\"{random.choice(real_queries)}\"")

'We have 228 real queries:'


We can now use those real user questions to guide our model to produce synthetic questions like those

In [33]:
def generate_few_shot_prompts(queries,n=3):
  prompt="Generate a support question from an W&B user\n"+\
        "Below you will find a few examples of real user queries"
  for _ in range(n):
    prompt +=random.choice(queries) + "\n"
  prompt +="Let's start!"
  return prompt

generation_prompt=generate_few_shot_prompts(real_queries)
Markdown(generation_prompt)

Lets check this

In [34]:
generate_and_print(system_prompt,user_prompt=generation_prompt)

Add context and response

#Check if directory exists, if not create it and downlaod the samples
if not os.path.exists("../notebooks"):
  !git clone https://github.com/wandb/edu.git
  !cp -r edu/llm-apps-course/docs_sample/notebooks ../

In [51]:
# check if directory exists, if not, create it and download the files, e.g if running in colab
if not os.path.exists("../docs_sample/"):
  !git clone https://github.com/wandb/edu.git
  !cp -r edu/llm-apps-course/docs_sample ../

In [52]:
def find_md_files(directory):
  "Find all markdown files in a directory and return their contents"
  md_files=[]
  for file in Path(directory).rglob("*.md"):
    with open(file,'r',encoding='utf-8') as md_file:
      content=md_file.read()
    md_files.append((file.relative_to(directory),content))
    return md_files

documents=find_md_files('../docs_sample/')
len(documents)

1

In [53]:
#LEts check if documents are not too long for our context window
tokenizer=tiktoken.encoding_for_model(model_name)
tokens_per_document=[len(tokenizer.encode(document)) for _,document in documents]
pprint(tokens_per_document)

[956]


Some of them are too long , extract random chunk from a document



In [54]:
def extract_random_chunk(document,max_tokens=512):
  tokens=tokenizer.encode(document)
  if len(tokens)<=max_tokens:
    return document
  start=random.randint(0,len(tokens)-max_tokens)
  end=start+max_tokens
  return tokenizer.decode(tokens[start:end])

Now we will use that extracted chunk to create a question tat can be answered by the document. This way we can generate questions that our current documentation is capable of answering.

In [55]:
def generate_context_prompt(chunk):
  prompt = "Generate a support question from a W&B user\n"+\
  "The question should be answerable by provided fragment of W&B documentation.\n" +\
  "Below you will find a fragment of W&B documenatation:\n"+\
  chunk + "\n"+\
  "Let's start!"
  return prompt

chunk=extract_random_chunk(documents[0][1])
generation_prmopt=generate_context_prompt(chunk)


In [56]:
Markdown(generation_prmopt)

In [43]:
generate_and_print(system_prompt,generation_prompt,n=3)

LEVEL 5 PROMPT
COMPLEX DIRECTIVE THAT INCLUDE:
1. Description of high level prompt
2. A detailed bulleted list of sub-tasks.
3. An explicit statement asking LLM to explain its own output
4. A guidance on how LLM output will be evaluated
5. Few shot examples

In [80]:
# we will use GPT4 model
model_name='gpt-3.5-turbo'

In [57]:
#test if examples.txt is present, download it if not
if not Path("system_template.txt").exists():
  !wget https://raw.githubusercontent.com/wandb/edu/main/llm-apps-course/notebooks/system_template.txt

--2023-08-30 22:59:10--  https://raw.githubusercontent.com/wandb/edu/main/llm-apps-course/notebooks/system_template.txt
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 469 [text/plain]
Saving to: ‘system_template.txt’


2023-08-30 22:59:10 (22.1 MB/s) - ‘system_template.txt’ saved [469/469]



In [59]:
# read system_template.txt file
with open("system_template.txt","r") as file:
  system_prompt=file.read()

In [60]:
Markdown(system_prompt)

In [62]:
#test if examples.txt is present, download it if not
if not Path("prompt_template.txt").exists():
  !wget https://raw.githubusercontent.com/wandb/edu/main/llm-apps-course/notebooks/prompt_template.txt

--2023-08-30 23:01:51--  https://raw.githubusercontent.com/wandb/edu/main/llm-apps-course/notebooks/prompt_template.txt
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1054 (1.0K) [text/plain]
Saving to: ‘prompt_template.txt’


2023-08-30 23:01:51 (38.2 MB/s) - ‘prompt_template.txt’ saved [1054/1054]



In [63]:
#read prompt_template.txt into an f-string
with open("prompt_template.txt","r") as file:
  prompt_template=file.read()

In [64]:
Markdown(prompt_template)

In [65]:
def generate_context_prompt(chunk,n_questions=3):
  questions = '\n'.join(random.sample(real_queries,n_questions))
  user_prompt=prompt_template.format(QUESTIONS=questions,CHUNK=chunk)
  return user_prompt
user_prompt=generate_context_prompt(chunk)

In [66]:
Markdown(user_prompt)

In [81]:
def generate_questions(documents,n_questions=3,n_generations=5):
  questions=[]
  for _, document in documents:
    chunk=extract_random_chunk(document)
    user_prompt=generate_context_prompt(chunk,n_questions)
    messages=[
        {"role":"system","content":system_prompt},
        {"role":"user","content":user_prompt},
    ]
    response=completion_with_backoff(
        model=model_name,
        messages=messages,
        n=n_generations
    )
    questions.extend([response.choices[i].message.content for i in range(n_generations)])
  return questions




In [82]:
# function to parse model generation and extract context, question and answer

def parse_generation(generation):
  lines=generation.split("\n")
  context=[]
  question=[]
  answer=[  ]
  flag=None

  for line in lines:
    if "CONTEXT:" in line:
      flag='context'
      line=line.replace("CONTEXT:","").strip()
    elif "QUESTION:" in line:
      flag = "question"
      line = line.replace("QUESTION:", "").strip()
    elif "ANSWER:" in line:
      flag = "answer"
      line = line.replace("ANSWER:", "").strip()

    if flag=='context':
      context.append(line)
    elif flag=='question':
      question.append(line)
    elif flag=='answer':
      answer.append(line)
  context='\n'.join(context)
  question='\n'.join(question)
  answer='\n'.join(answer)

  return context,question,answer


In [83]:
generations=generate_questions([documents[0]],n_questions=3,n_generations=5)


In [84]:
parse_generation(generations[0])

('The user is working on a machine learning project and is using Weights & Biases (W&B) to track and visualize their experiments. They have multiple hyperparameters to explore using sweeps, but they want to avoid certain combinations of hyperparameters that may cause GPU memory issues. The user is wondering if there is a way to prune unwanted combinations of hyperparameters during a sweep in W&B.\n',
 'How can I prune unwanted combinations of hyperparameters during a sweep in Weights & Biases?\n',
 'Yes, you can prune unwanted combinations of hyperparameters during a sweep in Weights & Biases. While there is not a built-in feature similar to optuna\'s special exception to cancel a run, you can achieve a similar result by using tags in Weights & Biases. You can define a threshold for a specific metric, and if a combination of hyperparameters results in a value below the threshold, you can add a specific tag to that run. Then, you can filter out or ignore runs with that tag when analyzin

In [85]:
parsed_generations=[]
generations=generate_questions(documents,n_questions=3,n_generations=5)

for generation in generations:
  context, question,answer=parse_generation(generation)
  parsed_generations.append({"context":context,"question":question,"answer":answer})

  #convert to pandas dataframe and save locally
  df=pd.DataFrame(parsed_generations)
  df.to_csv('generated_examples.csv',index=False)

  #log df as a table to W&B for interactive exploration
  wandb.log({'generated_examples': wandb.Table(dataframe=df)})

  #log csv file as an artifact
  artifact=wandb.Artifact("generated_examples",type='dataset')
  artifact.add_file("generated_examples.csv")
  wandb.log_artifact(artifact)

In [86]:
wandb.finish()

0,1
usage/completion_tokens,▁▂▁█▆
usage/elapsed_time,▁▁▁█▆
usage/prompt_tokens,▁▂▂██
usage/total_tokens,▁▂▁█▆

0,1
usage/completion_tokens,1488.0
usage/elapsed_time,9.35255
usage/prompt_tokens,1036.0
usage/total_tokens,2524.0
