# Perplexity LLM API
- [blog post introducing it](https://blog.perplexity.ai/blog/introducing-pplx-api)
- [api docs](https://docs.perplexity.ai/docs)
- [quickstart for chat completions](https://docs.perplexity.ai/reference/post_chat_completions)
- available models
    - codellama-34b-instruct, 16384
    - llama-2-70b-chat, 4096	
    - mistral-7b-instruct, 4096	
    - pplx-7b-chat, 8192	
    - pplx-70b-chat, 4096	
    - pplx-7b-online, 4096	
    - pplx-70b-online, 4096	

In [140]:
import openai
import os
import pandas as pd
import numpy as np
from pathlib import Path

from langchain.embeddings import HuggingFaceEmbeddings
from langchain.vectorstores import FAISS
from langchain.text_splitter import RecursiveCharacterTextSplitter
from sentence_transformers import SentenceTransformer
from langchain.document_loaders import UnstructuredMarkdownLoader
from langchain.document_loaders import TextLoader

In [2]:
## TODO not sourcing from bashrc, investigate why
#PERPLEXITY_API_KEY = os.environ.get('PERPLEXITY_API_KEY')

In [37]:
PERPLEXITY_API_KEY=''

## Sample Code Structure they provide
* I updated the actual prompt though

In [4]:
messages = [
    {
        "role": "system",
        "content": (
            "You are an artificial intelligence assistant and you need to "
            "engage in a helpful, detailed, polite conversation with a user."
        ),
    },
    {
        "role": "user",
        "content": (
            "What are some simple tricks to improve my aim at darts?"
        ),
    },
]

# demo chat completion without streaming
response = openai.ChatCompletion.create(
    model="mistral-7b-instruct",
    messages=messages,
    api_base="https://api.perplexity.ai",
    api_key=PERPLEXITY_API_KEY,
)
print(response)

{
  "id": "4abe46d6-fbb2-4825-bde6-07df992c8df2",
  "model": "mistral-7b-instruct",
  "created": 7617692,
  "usage": {
    "prompt_tokens": 46,
    "completion_tokens": 89,
    "total_tokens": 135
  },
  "object": "chat.completion",
  "choices": [
    {
      "index": 0,
      "finish_reason": "stop",
      "message": {
        "role": "assistant",
        "content": "Hello! I'd be happy to help improve your aim at darts. Here are a few simple tricks that you may find helpful:\n\n1. First and foremost, it's important to hold the dart properly. Hold it with your dominant hand and make sure that the point of the dart is facing forward. Use your non-dominant hand to steady the dart as you release it.\n2. ..."
      },
      "delta": {
        "role": "assistant",
        "content": ""
      }
    }
  ]
}


## Streaming Example

In [5]:
# # demo chat completion with streaming
# response_stream = openai.ChatCompletion.create(
#     model="mistral-7b-instruct",
#     messages=messages,
#     api_base="https://api.perplexity.ai",
#     api_key=PERPLEXITY_API_KEY,
#     stream=True, #BE CAREFUL WITH THIS
# )
# for response in response_stream:
#     print(response)

## Only print the response message

In [6]:
response['choices'][0]['message']['content']

"Hello! I'd be happy to help improve your aim at darts. Here are a few simple tricks that you may find helpful:\n\n1. First and foremost, it's important to hold the dart properly. Hold it with your dominant hand and make sure that the point of the dart is facing forward. Use your non-dominant hand to steady the dart as you release it.\n2. ..."

## Quick Math on Current Pricing

In [7]:
pricing = pd.read_csv('pricing_for_perplexity_api.csv')

In [8]:
pricing

Unnamed: 0,model_parameter_count,per1m_input_tokens,per1m_output_tokens
0,7B,$0.07,$0.28
1,13B,$0.14,$0.56
2,34B,$0.35,$1.40
3,70B,$0.70,$2.80


In [9]:
pricing['per1m_input_tokens'] = pricing['per1m_input_tokens'].str.replace('$', '').astype(float)
pricing['per1m_output_tokens'] = pricing['per1m_output_tokens'].str.replace('$', '').astype(float)

In [10]:
pricing

Unnamed: 0,model_parameter_count,per1m_input_tokens,per1m_output_tokens
0,7B,0.07,0.28
1,13B,0.14,0.56
2,34B,0.35,1.4
3,70B,0.7,2.8


In [20]:
def cost_of_message(response,pricing=pricing):
    'return the cost of the individual message in USD'
    model_type = response['model'].split('-')[-2].upper()
    input_tokens = response['usage']['prompt_tokens']
    output_tokens = response['usage']['completion_tokens']

    input_rate = pricing[pricing.model_parameter_count == model_type].per1m_input_tokens
    output_rate = pricing[pricing.model_parameter_count == model_type].per1m_output_tokens

    input_cost = input_tokens * input_rate / 1_000_000
    output_cost = output_tokens * output_rate / 1_000_000

    cost = input_cost + output_cost
    return cost

In [21]:
cost_of_message(response=response,pricing=pricing)

0    0.000028
dtype: float64

In [61]:
system_prompt = {
    'role': 'system',
     'content': 'You are an artificial intelligence assistant and you need to engage in a helpful, detailed, polite conversation with a user.'
    }

{'role': 'system',
 'content': 'You are an artificial intelligence assistant and you need to engage in a helpful, detailed, polite conversation with a user.'}

## Function to get answers

In [52]:
def ask_ppl(question,
            system_prompt=system_prompt,
            max_tokens=2048, 
            model="mistral-7b-instruct",
            api_key=PERPLEXITY_API_KEY):

    user_prompt = {
        "role": "user",
        "content": (
           question
        ),
    }

    message= [
            system_prompt,
            user_prompt
        ]
    response = openai.ChatCompletion.create(
        model=model,
        messages=message,
        max_tokens=max_tokens,
        api_base="https://api.perplexity.ai",
        api_key=PERPLEXITY_API_KEY,
    )

    return response['choices'][0]['message']['content']

In [41]:
question

'who in your estimation are the most influential computer scientists of the 21st century?'

## Test Message Formatting

In [42]:
user_prompt = {
    "role": "user",
    "content": (
       question
    ),
}

message= [
        system_prompt,
        user_prompt
    ]

message

[{'role': 'system',
  'content': 'You are an artificial intelligence assistant and you need to engage in a helpful, detailed, polite conversation with a user.'},
 {'role': 'user',
  'content': 'who in your estimation are the most influential computer scientists of the 21st century?'}]

## Test Function

In [43]:
max_tokens = 2048
model='mistral-7b-instruct'

In [46]:
models = [
    'codellama-34b-instruct',
    'llama-2-70b-chat',
    'mistral-7b-instruct',
    'pplx-7b-chat',
    'pplx-70b-chat',
]

In [62]:
model = models[1]

In [63]:
response = openai.ChatCompletion.create(
    model=model,
    messages=message,
    max_tokens=max_tokens,
    api_base="https://api.perplexity.ai",
    api_key=PERPLEXITY_API_KEY,
)

## Full Record

In [64]:
response

<OpenAIObject chat.completion id=ca2617b2-ffb2-48c0-a362-fa750efbd964 at 0x7f01502d5270> JSON: {
  "id": "ca2617b2-ffb2-48c0-a362-fa750efbd964",
  "model": "llama-2-70b-chat",
  "created": 7699435,
  "usage": {
    "prompt_tokens": 67,
    "completion_tokens": 769,
    "total_tokens": 836
  },
  "object": "chat.completion",
  "choices": [
    {
      "index": 0,
      "finish_reason": "stop",
      "message": {
        "role": "assistant",
        "content": "Thank you for asking! There have been many influential computer scientists in the 21st century who have made significant contributions to the field. Here are a few of the most notable ones in my estimation:\n\n1. Andrew Ng - Known for his work in AI, machine learning, and deep learning, Andrew Ng is a pioneer in the field of computer science. He is the co-founder of Coursera, an online learning platform, and has worked at Google, where he founded the Google Brain deep learning project.\n2. Yann LeCun - Yann LeCun is a computer sci

## Just chat response

In [65]:
response['choices'][0]['message']['content']

'Thank you for asking! There have been many influential computer scientists in the 21st century who have made significant contributions to the field. Here are a few of the most notable ones in my estimation:\n\n1. Andrew Ng - Known for his work in AI, machine learning, and deep learning, Andrew Ng is a pioneer in the field of computer science. He is the co-founder of Coursera, an online learning platform, and has worked at Google, where he founded the Google Brain deep learning project.\n2. Yann LeCun - Yann LeCun is a computer scientist and the director of AI Research at Facebook. He is also the Silver Professor of Computer Science at New York University, and a professor at the Courant Institute of Mathematical Sciences. He is one of the founding researchers of convolutional neural networks (CNNs), and was a founding member of the image-recognition startup, Numenta\n3. Geoffrey Hinton - Geoffrey Hinton is a computer scientist and cognitive psychologist who is considered one of the lea

## Compare Model Answers

In [66]:
models

['codellama-34b-instruct',
 'llama-2-70b-chat',
 'mistral-7b-instruct',
 'pplx-7b-chat',
 'pplx-70b-chat']

In [57]:
question = 'who in your estimation are the most influential computer scientists of the 21st century?'

In [58]:
model_answers = {}

for model in models:
    answer = ask_ppl(question=question, model=model)
    model_answers[model] = answer

In [60]:
model_answers.keys()

dict_keys(['codellama-34b-instruct', 'llama-2-70b-chat', 'mistral-7b-instruct', 'pplx-7b-chat', 'pplx-70b-chat'])

In [67]:
model_answers['pplx-70b-chat']

'1. Turing Award winners: The Turing Award is considered the highest distinction in computer science, and winners of this award have undoubtedly made significant contributions to the field. Some notable Turing Award winners in the 21st century include:\n\n   - Shafi Goldwasser and Silvio Micali (2012) - For their work on cryptography and secure distributed computation\n   - Leslie Valiant (2010) - For his work on computational complexity and learning\n   - Michael Stonebraker (2014) - For his work on database systems\n\n2. Authors of seminal papers: Some computer scientists have published papers that have had a significant impact on the field. A few examples include:\n\n   - Sergey Brin and Lawrence Page (1998) - For their paper on the PageRank algorithm, which revolutionized search engines\n   - Jon Kleinberg and David Easley (2010) - For their paper on network science, which has applications in various fields\n   - Yoshua Bengio, Efstratios Gavves, and Dominique Bechmann (2015) - For

# Use as a RAG

## Embed Articles from The Last Psychiatrist

In [69]:
# use tiny model for embeddings
model_id = "sentence-transformers/all-MiniLM-L6-v2"
embeddings = HuggingFaceEmbeddings(model_name=model_id)

.gitattributes: 100%|██████████████████████████████████████████████████████████████████████████████| 1.18k/1.18k [00:00<00:00, 191kB/s]
1_Pooling/config.json: 100%|██████████████████████████████████████████████████████████████████████████| 190/190 [00:00<00:00, 24.6kB/s]
README.md: 100%|██████████████████████████████████████████████████████████████████████████████████| 10.6k/10.6k [00:00<00:00, 6.21MB/s]
config.json: 100%|█████████████████████████████████████████████████████████████████████████████████████| 612/612 [00:00<00:00, 348kB/s]
config_sentence_transformers.json: 100%|██████████████████████████████████████████████████████████████| 116/116 [00:00<00:00, 47.6kB/s]
data_config.json: 100%|███████████████████████████████████████████████████████████████████████████| 39.3k/39.3k [00:00<00:00, 16.0MB/s]
pytorch_model.bin: 100%|██████████████████████████████████████████████████████████████████████████| 90.9M/90.9M [00:02<00:00, 41.0MB/s]
sentence_bert_config.json: 100%|████████████████

In [72]:
# Load text data from a file using TextLoader
loader = TextLoader('data/who_can_know_how_much_randi_zu.txt')
documents = loader.load()

In [73]:
# format it for processing by embedding model
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=50)
text_chunks = text_splitter.split_documents(documents)

In [96]:
vectorstore_of_docs = FAISS.from_documents(text_chunks, embedding=embeddings)

## Function to prep docs for encoding

In [159]:
def load_text_files_for_encoding(file_names, dir):
    documents = []
    for text_file in file_names:
        loader = TextLoader(str(directory) + '/' + text_file)
        documents.extend(loader.load())
    
    text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=50)
    chunked_documents = text_splitter.split_documents(documents)

    return chunked_documents

## Encoding a bunch of documents

In [160]:
directory = Path('/home/nick/Documents/repos/llm_playground/data')

file_names = [file.name for file in directory.glob('*.txt')]

chunked_documents = load_text_files_for_encoding(file_names, directory)

In [164]:
vectorstore_of_docs = FAISS.from_documents(chunked_documents, embedding=embeddings)

## Test the embeddings

In [165]:
related_text = vectorstore_of_docs.similarity_search('narcissism',top_k=3)

In [166]:
related_text

[Document(page_content="who like her-- the people she has to settle for-- are... not great.Genetics took care of her body but the upbringing affected her vision: the childhood of never good enough filters her present reality, obscures it, she can't see what is plain to everyone else, e.g. she's beautiful. So the process is to uncover the reasons why her view of reality is distorted and help her realign with reality. Use insight to strengthen her damaged ego, or, if you want a ten step approach, block automatic thoughts. In short, to understand that she is good, that men do find her attractive, not just the brazen ones, not just jerks.IV.If you think of narcissism as grandiosity you miss the nuances, e.g. in her case the problem is narcissism without any grandiosity: she is so consumed with her identity (as not pretty) that she is not able to read, to empathize with, other people's feelings. She doesn't care to try because it conflicts with how she sees herself. Ergo: Giorgio Armani was

## Ask Question with Context

In [167]:
def ask_ppl(question,
            system_prompt=system_prompt,
            max_tokens=2048, 
            model="mistral-7b-instruct",
            api_key=PERPLEXITY_API_KEY,
           context=False,
           top_k=5,
           vectorstore=False):
    'Ask the perplexity api a question, with or without context, using whichever model you prefer'

    if context == True:
        # find relevant text and add it to question
        related_text = vectorstore.similarity_search(question,top_k)
        context = ' / '.join([related_text[i].page_content for i in range(top_k)])
        files_used = ','.join(set([related_text[i].metadata['source'] for i in range(top_k)]))
        context_string = f'{context} || from the files {files_used}'
        # update question string
        question =  f'''Use the following pieces of context to answer the question at the end. 
            Try to make the response brief and only answer the portion after Question:
            Do not refer to the text from this prompt outside of the context.
            Structure your answers as well composed english sentences and do not include '\n'
            
            {context_string}
            
            Question: {question}
            Answer: '''

    user_prompt = {
        "role": "user",
        "content": (
           question
        ),
    }
    
    message= [
            system_prompt,
            user_prompt
        ]
    
    response = openai.ChatCompletion.create(
        model=model,
        messages=message,
        max_tokens=max_tokens,
        api_base="https://api.perplexity.ai",
        api_key=PERPLEXITY_API_KEY,
    )

    return response['choices'][0]['message']['content']

In [168]:
test_query = 'Why is Randi Zuckerberg well known?'

In [169]:
ask_ppl(test_query)

'Randi Zuckerberg is well known because she is the sister of Mark Zuckerberg, the founder and CEO of Facebook. She is also an entrepreneur and a philanthropist in her own right, and has been involved in several successful ventures and initiatives. Randi has been featured in numerous media outlets and has been recognized for her work and achievements in the tech industry.'

In [170]:
ask_ppl(question=test_query,
            system_prompt=system_prompt,
            max_tokens=2048, 
            model="mistral-7b-instruct",
            api_key=PERPLEXITY_API_KEY,
           context=True,
           top_k=5,
       vectorstore=vectorstore_of_docs)

"Randi Zuckerberg is well known because she is the CEO of Zuckerberg Media, which was invited to the World Economic Forum at Davos. She also wrote a book about social media and gives interviews. It's important to note that she has not been invited because of nepotism or because of her brother, but because of her own achievements and contributions. The media has recognized her expertise and interested in her opinions, though she believes that anyone who disagrees with her is a hater."

In [None]:
ask_ppl(question=test_query,
            system_prompt=system_prompt,
            max_tokens=2048, 
            model="llama-2-70b-chat",
            api_key=PERPLEXITY_API_KEY,
           context=True,
           top_k=5,
       vectorstore=vectorstore_of_docs)

In [173]:
test_query_2 = "what is narcissism, according to the Last Psychiatrist?"

In [174]:
ask_ppl(question=test_query_2,
            system_prompt=system_prompt,
            max_tokens=2048, 
            model="mistral-7b-instruct",
            api_key=PERPLEXITY_API_KEY,
           context=True,
           top_k=5,
       vectorstore=vectorstore_of_docs)



In [175]:
ask_ppl(question=test_query_2,
            system_prompt=system_prompt,
            max_tokens=2048, 
            model="llama-2-70b-chat",
            api_key=PERPLEXITY_API_KEY,
           context=True,
           top_k=5,
       vectorstore=vectorstore_of_docs)

"According to the Last Psychiatrist, narcissism is a pathology characterized by an inability to see things from other people's perspectives, a need to maintain a particular identity, and a tendency to reject others who do not mirror one's own desires. It is also associated with a lack of self-awareness and a tendency to blame external factors for one's own problems. The Last Psychiatrist suggests that narcissism is often treated in a way that reinforces the narcissist's existing beliefs and behaviors, rather than challenging them to change."