# Perplexity LLM API
- [blog post introducing it](https://blog.perplexity.ai/blog/introducing-pplx-api)
- [api docs](https://docs.perplexity.ai/docs)
- [quickstart for chat completions](https://docs.perplexity.ai/reference/post_chat_completions)
- available models
    - codellama-34b-instruct, 16384
    - llama-2-70b-chat, 4096	
    - mistral-7b-instruct, 4096	
    - pplx-7b-chat, 8192	
    - pplx-70b-chat, 4096	
    - pplx-7b-online, 4096	
    - pplx-70b-online, 4096	

In [1]:
import openai
import os
import pandas as pd
import numpy as np

In [2]:
## TODO not sourcing from bashrc, investigate why
#PERPLEXITY_API_KEY = os.environ.get('PERPLEXITY_API_KEY')

In [37]:
PERPLEXITY_API_KEY=''

## Sample Code Structure they provide
* I updated the actual prompt though

In [4]:
messages = [
    {
        "role": "system",
        "content": (
            "You are an artificial intelligence assistant and you need to "
            "engage in a helpful, detailed, polite conversation with a user."
        ),
    },
    {
        "role": "user",
        "content": (
            "What are some simple tricks to improve my aim at darts?"
        ),
    },
]

# demo chat completion without streaming
response = openai.ChatCompletion.create(
    model="mistral-7b-instruct",
    messages=messages,
    api_base="https://api.perplexity.ai",
    api_key=PERPLEXITY_API_KEY,
)
print(response)

{
  "id": "4abe46d6-fbb2-4825-bde6-07df992c8df2",
  "model": "mistral-7b-instruct",
  "created": 7617692,
  "usage": {
    "prompt_tokens": 46,
    "completion_tokens": 89,
    "total_tokens": 135
  },
  "object": "chat.completion",
  "choices": [
    {
      "index": 0,
      "finish_reason": "stop",
      "message": {
        "role": "assistant",
        "content": "Hello! I'd be happy to help improve your aim at darts. Here are a few simple tricks that you may find helpful:\n\n1. First and foremost, it's important to hold the dart properly. Hold it with your dominant hand and make sure that the point of the dart is facing forward. Use your non-dominant hand to steady the dart as you release it.\n2. ..."
      },
      "delta": {
        "role": "assistant",
        "content": ""
      }
    }
  ]
}


## Streaming Example

In [5]:
# # demo chat completion with streaming
# response_stream = openai.ChatCompletion.create(
#     model="mistral-7b-instruct",
#     messages=messages,
#     api_base="https://api.perplexity.ai",
#     api_key=PERPLEXITY_API_KEY,
#     stream=True, #BE CAREFUL WITH THIS
# )
# for response in response_stream:
#     print(response)

## Only print the response message

In [6]:
response['choices'][0]['message']['content']

"Hello! I'd be happy to help improve your aim at darts. Here are a few simple tricks that you may find helpful:\n\n1. First and foremost, it's important to hold the dart properly. Hold it with your dominant hand and make sure that the point of the dart is facing forward. Use your non-dominant hand to steady the dart as you release it.\n2. ..."

## Quick Math on Current Pricing

In [7]:
pricing = pd.read_csv('pricing_for_perplexity_api.csv')

In [8]:
pricing

Unnamed: 0,model_parameter_count,per1m_input_tokens,per1m_output_tokens
0,7B,$0.07,$0.28
1,13B,$0.14,$0.56
2,34B,$0.35,$1.40
3,70B,$0.70,$2.80


In [9]:
pricing['per1m_input_tokens'] = pricing['per1m_input_tokens'].str.replace('$', '').astype(float)
pricing['per1m_output_tokens'] = pricing['per1m_output_tokens'].str.replace('$', '').astype(float)

In [10]:
pricing

Unnamed: 0,model_parameter_count,per1m_input_tokens,per1m_output_tokens
0,7B,0.07,0.28
1,13B,0.14,0.56
2,34B,0.35,1.4
3,70B,0.7,2.8


In [20]:
def cost_of_message(response,pricing=pricing):
    'return the cost of the individual message in USD'
    model_type = response['model'].split('-')[-2].upper()
    input_tokens = response['usage']['prompt_tokens']
    output_tokens = response['usage']['completion_tokens']

    input_rate = pricing[pricing.model_parameter_count == model_type].per1m_input_tokens
    output_rate = pricing[pricing.model_parameter_count == model_type].per1m_output_tokens

    input_cost = input_tokens * input_rate / 1_000_000
    output_cost = output_tokens * output_rate / 1_000_000

    cost = input_cost + output_cost
    return cost

In [21]:
cost_of_message(response=response,pricing=pricing)

0    0.000028
dtype: float64

In [61]:
system_prompt = {
    'role': 'system',
     'content': 'You are an artificial intelligence assistant and you need to engage in a helpful, detailed, polite conversation with a user.'
    }

{'role': 'system',
 'content': 'You are an artificial intelligence assistant and you need to engage in a helpful, detailed, polite conversation with a user.'}

## Function to get answers

In [52]:
def ask_ppl(question,
            system_prompt=system_prompt,
            max_tokens=2048, 
            model="mistral-7b-instruct",
            api_key=PERPLEXITY_API_KEY):

    user_prompt = {
        "role": "user",
        "content": (
           question
        ),
    }

    message= [
            system_prompt,
            user_prompt
        ]
    response = openai.ChatCompletion.create(
        model=model,
        messages=message,
        max_tokens=max_tokens,
        api_base="https://api.perplexity.ai",
        api_key=PERPLEXITY_API_KEY,
    )

    return response['choices'][0]['message']['content']

In [41]:
question

'who in your estimation are the most influential computer scientists of the 21st century?'

## Test Message Formatting

In [42]:
user_prompt = {
    "role": "user",
    "content": (
       question
    ),
}

message= [
        system_prompt,
        user_prompt
    ]

message

[{'role': 'system',
  'content': 'You are an artificial intelligence assistant and you need to engage in a helpful, detailed, polite conversation with a user.'},
 {'role': 'user',
  'content': 'who in your estimation are the most influential computer scientists of the 21st century?'}]

## Test Function

In [43]:
max_tokens = 2048
model='mistral-7b-instruct'

In [46]:
models = [
    'codellama-34b-instruct',
    'llama-2-70b-chat',
    'mistral-7b-instruct',
    'pplx-7b-chat',
    'pplx-70b-chat',
]

In [62]:
model = models[1]

In [63]:
response = openai.ChatCompletion.create(
    model=model,
    messages=message,
    max_tokens=max_tokens,
    api_base="https://api.perplexity.ai",
    api_key=PERPLEXITY_API_KEY,
)

## Full Record

In [64]:
response

<OpenAIObject chat.completion id=ca2617b2-ffb2-48c0-a362-fa750efbd964 at 0x7f01502d5270> JSON: {
  "id": "ca2617b2-ffb2-48c0-a362-fa750efbd964",
  "model": "llama-2-70b-chat",
  "created": 7699435,
  "usage": {
    "prompt_tokens": 67,
    "completion_tokens": 769,
    "total_tokens": 836
  },
  "object": "chat.completion",
  "choices": [
    {
      "index": 0,
      "finish_reason": "stop",
      "message": {
        "role": "assistant",
        "content": "Thank you for asking! There have been many influential computer scientists in the 21st century who have made significant contributions to the field. Here are a few of the most notable ones in my estimation:\n\n1. Andrew Ng - Known for his work in AI, machine learning, and deep learning, Andrew Ng is a pioneer in the field of computer science. He is the co-founder of Coursera, an online learning platform, and has worked at Google, where he founded the Google Brain deep learning project.\n2. Yann LeCun - Yann LeCun is a computer sci

## Just chat response

In [65]:
response['choices'][0]['message']['content']

'Thank you for asking! There have been many influential computer scientists in the 21st century who have made significant contributions to the field. Here are a few of the most notable ones in my estimation:\n\n1. Andrew Ng - Known for his work in AI, machine learning, and deep learning, Andrew Ng is a pioneer in the field of computer science. He is the co-founder of Coursera, an online learning platform, and has worked at Google, where he founded the Google Brain deep learning project.\n2. Yann LeCun - Yann LeCun is a computer scientist and the director of AI Research at Facebook. He is also the Silver Professor of Computer Science at New York University, and a professor at the Courant Institute of Mathematical Sciences. He is one of the founding researchers of convolutional neural networks (CNNs), and was a founding member of the image-recognition startup, Numenta\n3. Geoffrey Hinton - Geoffrey Hinton is a computer scientist and cognitive psychologist who is considered one of the lea

## Compare Model Answers

In [66]:
models

['codellama-34b-instruct',
 'llama-2-70b-chat',
 'mistral-7b-instruct',
 'pplx-7b-chat',
 'pplx-70b-chat']

In [57]:
question = 'who in your estimation are the most influential computer scientists of the 21st century?'

In [58]:
model_answers = {}

for model in models:
    answer = ask_ppl(question=question, model=model)
    model_answers[model] = answer

In [60]:
model_answers.keys()

dict_keys(['codellama-34b-instruct', 'llama-2-70b-chat', 'mistral-7b-instruct', 'pplx-7b-chat', 'pplx-70b-chat'])

In [67]:
model_answers['pplx-70b-chat']

'1. Turing Award winners: The Turing Award is considered the highest distinction in computer science, and winners of this award have undoubtedly made significant contributions to the field. Some notable Turing Award winners in the 21st century include:\n\n   - Shafi Goldwasser and Silvio Micali (2012) - For their work on cryptography and secure distributed computation\n   - Leslie Valiant (2010) - For his work on computational complexity and learning\n   - Michael Stonebraker (2014) - For his work on database systems\n\n2. Authors of seminal papers: Some computer scientists have published papers that have had a significant impact on the field. A few examples include:\n\n   - Sergey Brin and Lawrence Page (1998) - For their paper on the PageRank algorithm, which revolutionized search engines\n   - Jon Kleinberg and David Easley (2010) - For their paper on network science, which has applications in various fields\n   - Yoshua Bengio, Efstratios Gavves, and Dominique Bechmann (2015) - For