<a href="https://colab.research.google.com/github/sudarshan-koirala/youtube-stuffs/blob/main/llamaindex/perplexity%2Bllamaindex.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Perplexity with LlamaIndex

## [Youtube Video Covering this Notebook](https://youtu.be/PHEZ6AHR57w?si=hUIkRIhV17YA0w9W)

Before we get started, make sure you install llama_index

In [None]:
%%capture
!pip install llama-index watermark gradio

In [None]:
%load_ext watermark
%watermark -a "Sudarshan Koirala" -vmp llama_index,openai,gradio

Author: Sudarshan Koirala

Python implementation: CPython
Python version       : 3.10.12
IPython version      : 7.34.0

llama_index: 0.9.3.post1
openai     : 1.3.3
gradio     : 4.4.1

Compiler    : GCC 11.4.0
OS          : Linux
Release     : 5.15.120+
Machine     : x86_64
Processor   : x86_64
CPU cores   : 2
Architecture: 64bit



## Setup LLM

As of Nov 14, 2023 - the following models are supported with the Perplexity LLM class in LLaMa Index:

| Model | Context Length | Model Type |
|-------|----------------|------------|
| codellama-34b-instruct | 16384 | Chat Completion |
| llama-2-13b-chat | 4096 | Chat Completion |
| llama-2-70b-chat | 4096 | Chat Completion |
| mistral-7b-instruct | 4096 [1] | Chat Completion |
| openhermes-2-mistral-7b | 4096 [1] | Chat Completion |
| openhermes-2.5-mistral-7b | 4096 [1] | Chat Completion |
| replit-code-v1.5-3b | 4096 | Text Completion |
| pplx-7b-chat-alpha | 4096 | Chat Completion |
| pplx-70b-chat-alpha | 4096 | Chat Completion |

[1] Context length of mistral-7b-instruct and openhermes-2-mistral-7b will be increased to 32k tokens (see perplexity roadmap).

You can find the latest supported models here - https://docs.perplexity.ai/docs/model-cards \
Rate limits are found here - https://docs.perplexity.ai/docs/rate-limits

## Without LlamaIndex
- [perplexity blog link](https://blog.perplexity.ai/blog/introducing-pplx-api)

## With Llama-Index

In [None]:
#perplexity api key
pplx_api_key = "your_pplx_api_key"

In [None]:
#Perplexity??

In [None]:
from llama_index.llms import Perplexity

llm = Perplexity(
    api_key=pplx_api_key, model="mistral-7b-instruct", temperature=0.5
)

In [None]:
from llama_index.llms.base import ChatMessage

messages_dict = [
    {"role": "system", "content": "Be precise and concise."},
    {"role": "user", "content": "Tell me 5 sentences about Sam Altman."},
]
messages = [ChatMessage(**msg) for msg in messages_dict]

## Chat

In [None]:
response = llm.chat(messages)
print(response)

assistant: 1. Sam Altman is an American entrepreneur and investor, known for co-founding Y Combinator, a startup accelerator.
2. He has also served as CEO of Y Combinator and as a general partner at Y Combinator Ventures.
3. Altman has invested in numerous successful startups, including Reddit, Airbnb, and Stripe.
4. He is also a prominent figure in the field of artificial intelligence and has founded several AI-related companies, including DeepMind, which was sold to Alphabet in 2014.
5. Altman has been recognized for his contributions to the tech industry and has received numerous awards, including the National Medal of Technology and the Presidential Rank Award.


## Async Chat

In [None]:
response = await llm.achat(messages)
print(response)

assistant: 1. Sam Altman is the co-founder and CEO of OpenAI, a non-profit organization focused on artificial intelligence.
2. He is also a co-founder of Y Combinator, a startup accelerator and venture fund.
3. Altman has been involved in the development of several successful startups, including Reddit and LoopNet.
4. He is a prominent figure in the tech industry and has been recognized for his contributions to the field.
5. Altman has also been involved in various philanthropic efforts, including supporting research into artificial intelligence and space exploration.


## Stream Chat

In [None]:
resp = llm.stream_chat(messages)
for r in resp:
    print(r.delta, end="")

1. Sam Altman is an American entrepreneur and investor.
2. He is the co-founder and CEO of Y Combinator, a startup accelerator.
3. Altman is also the co-founder of several other companies, including LoopNet and Max Levchin.
4. He has invested in numerous successful startups, including Reddit, Twitter, and Stripe.
5. Altman is known for his involvement in the tech industry and his support of entrepreneurship.

## Async Stream Chat

In [None]:
resp = await llm.astream_chat(messages)
async for delta in resp:
    print(delta.delta, end="")

1. Sam Altman is the co-founder and CEO of OpenAI, a leading AI research and development company.
2. He is also the co-founder of Y Combinator, a startup incubator and venture capital firm.
3. Altman has been involved in the development of several other successful startups, including LoopNet, which was acquired by Autodesk for $800 million.
4. He is a strong advocate for the development of safe and beneficial artificial intelligence, and has written extensively on the subject.
5. In addition to his work in the tech industry, Altman is also involved in several other areas, including education and space exploration.

## Conclusion
- Synchronous chat calls block the program until a response is received.
- Asynchronous chat calls enable the program to continue executing other tasks while waiting for the response, thus improving efficiency and responsiveness.

## Simple Gradio app

In [None]:
import gradio as gr

In [None]:
def chat_with_pplx_model(user_input):

    messages_dict = [
    {"role": "system", "content": "Be precise and concise."},
    {"role": "user", "content": user_input}
]
    messages = [ChatMessage(**msg) for msg in messages_dict]

    llm = Perplexity(
    api_key=pplx_api_key, model="mistral-7b-instruct", temperature=0.5
)
    response = llm.chat(messages)
    return response

In [None]:
iface = gr.Interface(fn=chat_with_pplx_model, inputs="text", outputs="text")
iface.launch()

Setting queue=True in a Colab notebook requires sharing enabled. Setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly).

Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
Running on public URL: https://333c10e92db1fcbb51.gradio.live

This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from Terminal to deploy to Spaces (https://huggingface.co/spaces)


