<a href="https://colab.research.google.com/github/erno123/Open-source-LLMs-API-trial/blob/main/Replicate.com%20Llama2%20and%20Mixtral%20API%20trial.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Getting Started with Replicate
This notebook shows how to run models on [Replicate](https://replicate.com).

Last updated: 2024-01-17

>[Getting Started with Replicate](#scrollTo=UXa9IwkeokWH)

>[Setup](#scrollTo=ep0A2pLDnoWK)

>[Authenticate](#scrollTo=8BCZiwH6cLnv)

>[Run a model](#scrollTo=Ax6xbVZOpnaV)



# Setup

To run this notebook, you’ll need to create a [Replicate](https://replicate.com) account and install the Replicate python client.

In [1]:
# install replicate client
!pip install replicate
import replicate

Collecting replicate
  Downloading replicate-0.23.0-py3-none-any.whl (36 kB)
Collecting httpx<1,>=0.21.0 (from replicate)
  Downloading httpx-0.26.0-py3-none-any.whl (75 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m75.9/75.9 kB[0m [31m3.1 MB/s[0m eta [36m0:00:00[0m
Collecting httpcore==1.* (from httpx<1,>=0.21.0->replicate)
  Downloading httpcore-1.0.2-py3-none-any.whl (76 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m76.9/76.9 kB[0m [31m4.4 MB/s[0m eta [36m0:00:00[0m
Collecting h11<0.15,>=0.13 (from httpcore==1.*->httpx<1,>=0.21.0->replicate)
  Downloading h11-0.14.0-py3-none-any.whl (58 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m58.3/58.3 kB[0m [31m4.8 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: h11, httpcore, httpx, replicate
Successfully installed h11-0.14.0 httpcore-1.0.2 httpx-0.26.0 replicate-0.23.0


# Authenticate
Before running any Python scripts that use the API, you need to set your Replicate API token in your environment.

Grab your token from replicate.com/account and set it as an environment variable:

export REPLICATE_API_TOKEN=<your token>
We recommend not adding the token directly to your source code, because you don't want to put your credentials in source control. If anyone used your API key, their usage would be charged to your account.

In [2]:
# get a token: https://replicate.com/account
from getpass import getpass
import os

REPLICATE_API_TOKEN = getpass()
os.environ["REPLICATE_API_TOKEN"] = REPLICATE_API_TOKEN

··········


# Run a model
You can run any public model on Replicate from your Python code. The following example runs [stability-ai/stable-diffusion](https://replicate.com/stability-ai/stable-diffusion):

In [3]:
# Llama2 with streaming but without linebreaks, verison 1.
# (There are only streaming LLMs on replicate.com)

output = replicate.run(
    "meta/llama-2-70b-chat",
    input={
        "top_p": 1,
        "prompt": "Who was Albert Einstein?",
        "temperature": 1,
        "system_prompt": "You are a helpful, respectful and honest assistant who has to answer questions in a funny style.",
        "max_new_tokens": 256,
        "repetition_penalty": 1
    }
)

for text in output:
      print(text, end="")

Albert Einstein, you say? Well, he was a guy who was really good at math and science and stuff. Like, he was basically a genius or something. But let's be real, who isn't a genius these days? I mean, have you seen all the cat videos on YouTube? People are basically geniuses at watching cats do silly things. But I digress, Einstein did some pretty cool things like coming up with that whole E=mc^2 thing and he also had some wild hair. I mean, it's like he stuck his finger in a light socket and it just decided to do its own thing. But hey, being a genius isn't all it's cracked up to be, sometimes you gotta have a little bit of crazy hair to really make a mark in the world. Just saying.

In [4]:
# Llama2 with streaming and without linebreaks, verison 2.
# (There are only streaming LLMs on replicate.com)

for event in replicate.stream(
    "meta/llama-2-70b-chat",
    input={
        "top_p": 1,
        "prompt": "Who was Albert Einstein?",
        "temperature": 1,
        "system_prompt": "You are a helpful, respectful and honest assistant who has to answer questions in a funny style.",
        "max_new_tokens": 256,
        "repetition_penalty": 1
    },
):
    print(str(event), end="")

Albert Einstein, you say? Well, let me tell you, he was a bit of a brainiac, that one. Like, he was practically a genius, or something. He had some crazy hair, too. I mean, it was like he stuck his finger in a light socket and it just decided to do its own thing. But hey, at least he didn' Nigerian prince'd us with his science-y stuff, right? I mean, have you seen his famous equation, E=mc²? It's like, "Hey, you got your mass and energy in my speed squared, and your speed squared in my mass and energy, and... uh, well, you get the idea." Genius, I tell ya. Pure, unadulterated, lab-coat-wearing, formula-scribbling, prestidigitation-practicing genius.

In [5]:
# Llama2 with streaming and linebreaks
# (There are only streaming LLMs on replicate.com)

iterator = replicate.run(
    "meta/llama-2-70b-chat",
    input={
        "top_p": 1,
        "prompt": "Who was Albert Einstein?",
        "temperature": 1,
        "system_prompt": "You are a helpful, respectful and honest assistant who has to answer questions in a funny style.",
        "max_new_tokens": 256,
        "repetition_penalty": 1
    }
)

line_length = 0
for text in iterator:
      print(text, end="")
      line_length += len(text)
      if line_length > 100:
        print("")
        line_length = 0


Albert Einstein? Oh, you know, just your typical, run-of-the-mill genius who changed the way we understand
 the universe. He was like the Stephen Hawking of his time, but with fewer TV appearances and more hair
. In fact, his hair was so iconic, it' EP05itted  into the popular phrase "Einstein hair." But let's not
 get too hung up on his looks, folks. The man was a freakin' genius! He basically reinvented the way we
 think about space and time. Like, who else could make E=mc^2 sound like a simple equation? Sorry, what
 was that? Yes, he also won a Nobel Prize. No big deal. He just casually reshaped the way we think about
 physics. BRB, brain explosion!

In [6]:
# mistralai/mixtral-8x7b-instruct-v0.1 with streaming but without linebreaks
# (There are only streaming LLMs on replicate.com)

system_prompt = "You are a helpful, respectful and honest assistant who has to answer questions in a funny style."
for event in replicate.stream(
    "mistralai/mixtral-8x7b-instruct-v0.1",
    input={
        "top_k": 50,
        "top_p": 0.9,
        "prompt": f"{system_prompt} Who was Albert Einstein?",
        "temperature": 1,
        "max_new_tokens": 256,
        "prompt_template": "<s>[INST] {prompt} [/INST]"},
):
    print(str(event), end="")


 Albert Einstein, the time-traveling patent office clerk who came up with the theory of relativity while riding his bike! Well, not really, but that would've been an exciting story, don't you think? In actuality, he was a brilliant physicist and mathematician who developed the famous equation, E=mc².

Imagine if Einstein were here today; he'd probably have a field day explaining dark energy and quantum mechanics with his fabulous hair and humorous anecdotes! Just picture him in jeans and a cool t-shirt saying, "It's all relative... especially when it comes to deciding who gets the last slice of pizza!"

In [7]:
# mistralai/mixtral-8x7b-instruct-v0.1 with streaming and linebreaks
# (There are only streaming LLMs on replicate.com)

line_length = 0
system_prompt = "You are a helpful, respectful and honest assistant who has to answer questions in a funny style."
for event in replicate.stream(
    "mistralai/mixtral-8x7b-instruct-v0.1",
    input={
        "top_k": 50,
        "top_p": 0.9,
        "prompt": f"{system_prompt} Who was Albert Einstein?",
        "temperature": 1,
        "max_new_tokens": 256,
        "prompt_template": "<s>[INST] {prompt} [/INST]"},
):
    text = str(event)
    print(text, end="")
    line_length += len(text)
    if line_length > 100:
      print("")
      line_length = 0



 Albert Einstein, the time-traveling genius-rocker with wild hair and a genius-stick (also known as a
 chalkboard), was born in Germany but later became a citizen of the United States. He's best known for
 his theory of relativity, which is basically the idea that if you're running late, time actually slows down
 to annoy you even further.

When he wasn't busy derping physicists with his mind, he played the violin
 like a boss, proving that smart people can indeed have fun too! And while I can't confirm if he ever time
-traveled to prank his younger self, I'm told that his favorite cereal was definitely "Relativi-O's: The
 Cereal of Space-Time." Because who doesn't love a good bowl of puns with their quantum mechanics?