Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .devcontainer/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ FROM mcr.microsoft.com/vscode/devcontainers/python:0-${VARIANT}

USER vscode

RUN curl -sSf https://rye-up.com/get | RYE_VERSION="0.15.2" RYE_INSTALL_OPTION="--yes" bash
RUN curl -sSf https://rye-up.com/get | RYE_VERSION="0.24.0" RYE_INSTALL_OPTION="--yes" bash
ENV PATH=/home/vscode/.rye/shims:$PATH

RUN echo "[[ -d .venv ]] && source .venv/bin/activate" >> /home/vscode/.bashrc
6 changes: 3 additions & 3 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,10 @@ name: CI
on:
push:
branches:
- main
- stainless
pull_request:
branches:
- main
- stainless

jobs:
lint:
Expand All @@ -21,7 +21,7 @@ jobs:
curl -sSf https://rye-up.com/get | bash
echo "$HOME/.rye/shims" >> $GITHUB_PATH
env:
RYE_VERSION: 0.15.2
RYE_VERSION: 0.24.0
RYE_INSTALL_OPTION: "--yes"

- name: Install dependencies
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/publish-pypi.yml
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ jobs:
curl -sSf https://rye-up.com/get | bash
echo "$HOME/.rye/shims" >> $GITHUB_PATH
env:
RYE_VERSION: 0.15.2
RYE_VERSION: 0.24.0
RYE_INSTALL_OPTION: "--yes"

- name: Publish to PyPI
Expand Down
2 changes: 1 addition & 1 deletion .release-please-manifest.json
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
{
".": "0.1.0"
".": "0.2.0"
}
19 changes: 19 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,24 @@
# Changelog

## 0.2.0 (2024-02-21)

Full Changelog: [v0.1.0...v0.2.0](https://github.com/groq/groq-python/compare/v0.1.0...v0.2.0)

### Features

* Add initial Stainless SDK ([d5a8512](https://github.com/groq/groq-python/commit/d5a851262e04e625dde130367ed91d8f95683599))
* Add initial Stainless SDK ([316de2c](https://github.com/groq/groq-python/commit/316de2ccfeb76e36fe34bb8656ea90a8d42a7d00))
* create default branch ([7e00266](https://github.com/groq/groq-python/commit/7e00266e3c691d92d508e753e2c14c03297c09f9))
* update via SDK Studio ([#10](https://github.com/groq/groq-python/issues/10)) ([0c0d204](https://github.com/groq/groq-python/commit/0c0d20405a96167f060a03a2b8a58a49d9a1c7c8))
* update via SDK Studio ([#3](https://github.com/groq/groq-python/issues/3)) ([8d92c08](https://github.com/groq/groq-python/commit/8d92c086e320c2715e02bc79807ff872e84c0b0f))


### Chores

* go live ([#2](https://github.com/groq/groq-python/issues/2)) ([ba81c42](https://github.com/groq/groq-python/commit/ba81c42d6d0fd6d47819e0d58962235cb70ca4f1))
* go live ([#5](https://github.com/groq/groq-python/issues/5)) ([af9a838](https://github.com/groq/groq-python/commit/af9a838e240bb0f7385bc33fb18ce246427ca2f7))
* update branch ([#8](https://github.com/groq/groq-python/issues/8)) ([b9b55b4](https://github.com/groq/groq-python/commit/b9b55b41cb158efd155f9cda829808c877493afd))

## 0.1.0 (2024-02-10)

Full Changelog: [v0.0.1...v0.1.0](https://github.com/definitive-io/groqcloud-python/compare/v0.0.1...v0.1.0)
Expand Down
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -261,9 +261,9 @@ completion = response.parse() # get the object that `chat.completions.create()`
print(completion.id)
```

These methods return an [`APIResponse`](https://github.com/groq/groq-python/tree/main/src/groq/_response.py) object.
These methods return an [`APIResponse`](https://github.com/groq/groq-python/tree/stainless/src/groq/_response.py) object.

The async client returns an [`AsyncAPIResponse`](https://github.com/groq/groq-python/tree/main/src/groq/_response.py) with the same structure, the only difference being `await`able methods for reading the response content.
The async client returns an [`AsyncAPIResponse`](https://github.com/groq/groq-python/tree/stainless/src/groq/_response.py) with the same structure, the only difference being `await`able methods for reading the response content.

#### `.with_streaming_response`

Expand Down
4 changes: 2 additions & 2 deletions bin/check-release-environment
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,9 @@ if [ -z "${PYPI_TOKEN}" ]; then
errors+=("The GROQ_PYPI_TOKEN secret has not been set. Please set it in either this repository's secrets or your organization secrets.")
fi

len=${#errors[@]}
lenErrors=${#errors[@]}

if [[ len -gt 0 ]]; then
if [[ lenErrors -gt 0 ]]; then
echo -e "Found the following errors in the release environment:\n"

for error in "${errors[@]}"; do
Expand Down
Empty file modified bin/check-test-server
100644 → 100755
Empty file.
Empty file modified bin/test
100644 → 100755
Empty file.
45 changes: 45 additions & 0 deletions examples/chat_completion.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
from groq import Groq

client = Groq()

chat_completion = client.chat.completions.create(
#
# Required parameters
#
messages=[
# Set an optional system message. This sets the behavior of the
# assistant and can be used to provide specific instructions for
# how it should behave throughout the conversation.
{"role": "system", "content": "you are a helpful assistant."},
# Set a user message for the assistant to respond to.
{
"role": "user",
"content": "Explain the importance of low latency LLMs",
},
],
# The language model which will generate the completion.
model="mixtral-8x7b-32768",
#
# Optional parameters
#
# Controls randomness: lowering results in less random completions.
# As the temperature approaches zero, the model will become deterministic
# and repetitive.
temperature=0.5,
# The maximum number of tokens to generate. Requests can use up to
# 2048 tokens shared between prompt and completion.
max_tokens=1024,
# Controls diversity via nucleus sampling: 0.5 means half of all
# likelihood-weighted options are considered.
top_p=1,
# A stop sequence is a predefined or user-specified text string that
# signals an AI to stop generating content, ensuring its responses
# remain focused and concise. Examples include punctuation marks and
# markers like "[end]".
stop=None,
# If set, partial message deltas will be sent.
stream=False,
)

# Print the completion returned by the LLM.
print(chat_completion.choices[0].message.content)
52 changes: 52 additions & 0 deletions examples/chat_completion_async.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
import asyncio

from groq import AsyncGroq


async def main():
client = AsyncGroq()

chat_completion = await client.chat.completions.create(
#
# Required parameters
#
messages=[
# Set an optional system message. This sets the behavior of the
# assistant and can be used to provide specific instructions for
# how it should behave throughout the conversation.
{"role": "system", "content": "you are a helpful assistant."},
# Set a user message for the assistant to respond to.
{
"role": "user",
"content": "Explain the importance of low latency LLMs",
},
],
# The language model which will generate the completion.
model="mixtral-8x7b-32768",
#
# Optional parameters
#
# Controls randomness: lowering results in less random completions.
# As the temperature approaches zero, the model will become
# deterministic and repetitive.
temperature=0.5,
# The maximum number of tokens to generate. Requests can use up to
# 2048 tokens shared between prompt and completion.
max_tokens=1024,
# Controls diversity via nucleus sampling: 0.5 means half of all
# likelihood-weighted options are considered.
top_p=1,
# A stop sequence is a predefined or user-specified text string that
# signals an AI to stop generating content, ensuring its responses
# remain focused and concise. Examples include punctuation marks and
# markers like "[end]".
stop=None,
# If set, partial message deltas will be sent.
stream=False,
)

# Print the completion returned by the LLM.
print(chat_completion.choices[0].message.content)


asyncio.run(main())
51 changes: 51 additions & 0 deletions examples/chat_completion_async_streaming.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
import asyncio

from groq import AsyncGroq


async def main():
client = AsyncGroq()

stream = await client.chat.completions.create(
#
# Required parameters
#
messages=[
# Set an optional system message. This sets the behavior of the
# assistant and can be used to provide specific instructions for
# how it should behave throughout the conversation.
{"role": "system", "content": "you are a helpful assistant."},
# Set a user message for the assistant to respond to.
{
"role": "user",
"content": "Explain the importance of low latency LLMs",
},
],
# The language model which will generate the completion.
model="mixtral-8x7b-32768",
#
# Optional parameters
#
# Controls randomness: lowering results in less random completions.
# As the temperature approaches zero, the model will become
# deterministic and repetitive.
temperature=0.5,
# The maximum number of tokens to generate. Requests can use up to
# 2048 tokens shared between prompt and completion.
max_tokens=1024,
# A stop sequence is a predefined or user-specified text string that
# signals an AI to stop generating content, ensuring its responses
# remain focused and concise. Examples include punctuation marks and
# markers like "[end]".
stop=None,
# Controls diversity via nucleus sampling: 0.5 means half of all
# likelihood-weighted options are considered.
stream=True,
)

# Print the incremental deltas returned by the LLM.
async for chunk in stream:
print(chunk.choices[0].delta.content, end="")


asyncio.run(main())
48 changes: 48 additions & 0 deletions examples/chat_completion_stop.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
from groq import Groq

client = Groq()

chat_completion = client.chat.completions.create(
#
# Required parameters
#
messages=[
# Set an optional system message. This sets the behavior of the
# assistant and can be used to provide specific instructions for
# how it should behave throughout the conversation.
{"role": "system", "content": "you are a helpful assistant."},
# Set a user message for the assistant to respond to.
{
"role": "user",
"content": 'Count to 10. Your response must begin with "1, ". example: 1, 2, 3, ...',
},
],
# The language model which will generate the completion.
model="mixtral-8x7b-32768",
#
# Optional parameters
#
# Controls randomness: lowering results in less random completions.
# As the temperature approaches zero, the model will become deterministic
# and repetitive.
temperature=0.5,
# The maximum number of tokens to generate. Requests can use up to
# 2048 tokens shared between prompt and completion.
max_tokens=1024,
# Controls diversity via nucleus sampling: 0.5 means half of all
# likelihood-weighted options are considered.
top_p=1,
# A stop sequence is a predefined or user-specified text string that
# signals an AI to stop generating content, ensuring its responses
# remain focused and concise. Examples include punctuation marks and
# markers like "[end]".
# For this example, we will use ", 6" so that the llm stops counting at 5.
# If multiple stop values are needed, an array of string may be passed,
# stop=[", 6", ", six", ", Six"]
stop=", 6",
# If set, partial message deltas will be sent.
stream=False,
)

# Print the completion returned by the LLM.
print(chat_completion.choices[0].message.content)
46 changes: 46 additions & 0 deletions examples/chat_completion_streaming.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
from groq import Groq

client = Groq()

stream = client.chat.completions.create(
#
# Required parameters
#
messages=[
# Set an optional system message. This sets the behavior of the
# assistant and can be used to provide specific instructions for
# how it should behave throughout the conversation.
{"role": "system", "content": "you are a helpful assistant."},
# Set a user message for the assistant to respond to.
{
"role": "user",
"content": "Explain the importance of low latency LLMs",
},
],
# The language model which will generate the completion.
model="mixtral-8x7b-32768",
#
# Optional parameters
#
# Controls randomness: lowering results in less random completions.
# As the temperature approaches zero, the model will become deterministic
# and repetitive.
temperature=0.5,
# The maximum number of tokens to generate. Requests can use up to
# 2048 tokens shared between prompt and completion.
max_tokens=1024,
# Controls diversity via nucleus sampling: 0.5 means half of all
# likelihood-weighted options are considered.
top_p=1,
# A stop sequence is a predefined or user-specified text string that
# signals an AI to stop generating content, ensuring its responses
# remain focused and concise. Examples include punctuation marks and
# markers like "[end]".
stop=None,
# If set, partial message deltas will be sent.
stream=True,
)

# Print the incremental deltas returned by the LLM.
for chunk in stream:
print(chunk.choices[0].delta.content, end="")
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[project]
name = "groq"
version = "0.1.0"
version = "0.2.0"
description = "The official Python library for the groq API"
readme = "README.md"
license = "Apache-2.0"
Expand Down
Loading