Add Azure, Cohere, Anthropic, Palm Support - using liteLLM #235

ishaan-jaff · 2023-08-02T23:09:59Z

Hi @ShreyaR I'm the maintainer of https://github.com/BerriAI/litellm
This PR is addressing #217

Would love to help increase model coverage for guardrails - let me know if you have feedback on this

Here's a sample of how it's used:

from litellm import completion

## set ENV variables
# ENV variables can be set in .env file, too. Example in .env.example
os.environ["OPENAI_API_KEY"] = "openai key"
os.environ["COHERE_API_KEY"] = "cohere key"

messages = [{ "content": "Hello, how are you?","role": "user"}]

# openai call
response = completion(model="gpt-3.5-turbo", messages=messages)

# azure call
response = completion(model="gpt-3.5-deployment", messages=messages, azure=True)

# cohere call
response = completion("command-nightly", messages)

# anthropic call
response = completion(model="claude-instant-1", messages=messages)

krrishdholakia · 2023-10-04T22:55:43Z

Hey @ShreyaR @thekaranacharya @nefertitirogers any updates on this PR? Happy to help if there's any issues with this!

cc: @ishaan-jaff

whatnick · 2023-10-09T06:55:23Z

I will add that this is related to Anthropic coverage feature request and the PR may need to be refactored to account for the new ways in which llm_providers works. The PR is delegating most of the functions to another LLM generalization library and perhaps is good separation of concerns compared to implementing a whole suite of wrappers here.

However the entire llm_providers structure has changed and this PR needs to be reworked to provide a generic one.

Ludecan · 2024-01-04T22:28:05Z

Hey there, I'm Pablo from Tryolabs, my colleague Paz Cuturi has been working together with some of the teams at Guardrails.

I'm actually interested in implementing a use case combining LiteLLM+Guardrails to be able to switch LLMs while also providing guidance to the model.

Anything I can do to help with this? What would be the refactor needed?

ghbacct

This is a straightforward change.
Can a repo admin please confirm they're ok with adding a dependency on a new library?

ishaan-jaff · 2024-01-13T20:17:25Z

@ghbacct you can use our OpenAI proxy server to connect your model to guardrails: https://docs.litellm.ai/docs/simple_proxy

No need to wait for this PR to get merged

smohiuddin · 2024-01-16T18:51:31Z

Hi! I saw the liteLLM interfaces and they look great - we could definitely add LiteLLM support as it's own code path (we'd change the switch here). We'd have litellm.completion as the llm_api argument. Adding it as it's own code path lets us support all of the APIs that lite supports, while maintaining backwards compatibility and giving the library time to harden with liteLLM usage (find out what works/what doesn't/provide bake time).
Definitely open to assessing if LiteLLM can become the primary LLM router for guardrails, but I'd like to run this as a POC first.
TODO's for this PR based on this discussion:

change the llm provider switch
update azure/cohere/anthropic docs to use a litellm provider

thekaranacharya · 2024-03-05T23:17:30Z

Hello @ishaan-jaff @krrishdholakia,
We just added LiteLLM support as an additional path users can use any provider supported by LiteLLM! Check the example notebook here.

Give it a spin today with the latest version: v0.4.1. Closing this PR.
cc @smohiuddin

Harryalways317 · 2024-03-19T20:21:08Z

@thekaranacharya seems its not there in main branch

we can access the example from Here

wesngoh · 2024-04-04T04:43:34Z

@Harryalways317 Hi the link to the litellm example might have been changed, can you provide the link again?

krrishdholakia · 2024-04-04T04:58:28Z

Here's a way to do it with instructor, could a similar way be used?
https://docs.litellm.ai/docs/tutorials/instructor

Harryalways317 · 2024-04-04T05:04:14Z

Yep, similar as instructor, but if you need a detailed example use this gist,

@w8jie you can refer this gist https://gist.github.com/Harryalways317/3257bea378e960914026e0df2e2c4354

or the sharing the code below, i was on mobile so didn't checked variable defs, pls check them like base url and model

import json
import os
import time

from langfuse.callback import CallbackHandler
from langchain.callbacks import AsyncIteratorCallbackHandler
from langchain.callbacks.streaming_aiter_final_only import (
    AsyncFinalIteratorCallbackHandler,
)
from fastapi import HTTPException
from loguru import logger
import litellm
import httpx
from typing import Any, Dict

import guardrails as gd
from guardrails.validators import ToxicLanguage,IsProfanityFree,PIIFilter,DetectSecrets
from starlette.responses import StreamingResponse

# # litellm.input_callback = ["sentry"]
litellm.success_callback = ["langfuse"]
litellm.failure_callback = ["langfuse","sentry"]

litellm.set_verbose=True
os.environ["OPENAI_API_KEY"] = "anything"

langfuse_handler = CallbackHandler(
    public_key=os.environ.get("LANGFUSE_PUBLIC_KEY"),
    secret_key=os.environ.get("LANGFUSE_SECRET_KEY"),
)

guard = gd.Guard.from_string(
    #DetectSecrets(validation_method="sentence", on_fail="fix")
    validators=[ToxicLanguage(validation_method="sentence",on_fail="fix"), IsProfanityFree(validation_method="sentence", on_fail="fix"),PIIFilter(pii_entities=['PERSON', 'PHONE_NUMBER'],validation_method="sentence", on_fail="fix")],
    description="Ensure no toxic language",
)


def completion(prompt):
     messages = [
            {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": f"{prompt}"}
        ]

    try:
        response = guard(
            litellm.completion,
            api_base=base_url,
            custom_llm_provider="openai",
            model=model,
            metadata={
                "generation_name": f"instance1",  # langfuse generation name
            },
            prompt=prompt,
            # prompt_params={'query':messages},
            max_tokens=max_tokens,
            temperature=temperature,
        )
        logger.info(f'Prompt is {prompt} and response is {response}')
        return response.model_dump_json()
    except Exception as e:
        logger.info(f'Prompt is {prompt}')
        logger.error(f"An error occurred while requesting: {e}")
        response = str(e)
        return {"error": "An error occurred while processing your request.",'code':500,'response':response}
       
def generate_streaming_completion_llm(prompt: str,model: str = "HuggingFaceH4/zephyr-7b-beta", max_tokens: int = 12000,
                              temperature: float = 0.66) -> StreamingResponse:
    """
    Generates a streaming completion from the LiteLLM API and returns a FastAPI StreamingResponse.
    """

    messages = [
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": f"{prompt}"}
    ]
    base_url = f'http://{self-hosted-ip/base url}/v1'
    model = 'HuggingFaceH4/zephyr-7b-beta'
    

    logger.info(f'Streaming Prompt')
    print('base_url',base_url)
    os.environ["OPENAI_API_KEY"] = "anything"
    os.environ['OPENAI_API_BASE'] = base_url

    async def stream_generator():
        try:
            response = await litellm.acompletion(
                api_base=base_url,
                # llm_api=litellm.completion,
                custom_llm_provider="openai",
                model=model,
                metadata={
                    "generation_name": f"instance1",
                },
                # prompt=prompt,
                messages=messages,
                max_tokens=max_tokens,
                temperature=temperature,
                stream=True,
                # callbacks=(
                #     [AsyncFinalIteratorCallbackHandler(), langfuse_handler]
                # ),
            )
            print(f"response: {response}")
            async for chunk in response:
                data = chunk.model_dump()
                if 'choices' in data and data['choices']:
                    for choice in data['choices']:
                        if 'delta' in choice and choice['delta']:
                            if 'content' in choice['delta']:
                                yield choice['delta']['content']
        except Exception as e:
            yield json.dumps({"error": "An error occurred while processing your request.", "details": str(e)})
        return
    return StreamingResponse(content=stream_generator(), media_type="text/event-stream")`

ShreyaR · 2024-04-04T05:37:37Z

@w8jie @krrishdholakia @Harryalways317 we're putting up a new docs pr with a tighter LiteLLM integration at #635. Will push up changes that you can test out!

ShreyaR · 2024-04-05T02:03:49Z

@w8jie @krrishdholakia @Harryalways317 updated docs on using Guardrails with any custom LLM https://www.guardrailsai.com/docs/how_to_guides/llm_api_wrappers

We've added a tighter integration with LiteLLM, take it for a spin and let us know if there's feedback.

Also adding here -- in case there's interest we're heavily consolidating our LLM calling APIs right now. There's an open discussion that talks about this, feel free to leave comments there if you have feedback.

v0

389f6a6

ghbacct reviewed Jan 13, 2024

View reviewed changes

thekaranacharya closed this Mar 5, 2024

Add Azure, Cohere, Anthropic, Palm Support - using liteLLM #235

Add Azure, Cohere, Anthropic, Palm Support - using liteLLM #235

Uh oh!

Conversation

ishaan-jaff commented Aug 2, 2023

Uh oh!

krrishdholakia commented Oct 4, 2023

Uh oh!

whatnick commented Oct 9, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Ludecan commented Jan 4, 2024

Uh oh!

ghbacct left a comment

Choose a reason for hiding this comment

Uh oh!

ishaan-jaff commented Jan 13, 2024

Uh oh!

smohiuddin commented Jan 16, 2024

Uh oh!

thekaranacharya commented Mar 5, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Harryalways317 commented Mar 19, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wesngoh commented Apr 4, 2024

Uh oh!

krrishdholakia commented Apr 4, 2024

Uh oh!

Harryalways317 commented Apr 4, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ShreyaR commented Apr 4, 2024

Uh oh!

ShreyaR commented Apr 5, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

10 participants

whatnick commented Oct 9, 2023 •

edited

Loading

thekaranacharya commented Mar 5, 2024 •

edited

Loading

Harryalways317 commented Mar 19, 2024 •

edited

Loading

Harryalways317 commented Apr 4, 2024 •

edited

Loading