Skip to content

Conversation

@ishaan-jaff
Copy link

Hi @ShreyaR I'm the maintainer of https://github.com/BerriAI/litellm
This PR is addressing #217

Would love to help increase model coverage for guardrails - let me know if you have feedback on this

Here's a sample of how it's used:

from litellm import completion

## set ENV variables
# ENV variables can be set in .env file, too. Example in .env.example
os.environ["OPENAI_API_KEY"] = "openai key"
os.environ["COHERE_API_KEY"] = "cohere key"

messages = [{ "content": "Hello, how are you?","role": "user"}]

# openai call
response = completion(model="gpt-3.5-turbo", messages=messages)

# azure call
response = completion(model="gpt-3.5-deployment", messages=messages, azure=True)

# cohere call
response = completion("command-nightly", messages)

# anthropic call
response = completion(model="claude-instant-1", messages=messages)

@krrishdholakia
Copy link

Hey @ShreyaR @thekaranacharya @nefertitirogers any updates on this PR? Happy to help if there's any issues with this!

cc: @ishaan-jaff

@whatnick
Copy link

whatnick commented Oct 9, 2023

I will add that this is related to Anthropic coverage feature request and the PR may need to be refactored to account for the new ways in which llm_providers works. The PR is delegating most of the functions to another LLM generalization library and perhaps is good separation of concerns compared to implementing a whole suite of wrappers here.

However the entire llm_providers structure has changed and this PR needs to be reworked to provide a generic one.

@Ludecan
Copy link

Ludecan commented Jan 4, 2024

Hey there, I'm Pablo from Tryolabs, my colleague Paz Cuturi has been working together with some of the teams at Guardrails.

I'm actually interested in implementing a use case combining LiteLLM+Guardrails to be able to switch LLMs while also providing guidance to the model.

Anything I can do to help with this? What would be the refactor needed?

Copy link

@ghbacct ghbacct left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a straightforward change.
Can a repo admin please confirm they're ok with adding a dependency on a new library?

@ishaan-jaff
Copy link
Author

@ghbacct you can use our OpenAI proxy server to connect your model to guardrails: https://docs.litellm.ai/docs/simple_proxy

No need to wait for this PR to get merged

@smohiuddin
Copy link
Collaborator

Hi! I saw the liteLLM interfaces and they look great - we could definitely add LiteLLM support as it's own code path (we'd change the switch here). We'd have litellm.completion as the llm_api argument. Adding it as it's own code path lets us support all of the APIs that lite supports, while maintaining backwards compatibility and giving the library time to harden with liteLLM usage (find out what works/what doesn't/provide bake time).
Definitely open to assessing if LiteLLM can become the primary LLM router for guardrails, but I'd like to run this as a POC first.
TODO's for this PR based on this discussion:

  1. change the llm provider switch
  2. update azure/cohere/anthropic docs to use a litellm provider

@thekaranacharya
Copy link
Contributor

thekaranacharya commented Mar 5, 2024

Hello @ishaan-jaff @krrishdholakia,
We just added LiteLLM support as an additional path users can use any provider supported by LiteLLM! Check the example notebook here.

Give it a spin today with the latest version: v0.4.1. Closing this PR.
cc @smohiuddin

@Harryalways317
Copy link

Harryalways317 commented Mar 19, 2024

@thekaranacharya seems its not there in main branch

we can access the example from Here

@wesngoh
Copy link

wesngoh commented Apr 4, 2024

@Harryalways317 Hi the link to the litellm example might have been changed, can you provide the link again?

@krrishdholakia
Copy link

Here's a way to do it with instructor, could a similar way be used?
https://docs.litellm.ai/docs/tutorials/instructor

@Harryalways317
Copy link

Harryalways317 commented Apr 4, 2024

Yep, similar as instructor, but if you need a detailed example use this gist,

@w8jie you can refer this gist https://gist.github.com/Harryalways317/3257bea378e960914026e0df2e2c4354

or the sharing the code below, i was on mobile so didn't checked variable defs, pls check them like base url and model

import json
import os
import time

from langfuse.callback import CallbackHandler
from langchain.callbacks import AsyncIteratorCallbackHandler
from langchain.callbacks.streaming_aiter_final_only import (
    AsyncFinalIteratorCallbackHandler,
)
from fastapi import HTTPException
from loguru import logger
import litellm
import httpx
from typing import Any, Dict

import guardrails as gd
from guardrails.validators import ToxicLanguage,IsProfanityFree,PIIFilter,DetectSecrets
from starlette.responses import StreamingResponse

# # litellm.input_callback = ["sentry"]
litellm.success_callback = ["langfuse"]
litellm.failure_callback = ["langfuse","sentry"]

litellm.set_verbose=True
os.environ["OPENAI_API_KEY"] = "anything"

langfuse_handler = CallbackHandler(
    public_key=os.environ.get("LANGFUSE_PUBLIC_KEY"),
    secret_key=os.environ.get("LANGFUSE_SECRET_KEY"),
)

guard = gd.Guard.from_string(
    #DetectSecrets(validation_method="sentence", on_fail="fix")
    validators=[ToxicLanguage(validation_method="sentence",on_fail="fix"), IsProfanityFree(validation_method="sentence", on_fail="fix"),PIIFilter(pii_entities=['PERSON', 'PHONE_NUMBER'],validation_method="sentence", on_fail="fix")],
    description="Ensure no toxic language",
)


def completion(prompt):
     messages = [
            {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": f"{prompt}"}
        ]

    try:
        response = guard(
            litellm.completion,
            api_base=base_url,
            custom_llm_provider="openai",
            model=model,
            metadata={
                "generation_name": f"instance1",  # langfuse generation name
            },
            prompt=prompt,
            # prompt_params={'query':messages},
            max_tokens=max_tokens,
            temperature=temperature,
        )
        logger.info(f'Prompt is {prompt} and response is {response}')
        return response.model_dump_json()
    except Exception as e:
        logger.info(f'Prompt is {prompt}')
        logger.error(f"An error occurred while requesting: {e}")
        response = str(e)
        return {"error": "An error occurred while processing your request.",'code':500,'response':response}
       
def generate_streaming_completion_llm(prompt: str,model: str = "HuggingFaceH4/zephyr-7b-beta", max_tokens: int = 12000,
                              temperature: float = 0.66) -> StreamingResponse:
    """
    Generates a streaming completion from the LiteLLM API and returns a FastAPI StreamingResponse.
    """

    messages = [
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": f"{prompt}"}
    ]
    base_url = f'http://{self-hosted-ip/base url}/v1'
    model = 'HuggingFaceH4/zephyr-7b-beta'
    

    logger.info(f'Streaming Prompt')
    print('base_url',base_url)
    os.environ["OPENAI_API_KEY"] = "anything"
    os.environ['OPENAI_API_BASE'] = base_url

    async def stream_generator():
        try:
            response = await litellm.acompletion(
                api_base=base_url,
                # llm_api=litellm.completion,
                custom_llm_provider="openai",
                model=model,
                metadata={
                    "generation_name": f"instance1",
                },
                # prompt=prompt,
                messages=messages,
                max_tokens=max_tokens,
                temperature=temperature,
                stream=True,
                # callbacks=(
                #     [AsyncFinalIteratorCallbackHandler(), langfuse_handler]
                # ),
            )
            print(f"response: {response}")
            async for chunk in response:
                data = chunk.model_dump()
                if 'choices' in data and data['choices']:
                    for choice in data['choices']:
                        if 'delta' in choice and choice['delta']:
                            if 'content' in choice['delta']:
                                yield choice['delta']['content']
        except Exception as e:
            yield json.dumps({"error": "An error occurred while processing your request.", "details": str(e)})
        return
    return StreamingResponse(content=stream_generator(), media_type="text/event-stream")`

@ShreyaR
Copy link
Collaborator

ShreyaR commented Apr 4, 2024

@w8jie @krrishdholakia @Harryalways317 we're putting up a new docs pr with a tighter LiteLLM integration at #635. Will push up changes that you can test out!

@ShreyaR
Copy link
Collaborator

ShreyaR commented Apr 5, 2024

@w8jie @krrishdholakia @Harryalways317 updated docs on using Guardrails with any custom LLM https://www.guardrailsai.com/docs/how_to_guides/llm_api_wrappers

We've added a tighter integration with LiteLLM, take it for a spin and let us know if there's feedback.

Also adding here -- in case there's interest we're heavily consolidating our LLM calling APIs right now. There's an open discussion that talks about this, feel free to leave comments there if you have feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants