Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AzureMLOnlineEndpoint not working, 404 error, but same url and api key works with standard http #9987

Closed
2 of 14 tasks
levalencia opened this issue Aug 30, 2023 · 5 comments
Closed
2 of 14 tasks
Labels
🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature Ɑ: models Related to LLMs or chat model modules

Comments

@levalencia
Copy link
Contributor

System Info

0.0.276

Who can help?

@hwchase17
@agola11

Information

  • The official example notebooks/scripts
  • My own modified scripts

Related Components

  • LLMs/Chat Models
  • Embedding Models
  • Prompts / Prompt Templates / Prompt Selectors
  • Output Parsers
  • Document Loaders
  • Vector Stores / Retrievers
  • Memory
  • Agents / Agent Executors
  • Tools / Toolkits
  • Chains
  • Callbacks/Tracing
  • Async

Reproduction

I have deployed llama2 in Azure ML endpoint and when I test it with langchain code I got a 404 Error. However using standard request library from python it works

Below : askdocuments2 ---> no lanchain
askdocuments --->langchain

same endpoint url, same key.

def askdocuments2(
        question):
    # Request data goes here
    # The example below assumes JSON formatting which may be updated
    # depending on the format your endpoint expects.
    # More information can be found here:
    # https://docs.microsoft.com/azure/machine-learning/how-to-deploy-advanced-entry-script
    formatter = LlamaContentFormatter()
    data = formatter.format_request_payload(prompt=question, model_kwargs={"temperature": 0.1, "max_tokens": 300})
    body = data

    url = 'https://llama-2-7b-test.westeurope.inference.ml.azure.com/score'
    # Replace this with the primary/secondary key or AMLToken for the endpoint
    api_key = ''
    if not api_key:
        raise Exception("A key should be provided to invoke the endpoint")

    # The azureml-model-deployment header will force the request to go to a specific deployment.
    # Remove this header to have the request observe the endpoint traffic rules
    headers = {'Content-Type': 'application/json', 'Authorization': ('Bearer ' + api_key), 'azureml-model-deployment': 'llama'}

    req = urllib.request.Request(url, body, headers)

    try:
        response = urllib.request.urlopen(req)

        result = response.read()
        decoded_data = json.loads(result.decode('utf-8'))
        text = decoded_data[0]["0"]
        return text
    except urllib.error.HTTPError as error:
        print("The request failed with status code: " + str(error.code))

        # Print the headers - they include the requert ID and the timestamp, which are useful for debugging the failure
        print(error.info())
        print(error.read().decode("utf8", 'ignore'))


def askdocuments(
        question):

    try:
        content_formatter = LlamaContentFormatter()
        llm = AzureMLOnlineEndpoint(
            endpoint_api_key="",
            deployment_name="llama-2-7b-test",
            endpoint_url="https://llama-2-7b-test.westeurope.inference.ml.azure.com/score",
            model_kwargs={"temperature": 0.8, "max_tokens": 300},
            content_formatter=content_formatter
        )

        formatter_template = "Write a {word_count} word essay about {topic}."

        prompt = PromptTemplate(
            input_variables=["word_count", "topic"], template=formatter_template
        )

        chain = LLMChain(llm=llm, prompt=prompt)
        response = chain.run({"word_count": 100, "topic": "how to make friends"})
        return response
    except requests.exceptions.RequestException as e:
        # Handle any requests-related errors (e.g., network issues, invalid URL)
        raise ValueError(f"Error with the API request: {e}")

    except json.JSONDecodeError as e:
        # Handle any JSON decoding errors (e.g., invalid JSON format)
        raise ValueError(f"Error decoding API response as JSON: {e}")
    except Exception as e:
        # Handle any other errors
        raise ValueError(f"Error: {e}")

Expected behavior

According to the documentation I am doing everything correctly, so not sure why its showing a 404 error in a valid url

@dosubot dosubot bot added Ɑ: models Related to LLMs or chat model modules 🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature labels Aug 30, 2023
@dosubot
Copy link

dosubot bot commented Aug 30, 2023

🤖

Hello,

The 404 error you're experiencing when using the LangChain code with AzureMLOnlineEndpoint could be due to the way the request is being sent. In the standard Python request library, you might be sending a GET request by default, but in the LangChain code, a POST request is being sent.

In the AzureMLEndpointClient class, the call method uses urllib.request.Request to create a request object and urllib.request.urlopen to send the request. By default, urllib.request.Request sends a POST request if the data parameter is not None. In this case, body is being passed as the data parameter, so a POST request is being sent.

If the endpoint you're trying to reach does not support POST requests, this could be why you're receiving a 404 error. You might need to modify the call method to send a GET request instead, or ensure that your endpoint supports POST requests.

In the context shared, it's also important to note that the AzureMLOnlineEndpoint class uses a content_formatter parameter, which is a handler class for transforming the request and response of an AzureML endpoint to match with required schema. If the content_formatter is not correctly set up or if the request payload is not correctly formatted, this could also lead to a 404 error.

The LlamaContentFormatter class is used to format the request payload. It converts the messages into a specific format and encodes them into bytes. If the messages are not correctly converted into this format, this could lead to a 404 error. Make sure that the content_formatter is correctly set up, the messages are correctly converted into the required format, and that the request payload is correctly formatted.

You can find more information about this in the following sources:

I hope this helps! If you have any other questions, feel free to ask.

Best,
Dosu

Sources


This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

@levalencia
Copy link
Contributor Author

@dosu-bot the test only test AzureMLChatOnlineEndpoint and not the AzureMLEndpoint.

@dosubot
Copy link

dosubot bot commented Aug 30, 2023

🤖

Hello,

To effectively test the AzureMLEndpoint in LangChain, you can create a unit test that mocks the AzureMLEndpointClient's call method and the content formatter's methods. Here's an example of how you can do this using the unittest and unittest.mock libraries:

import unittest
from unittest.mock import MagicMock, patch
from langchain.llms.azureml_endpoint import AzureMLEndpoint, AzureMLEndpointClient, LlamaContentFormatter

class TestAzureMLEndpoint(unittest.TestCase):
    @patch.object(AzureMLEndpointClient, 'call')
    @patch.object(LlamaContentFormatter, 'format_request_payload')
    @patch.object(LlamaContentFormatter, 'format_response_payload')
    def test_call(self, mock_format_response_payload, mock_format_request_payload, mock_call):
        # Arrange
        mock_format_request_payload.return_value = b'{"inputs": {"input_string": ["prompt"]}, "parameters": {}}'
        mock_call.return_value = b'{"0": "generated text"}'
        mock_format_response_payload.return_value = "generated text"
        azure_ml_endpoint = AzureMLEndpoint(
            endpoint_url="https://test-endpoint.com/score",
            endpoint_api_key="test-api-key",
            content_formatter=LlamaContentFormatter(),
        )

        # Act
        result = azure_ml_endpoint._call("prompt")

        # Assert
        mock_format_request_payload.assert_called_once_with("prompt", {})
        mock_call.assert_called_once_with(b'{"inputs": {"input_string": ["prompt"]}, "parameters": {}}')
        mock_format_response_payload.assert_called_once_with(b'{"0": "generated text"}')
        self.assertEqual(result, "generated text")

if __name__ == '__main__':
    unittest.main()

In this test, we're creating an instance of AzureMLEndpoint and calling its _call method with a prompt. We're mocking the AzureMLEndpointClient's call method and the LlamaContentFormatter's format_request_payload and format_response_payload methods to control their behavior and make assertions about how they're called. We're also asserting that the result of the _call method is as expected. This is in line with the context shared where LlamaContentFormatter is used instead of GPT2ContentFormatter.

You can find more information about the AzureMLEndpoint and how it's used in the LangChain codebase in the following files:

I hope this helps! Let me know if you have any other questions.

Sources


This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

@PJPRoche
Copy link

PJPRoche commented Sep 11, 2023

I have also experienced this same issue. However, I don't think the issue with with GET/POST request or with the content formatter.

The issue seems to the with the fact that the AzureMLChatOnlineEndpoint class is not instantiated with any reference to the deployment name of the Azure ML endpoint, see

class AzureMLChatOnlineEndpoint(SimpleChatModel):

The deployment_name is an (optional) parameter for AzureMLEndpointClient, which AzureMLChatOnlineEndpoint calls. I can see here in the Langchain docs that deployment_name is "Not required" but i believe it is required when using AzureMLChatOnlineEndpoint. As @levalencia has mentioned above (and shown in his askdocuments2 example), when including the deployment name, in his case "'azureml-model-deployment': 'llama'" in his header, then the call to the hosted Azure ML model works. i.e. the deployment name is required.

To fix this in the AzureMLChatOnlineEndpoint, I think adding an additional "deployment_name" parameter to the class would solve it - just like this is for AzureMLEndpointClient. The new deployment_name parameter passed to AzureMLChatOnlineEndpoint can then be passed to AzureMLEndpointClient when it is instantiated...

https://github.com/langchain-ai/langchain/blob/65e1606daa696e2190fcb410f190c6811f9f8dc3/libs/langchain/langchain/chat_models/azureml_endpoint.py#L108C23-L108C44

I have tested this in debug by adding a deployment_name value just after AzureMLEndpointClient is created and it all works fine. Only with the deployment name included does it work.

I'm happy to raise a PR if @dosu-bot you agree/think this is the correct solution?

@levalencia
Copy link
Contributor Author

When testing with the newest versions, it works now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature Ɑ: models Related to LLMs or chat model modules
Projects
None yet
Development

No branches or pull requests

2 participants