[BUG] `mlflow models serve` fails with HTTP 500 instead of 400 on bad input #4897

mmaitre314 · 2021-10-13T22:20:19Z

Thank you for submitting an issue. Please refer to our issue policy for additional information about bug reports. For help with debugging your code, please refer to Stack Overflow.

Please fill in this bug report template to ensure a timely and thorough response.

Willingness to contribute

The MLflow Community encourages bug fix contributions. Would you or another member of your organization be willing to contribute a fix for this bug to the MLflow code base?

Yes. I can contribute a fix for this bug independently.
Yes. I would be willing to contribute a fix for this bug with guidance from the MLflow community.
No. I cannot contribute a bug fix at this time.

System information

Have I written custom code (as opposed to using a stock example script provided in MLflow): yes
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): windows 10
MLflow installed from (source or binary): binary
MLflow version (run mlflow --version): 1.20.2
Python version: 3.8
npm version, if running the dev UI: N/A
Exact command to reproduce:
- Start server: mlflow models serve -m runs:/5f8aee52fcb442388368af4da658b398/model --no-conda
- Submit an inference request: curl -i -X POST -d "{\"data\":0.0199132142]}" -H "Content-Type: application/json" http://localhost:5000/invocations

Describe the problem

Describe the problem clearly here. Include descriptions of the expected behavior and the actual behavior.

Submitting an inference requests to the MLFlow model server with invalid content returns HTTP error 500 'Internal Server Error' instead of HTTP error 400 'Bad Request'. This prevents proper error handling on the client side and blocks REST API fuzzing.

Ex:

curl -i -X POST -d "{\"data\":0.0199132142]}" -H "Content-Type: application/json" http://localhost:5000/invocations
HTTP/1.1 500 INTERNAL SERVER ERROR
Content-Length: 901
Content-Type: application/json
Date: Wed, 13 Oct 2021 22:16:44 GMT
Server: mlflow

{"error_code": "MALFORMED_REQUEST", "message": "Failed to parse input from JSON. Ensure that input is a valid JSON formatted string.", "stack_trace": "Traceback (most recent call last):\n  File \"C:\\Source\\local_training_mlflow_project\\.venv\\lib\\site-packages\\mlflow\\pyfunc\\scoring_server\\__init__.py\", line 81, in infer_and_parse_json_input\n    decoded_input = json.loads(json_input)\n  File \"C:\\Users\\mmaitre\\Anaconda3\\lib\\json\\__init__.py\", line 357, in loads\n    return _default_decoder.decode(s)\n  File \"C:\\Users\\mmaitre\\Anaconda3\\lib\\json\\decoder.py\", line 337, in decode\n    obj, end = self.raw_decode(s, idx=_w(s, 0).end())\n  File \"C:\\Users\\mmaitre\\Anaconda3\\lib\\json\\decoder.py\", line 353, in raw_decode\n    obj, end = self.scan_once(s, idx)\njson.decoder.JSONDecodeError: Expecting ',' delimiter: line 1 column 21 (char 20)\n"}

Code to reproduce issue

Provide a reproducible test case that is the bare minimum necessary to generate the problem.'

curl -i -X POST -d "{\"data\":0.0199132142]}" -H "Content-Type: application/json" http://localhost:5000/invocations

Other info / logs

Include any logs or source code that would be helpful to diagnose the problem. If including tracebacks, please include the full traceback. Large logs and files should be attached.

What component(s), interfaces, languages, and integrations does this bug affect?

Components

Interface

area/uiux: Front-end, user experience, plotting, JavaScript, JavaScript dev server
area/docker: Docker use across MLflow's components, such as MLflow Projects and MLflow Models
area/sqlalchemy: Use of SQLAlchemy in the Tracking Service or Model Registry
area/windows: Windows support

Language

language/r: R APIs and clients
language/java: Java APIs and clients
language/new: Proposals for new client languages

Integrations

integrations/azure: Azure and Azure ML integrations
integrations/sagemaker: SageMaker integrations
integrations/databricks: Databricks integrations

The text was updated successfully, but these errors were encountered:

abatomunkuev · 2021-11-01T11:43:29Z

Hello @mmaitre314! I would like to work on this issue.

Talking about the issue, it seems that we are incorrectly passing a value to the argument error_code in function _handle_serving_error.

mlflow/mlflow/pyfunc/scoring_server/__init__.py

Lines 80 to 89 in b945297

    
           try: 
        
               decoded_input = json.loads(json_input) 
        
           except json.decoder.JSONDecodeError: 
        
               _handle_serving_error( 
        
                   error_message=( 
        
                       "Failed to parse input from JSON. Ensure that input is a valid JSON" 
        
                       " formatted string." 
        
                   ), 
        
                   error_code=MALFORMED_REQUEST, 
        
               )

The function _handle_serving_error then passes the value MALFORMED_REQUEST to a MlflowException constructor.

mlflow/mlflow/pyfunc/scoring_server/__init__.py

Line 203 in b945297

    
           e = MlflowException(message=error_message, error_code=error_code, stack_trace=traceback_str)

In MlflowException constructor, it sets the value MALFORMED_REQUEST, which is not defined. So, probably the code goes to an exception, where error_code stores a value ErrorCode.Name(INTERNAL_ERROR).

mlflow/mlflow/exceptions.py

Lines 49 to 52 in b945297

    
           try: 
        
               self.error_code = ErrorCode.Name(error_code) 
        
           except (ValueError, TypeError): 
        
               self.error_code = ErrorCode.Name(INTERNAL_ERROR)

The problem is that MALFORMED_REQUEST has not been defined in mlflow/exceptions.py

mlflow/mlflow/exceptions.py

Lines 3 to 28 in b945297

    
           from mlflow.protos.databricks_pb2 import ( 
        
               INTERNAL_ERROR, 
        
               TEMPORARILY_UNAVAILABLE, 
        
               ENDPOINT_NOT_FOUND, 
        
               PERMISSION_DENIED, 
        
               REQUEST_LIMIT_EXCEEDED, 
        
               BAD_REQUEST, 
        
               INVALID_PARAMETER_VALUE, 
        
               RESOURCE_DOES_NOT_EXIST, 
        
               INVALID_STATE, 
        
               RESOURCE_ALREADY_EXISTS, 
        
               ErrorCode, 
        
           ) 
        
           ERROR_CODE_TO_HTTP_STATUS = { 
        
               ErrorCode.Name(INTERNAL_ERROR): 500, 
        
               ErrorCode.Name(INVALID_STATE): 500, 
        
               ErrorCode.Name(TEMPORARILY_UNAVAILABLE): 503, 
        
               ErrorCode.Name(REQUEST_LIMIT_EXCEEDED): 429, 
        
               ErrorCode.Name(ENDPOINT_NOT_FOUND): 404, 
        
               ErrorCode.Name(RESOURCE_DOES_NOT_EXIST): 404, 
        
               ErrorCode.Name(PERMISSION_DENIED): 403, 
        
               ErrorCode.Name(BAD_REQUEST): 400, 
        
               ErrorCode.Name(RESOURCE_ALREADY_EXISTS): 400, 
        
               ErrorCode.Name(INVALID_PARAMETER_VALUE): 400, 
        
           }

There are 2 ways to fix this issue:

Instead of passing MALFORMED_REQUEST, we need to pass BAD_REQUEST. I would prefer this option.
Define MALFORMED_REQUEST in mlflow/exceptions.py

dbczumar · 2021-11-03T17:27:43Z

Hi @abatomunkuev ! Thank you for the detailed root cause analysis and willingness to contribute. We'd be very excited about your contribution for a fix; I agree that solution #1 is better. Please feel free to file a pull request, and let me know if you have any questions!

abatomunkuev · 2021-11-04T04:25:19Z

Hello @dbczumar! I am currently have some issues reproducing the error.

It seems to me that to start a server, I may need a ML model.
mlflow models serve -m runs:/5f8aee52fcb442388368af4da658b398/model --no-conda

Could you please guide me through how to properly serve the model. From model building to serving. I am trying to run this script: python sklearn_elasticnet_wine/train.py. However, I got the following error with dependencies:

(mlflow-dev-env) MacBook-Pro-Andrei:examples bork$ python sklearn_elasticnet_wine/train.py
Traceback (most recent call last):
  File "sklearn_elasticnet_wine/train.py", line 9, in <module>
    import pandas as pd
ImportError: No module named pandas
(mlflow-dev-env) MacBook-Pro-Andrei:examples bork$

I have gone through contribution guidelines, created conda environment and installed the dependencies.

dbczumar · 2021-11-04T04:50:09Z

Hi @abatomunkuev , are you sure that MLflow and Pandas are installed in your conda environment?

abatomunkuev · 2021-11-04T04:56:21Z

@dbczumar

mlflow-dev-env) MacBook-Pro-Andrei:mlflow bork$ pip install -r dev-requirements.txt
DEPRECATION: Configuring installation scheme with distutils config files is deprecated and will no longer work in the near future. If you are using a Homebrew or Linuxbrew Python, please see discussion at https://github.com/Homebrew/homebrew-core/issues/76621
Requirement already satisfied: sphinx==3.5.4 in /usr/local/lib/python3.7/site-packages (from -r dev-requirements.txt (line 4)) (3.5.4)
Requirement already satisfied: sphinx-autobuild in /usr/local/lib/python3.7/site-packages (from -r dev-requirements.txt (line 5)) (2021.3.14)
Requirement already satisfied: sphinx-click in /usr/local/lib/python3.7/site-packages (from -r dev-requirements.txt (line 6)) (3.0.1)
Requirement already satisfied: scikit-learn in /usr/local/lib/python3.7/site-packages (from -r dev-requirements.txt (line 7)) (1.0.1)
Requirement already satisfied: scipy in /usr/local/lib/python3.7/site-packages (from -r dev-requirements.txt (line 8)) (1.7.1)
Requirement already satisfied: kubernetes in /usr/local/lib/python3.7/site-packages (from -r dev-requirements.txt (line 9)) (19.15.0)
Requirement already satisfied: docutils<0.17,>=0.12 in /usr/local/lib/python3.7/site-packages (from sphinx==3.5.4->-r dev-requirements.txt (line 4)) (0.16)
Requirement already satisfied: sphinxcontrib-qthelp in /usr/local/lib/python3.7/site-packages (from sphinx==3.5.4->-r dev-requirements.txt (line 4)) (1.0.3)
Requirement already satisfied: snowballstemmer>=1.1 in /usr/local/lib/python3.7/site-packages (from sphinx==3.5.4->-r dev-requirements.txt (line 4)) (2.1.0)
Requirement already satisfied: babel>=1.3 in /usr/local/lib/python3.7/site-packages (from sphinx==3.5.4->-r dev-requirements.txt (line 4)) (2.9.1)
Requirement already satisfied: imagesize in /usr/local/lib/python3.7/site-packages (from sphinx==3.5.4->-r dev-requirements.txt (line 4)) (1.2.0)
Requirement already satisfied: packaging in /usr/local/lib/python3.7/site-packages (from sphinx==3.5.4->-r dev-requirements.txt (line 4)) (21.2)
Requirement already satisfied: requests>=2.5.0 in /usr/local/lib/python3.7/site-packages (from sphinx==3.5.4->-r dev-requirements.txt (line 4)) (2.26.0)
Requirement already satisfied: setuptools in /usr/local/lib/python3.7/site-packages (from sphinx==3.5.4->-r dev-requirements.txt (line 4)) (42.0.2)
Requirement already satisfied: Jinja2>=2.3 in /usr/local/lib/python3.7/site-packages (from sphinx==3.5.4->-r dev-requirements.txt (line 4)) (3.0.2)
Requirement already satisfied: sphinxcontrib-serializinghtml in /usr/local/lib/python3.7/site-packages (from sphinx==3.5.4->-r dev-requirements.txt (line 4)) (1.1.5)
Requirement already satisfied: sphinxcontrib-applehelp in /usr/local/lib/python3.7/site-packages (from sphinx==3.5.4->-r dev-requirements.txt (line 4)) (1.0.2)
Requirement already satisfied: sphinxcontrib-devhelp in /usr/local/lib/python3.7/site-packages (from sphinx==3.5.4->-r dev-requirements.txt (line 4)) (1.0.2)
Requirement already satisfied: Pygments>=2.0 in /usr/local/lib/python3.7/site-packages (from sphinx==3.5.4->-r dev-requirements.txt (line 4)) (2.10.0)
Requirement already satisfied: sphinxcontrib-htmlhelp in /usr/local/lib/python3.7/site-packages (from sphinx==3.5.4->-r dev-requirements.txt (line 4)) (2.0.0)
Requirement already satisfied: alabaster<0.8,>=0.7 in /usr/local/lib/python3.7/site-packages (from sphinx==3.5.4->-r dev-requirements.txt (line 4)) (0.7.12)
Requirement already satisfied: sphinxcontrib-jsmath in /usr/local/lib/python3.7/site-packages (from sphinx==3.5.4->-r dev-requirements.txt (line 4)) (1.0.1)
Requirement already satisfied: livereload in /usr/local/lib/python3.7/site-packages (from sphinx-autobuild->-r dev-requirements.txt (line 5)) (2.6.3)
Requirement already satisfied: colorama in /usr/local/lib/python3.7/site-packages (from sphinx-autobuild->-r dev-requirements.txt (line 5)) (0.4.4)
Requirement already satisfied: click>=7.0 in /usr/local/lib/python3.7/site-packages (from sphinx-click->-r dev-requirements.txt (line 6)) (8.0.3)
Requirement already satisfied: joblib>=0.11 in /usr/local/lib/python3.7/site-packages (from scikit-learn->-r dev-requirements.txt (line 7)) (1.1.0)
Requirement already satisfied: threadpoolctl>=2.0.0 in /usr/local/lib/python3.7/site-packages (from scikit-learn->-r dev-requirements.txt (line 7)) (3.0.0)
Requirement already satisfied: numpy>=1.14.6 in /usr/local/lib/python3.7/site-packages (from scikit-learn->-r dev-requirements.txt (line 7)) (1.19.5)
Requirement already satisfied: websocket-client!=0.40.0,!=0.41.*,!=0.42.*,>=0.32.0 in /usr/local/lib/python3.7/site-packages (from kubernetes->-r dev-requirements.txt (line 9)) (1.2.1)
Requirement already satisfied: certifi>=14.05.14 in /usr/local/lib/python3.7/site-packages (from kubernetes->-r dev-requirements.txt (line 9)) (2021.10.8)
Requirement already satisfied: python-dateutil>=2.5.3 in /usr/local/lib/python3.7/site-packages (from kubernetes->-r dev-requirements.txt (line 9)) (2.8.2)
Requirement already satisfied: requests-oauthlib in /usr/local/lib/python3.7/site-packages (from kubernetes->-r dev-requirements.txt (line 9)) (1.3.0)
Requirement already satisfied: pyyaml>=5.4.1 in /usr/local/lib/python3.7/site-packages (from kubernetes->-r dev-requirements.txt (line 9)) (6.0)
Requirement already satisfied: google-auth>=1.0.1 in /usr/local/lib/python3.7/site-packages (from kubernetes->-r dev-requirements.txt (line 9)) (2.3.3)
Requirement already satisfied: urllib3>=1.24.2 in /usr/local/lib/python3.7/site-packages (from kubernetes->-r dev-requirements.txt (line 9)) (1.26.7)
Requirement already satisfied: six>=1.9.0 in /usr/local/lib/python3.7/site-packages (from kubernetes->-r dev-requirements.txt (line 9)) (1.15.0)
Requirement already satisfied: pytz>=2015.7 in /usr/local/lib/python3.7/site-packages (from babel>=1.3->sphinx==3.5.4->-r dev-requirements.txt (line 4)) (2021.3)
Requirement already satisfied: importlib-metadata in /usr/local/lib/python3.7/site-packages (from click>=7.0->sphinx-click->-r dev-requirements.txt (line 6)) (4.8.1)
Requirement already satisfied: rsa<5,>=3.1.4 in /usr/local/lib/python3.7/site-packages (from google-auth>=1.0.1->kubernetes->-r dev-requirements.txt (line 9)) (4.7.2)
Requirement already satisfied: cachetools<5.0,>=2.0.0 in /usr/local/lib/python3.7/site-packages (from google-auth>=1.0.1->kubernetes->-r dev-requirements.txt (line 9)) (4.2.4)
Requirement already satisfied: pyasn1-modules>=0.2.1 in /usr/local/lib/python3.7/site-packages (from google-auth>=1.0.1->kubernetes->-r dev-requirements.txt (line 9)) (0.2.8)
Requirement already satisfied: MarkupSafe>=2.0 in /usr/local/lib/python3.7/site-packages (from Jinja2>=2.3->sphinx==3.5.4->-r dev-requirements.txt (line 4)) (2.0.1)
Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.7/site-packages (from requests>=2.5.0->sphinx==3.5.4->-r dev-requirements.txt (line 4)) (3.3)
Requirement already satisfied: charset-normalizer~=2.0.0 in /usr/local/lib/python3.7/site-packages (from requests>=2.5.0->sphinx==3.5.4->-r dev-requirements.txt (line 4)) (2.0.7)
Requirement already satisfied: tornado in /usr/local/lib/python3.7/site-packages (from livereload->sphinx-autobuild->-r dev-requirements.txt (line 5)) (6.1)
Requirement already satisfied: pyparsing<3,>=2.0.2 in /usr/local/lib/python3.7/site-packages (from packaging->sphinx==3.5.4->-r dev-requirements.txt (line 4)) (2.4.7)
Requirement already satisfied: oauthlib>=3.0.0 in /usr/local/lib/python3.7/site-packages (from requests-oauthlib->kubernetes->-r dev-requirements.txt (line 9)) (3.1.1)
Requirement already satisfied: pyasn1<0.5.0,>=0.4.6 in /usr/local/lib/python3.7/site-packages (from pyasn1-modules>=0.2.1->google-auth>=1.0.1->kubernetes->-r dev-requirements.txt (line 9)) (0.4.8)
Requirement already satisfied: zipp>=0.5 in /usr/local/lib/python3.7/site-packages (from importlib-metadata->click>=7.0->sphinx-click->-r dev-requirements.txt (line 6)) (3.6.0)
Requirement already satisfied: typing-extensions>=3.6.4 in /usr/local/lib/python3.7/site-packages (from importlib-metadata->click>=7.0->sphinx-click->-r dev-requirements.txt (line 6)) (3.7.4.3)
(mlflow-dev-env) MacBook-Pro-Andrei:mlflow bork$ pip install pandas
DEPRECATION: Configuring installation scheme with distutils config files is deprecated and will no longer work in the near future. If you are using a Homebrew or Linuxbrew Python, please see discussion at https://github.com/Homebrew/homebrew-core/issues/76621
Requirement already satisfied: pandas in /usr/local/lib/python3.7/site-packages (1.3.4)
Requirement already satisfied: numpy>=1.17.3 in /usr/local/lib/python3.7/site-packages (from pandas) (1.19.5)
Requirement already satisfied: pytz>=2017.3 in /usr/local/lib/python3.7/site-packages (from pandas) (2021.3)
Requirement already satisfied: python-dateutil>=2.7.3 in /usr/local/lib/python3.7/site-packages (from pandas) (2.8.2)
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.7/site-packages (from python-dateutil>=2.7.3->pandas) (1.15.0)
(mlflow-dev-env) MacBook-Pro-Andrei:mlflow bork$ cd examples/
(mlflow-dev-env) MacBook-Pro-Andrei:examples bork$ python sklearn_elasticnet_wine/train.py
Traceback (most recent call last):
  File "sklearn_elasticnet_wine/train.py", line 9, in <module>
    import pandas as pd
ImportError: No module named pandas
(mlflow-dev-env) MacBook-Pro-Andrei:examples bork$

dbczumar · 2021-11-04T05:05:27Z

@abatomunkuev if you run “which python” and “which pip”, do the resulting locations reside within the expected conda environment?

abatomunkuev · 2021-11-04T05:36:11Z

@dbczumar I got it working. I had to run the python script from conda environment and also install dependencies for conda env.

abatomunkuev · 2021-11-04T12:21:45Z

@dbczumar I have created a Pull Request. However, when I was performing tests by running the following command

pytest tests/pyfunc --large

Some tests have failed since I have changed the code in scoring_server/__init__.py. I am a little bit confused. Should I change the existing tests in test_scoring_server.py ? If so, could you help me to find which test function needs to be updated so that it respects the changes.

Thank you.

dbczumar · 2022-08-09T18:42:03Z

Closing now that #5003 has been merged. Thanks @abatomunkuev !

mmaitre314 added the bug Something isn't working label Oct 13, 2021

github-actions bot added the area/scoring MLflow Model server, model deployment tools, Spark UDFs label Oct 13, 2021

abatomunkuev mentioned this issue Nov 4, 2021

BUG: fixed model serve fail with HTTP 400 on Bad Request. #5003

Merged

27 tasks

dbczumar closed this as completed Aug 9, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] `mlflow models serve` fails with HTTP 500 instead of 400 on bad input #4897

[BUG] `mlflow models serve` fails with HTTP 500 instead of 400 on bad input #4897

mmaitre314 commented Oct 13, 2021

abatomunkuev commented Nov 1, 2021

dbczumar commented Nov 3, 2021

abatomunkuev commented Nov 4, 2021

dbczumar commented Nov 4, 2021

abatomunkuev commented Nov 4, 2021

dbczumar commented Nov 4, 2021

abatomunkuev commented Nov 4, 2021 •

edited

abatomunkuev commented Nov 4, 2021

dbczumar commented Aug 9, 2022

[BUG] mlflow models serve fails with HTTP 500 instead of 400 on bad input #4897

[BUG] mlflow models serve fails with HTTP 500 instead of 400 on bad input #4897

Comments

mmaitre314 commented Oct 13, 2021

Willingness to contribute

System information

Describe the problem

Code to reproduce issue

Other info / logs

What component(s), interfaces, languages, and integrations does this bug affect?

abatomunkuev commented Nov 1, 2021

dbczumar commented Nov 3, 2021

abatomunkuev commented Nov 4, 2021

dbczumar commented Nov 4, 2021

abatomunkuev commented Nov 4, 2021

dbczumar commented Nov 4, 2021

abatomunkuev commented Nov 4, 2021 • edited

abatomunkuev commented Nov 4, 2021

dbczumar commented Aug 9, 2022

[BUG] `mlflow models serve` fails with HTTP 500 instead of 400 on bad input #4897

[BUG] `mlflow models serve` fails with HTTP 500 instead of 400 on bad input #4897

abatomunkuev commented Nov 4, 2021 •

edited