Add support for returning classifier scores in transformers output #8512

BenWilson2 · 2023-05-23T21:27:05Z

Related Issues/PRs

#xxx

What changes are proposed in this pull request?

Add support for returning score attributes for label classification in text-based LLM classification pipelines for transformers

How is this patch tested?

Existing unit/integration tests
New unit/integration tests
Manual tests (describe details, including test results, below)

Validation that serving works correctly for ZeroShotClassificationPipelines and TextClassificationPipelines. Existing unit tests and integration tests have been updated to conform to the updated return types.

Does this PR change the documentation?

No. You can skip the rest of this section.
Yes. Make sure the changed pages / sections render correctly in the documentation preview.

Release Notes

Is this a user-facing change?

No. You can skip the rest of this section.
Yes. Give a description of this change to be included in the release notes for MLflow users.

Added return scores for text-based classification pipelines in transformers

What component(s), interfaces, languages, and integrations does this PR affect?

Components

Interface

area/uiux: Front-end, user experience, plotting, JavaScript, JavaScript dev server
area/docker: Docker use across MLflow's components, such as MLflow Projects and MLflow Models
area/sqlalchemy: Use of SQLAlchemy in the Tracking Service or Model Registry
area/windows: Windows support

Language

language/r: R APIs and clients
language/java: Java APIs and clients
language/new: Proposals for new client languages

Integrations

integrations/azure: Azure and Azure ML integrations
integrations/sagemaker: SageMaker integrations
integrations/databricks: Databricks integrations

How should the PR be classified in the release notes? Choose one:

rn/breaking-change - The PR will be mentioned in the "Breaking Changes" section
rn/none - No description will be included. The PR will be mentioned only by the PR number in the "Small Bugfixes and Documentation Updates" section
rn/feature - A new user-facing feature worth mentioning in the release notes
rn/bug-fix - A user-facing bug fix worth mentioning in the release notes
rn/documentation - A user-facing documentation change worth mentioning in the release notes

Signed-off-by: Ben Wilson <benjamin.wilson@databricks.com>

BenWilson2 · 2023-05-23T21:28:51Z

tests/transformers/test_transformers_model_export.py

-        {0: "POSITIVE"},
-        {0: "POSITIVE"},
-    ]
+    assert len(values.to_dict()) == 2


all of these validations are being moved to structural relationships since scores are somewhat non-deterministic depending on the model used.

mlflow-automation · 2023-05-23T22:20:24Z

Documentation preview for 4bca13a will be available here when this CircleCI job completes successfully.

More info

Ignore this comment if this PR does not change the documentation.
It takes a few minutes for the preview to be available.
The preview is updated when a new commit is pushed to this PR.
This comment was created by https://github.com/mlflow/mlflow/actions/runs/5071896843.

WeichenXu123 · 2023-05-24T05:17:26Z

mlflow/transformers.py

+        for entry in data:
+            for label, score in zip(entry["labels"], entry["scores"]):
+                flattened_data.append(
+                    {"sequence": entry["sequence"], "labels": label, "score": score}


Suggested change

{"sequence": entry["sequence"], "labels": label, "score": score}

{"sequence": entry["sequence"], "label": label, "score": score}

I used these keys in order to match exactly what the return dict naming formats are for the transformers pipelines. I think it might be a little confusing to users that if they call the pipeline directly they get a response that says "labels" but if they use it in serving it says "label". This was just to make it consistent with the transformers package implementation :)

WeichenXu123 · 2023-05-24T09:36:25Z

mlflow/transformers.py

+                outputs=Schema(
+                    [
+                        ColSpec("string", name="sequence"),
+                        ColSpec("string", name="labels"),


Suggested change

ColSpec("string", name="labels"),

ColSpec("string", name="label"),

same as above (purely for consistency with the transformers API)

WeichenXu123

LGTM after addressing comments

harupy · 2023-05-24T09:49:35Z

mlflow/transformers.py

+        {'sequence': {0: 'My dog loves to eat spaghetti',
+          1: 'My dog loves to eat spaghetti',
+          2: 'My dog hates going to the vet',
+          3: 'My dog hates going to the vet'},
+         'label': {0: 'happy', 1: 'sad', 2: 'sad', 3: 'happy'},
+         'score': {0: 0.9896970987319946,
+          1: 0.010302911512553692,
+          2: 0.957074761390686,
+          3: 0.042925238609313965}}


Can we replace this with what the dataframe looks like because this function returns a dataframe. It's difficult to guess what this functions returns from the converted dictionary.

great point. Changing to the .to_string() output format

harupy · 2023-05-24T09:54:15Z

mlflow/transformers.py

+            # interim_output = self._parse_lists_of_dict_to_list_of_str(raw_output, output_key)
+            # output = self._parse_list_output_for_multiple_candidate_pipelines(interim_output)


Suggested change

# interim_output = self._parse_lists_of_dict_to_list_of_str(raw_output, output_key)

# output = self._parse_list_output_for_multiple_candidate_pipelines(interim_output)

TY for the catch :)

harupy · 2023-05-24T10:02:51Z

tests/transformers/test_transformers_model_export.py

-                "outputs": '[{"type": "string"}]',
+                "outputs": '[{"name": "sequence", "type": "string"}, {"name": "labels", '
+                '"type": "string"}, {"name": "score", "type": "double"}]',


Can we use dict instead of string here?

This is to match the output of ModelSignature.to_dict(). The alternative might be to construct the ModelSignature instance results direclty to a dict instead of a JSON encoded representation, but that will probably make this test far more complicated to follow.

harupy · 2023-05-24T10:12:39Z

docs/source/models.rst

@@ -2425,12 +2425,12 @@ Pipeline Type                     Input Type                     Output Type
 Instructional Text Generation     str or List[str]               str or List[str]
 Conversational                    str or List[str]               str or List[str]
 Summarization                     str or List[str]               str or List[str]
-Text Classification               str or List[str]               str or List[str]
+Text Classification               str or List[str]               pd.DataFrame


Is it possible to include the dataframe schema like this?

pd.DataFrame (dtypes: {foo: int, bar: float})

great idea :)

harupy · 2023-05-24T10:18:06Z

mlflow/transformers.py

@@ -1800,6 +1807,52 @@ def _coerce_exploded_dict_to_single_dict(self, data):
        else:
            return data

+    def _parse_zero_shot_text_classifier_output_to_df(self, data):


Suggested change

def _parse_zero_shot_text_classifier_output_to_df(self, data):

def _{ flatten | expand }_zero_shot_text_classifier_output(self, data):

Nit: I'd use flatten or expand here :)

agreed. flatten is more appropriate. Changed!

harupy

Left some comments, looks good to me once they are addressed :)

Signed-off-by: Ben Wilson <benjamin.wilson@databricks.com>

Add support for classification scores in transformers pipelines

63daede

Signed-off-by: Ben Wilson <benjamin.wilson@databricks.com>

BenWilson2 commented May 23, 2023

View reviewed changes

github-actions bot added area/models MLmodel format, model serialization/deserialization, flavors rn/feature Mention under Features in Changelogs. labels May 23, 2023

BenWilson2 requested review from harupy, WeichenXu123, dbczumar and serena-ruan May 23, 2023 23:06

WeichenXu123 reviewed May 24, 2023

View reviewed changes

WeichenXu123 approved these changes May 24, 2023

View reviewed changes

harupy reviewed May 24, 2023

View reviewed changes

harupy approved these changes May 24, 2023

View reviewed changes

BenWilson2 added 2 commits May 24, 2023 13:47

PR feedback

d207488

Signed-off-by: Ben Wilson <benjamin.wilson@databricks.com>

typo

4bca13a

Signed-off-by: Ben Wilson <benjamin.wilson@databricks.com>

BenWilson2 enabled auto-merge (squash) May 24, 2023 19:25

BenWilson2 merged commit 4dcf9e3 into mlflow:master May 24, 2023
35 checks passed

BenWilson2 deleted the classifier-standardization branch May 24, 2023 20:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for returning classifier scores in transformers output #8512

Add support for returning classifier scores in transformers output #8512

BenWilson2 commented May 23, 2023

BenWilson2 May 23, 2023

mlflow-automation commented May 23, 2023 •

edited

WeichenXu123 May 24, 2023 •

edited

BenWilson2 May 24, 2023

WeichenXu123 May 24, 2023

BenWilson2 May 24, 2023

WeichenXu123 left a comment

harupy May 24, 2023 •

edited

BenWilson2 May 24, 2023

harupy May 24, 2023

BenWilson2 May 24, 2023

harupy May 24, 2023 •

edited

BenWilson2 May 24, 2023

harupy May 24, 2023

BenWilson2 May 24, 2023

harupy May 24, 2023

BenWilson2 May 24, 2023

harupy left a comment

	{"sequence": entry["sequence"], "labels": label, "score": score}
	{"sequence": entry["sequence"], "label": label, "score": score}

	ColSpec("string", name="labels"),
	ColSpec("string", name="label"),

		# interim_output = self._parse_lists_of_dict_to_list_of_str(raw_output, output_key)
		# output = self._parse_list_output_for_multiple_candidate_pipelines(interim_output)

	def _parse_zero_shot_text_classifier_output_to_df(self, data):
	def _{ flatten \| expand }_zero_shot_text_classifier_output(self, data):

Add support for returning classifier scores in transformers output #8512

Add support for returning classifier scores in transformers output #8512

Conversation

BenWilson2 commented May 23, 2023

Related Issues/PRs

What changes are proposed in this pull request?

How is this patch tested?

Does this PR change the documentation?

Release Notes

Is this a user-facing change?

What component(s), interfaces, languages, and integrations does this PR affect?

How should the PR be classified in the release notes? Choose one:

Choose a reason for hiding this comment

mlflow-automation commented May 23, 2023 • edited

WeichenXu123 May 24, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

WeichenXu123 left a comment

Choose a reason for hiding this comment

harupy May 24, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

harupy May 24, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

harupy left a comment

Choose a reason for hiding this comment

mlflow-automation commented May 23, 2023 •

edited

WeichenXu123 May 24, 2023 •

edited

harupy May 24, 2023 •

edited

harupy May 24, 2023 •

edited