Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NIFI-10953: Implement GCP Vision AI processors #6762

Conversation

KalmanJantner
Copy link
Contributor

@KalmanJantner KalmanJantner commented Dec 6, 2022

Summary

NIFI-10953

Tracking

Please complete the following tracking steps prior to pull request creation.

Issue Tracking

Pull Request Tracking

  • Pull Request title starts with Apache NiFi Jira issue number, such as NIFI-00000
  • Pull Request commit message starts with Apache NiFi Jira issue number, as such NIFI-00000

Pull Request Formatting

  • Pull Request based on current revision of the main branch
  • Pull Request refers to a feature branch with one commit containing changes

Verification

Please indicate the verification steps performed prior to pull request creation.

Build

  • Build completed using mvn clean install -P contrib-check
    • JDK 8
    • JDK 11
    • JDK 17

Licensing

  • New dependencies are compatible with the Apache License 2.0 according to the License Policy
  • New dependencies are documented in applicable LICENSE and NOTICE files

Documentation

  • Documentation formatting appears as expected in rendered files

}
}

abstract protected Message fromJson(String json) throws IOException;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for this method to be declared here. In fact their implementation might be better off inlined into the startOperation implementations.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, fixed.

Comment on lines 40 to 41
OperationFuture asyncResponse = startOperation(session, flowFile);
String operationName = ((OperationSnapshot) asyncResponse.getInitialFuture().get()).getName();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
OperationFuture asyncResponse = startOperation(session, flowFile);
String operationName = ((OperationSnapshot) asyncResponse.getInitialFuture().get()).getName();
OperationFuture<?, ?> asyncResponse = startOperation(session, flowFile);
String operationName = asyncResponse.getName();

REL_FAILURE
)));

private ImageAnnotatorClient vision;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

vision should be closed in an @OnStopped method.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, fixed.


@Override
public List<PropertyDescriptor> getSupportedPropertyDescriptors() {
return Collections.unmodifiableList(Arrays.asList(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This collection should be extracted to a PROPERTIES constant.

ImageAnnotatorSettings.Builder builder = ImageAnnotatorSettings.newBuilder().setCredentialsProvider(credentialsProvider);
vision = ImageAnnotatorClient.create(builder.build());
} catch (Exception e) {
getLogger().error("Failed to create vision client.", e);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The framework should know that the processor is incapable of doing it's job when this happens.
We should throw a ProcessException.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thx, fixed.


abstract protected Message fromJson(String json) throws IOException;

abstract protected OperationFuture startOperation(ProcessSession session, FlowFile flowFile) throws InvalidProtocolBufferException;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are multiple issues here. Each one is minor on it's own but overall an improvement is warranted in my option.

First the issues:

  1. The startOperation implementation have logic duplications. (Not too big but still.)
  2. AbstractGcpVisionProcessor has a readFlowFile method which it doesn't use and overall not something that should be the concern of that class. We could move that method into AbstractStartGcpVisionOperation - only subclasses of that uses that method, but we basically using the parent as a utility repository.

My recommended improvement is to remove the readFlowFile entirely, implement startOperation in AbstractStartGcpVisionOperation itself while declaring abstract methods for logic that absolutely requires it. Here's what I came up with:

Suggested change
abstract protected OperationFuture startOperation(ProcessSession session, FlowFile flowFile) throws InvalidProtocolBufferException;
public abstract class AbstractStartGcpVisionOperation<B extends com.google.protobuf.GeneratedMessageV3.Builder<B>> extends AbstractGcpVisionProcessor {
...
protected OperationFuture<?, ?> startOperation(ProcessSession session, FlowFile flowFile) throws InvalidProtocolBufferException {
B builder = newBuilder();
try (InputStream inputStream = session.read(flowFile)) {
JsonFormat.parser().ignoringUnknownFields().merge(new InputStreamReader(inputStream), builder);
} catch (final IOException e) {
throw new ProcessException("Read FlowFile Failed", e);
}
return startOperation(builder);
}
abstract B newBuilder();
abstract OperationFuture<?, ?> startOperation(B builder);

The two implementation would be fairly straightforward:

public class StartGcpVisionAnnotateFilesOperation extends AbstractStartGcpVisionOperation<AsyncBatchAnnotateFilesRequest.Builder> {

    @Override
    AsyncBatchAnnotateFilesRequest.Builder newBuilder() {
        return AsyncBatchAnnotateFilesRequest.newBuilder();
    }

    @Override
    OperationFuture<?, ?> startOperation(AsyncBatchAnnotateFilesRequest.Builder builder) {
        return getVisionClient().asyncBatchAnnotateFilesAsync(builder.build());
    }
}

public class StartGcpVisionAnnotateImagesOperation extends AbstractStartGcpVisionOperation<AsyncBatchAnnotateImagesRequest.Builder> {

    @Override
    AsyncBatchAnnotateImagesRequest.Builder newBuilder() {
        return AsyncBatchAnnotateImagesRequest.newBuilder();
    }

    @Override
    OperationFuture<?, ?> startOperation(AsyncBatchAnnotateImagesRequest.Builder builder) {
        return getVisionClient().asyncBatchAnnotateImagesAsync(builder.build());
    }
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, fixed based on you suggestion.

@@ -0,0 +1,105 @@
<!DOCTYPE html>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are multiple suggestions I'd like to make, here's how the document would look like based on those:

<!DOCTYPE html>
<html lang="en" xmlns="http://www.w3.org/1999/html">
<!--
      Licensed to the Apache Software Foundation (ASF) under one or more
      contributor license agreements.  See the NOTICE file distributed with
      this work for additional information regarding copyright ownership.
      The ASF licenses this file to You under the Apache License, Version 2.0
      (the "License"); you may not use this file except in compliance with
      the License.  You may obtain a copy of the License at
          http://www.apache.org/licenses/LICENSE-2.0
      Unless required by applicable law or agreed to in writing, software
      distributed under the License is distributed on an "AS IS" BASIS,
      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
      See the License for the specific language governing permissions and
      limitations under the License.
    -->

<head>
    <meta charset="utf-8"/>
    <title>Amazon Textract</title>
    <link rel="stylesheet" href="../../../../../css/component-usage.css" type="text/css"/>
</head>
<body>

<h1>Google Cloud Vision - Start Annotate Files Operation</h1>
<p>
    Prerequisites
<ul>
    <li>Make sure Vision API is enabled and the account you are using has the right to use it</li>
    <li>Make sure the input file(s) are available in a GCS bucket</li>
</ul>
</p>
<h3>Usage</h3>
<p>
    StartGcpVisionAnnotateFilesOperation is designed to trigger file annotation operations. This processor should be used in pair with the GetGcpVisionAnnotateFilesOperationStatus Processor.
    Outgoing FlowFiles contain the raw response to the request returned by the Vision server. The response is in JSON format and contains the result and additional metadata as written in the Google Vision API Reference documents.
</p>

<h3>Payload</h3>
<p>
    The JSON Payload is a request in JSON format as documented in the <a href="https://cloud.google.com/vision/docs/reference/rest/v1/files/asyncBatchAnnotate" target="_blank">Google Vision REST API reference document</a>.
    Payload can be fed to the processor via the <code>JSON Payload</code> property or as a FlowFile content. The property has higher precedence over FlowFile content.
    Please make sure to delete the default value of the property if you want to use FlowFile content payload.
    A JSON payload template example:
</p>
<code>
    <pre>
{
    "requests": [
        {
            "inputConfig": {
                "gcsSource": {
                    "uri": "gs://${gcs.bucket}/${filename}"
                },
                "mimeType": "application/pdf"
            },
            "features": [{
                    "type": "DOCUMENT_TEXT_DETECTION",
                    "maxResults": 4
                }],
            "outputConfig": {
                "gcsDestination": {
                    "uri": "gs://${gcs.bucket}/${filename}/"
                },
                "batchSize": 2
            }
        }]
}
    </pre>
</code>
<h3>Features types</h3>
<ul>
    <li>TEXT_DETECTION: Optical character recognition (OCR) for an image; text recognition and conversion to machine-coded text. Identifies and extracts UTF-8 text in an image.</li>
    <li>DOCUMENT_TEXT_DETECTION: Optical character recognition (OCR) for a file (PDF/TIFF) or dense text image; dense text recognition and conversion to machine-coded text.</li>
</ul>
You can find more details at <a href="https://cloud.google.com/vision/docs/features-list" target="_blank">Google Vision Feature List</a>

<h3>Example: How to setup a simple Annotate Image Flow</h3>
<p>
    Prerequisites
</p>
<p>
    <ul>
    <li>Input files should be available in a GCS bucket</li>
    <li>This bucket must not contain anything else but the input files</li>
    </ul>
</p>
<p>Create the following flow</p>
<img src="vision-annotate-files.png" style="height: 50%; width: 50%"/>
<p>
Keep the default value of JSON PAYLOAD property in StartGcpVisionAnnotateImagesOperation
</p>
<p>
Execution steps:
    <ul>
        <li>ListGCSBucket processor will return a list of files in the bucket at the first run.</li>
        <li>ListGCSBucket will return only new items at subsequent runs.</li>
        <li>StartGcpVisionAnnotateFilesOperation processor will trigger GCP Vision file annotation jobs based on the JSON payload.</li>
        <li>StartGcpVisionAnnotateFilesOperation processor will populate the <code>operationKey</code> flow file attribute.</li>
        <li>GetGcpVisionAnnotateFilesOperationStatus processor will periodically query status of the job.</li>
    </ul>
</p>
</body>
</html>

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added you suggestions to the PR. thank you for the remarks.

@@ -0,0 +1,107 @@
<!DOCTYPE html>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are multiple suggestions I'd like to make, here's how the document would look like based on those:

<!DOCTYPE html>
<html lang="en" xmlns="http://www.w3.org/1999/html">
<!--
      Licensed to the Apache Software Foundation (ASF) under one or more
      contributor license agreements.  See the NOTICE file distributed with
      this work for additional information regarding copyright ownership.
      The ASF licenses this file to You under the Apache License, Version 2.0
      (the "License"); you may not use this file except in compliance with
      the License.  You may obtain a copy of the License at
          http://www.apache.org/licenses/LICENSE-2.0
      Unless required by applicable law or agreed to in writing, software
      distributed under the License is distributed on an "AS IS" BASIS,
      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
      See the License for the specific language governing permissions and
      limitations under the License.
    -->

<head>
    <meta charset="utf-8"/>
    <title>Amazon Textract</title>
    <link rel="stylesheet" href="../../../../../css/component-usage.css" type="text/css"/>
</head>
<body>

<h1>Google Cloud Vision - Start Annotate Images Operation</h1>
<p>
    Prerequisites
<ul>
	<li>Make sure Vision API is enabled and the account you are using has the right to use it</li>
	<li>Make sure thne input image(s) are available in a GCS bucket</li>
</ul>
</p>
<h3>Usage</h3>
<p>
StartGcpVisionAnnotateImagesOperation is designed to trigger image annotation operations. This processor should be used in pair with the GetGcpVisionAnnotateImagesOperationStatus Processor.
Outgoing FlowFiles contain the raw response to the request returned by the Vision server. The response is in JSON format and contains the result and additional metadata as written in the Google Vision API Reference documents.
</p>
<h3>Payload</h3>
<p>
	The JSON Payload is a request in JSON format as documented in the <a href="https://cloud.google.com/vision/docs/reference/rest/v1/images/asyncBatchAnnotate" target="_blank">Google Vision REST API reference document</a>.
	Payload can be fed to the processor via the <code>JSON Payload</code> property or as a FlowFile content. The property has higher precedence over FlowFile content.
	Please make sure to delete the default value of the property if you want to use FlowFile content payload.
    A JSON payload template example:
</p>

<code>
    <pre>
{
	"requests": [{
		"image": {
			"source": {
				"imageUri": "gs://${gcs.bucket}/${filename}"
			}
		},
		"features": [{
			"type": "DOCUMENT_TEXT_DETECTION",
			"maxResults": 4
		}]
	}],
	"outputConfig": {
		"gcsDestination": {
			"uri": "gs://${gcs.bucket}/${filename}/"
		},
		"batchSize": 2
	}
}
    </pre>
</code>
<h3>Features types</h3>
<ul>
	<li>TEXT_DETECTION: Optical character recognition (OCR) for an image; text recognition and conversion to machine-coded text. Identifies and extracts UTF-8 text in an image.</li>
	<li>DOCUMENT_TEXT_DETECTION: Optical character recognition (OCR) for a file (PDF/TIFF) or dense text image; dense text recognition and conversion to machine-coded text.</li>
	<li>LANDMARK_DETECTION: Provides the name of the landmark, a confidence score and a bounding box in the image for the landmark.</li>
	<li>LOGO_DETECTION: Provides a textual description of the entity identified, a confidence score, and a bounding polygon for the logo in the file.</li>
	<li>LABEL_DETECTION: Provides generalized labels for an image.</li>
	<li>etc.</li>
</ul>
You can find more details at <a href="https://cloud.google.com/vision/docs/features-list" target="_blank">Google Vision Feature List</a>
<h3>Example: How to setup a simple Annotate Image Flow</h3>
<p>
	Prerequisites
</p>
<p>
<ul>
	<li>Input image files should be available in a GCS bucket</li>
    <li>This bucket must not contain anything else but the input image files</li>
</ul>
</p>
<p>Create the following flow</p>
<img src="vision-annotate-images.png" style="height: 50%; width: 50%"/>
<p>
	Keep the default value of JSON PAYLOAD property in StartGcpVisionAnnotateImagesOperation
</p>
<p>
	Execution steps:
<ul>
	<li>ListGCSBucket processor will return a list of files in the bucket at the first run.</li>
	<li>ListGCSBucket will return only new items at subsequent runs.</li>
	<li>StartGcpVisionAnnotateImagesOperation processor will trigger GCP Vision image annotation jobs based on the JSON payload.</li>
	<li>StartGcpVisionAnnotateImagesOperation processor will populate the <code>operationKey</code> flow file attribute.</li>
	<li>GetGcpVisionAnnotateImagesOperationStatus processor will periodically query status of the job.</li>
</ul>
</p>
</body>
</html>

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added you suggestions to the PR. thank you for the remarks.

KalmanJantner and others added 8 commits January 18, 2023 17:24
…java/org/apache/nifi/processors/gcp/vision/StartGcpVisionAnnotateFilesOperation.java

Co-authored-by: tpalfy <53442425+tpalfy@users.noreply.github.com>
…java/org/apache/nifi/processors/gcp/vision/StartGcpVisionAnnotateImagesOperation.java

Co-authored-by: tpalfy <53442425+tpalfy@users.noreply.github.com>
…java/org/apache/nifi/processors/gcp/vision/GetGcpVisionAnnotateFilesOperationStatus.java

Co-authored-by: tpalfy <53442425+tpalfy@users.noreply.github.com>
…java/org/apache/nifi/processors/gcp/vision/GetGcpVisionAnnotateImagesOperationStatus.java

Co-authored-by: tpalfy <53442425+tpalfy@users.noreply.github.com>
…java/org/apache/nifi/processors/gcp/vision/AbstractGetGcpVisionAnnotateOperationStatus.java

Co-authored-by: tpalfy <53442425+tpalfy@users.noreply.github.com>
…resources/docs/org.apache.nifi.processors.gcp.vision.GetGcpVisionAnnotateFilesOperationStatus/additionalDetails.html

Co-authored-by: tpalfy <53442425+tpalfy@users.noreply.github.com>
…resources/docs/org.apache.nifi.processors.gcp.vision.GetGcpVisionAnnotateImagesOperationStatus/additionalDetails.html

Co-authored-by: tpalfy <53442425+tpalfy@users.noreply.github.com>
@asfgit asfgit closed this in 67925b1 Jan 19, 2023
@tpalfy
Copy link
Contributor

tpalfy commented Jan 19, 2023

LGTM
Thank you for your work @KalmanJantner !
Merged into main.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants