Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NIFI-12616: Added some Use Case docs for Python processors and update… #8253

Closed
wants to merge 4 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
2 changes: 2 additions & 0 deletions .github/workflows/ci-workflow.yml
Original file line number Diff line number Diff line change
Expand Up @@ -170,6 +170,7 @@ jobs:
${{ env.MAVEN_VERIFY_COMMAND }}
${{ env.MAVEN_BUILD_PROFILES }}
-P report-code-coverage
-P python-unit-tests
${{ env.MAVEN_PROJECTS }}
- name: Codecov
uses: codecov/codecov-action@v3
Expand Down Expand Up @@ -238,6 +239,7 @@ jobs:
${{ env.MAVEN_COMMAND }}
${{ env.MAVEN_VERIFY_COMMAND }}
${{ env.MAVEN_BUILD_PROFILES }}
-P python-unit-tests
${{ env.MAVEN_PROJECTS }}
- name: Upload Test Reports
uses: actions/upload-artifact@v3
Expand Down
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -18,3 +18,4 @@ nb-configuration.xml
.vscode/
.java-version
/nifi-nar-bundles/nifi-py4j-bundle/nifi-python-extension-api/src/main/python/dist/
__pycache__
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

package org.apache.nifi.c2.protocol.component.api;

import io.swagger.v3.oas.annotations.media.Schema;

import java.io.Serializable;
import java.util.List;

public class MultiProcessorUseCase implements Serializable {
private String description;
private String notes;
private List<String> keywords;
private List<ProcessorConfiguration> configurations;

@Schema(description="A description of the use case")
public String getDescription() {
return description;
}

public void setDescription(final String description) {
this.description = description;
}

@Schema(description="Any pertinent notes about the use case")
public String getNotes() {
return notes;
}

public void setNotes(final String notes) {
this.notes = notes;
}

@Schema(description="Keywords that pertain to the use csae")
public List<String> getKeywords() {
return keywords;
}

public void setKeywords(final List<String> keywords) {
this.keywords = keywords;
}

@Schema(description="A description of how to configure the Processor to perform the task described in the use case")
public List<ProcessorConfiguration> getConfigurations() {
return configurations;
}

public void setConfigurations(final List<ProcessorConfiguration> configurations) {
this.configurations = configurations;
}

}
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

package org.apache.nifi.c2.protocol.component.api;


import io.swagger.v3.oas.annotations.media.Schema;

import java.io.Serializable;

public class ProcessorConfiguration implements Serializable {
private String processorClassName;
private String configuration;

@Schema(description="The fully qualified classname of the Processor that should be used to accomplish the use case")
public String getProcessorClassName() {
return processorClassName;
}

public void setProcessorClassName(final String processorClassName) {
this.processorClassName = processorClassName;
}

@Schema(description="A description of how the Processor should be configured in order to accomplish the use case")
public String getConfiguration() {
return configuration;
}

public void setConfiguration(final String configuration) {
this.configuration = configuration;
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,11 @@ public class ProcessorDefinition extends ConfigurableExtensionDefinition {
private List<Attribute> readsAttributes;
private List<Attribute> writesAttributes;

@Schema(description = "Any input requirements this processor has.")
private List<UseCase> useCases;
private List<MultiProcessorUseCase> multiProcessorUseCases;


@Schema(description="Any input requirements this processor has.")
public InputRequirement.Requirement getInputRequirement() {
return inputRequirement;
}
Expand Down Expand Up @@ -225,4 +229,22 @@ public List<Attribute> getWritesAttributes() {
public void setWritesAttributes(List<Attribute> writesAttributes) {
this.writesAttributes = writesAttributes;
}

@Schema(description="A list of use cases that have been documented for this Processor")
public List<UseCase> getUseCases() {
return useCases;
}

public void setUseCases(final List<UseCase> useCases) {
this.useCases = useCases;
}

@Schema(description="A list of use cases that have been documented that involve this Processor in conjunction with other Processors")
public List<MultiProcessorUseCase> getMultiProcessorUseCases() {
return multiProcessorUseCases;
}

public void setMultiProcessorUseCases(final List<MultiProcessorUseCase> multiProcessorUseCases) {
this.multiProcessorUseCases = multiProcessorUseCases;
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

package org.apache.nifi.c2.protocol.component.api;

import io.swagger.v3.oas.annotations.media.Schema;
import org.apache.nifi.annotation.behavior.InputRequirement;

import java.io.Serializable;
import java.util.List;

public class UseCase implements Serializable {
private String description;
private String notes;
private List<String> keywords;
private String configuration;
private InputRequirement.Requirement inputRequirement;

@Schema(description="A description of the use case")
public String getDescription() {
return description;
}

public void setDescription(final String description) {
this.description = description;
}

@Schema(description="Any pertinent notes about the use case")
public String getNotes() {
return notes;
}

public void setNotes(final String notes) {
this.notes = notes;
}

@Schema(description="Keywords that pertain to the use case")
public List<String> getKeywords() {
return keywords;
}

public void setKeywords(final List<String> keywords) {
this.keywords = keywords;
}

@Schema(description="A description of how to configure the Processor to perform the task described in the use case")
public String getConfiguration() {
return configuration;
}

public void setConfiguration(final String configuration) {
this.configuration = configuration;
}

@Schema(description="Specifies whether an incoming FlowFile is expected for this use case")
public InputRequirement.Requirement getInputRequirement() {
return inputRequirement;
}

public void setInputRequirement(final InputRequirement.Requirement inputRequirement) {
this.inputRequirement = inputRequirement;
}
}
82 changes: 82 additions & 0 deletions nifi-docs/src/main/asciidoc/python-developer-guide.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -424,6 +424,88 @@ that there are no longer any invocations of the `transform` method running when



[[documenting_use_cases]]
== Documenting Use Cases

No matter how powerful a piece of software is, it has no value unless people are able to use it. To that end, documentation of Processors is
very important. While a description of the Processor should be provided in the `ProcessorDetails` class and each PropertyDescriptor is expected to have a description,
it is usually helpful to also call out specific use cases that can be performed by the Processor. This is particularly important for Processors that perform
more generalized transformations on objects, where a single Processor may be capable of performing multiple tasks, based on its configuration.

[[use_case_decorator]]
=== The `@use_case` Decorator

The `@use_case` decorator, defined in the `nifiapi.documentation` module can facilitate this. The decorator takes four arguments:

- `description`: A simple 1 (at most 2) sentence description of the use case. Generally, this should not include any extraneous details,
such as caveats, etc. Those can be provided using the `notes` argument. The description is required.
- `notes`: Most of the time, 1-2 sentences is sufficient to describe a use case. Those 1-2 sentence should then be returned
by the `description`. In the event that the description is not sufficient, details may be provided to
further explain, by providing caveats, etc. This is optional.
- `keywords`: An array of keywords that can be associated with the use case. This is optional.
- `configuration`: A description of how to configure the Processor for this particular use case. This may include explicit values to set for some properties,
and may include instructions on how to choose the appropriate value for other properties. The configuration is required.

A single Processor may have multiple `@use_case` decorators.


[[multi_processor_use_case_decorator]]
=== The `@multi_processor_use_case` Decorator

When designing and creating Processors, it is important to keep in mind the idea of loose coupling. One Processor should not be dependent on another Processor
in order to perform its task. That being said, it is often advantageous to build Processors that are designed to work well together. For example, a Processor that
is able to perform a listing of files in a directory can provide an important capability in and of itself. Similarly, a Processor that is able to ingest the contents
of a specific file and make that file's contents the contents of a FlowFile is also an important capability in and of itself. But far more powerful than either of these
individual capabilities is the notion of being able to compose a flow that lists all files in a directory and then ingests each of those files as a FlowFile. This is
done by using a combination of the two. As such, it is important that the two Processors be able to work together in such a way that the output of the first is
easily understood as the input of the second.

In this case, it makes sense to document this composition of Processors as a use case so that users can understand how to compose such a pipeline. This is accomplished
by using the `@multi_processor_use_case` decorator. This decorator is very similar to the <<use_case_decorator>> but instead of a `configuration` element, it has a
`configurations` element, which is a `list` of `ProcessorConfiguration` objects, where each `ProcessorConfiguration` object has both a `processor_type`, which is the
name of the Processor, and a `configuration` that explains how to configure that particular Processor. The `configuration` element typically also explains how to connect
outbound Relationships.

For example, we might use these decorators as such:
----
@use_case(description="Retrieve the contents of a given file on disk and create a FlowFile from it without modifying the file",
keywords=["file", "filesystem"],
configuration="""
Set the 'Filename' property to the fully qualified path of the file to ingest
Set the 'Completion Strategy' to 'None'
""")
@use_case(description="Retrieve the contents of a given file on disk and create a FlowFile from it, deleting the local file upon success",
keywords=["file", "filesystem"],
configuration="""
Set the 'Filename' property to the fully qualified path of the file to ingest
Set the 'Completion Strategy' to 'Delete'
""")
@multi_processor_use_case(
description="Ingest all files from a landing directory on the filesystem and delete them after ingesting them.",
keywords=["file", "filesystem", "landing directory"],
configurations=[
ProcessorConfiguration(
processor_type="org.apache.nifi.processors.standard.ListFile",
configuration="""
Set 'Input Directory' to the directory that files should be ingested from
Set 'Input Directory Location' to 'Local'
"""
),
ProcessorConfiguration(
processor_type="FetchFile",
configuration="""
Set the 'Filename' property to `${absolute.path}/${filename}`
Set the 'Completion Strategy' to 'Delete'
"""
)
])
class FetchFile(FlowFileTransform):
----

Note that in this case, we are able to specifically tell the user that the Filename property of FetchFile should be set to the value `${absolute.path}/${filename}`
because we know that the ListFile Processor will produce these attributes for us.


[[requirements]]
== Requirements

Expand Down