Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NIFI-3402: Added etag support to InvokeHTTP #2150

Closed
wants to merge 3 commits into from
Closed

NIFI-3402: Added etag support to InvokeHTTP #2150

wants to merge 3 commits into from

Conversation

m-hogue
Copy link
Member

@m-hogue m-hogue commented Sep 12, 2017

Thank you for submitting a contribution to Apache NiFi.

In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:

For all changes:

  • Is there a JIRA ticket associated with this PR? Is it referenced
    in the commit message?

  • Does your PR title start with NIFI-XXXX where XXXX is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character.

  • Has your PR been rebased against the latest commit within the target branch (typically master)?

  • Is your initial contribution a single, squashed commit?

For code changes:

  • Have you ensured that the full suite of tests is executed via mvn -Pcontrib-check clean install at the root nifi folder?
  • Have you written or updated unit tests to verify your changes?
  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
  • If applicable, have you updated the LICENSE file, including the main LICENSE file under nifi-assembly?
  • If applicable, have you updated the NOTICE file, including the main NOTICE file found under nifi-assembly?
  • If adding new Properties, have you added .displayName in addition to .name (programmatic access) for each of the new properties?

For documentation related changes:

  • Have you ensured that format looks appropriate for the output in which it is rendered?

Note:

Please ensure that once the PR is submitted, you check travis-ci for build issues and submit an update to your PR as soon as possible.

@m-hogue
Copy link
Member Author

m-hogue commented Sep 12, 2017

Since there's not a clean way to unit test this, the way i tested was to change unit tests and observe that there were etag cache hits until max-age=1 seconds had passed, when a network request would be issued again. For example:

2751 [pool-2-thread-1] DEBUG org.apache.nifi.processors.standard.InvokeHTTP - InvokeHTTP[id=6a72be7c-560d-4bdf-b58e-a59aaaec7478] OkHttp ETag cache metrics :: Request Count: 99 | Network Count: 2 | Hit Count: 97

Copy link
Contributor

@MikeThomsen MikeThomsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some minor fixes to follow some of the established best practices for configurability. Other than that, looks like a solid commit to me.

* The maximum size, in bytes, that the ETag cache should grow if it is enabled.
* The size here is chosen arbitrarily, as the documentation doesn't suggest a recommended value.
*/
private static final int MAX_ETAG_CACHE_SIZE_BYTES = 1024 * 1024 * 10; // 10MiB
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Making this configurable with a property descriptor would be a good idea. I think two property descriptors, one for the numeric portion and the other for the unit would be very helpful to users.

@@ -669,6 +694,14 @@ public void onTrigger(ProcessContext context, ProcessSession session) throws Pro
final int maxAttributeSize = context.getProperty(PROP_PUT_ATTRIBUTE_MAX_LENGTH).asInteger();
final ComponentLog logger = getLogger();

// log ETag cache metrics
final boolean eTagEnabled = context.getProperty(PROP_USE_ETAG).asBoolean();
if(eTagEnabled) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think && logger.isDebugEnabled() should be added to this line to prevent the extra work being done since this is only for debugging.

public static final PropertyDescriptor PROP_USE_ETAG = new PropertyDescriptor.Builder()
.name("use-etag")
.displayName("Use HTTP ETag")
.required(true)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs a description("") call to set a description statement on the property.

@m-hogue
Copy link
Member Author

m-hogue commented Oct 25, 2017

Thanks for the review, @MikeThomsen! I agree & I'll make all of these changes.

@m-hogue
Copy link
Member Author

m-hogue commented Oct 25, 2017

@MikeThomsen : I've added your suggestions. Please let me know if you'd like any more changes.

@MikeThomsen
Copy link
Contributor

Looks good to me. The only change I'd recommend is do a squashed commit so your patch is just one commit.

Not a core committer, but FWIW +1

* @return the directory in which the ETag cache should be written
*/
private static File getETagCacheDir() {
return Files.createTempDir();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this directory be deleted on @OnUnscheduled annotation call? Or is this done automatically?

Copy link
Member Author

@m-hogue m-hogue Nov 6, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As it's currently written, this should be done automatically via OS-controlled temporary folder cleanup [1]. I elected to do this to 1) avoid manual cleanup and 2) avoid introducing something error prone. As this is only a cache, its durability isn't important.

[1] https://google.github.io/guava/releases/19.0/api/docs/com/google/common/io/Files.html#createTempDir()

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's a good way to go. It should be platform agnostic and work as a sane default.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough. I just thought that OS-controlled temporary folder was being cleaned at server reboot by default and a NiFi server is probably not supposed to be rebooted very often.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pvillard31 Since there is a configured size limit on the field, it should stick to a reasonable size since there is a sane default of 10MB. If a user is foolish enough to set that to 10GB, not our problem.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MikeThomsen - true but one temporary folder is created each time the processor is started. We could assume that the processor is going to be stopped/started a lot. But I agree, this is a edge case. I'm fine with the current implementation. If I really want to be meticulous I could suggest to add this information in the property description :)

@MikeThomsen
Copy link
Contributor

@pvillard31 @m-hogue Think we can close the loop on this one?

@pvillard31
Copy link
Contributor

Yes. +1, merging to master, thanks!

@asfgit asfgit closed this in 6471f66 Mar 13, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants