Skip to content

NIFI-631 Create ListFile and FetchFile processors#112

Closed
jskora wants to merge 1 commit intoapache:masterfrom
jskora:NIFI-631
Closed

NIFI-631 Create ListFile and FetchFile processors#112
jskora wants to merge 1 commit intoapache:masterfrom
jskora:NIFI-631

Conversation

@jskora
Copy link
Copy Markdown
Contributor

@jskora jskora commented Oct 30, 2015

No description provided.

@jskora jskora closed this Oct 30, 2015
@jskora jskora reopened this Oct 30, 2015
@jskora jskora changed the title New ListFile processor. NIFI-631 Create ListFile and FetchFile processors Oct 30, 2015
@trkurc
Copy link
Copy Markdown
Contributor

trkurc commented Oct 30, 2015

Was this closed or submitted erroneously?

@jskora
Copy link
Copy Markdown
Contributor Author

jskora commented Oct 30, 2015

Sorry about that. I thought I had to delete it and resubmit it to fix the title, but realized I could change the title and reopen it.

@mpetronic
Copy link
Copy Markdown

Joe, thanks for getting this processor going. I need it. :) I've pulled this in and am giving it a try. I have some additional thoughts on functionality.

  1. Should it have a "Recurse sub-directories <yes|no>" option? Reason I mention this is because, in my setup, I have to scan files from an NFS share and it actually is not so fast, especially if you recurse many levels of subdirs that you don't really need to look at. That's special case, I know, but it is a valid use case and we could eliminate some latency by not requiring a full recursive scan all the time.
  2. Should it have the option to specify a seeded last modified time? Say there is a directory full of files from days or weeks but you only want to start pulling them in from say, one day ago or some specific date/time, and not pick up all the previous files
  3. If there are empty directories in the path you are scanning, they get listed in the "filename", just like an actual file would be listed. I think it would be nice to have another attribute that indicated whether the leaf node was a file or directory as that could more easily be use by downstream processors to decide how to act on that value.
  4. Should it expose each files actual last modified timestamp in the FlowFile Attribute Map Content?
  5. How would you reset the last modified state if you wanted to rerun the processor from a place back in time, like when testing? I guess item 2 could do that.

I guess for all other types of filtering, like wildcards and such, the right 'Nifi' thing to do is use a downstream "UpdateAttribute" processor to massage the list. Correct? Maybe this also applies to item 2 above, then?

Maybe the following should/work be part of the code review process but I will note here just in case. I'm new to this OSS process but, since I see this as a pull request, it made me think it was ready to go but seems some stuff is missing?

  1. There is no description of the processor
  2. The 'path' attribute description of "The path on the system from which to pull or push files" is misleading, IMO. Maybe "The path on the system where this processor will scan files and directories to build the file list."

@jskora jskora closed this Oct 31, 2015
JPercivall pushed a commit to JPercivall/nifi that referenced this pull request Apr 23, 2018
MINIFI-431 - Cleaning up L&N for changes to jersey dependencies.
MINIFI-432 - Updating copyrights to 2018.

Signed-off-by: Pierre Villard <pierre.villard.fr@gmail.com>

This closes apache#112.
iadamcsik pushed a commit to iadamcsik/nifi that referenced this pull request Oct 22, 2025
…ponents for a PG sync (apache#110)

  - Ensure VCI is set back to original value if an exception is encountered during pg sync
  - Updated StandardScheduledStateListener to use VCI from the passed in FD PG
- CDPDFX-7395: Resolving bug in bulk actions as applied to nested groups (apache#111)
- Fixes after 1.22.0 rebase
- CDPDFX-7445: Adding requestId to FlowChangeEvent (apache#112)
- CDPDFX-7517: Fixing NPE when publishing flow change events for bulk actions, publishing events on initially stopping components (apache#113)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants