NIFI-2156: Add ListDatabaseTables processor #642

mattyb149 · 2016-07-13T16:06:02Z

No description provided.

JPercivall · 2016-07-17T22:41:42Z

...tandard-processors/src/main/java/org/apache/nifi/processors/standard/ListDatabaseTables.java

+        @WritesAttribute(attribute = "db.table.remarks", description = "Contains the name of a database table from the connection"),
+        @WritesAttribute(attribute = "db.table.count", description = "Contains the number of rows in the table")
+})
+@Stateful(scopes = {Scope.LOCAL}, description = "After performing a listing of tables, the timestamp of the query is stored. "


Shouldn't this be "cluster"? That way when primary node changes it will keep the same listing of tables

Wasn't sure about that but makes sense to me :) will change to cluster.

JPercivall · 2016-07-18T14:36:42Z

Alright, after giving it some time and a cup of coffee I realize how off I was at first, lol. This processor reaches out to the DB asking for the tables. Then for each table that isn't already stored in state it creates a flowfile. If the processor is configured to give the count, it needs to send a SQL query asking for it. If that query fails it will remove the flowfile it created and continue onto the next table. If successful, the FQN of the table will then be added to state (after queuing it to transfer).

That realization makes my comment about data loss void (was afraid it would get stored in state after un-successfully getting the count).

One new comment, would a user want to set an expiration for tables in state? That way they could get updates on the count of a table every X seconds/minutes. In it's current form it will get the table once but never again. You're already storing the timestamp as the value so it should be an easy addition.

mattyb149 · 2016-07-18T17:47:49Z

Yes, at one point I had a "Refresh Interval" property but I think that was in another branch, will restore it. Also, currently any change to properties will reset the state (since the tables fetched may have changed), I'm thinking of taking that part out. The Refresh Interval would cause all tables to be re-fetched, and/or the user could always manually clear state. What do you think?

JPercivall · 2016-07-18T22:26:19Z

...tandard-processors/src/main/java/org/apache/nifi/processors/standard/ListDatabaseTables.java

+            }
+            if (lastRefreshed > 0 && refreshInterval > 0 && currentTime >= (lastRefreshed + refreshInterval)) {
+                stateManager.clear(Scope.CLUSTER);
+                stateMapProperties.clear();


Why clear all the properties at once and not just have an expiration time that applies to each table? Where if you have it set to 5 minutes, the processor will end up reporting the count of the table every minutes (while it still exists). You already store the timestamp under the FQN.

Good point, will add.

JPercivall · 2016-07-19T14:17:58Z

...tandard-processors/src/main/java/org/apache/nifi/processors/standard/ListDatabaseTables.java

+                        refreshTable = false;
+                    }
+                } catch (final NumberFormatException nfe) {
+                    getLogger().error("Failed to retrieve observed last table fetches from the State Manager. Will not perform "


This exception really shouldn't happen since we are setting the long ourselves but if it does it will fail entirely until state is cleared by the user. In addition to returning and yielding, it should probably clear the offending state entry (and log in the error message that this is hapenning and the ramifications). This will at least give the processor a chance to continue working if it every reaches this state.

JPercivall · 2016-07-19T16:17:43Z

...tandard-processors/src/main/java/org/apache/nifi/processors/standard/ListDatabaseTables.java

-            // Update the last time the processor finished successfully
-            stateManager.replace(stateMap, stateMapProperties, Scope.CLUSTER);
+            // Update the timestamps for listed tables
+            stateManager.setState(stateMapProperties, Scope.CLUSTER);


I would not change this to setState. It will overwrite anything that is in State. If someone does end up running clustered and not primary Node it will blow it away without warnings.

Instead just check if the prior map version was -1 and do a set instead of replace.

JPercivall · 2016-07-19T16:59:44Z

+1

Visually verified code and any comments were addressed. Ran a contrib check build and verified functionality in a standalone cluster hitting a MySQL DB. Thanks @mattyb149, I will squash and merge

JPercivall reviewed Jul 17, 2016
View reviewed changes

mattyb149 force-pushed the NIFI-2156 branch from b33817d to 44efe58 Compare July 18, 2016 18:16

JPercivall reviewed Jul 18, 2016
View reviewed changes

mattyb149 force-pushed the NIFI-2156 branch from 44efe58 to c2d0f73 Compare July 19, 2016 01:22

JPercivall reviewed Jul 19, 2016
View reviewed changes

mattyb149 added 4 commits July 19, 2016 11:55

NIFI-2156: Add ListDatabaseTables processor

421138b

NIFI-2156: Addressed review comments/discussion

d2764f5

NIFI-2156: Changed ListDatabaseTables for per-table state management

9f92311

NIFI-2156: Fixed logic bug in state update

33f8328

mattyb149 force-pushed the NIFI-2156 branch from c2d0f73 to 33f8328 Compare July 19, 2016 15:55

JPercivall reviewed Jul 19, 2016
View reviewed changes

NIFI-2156: Added logic for first-time state update

509a75b

asfgit closed this in f1ba240 Jul 19, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NIFI-2156: Add ListDatabaseTables processor #642

NIFI-2156: Add ListDatabaseTables processor #642

mattyb149 commented Jul 13, 2016

JPercivall Jul 17, 2016

mattyb149 Jul 17, 2016

JPercivall commented Jul 18, 2016

mattyb149 commented Jul 18, 2016

JPercivall Jul 18, 2016

mattyb149 Jul 18, 2016

JPercivall Jul 19, 2016

JPercivall Jul 19, 2016

JPercivall commented Jul 19, 2016

NIFI-2156: Add ListDatabaseTables processor #642

NIFI-2156: Add ListDatabaseTables processor #642

Conversation

mattyb149 commented Jul 13, 2016

JPercivall Jul 17, 2016

Choose a reason for hiding this comment

mattyb149 Jul 17, 2016

Choose a reason for hiding this comment

JPercivall commented Jul 18, 2016

mattyb149 commented Jul 18, 2016

JPercivall Jul 18, 2016

Choose a reason for hiding this comment

mattyb149 Jul 18, 2016

Choose a reason for hiding this comment

JPercivall Jul 19, 2016

Choose a reason for hiding this comment

JPercivall Jul 19, 2016

Choose a reason for hiding this comment

JPercivall commented Jul 19, 2016