Flows do not persist pod restart #70

juldrixx · 2022-03-24T11:54:38Z

From nifikop created by andrew-musoke: Orange-OpenSource/nifikop#201

Type of question

Are you asking about community best practices, how to implement a specific feature, or about general context and help around nifikop ?
General help with Nifikop.

Question

What did you do?
I deployed Nifi with 2 pods via NifiKops. After creating a flow on the UI, I exported the process groups to a nifi-registry as well. The cluster run for days. This is the CR I used. I then deleted the cluster pods to test resilience.

apiVersion: nifi.orange.com/v1alpha1
kind: NifiCluster
metadata:
  name: simplenifi
  namespace: dataops
spec:
  service:
    headlessEnabled: true
  zkAddress: "zookeeper.dataops.svc.cluster.local.:2181"
  zkPath: "/simplenifi"
  clusterImage: "apache/nifi:1.12.1"
  oneNifiNodePerNode: false
  nodeConfigGroups:
    default_group:
      isNode: true
      imagePullPolicy: IfNotPresent
      storageConfigs:
        - mountPath: "/opt/nifi/nifi-current/logs"
          name: logs
          pvcSpec:
            accessModes:
              - ReadWriteOnce
            storageClassName: "gp2"
            resources:
              requests:
                storage: 10Gi
      serviceAccountName: "default"
      resourcesRequirements:
        limits:
          cpu: "0.5"
          memory: 2Gi
        requests:
          cpu: "0.5"
          memory: 2Gi
  clientType: "basic"
  nodes:
    - id: 1
      nodeConfigGroup: "default_group"
    - id: 2
      nodeConfigGroup: "default_group"
  propagateLabels: true
  nifiClusterTaskSpec:
    retryDurationMinutes: 10
  listenersConfig:
    internalListeners:
      - type: "http"
        name: "http"
        containerPort: 8080
      - type: "cluster"
        name: "cluster"
        containerPort: 6007
      - type: "s2s"
        name: "s2s"
        containerPort: 10000

What did you expect to see?
I expected the cluster to run properly and survive restarts since PVs are created. I expected to see the pipelines continue running after the pods started up.

What did you see instead? Under which circumstances?
When the pods came back up and were healthy, the UI had no flows or process groups. The registry configuration had also disappeared. I have to manually re-register the nifi-registry, re-import the process groups, add the secrets and restart the pipelines.

Why would this happen when Nifi has persistent volumes?
How can this behaviour be stopped?
How can I persist the flows or at least automate the re-importing and restarting of pipelines from nifi-registry.

Environment

nifikop version:
v0.7.5-release
Kubernetes version information:

 Client Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.4", GitCommit:"b695d79d4f967c403a96986f1750a35eb75e75f1", GitTreeState:"clean", BuildDate:"2021-11-17T15:48:33Z", GoVersion:"go1.16.10", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"20+", GitVersion:"v1.20.11-eks-f17b81", GitCommit:"f17b810c9e5a82200d28b6210b458497ddfcf31b", GitTreeState:"clean", BuildDate:"2021-10-15T21:46:21Z", GoVersion:"go1.15.15", Compiler:"gc", Platform:"linux/amd64"}

NiFi version:

apache/nifi:1.12.1

The text was updated successfully, but these errors were encountered:

juldrixx · 2022-03-24T11:54:39Z

I got this response from one of the alternate communication channels. But I cannot make sense of it. Could this be an issue?

It sounds like the flow.xml.gz is perhaps not saved on a persistent volume? The ideal behavior would be to have several different persistent volumes:

One for content repo

One for flowfile repo

One for provenance repo

One for logs

One for conf/ directory, any additional configuration resources. (this could easily be combined with the logs/ volume)

juldrixx · 2022-03-24T11:54:39Z

I recently opened a PR that provides an option to NifiClusterSpec where, when specified, does not remove the flow.xml.gz file on pod startup.

In the current implementation, even though the flows.xml.gz file is persisted, it is removed every time the pod starts. https://github.com/Orange-OpenSource/nifikop/blob/master/pkg/resources/nifi/pod.go#L418

juldrixx · 2022-03-24T11:54:41Z

You should deploy a NiFiDataflow so that NiFiKOp re-deploys the versioned dataflow from NiFi Registry.

https://orange-opensource.github.io/nifikop/docs/5_references/5_nifi_dataflow

I could be wrong, but I suppose you could also make sure the flow.xml.gz is persisted on a persistent volume but it's not necessary if you deploy a NiFiDataflow since nifikop will just put it back once the pod comes up.

mh013370 · 2022-07-15T10:34:40Z

For production clusters where you've configured nifikop to deploy flows to, this isn't really a problem. However, I do think this would be a useful feature for the following reason:

If you use a single cluster deployment as a place to create flows and version control them, then you wouldn't be configuring flows to be deployed to it. Since nifikop wipes the flow.xml.gz on each pod restart, you have to manually re-import all of the flows you are working on to be deployed to other clusters.

I personally feel that the PR previously mentioned, raised by @genehynson, would be a useful feature and should be re-opened in this repo.

genehynson · 2022-07-15T17:09:27Z

After upgrading to NiFi 1.16 we are no longer running into this issue. I believe this is because NiFi migrated to a new file, flow.json.gz which is not deleted by the NiFi pod startup script provided by nifikop.

Also with NiFi 1.16 we've been able to do clean, rolling upgrades by creating a PodDisruptionBudget and only allowing 1 NiFi node to be updated by k8s at a time. NiFi 1.16 introduced a new "flow negotiation" system that allows for each node in the NiFi cluster to have slightly different versions of the flow.json.gz file (like different processor versions, for example).

So even if nifikop does start deleting the flow.json.gz file I think we'll be fine because when a NiFi pod rolls it will get the contents for the flow.json.gz from the primary NiFi node that has not rolled yet (or has already rolled).

So that being said, the usecase for the PR mentioned is only if you're running 1 NiFi node or are running an older version of NiFi.

mh013370 · 2022-07-18T08:53:40Z

Good to know! Thanks for the follow up. I do think that NiFi is writing both the flow.xml.gz and the flow.json.gz temporarily as they transition to the json variant. But it's good to know that with 1.16+ and the changes around flow negotiation that it's a minor issue.

Maybe we can resolve this issue then?

genehynson · 2022-07-18T14:58:11Z

I do think that NiFi is writing both the flow.xml.gz and the flow.json.gz temporarily as they transition to the json variant

Correct, but it only uses one of them. Whichever you have defined in nifi.flow.configuration.file (flow.xml.gz is the default). And to get the benefits of the new flow negotiation stuff you have to switch to the flow.json.gz file.

That being said, I'm also fine with resolving this issue.

erdrix · 2022-08-19T21:02:29Z

The flow.xml.gz is not removed anymore at pod restart !

erdrix closed this as completed Aug 19, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Flows do not persist pod restart #70

Flows do not persist pod restart #70

juldrixx commented Mar 24, 2022

juldrixx commented Mar 24, 2022

juldrixx commented Mar 24, 2022

juldrixx commented Mar 24, 2022

mh013370 commented Jul 15, 2022 •

edited

Loading

genehynson commented Jul 15, 2022 •

edited

Loading

mh013370 commented Jul 18, 2022 •

edited

Loading

genehynson commented Jul 18, 2022

erdrix commented Aug 19, 2022

Flows do not persist pod restart #70

Flows do not persist pod restart #70

Comments

juldrixx commented Mar 24, 2022

Type of question

Question

juldrixx commented Mar 24, 2022

juldrixx commented Mar 24, 2022

juldrixx commented Mar 24, 2022

mh013370 commented Jul 15, 2022 • edited Loading

genehynson commented Jul 15, 2022 • edited Loading

mh013370 commented Jul 18, 2022 • edited Loading

genehynson commented Jul 18, 2022

erdrix commented Aug 19, 2022

mh013370 commented Jul 15, 2022 •

edited

Loading

genehynson commented Jul 15, 2022 •

edited

Loading

mh013370 commented Jul 18, 2022 •

edited

Loading