Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make repository-sizes dynamic based on pvc sizes #354

Closed
5 tasks done
sbernauer opened this issue Oct 3, 2022 · 8 comments
Closed
5 tasks done

Make repository-sizes dynamic based on pvc sizes #354

sbernauer opened this issue Oct 3, 2022 · 8 comments

Comments

@sbernauer
Copy link
Member

sbernauer commented Oct 3, 2022

Improving some demos i noticed that my NiFi instance suddenly stopped working.

Reason is that the provenance repo run out of space. It only has 5Gi but the operator hard-codes nifi.provenance.repository.max.storage.size=10 GB
IMHO something like the following would make sense:

nifi.provenance.repository.max.storage.size=<set to pvc size. If it causes problems maybe pvc-size - 500MB or so>
nifi.provenance.repository.max.storage.time=<unlimited>

DoD

  • The following repositories don't configure any time based retention, only size based upon the actual pvc size

    • flow archive
    • content archive
    • provenance
  • Wont: Additionally users can overwrite the time based retention or the percentage based retention (out of scope for this ticket)

Details

This is the current config

    nodes:
      config:
        resources:
          cpu:
            max: "4"
            min: 500m
          memory:
            limit: 6Gi
          storage:
            contentRepo:
              capacity: 10Gi
            databaseRepo:
              capacity: 5Gi
            flowfileRepo:
              capacity: 5Gi
            provenanceRepo:
              capacity: 5Gi
            stateRepo:
              capacity: 5Gi

This are the pvc usages

kubectl df-pv | grep -P "(NAME|nifi)"
 PV NAME                                   PVC NAME                                   NAMESPACE  NODE NAME           POD NAME                      VOLUME MOUNT NAME      SIZE   USED   AVAILABLE  %USED  IUSED  IFREE   %IUSED 
 pvc-8641ca89-ed87-4c1d-b954-6ba7a16651b0  content-repository-nifi-node-default-0     default    default-5n5cbfoy4o  nifi-node-default-0           content-repository     9Gi    5Gi    4Gi        52.00  5588   649772  0.85   
 pvc-df2633f6-efac-45d0-9517-a6a4cd85011d  flowfile-repository-nifi-node-default-0    default    default-5n5cbfoy4o  nifi-node-default-0           flowfile-repository    4Gi    12Mi   4Gi        0.25   15     327665  0.00   
 pvc-21329c7f-87f6-4ace-a621-6bb05e6f21cf  provenance-repository-nifi-node-default-0  default    default-5n5cbfoy4o  nifi-node-default-0           provenance-repository  4Gi    4Gi    272Mi      94.51  916    326764  0.28   
 pvc-2d585950-23dd-4f72-bd09-30266b3182e8  state-repository-nifi-node-default-0       default    default-5n5cbfoy4o  nifi-node-default-0           state-repository       4Gi    156Ki  4Gi        0.00   45     327635  0.01   
 pvc-a9f14349-b88f-4450-b41b-0ab902a14881  database-repository-nifi-node-default-0    default    default-5n5cbfoy4o  nifi-node-default-0           database-repository    4Gi    168Ki  4Gi        0.00   15     327665  0.00

Config

cat nifi.properties 
nifi.administrative.yield.duration=30 sec
nifi.authorizer.configuration.file=/stackable/nifi/conf/authorizers.xml
nifi.cluster.flow.election.max.candidates=
nifi.cluster.flow.election.max.wait.time=1 mins
nifi.cluster.is.node=true
nifi.cluster.node.address=nifi-node-default-0.nifi-node-default.default.svc.cluster.local
nifi.cluster.node.protocol.port=9088
nifi.cluster.protocol.is.secure=true
nifi.components.status.repository.buffer.size=1440
nifi.components.status.repository.implementation=org.apache.nifi.controller.status.history.VolatileComponentStatusRepository
nifi.components.status.snapshot.frequency=1 min
nifi.content.claim.max.appendable.size=1 MB
nifi.content.repository.always.sync=false
nifi.content.repository.archive.enabled=true
nifi.content.repository.archive.max.retention.period=7 days
nifi.content.repository.archive.max.usage.percentage=50%
nifi.content.repository.directory.default=/stackable/data/content
nifi.content.repository.implementation=org.apache.nifi.controller.repository.FileSystemRepository
nifi.content.viewer.url=../nifi-content-viewer/
nifi.database.directory=/stackable/data/database
nifi.documentation.working.directory=./work/docs/components
nifi.flow.configuration.archive.dir=/stackable/nifi/conf/archive/
nifi.flow.configuration.archive.enabled=true
nifi.flow.configuration.archive.max.count=
nifi.flow.configuration.archive.max.storage=500 MB
nifi.flow.configuration.archive.max.time=30 days
nifi.flow.configuration.file=/stackable/data/database/flow.xml.gz
nifi.flowcontroller.autoResumeState=true
nifi.flowcontroller.graceful.shutdown.period=10 sec
nifi.flowfile.repository.always.sync=false
nifi.flowfile.repository.checkpoint.interval=20 secs
nifi.flowfile.repository.directory=/stackable/data/flowfile
nifi.flowfile.repository.implementation=org.apache.nifi.controller.repository.WriteAheadFlowFileRepository
nifi.flowfile.repository.retain.orphaned.flowfiles=true
nifi.flowfile.repository.wal.implementation=org.apache.nifi.wali.SequentialAccessWriteAheadLog
nifi.flowservice.writedelay.interval=500 ms
nifi.h2.url.append=;LOCK_TIMEOUT=25000;WRITE_DELAY=0;AUTO_SERVER=FALSE
nifi.login.identity.provider.configuration.file=/stackable/nifi/conf/login-identity-providers.xml
nifi.nar.library.autoload.directory=./extensions
nifi.nar.library.directory=./lib
nifi.nar.working.directory=./work/nar/
nifi.provenance.repository.always.sync=false
nifi.provenance.repository.buffer.size=100000
nifi.provenance.repository.compress.on.rollover=true
nifi.provenance.repository.concurrent.merge.threads=2
nifi.provenance.repository.directory.default=/stackable/data/provenance
nifi.provenance.repository.implementation=org.apache.nifi.provenance.WriteAheadProvenanceRepository
nifi.provenance.repository.index.shard.size=500 MB
nifi.provenance.repository.index.threads=2
nifi.provenance.repository.indexed.attributes=
nifi.provenance.repository.indexed.fields=EventType, FlowFileUUID, Filename, ProcessorID, Relationship
nifi.provenance.repository.max.attribute.length=65536
nifi.provenance.repository.max.storage.size=10 GB
nifi.provenance.repository.max.storage.time=30 days
nifi.provenance.repository.query.threads=2
nifi.provenance.repository.rollover.size=100 MB
nifi.provenance.repository.rollover.time=10 mins
nifi.queue.swap.threshold=20000
nifi.security.allow.anonymous.authentication=false
nifi.security.keystore=/stackable/keystore/keystore.p12
nifi.security.keystorePasswd=secret
nifi.security.keystoreType=PKCS12
nifi.security.truststore=/stackable/keystore/truststore.p12
nifi.security.truststorePasswd=secret
nifi.security.truststoreType=PKCS12
nifi.security.user.authorizer=authorizer
nifi.security.user.login.identity.provider=login-identity-provider
nifi.sensitive.props.algorithm=NIFI_ARGON2_AES_GCM_256
nifi.sensitive.props.key=jfk6AnWozMfkQlJ
nifi.sensitive.props.key.protected=
nifi.state.management.configuration.file=./conf/state-management.xml
nifi.state.management.embedded.zookeeper.start=false
nifi.state.management.provider.cluster=zk-provider
nifi.state.management.provider.local=local-provider
nifi.status.repository.questdb.persist.component.days=3
nifi.status.repository.questdb.persist.location=./status_repository
nifi.status.repository.questdb.persist.node.days=14
nifi.swap.manager.implementation=org.apache.nifi.controller.FileSystemSwapManager
nifi.templates.directory=./conf/templates
nifi.ui.autorefresh.interval=30 sec
nifi.ui.banner.text=
nifi.web.https.host=nifi-node-default-0.nifi-node-default.default.svc.cluster.local
nifi.web.https.network.interface.default=
nifi.web.https.port=8443
nifi.web.jetty.threads=200
nifi.web.jetty.working.directory=./work/jetty
nifi.web.max.header.size=16 KB
nifi.web.proxy.context.path=
nifi.web.proxy.host=85.215.194.27:30671,default-5n5cbfoy4o:30671,85.215.233.125:30671,default-6cljbrq2at:30671,85.215.194.120:30671,default-k4qezqd2uq:30671,85.215.160.5:30671,default-u6rswpvdas:30671,nifi.default.svc.cluster.local
nifi.zookeeper.connect.string=zookeeper-server-default-0.zookeeper-server-default.default.svc.cluster.local:2282
nifi.zookeeper.root.node=/znode-4fb22e61-13bc-46cc-8b24-bbdd865742dc
@nightkr
Copy link
Member

nightkr commented Nov 1, 2022

I agree that focusing on size retention makes more sense for us since PVCs are isolated based on that anyway. I think for now it'd probably be ok to let time retention be configured through config overrides if it is actually desired by the user?

@nightkr nightkr moved this from Refinement: Waiting for to Refinement: In Progress in Stackable Engineering Nov 1, 2022
@nightkr nightkr self-assigned this Nov 1, 2022
@nightkr
Copy link
Member

nightkr commented Nov 1, 2022

Cross-referenced which repository types would need to be configured this way, but I think otherwise the original ticket did a pretty good job.

@nightkr nightkr moved this from Refinement: In Progress to Refinement Acceptance: Waiting for in Stackable Engineering Nov 1, 2022
@lfrancke
Copy link
Member

lfrancke commented Nov 1, 2022

I'm fine with only having size based retention but the description is still a bit too unclear for me.
This will require a CRD change, right?
I think what you're suggesting is that the PVC size is split up between all repositories automatically or should the user select a percentage of size?

e.g. flow: 50%, archive: 30%, provenance: 20% ?

@lfrancke lfrancke self-assigned this Nov 1, 2022
@lfrancke lfrancke moved this from Refinement Acceptance: Waiting for to Refinement Acceptance: In Progress in Stackable Engineering Nov 1, 2022
@lfrancke lfrancke moved this from Refinement Acceptance: In Progress to Ready for Development in Stackable Engineering Nov 1, 2022
@lfrancke lfrancke moved this from Ready for Development to Refinement Acceptance: In Progress in Stackable Engineering Nov 1, 2022
@sbernauer
Copy link
Member Author

Currently this is a bug as nifi.provenance.repository.max.storage.size=10 GB is hard-coded. The demo used a pvc size of 5Gi for provenance and bad things happened.
Each repository gets a own pvc.
No CRD change is needed IMHO. We only remove the hard-coded nifi.provenance.repository.max.storage.size=10 GB setting and put in the actual pvc size.

@sbernauer
Copy link
Member Author

The

e.g. flow: 50%, archive: 30%, provenance: 20% ?

part is decided by the user. He needs to specify the pvc sizes for every repository individually

@sbernauer sbernauer moved this from Refinement Acceptance: In Progress to Ready for Development in Stackable Engineering Nov 2, 2022
@lfrancke
Copy link
Member

lfrancke commented Nov 2, 2022

My comments were based on a misunderstanding of the proposal.
I thought that the CRD snippet above is the proposed new one and didn't realize that this already exists.

Time based retention is out of scope for now.
Please make sure to leave a safety buffer of at least 100MB (which means we also need to validate that a storage PVC is at least 101MB in size :)

@nightkr nightkr moved this from Ready for Development to Development: In Progress in Stackable Engineering Nov 3, 2022
@nightkr nightkr moved this from Development: In Progress to Development: Waiting for Review in Stackable Engineering Nov 7, 2022
@maltesander maltesander moved this from Development: Waiting for Review to Development: In Review in Stackable Engineering Nov 8, 2022
@bors bors bot closed this as completed in cb5c5b4 Nov 8, 2022
@nightkr nightkr moved this from Development: In Review to Acceptance: Waiting for in Stackable Engineering Nov 8, 2022
@lfrancke lfrancke moved this from Acceptance: Waiting for to Acceptance: In Progress in Stackable Engineering Nov 11, 2022
@lfrancke lfrancke self-assigned this Nov 11, 2022
@lfrancke
Copy link
Member

Almost none of the checkboxes are ticked. Neither in the PR nor here, can you make sure that everything is done?

@sbernauer
Copy link
Member Author

Checked implementation and the boxes

@lfrancke lfrancke moved this from Acceptance: In Progress to Done in Stackable Engineering Nov 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment