WIP: Journal - Heketi should keep track of transactions #661
Conversation
Can one of the admins verify this patch? |
@heketi/dev @heketi/maintainers @lpabon @humblec @obnoxxx @jarrpa @raghavendra-talur @ramkrsna Test: for now, it is writing to /tmp/journal which can be moved under /var/lib/heketi or anywhere else. based on the count of these the transaction state is found. Have to revert db entries and delete the stale lvs based on the input along with START and END. |
b78947e
to
f31c5e8
Compare
Journal support is a simple concept where heketi keeps track of transaction and has capability to revert back to a consistent state or continue from the point where transaction was stopped. In case of Volume create transaction, Brick entries are added to the devices and under volume entries after creating bricks(or lv's on devices). Before all the bricks created for the volume, heketi process is forcefully terminated. DB parsing is failed as there are no complete volume entries. Someone has to cleanup or resume the transaction. Thats journal responsibility now. Pair-Programmed-With: Raghavendra Talur rtalur@redhat.com Signed-off-by: Mohamed Ashiq Liyazudeen mliyazud@redhat.com Signed-off-by: Raghavendra Talur rtalur@redhat.com
f31c5e8
to
5b0469c
Compare
Interesting. I'll definitely look at it this tomorrow |
Hi guys, I'm little bit confused, because Heketi already does this through the use of defer functions. Take a look at the defer functions here. Do you mind explaining what creating a journal provides over defer functions? Ah I get it, if Heketi crashes and needs to come back, this can help. I believe you may want to describe in a document how Heketi journal would be replayed to get back to a consistent state before code is written. |
@lpabon Hi, Let me tell Why I want this and Where it will make the difference. So we were thinking we would recommend to write Brick ID on the creation of each brick and create a structure from these entries. Call removeBrickFromDB and also call destroy bricks which will delete the DB entries and LV's. Example Journal on Heketi going down on VolumeCreate, Roughly looks like below Now will create brickEntry and VolumeEntry to call the DB from context available in Journal file. Then Clean the Journal. Hurray! Good To Go. |
@heketi/dev @heketi/maintainers @heketi/admin See the above comment and share your feedbacks. |
@MohamedAshiqrh I think this really really good and necessary. I would highly suggest to save the information (journal) in the db instead of a file. The issue is that a file will not be available when Heketi crashes if it was a container in Kubernetes. By saving the data in the DB, the Journal can be replayed and cleaned up or continued. I would recommend creating a Journal structure with an array of steps where it can be saved as an entry. A new bucket can be created for journals in the db to save these. |
@MohamedAshiqrh If you do not mind, please write up a design (markdown) document with the steps to save, restore, and determine how to repair. Also, document how to test. |
@lpabon I thought of sticking to a file as DB is in incorrect state also this may be helpful on exactly what went wrong in case db corrupts. We mount the gluster volume on /var/lib/heketi, If we place the file there we can persist it in the container world too. Just thought If heketi goes stateless Journal will not have much changes. I will definitely write design doc. |
Generally, good thinking! @MohamedAshiqrh I don't quite understand yet though, how storing the heketi db in a kube secret (PR #671) has any effect on this change. Firstly, heketi runs in non-kubernetes deployments as well. Secondly afaict, that is just a different place to store the heketi db. It should not affect the decision whether to store the journal inside the db or outside... I need to look deeper, but generally, I tend to agree with @lpabon that the journal should be put into the DB. This is not a contradition, since the journal would be used to keep the DB in a consistent / roll-back-able state, essentialy. |
@obnoxxx In #671, @lpabon correct me if wrong. secret is mounted on /backup and working directory of heketi(/var/lib/heketi) is an Empty directory(empty dir from host to container which is actually bind mount of a folder which AFAIK located in /var/lib/docker/container/id/something* of the host). This means file for journal will land in working directory of heketi which is not persisted. the db is backed up to secret again but the db is not used from the secret itself or in other words secret is not a mount point where we can place the content of heketi working directory and persisted. Secret can hold only DB. Hope I make some sense. |
Journal support is a simple concept where heketi keeps track
of transaction and has capability to revert back to a consistent
state or continue from the point where transaction was stopped.
In case of Volume create transaction,
Brick entries are added to the devices and under volume entries
after creating bricks(or lv's on devices). Before all the bricks
created for the volume, heketi process is forcefully terminated.
DB parsing is failed as there are no complete volume entries.
Someone has to cleanup or resume the transaction. Thats journal
responsibility now.
Pair-Programmed-With: Raghavendra Talur rtalur@redhat.com
Signed-off-by: Mohamed Ashiq Liyazudeen mliyazud@redhat.com
Signed-off-by: Raghavendra Talur rtalur@redhat.com