Skip to content
This repository has been archived by the owner on Jul 6, 2023. It is now read-only.

WIP: Journal - Heketi should keep track of transactions #661

Closed
wants to merge 1 commit into from

Conversation

MohamedAshiqrh
Copy link
Member

Journal support is a simple concept where heketi keeps track
of transaction and has capability to revert back to a consistent
state or continue from the point where transaction was stopped.

In case of Volume create transaction,
Brick entries are added to the devices and under volume entries
after creating bricks(or lv's on devices). Before all the bricks
created for the volume, heketi process is forcefully terminated.
DB parsing is failed as there are no complete volume entries.

Someone has to cleanup or resume the transaction. Thats journal
responsibility now.

Pair-Programmed-With: Raghavendra Talur rtalur@redhat.com

Signed-off-by: Mohamed Ashiq Liyazudeen mliyazud@redhat.com
Signed-off-by: Raghavendra Talur rtalur@redhat.com

@centos-ci
Copy link
Collaborator

Can one of the admins verify this patch?

@MohamedAshiqrh
Copy link
Member Author

MohamedAshiqrh commented Jan 31, 2017

@heketi/dev @heketi/maintainers
Hi,

@lpabon @humblec @obnoxxx @jarrpa @raghavendra-talur @ramkrsna
Please take a look at this and share your ideas.

Test:
Do a volume create command and kill heketi forcefully before the end of volume create.
Journal will throw a critial error now which will be replaced by Journal Handle functionality which will do revert or resume the transaction.

for now, it is writing to /tmp/journal which can be moved under /var/lib/heketi or anywhere else.
There are Just two Labels
START
END

based on the count of these the transaction state is found.

Have to revert db entries and delete the stale lvs based on the input along with START and END.

Journal support is a simple concept where heketi keeps track
of transaction and has capability to revert back to a consistent
state or continue from the point where transaction was stopped.

In case of Volume create transaction,

Brick entries are added to the devices and under volume entries
after creating bricks(or lv's on devices). Before all the bricks
created for the volume, heketi process is forcefully terminated.
DB parsing is failed as there are no complete volume entries.

Someone has to cleanup or resume the transaction. Thats journal
responsibility now.

Pair-Programmed-With: Raghavendra Talur rtalur@redhat.com

Signed-off-by: Mohamed Ashiq Liyazudeen mliyazud@redhat.com
Signed-off-by: Raghavendra Talur rtalur@redhat.com
@lpabon
Copy link
Contributor

lpabon commented Feb 1, 2017

Interesting. I'll definitely look at it this tomorrow

@lpabon lpabon self-requested a review February 1, 2017 05:25
@lpabon
Copy link
Contributor

lpabon commented Feb 2, 2017

Hi guys, I'm little bit confused, because Heketi already does this through the use of defer functions. Take a look at the defer functions here. Do you mind explaining what creating a journal provides over defer functions?

Ah I get it, if Heketi crashes and needs to come back, this can help. I believe you may want to describe in a document how Heketi journal would be replayed to get back to a consistent state before code is written.

@MohamedAshiqrh
Copy link
Member Author

MohamedAshiqrh commented Feb 3, 2017

@lpabon Hi,

Let me tell Why I want this and Where it will make the difference.
We will hit the issue of DB in not clean state and Stale LV's, When Heketi goes down on Volume create progress. IMO This is what we hit the most today so our below solution is based on this path. Let us know If it will create conflict on other paths(node add/delete, device add/delete and cluster add/delete).

So we were thinking we would recommend to write Brick ID on the creation of each brick and create a structure from these entries. Call removeBrickFromDB and also call destroy bricks which will delete the DB entries and LV's.

Example Journal on Heketi going down on VolumeCreate, Roughly looks like below
START VolumeCreate VolumeName size .....
START BrickCreate brickid 3247198744519sdf83
END BrickCreate
START BrickCreate brickid sdahfwiu233
END BrickCreate

Now will create brickEntry and VolumeEntry to call the DB from context available in Journal file.
Then Call removeBrickFromDB and DestroyBrick.

Then Clean the Journal.
Now all the stale bricks entries and LV's are deleted. This way reverting back to good state.

Hurray! Good To Go.

@MohamedAshiqrh
Copy link
Member Author

@heketi/dev @heketi/maintainers @heketi/admin See the above comment and share your feedbacks.

@humblec @obnoxxx @jarrpa @raghavendra-talur @ramkrsna

@lpabon
Copy link
Contributor

lpabon commented Feb 5, 2017

@MohamedAshiqrh I think this really really good and necessary. I would highly suggest to save the information (journal) in the db instead of a file. The issue is that a file will not be available when Heketi crashes if it was a container in Kubernetes. By saving the data in the DB, the Journal can be replayed and cleaned up or continued.

I would recommend creating a Journal structure with an array of steps where it can be saved as an entry. A new bucket can be created for journals in the db to save these.

@lpabon
Copy link
Contributor

lpabon commented Feb 5, 2017

@MohamedAshiqrh If you do not mind, please write up a design (markdown) document with the steps to save, restore, and determine how to repair. Also, document how to test.

@MohamedAshiqrh
Copy link
Member Author

@lpabon I thought of sticking to a file as DB is in incorrect state also this may be helpful on exactly what went wrong in case db corrupts. We mount the gluster volume on /var/lib/heketi, If we place the file there we can persist it in the container world too. Just thought If heketi goes stateless Journal will not have much changes.

I will definitely write design doc.

@MohamedAshiqrh
Copy link
Member Author

According to #671 PR No more volume is required, Thus I agree to @lpabon point on having a separate bucket for Journal. Will proceed the same. 👍 Good Job on #671 @lpabon .

@obnoxxx
Copy link
Contributor

obnoxxx commented Feb 20, 2017

Generally, good thinking! @MohamedAshiqrh I don't quite understand yet though, how storing the heketi db in a kube secret (PR #671) has any effect on this change. Firstly, heketi runs in non-kubernetes deployments as well. Secondly afaict, that is just a different place to store the heketi db. It should not affect the decision whether to store the journal inside the db or outside...

I need to look deeper, but generally, I tend to agree with @lpabon that the journal should be put into the DB. This is not a contradition, since the journal would be used to keep the DB in a consistent / roll-back-able state, essentialy.

@MohamedAshiqrh
Copy link
Member Author

@obnoxxx In #671, @lpabon correct me if wrong. secret is mounted on /backup and working directory of heketi(/var/lib/heketi) is an Empty directory(empty dir from host to container which is actually bind mount of a folder which AFAIK located in /var/lib/docker/container/id/something* of the host). This means file for journal will land in working directory of heketi which is not persisted. the db is backed up to secret again but the db is not used from the secret itself or in other words secret is not a mount point where we can place the content of heketi working directory and persisted. Secret can hold only DB. Hope I make some sense.

@lpabon lpabon closed this Jun 13, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants