New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

temporary metadata file exists #700

Closed
mwaeckerlin opened this Issue May 17, 2018 · 32 comments

Comments

Projects
None yet
3 participants
@mwaeckerlin
Copy link

mwaeckerlin commented May 17, 2018

After a crash of master, restarting the master failed with stale lockfile exists, consider running mfsmetarestore -a to fix problems with your datadir. After sudo mfsmetarestore -a restarting complains: temporary metadata file exists, metadata directory is in dirty state.

What now?

There are two metadata loggers up and running.

@mwaeckerlin

This comment has been minimized.

Copy link

mwaeckerlin commented May 17, 2018

Found in /var/log/syslog:

May 17 18:48:59 universum systemd[1]: Starting LizardFS master server daemon...
May 17 18:48:59 universum mfsmaster: set gid to 120
May 17 18:48:59 universum mfsmaster: set uid to 114
May 17 18:48:59 universum mfsmaster: changed working directory to: /var/lib/mfs
May 17 18:48:59 universum mfsmaster: lockfile /var/lib/lizardfs/.mfsmaster.lock created and locked
May 17 18:48:59 universum mfsmaster: sessions have been loaded
May 17 18:48:59 universum mfsmaster: initialized sessions from file /var/lib/lizardfs/sessions.mfs
May 17 18:48:59 universum mfsmaster: initialized exports from file /etc/mfs/mfsexports.cfg
May 17 18:48:59 universum mfsmaster: initialized topology from file /etc/mfs/mfstopology.cfg
May 17 18:48:59 universum mfsmaster: initialized goal definitions from file /etc/mfs/mfsgoals.cfg
May 17 18:48:59 universum systemd[1]: lizardfs-master.service: Control process exited, code=exited status=2
May 17 18:48:59 universum mfsmaster: stale lockfile exists, consider running `mfsmetarestore -a' to fix problems with your datadir.
May 17 18:48:59 universum systemd[1]: Failed to start LizardFS master server daemon.
May 17 18:48:59 universum systemd[1]: lizardfs-master.service: Unit entered failed state.
May 17 18:48:59 universum systemd[1]: lizardfs-master.service: Failed with result 'exit-code'.

Run:

sudo rm /var/lib/lizardfs/.mfsmaster.lock

Still get:

May 17 18:51:08 universum systemd[1]: Starting LizardFS master server daemon...
May 17 18:51:08 universum mfsmaster: set gid to 120
May 17 18:51:08 universum mfsmaster: set uid to 114
May 17 18:51:08 universum mfsmaster: changed working directory to: /var/lib/mfs
May 17 18:51:08 universum mfsmaster: lockfile /var/lib/lizardfs/.mfsmaster.lock created and locked
May 17 18:51:08 universum mfsmaster: sessions have been loaded
May 17 18:51:08 universum mfsmaster: initialized sessions from file /var/lib/lizardfs/sessions.mfs
May 17 18:51:08 universum mfsmaster: initialized exports from file /etc/mfs/mfsexports.cfg
May 17 18:51:08 universum mfsmaster: initialized topology from file /etc/mfs/mfstopology.cfg
May 17 18:51:08 universum mfsmaster: initialized goal definitions from file /etc/mfs/mfsgoals.cfg
May 17 18:51:08 universum mfsmaster: stale lockfile exists, consider running `mfsmetarestore -a' to fix problems with your datadir.
May 17 18:51:08 universum systemd[1]: lizardfs-master.service: Control process exited, code=exited status=2
May 17 18:51:08 universum systemd[1]: Failed to start LizardFS master server daemon.
May 17 18:51:08 universum systemd[1]: lizardfs-master.service: Unit entered failed state.
May 17 18:51:08 universum systemd[1]: lizardfs-master.service: Failed with result 'exit-code'.

Running again:

sudo mfsmetarestore -a

Again:

May 17 18:57:17 universum systemd[1]: Starting LizardFS master server daemon...
May 17 18:57:17 universum mfsmaster: set gid to 120
May 17 18:57:17 universum mfsmaster: set uid to 114
May 17 18:57:17 universum mfsmaster: changed working directory to: /var/lib/mfs
May 17 18:57:17 universum mfsmaster: lockfile /var/lib/lizardfs/.mfsmaster.lock created and locked
May 17 18:57:17 universum mfsmaster: sessions have been loaded
May 17 18:57:17 universum mfsmaster: initialized sessions from file /var/lib/lizardfs/sessions.mfs
May 17 18:57:17 universum mfsmaster: initialized exports from file /etc/mfs/mfsexports.cfg
May 17 18:57:17 universum mfsmaster: initialized topology from file /etc/mfs/mfstopology.cfg
May 17 18:57:17 universum mfsmaster: initialized goal definitions from file /etc/mfs/mfsgoals.cfg
May 17 18:57:17 universum mfsmaster: temporary metadata file exists, metadata directory is in dirty state
May 17 18:57:17 universum systemd[1]: lizardfs-master.service: Control process exited, code=exited status=2
May 17 18:57:17 universum systemd[1]: Failed to start LizardFS master server daemon.
May 17 18:57:17 universum systemd[1]: lizardfs-master.service: Unit entered failed state.
May 17 18:57:17 universum systemd[1]: lizardfs-master.service: Failed with result 'exit-code'.

…?

Need help!

@guestisp

This comment has been minimized.

Copy link

guestisp commented May 17, 2018

Ensure that master process isn't restarted automatically by systemd before deleting the lockfile
Stop it and keep it stopped, then clear the lockfile, , run mfsmetarestore and start master again

@guestisp

This comment has been minimized.

Copy link

guestisp commented May 17, 2018

Don't you have any shadow running? Because if you are restoring from dump files, any changes done after dump are lost

@mwaeckerlin

This comment has been minimized.

Copy link

mwaeckerlin commented May 17, 2018

@guestisp

Ensure that master process isn't restarted automatically by systemd before deleting the lockfile
Stop it and keep it stopped, then clear the lockfile, , run mfsmetarestore and start master again

That's exactly, what I did several times, see above. With no success.

Don't you have any shadow running?

Two Metalogger, no Shadow.

@mwaeckerlin

This comment has been minimized.

Copy link

mwaeckerlin commented May 17, 2018

How can I fix this:

mfsmaster: temporary metadata file exists, metadata directory is in dirty state

Need Help!?!?!?

@mwaeckerlin

This comment has been minimized.

Copy link

mwaeckerlin commented May 17, 2018

Can start restoration using the metaloggers?

@mwaeckerlin

This comment has been minimized.

Copy link

mwaeckerlin commented May 17, 2018

The messages toggle somehow between:

temporary metadata file exists, metadata directory is in dirty state

and

stale lockfile exists, consider running `mfsmetarestore -a' to fix problems with your datadir.

…?

@mwaeckerlin

This comment has been minimized.

Copy link

mwaeckerlin commented May 17, 2018

sudo mfsmetarestore -a does not show any information, neither did I find a log, but it returns with 0 exit status. Is that enough? Can I get information on what it did?

Can I simply remove the files in the «metadata directory» after running sudo mfsmetarestore -a? If yes, which directory is this, which files can be removed? Can I see somehow if the files have been applied?

What next?

@guestisp

This comment has been minimized.

Copy link

guestisp commented May 17, 2018

Two Metalogger, no Shadow.

A recipe for a disaster and data-loss. Metalogger are not synced in realtime. Is something like a scheduled "mysqldump" for MySQL. Everything that is changed between the last dump and the crash, is lost.
IIRC, default metalogger configuration will gather metadata from master once every 24 hours!

mfsmaster: temporary metadata file exists, metadata directory is in dirty state

this is triggered by the existance of metadata.mfs.tmp. Did you remove it ?

@mwaeckerlin

This comment has been minimized.

Copy link

mwaeckerlin commented May 17, 2018

@guestisp

this is triggered by the existance of metadata.mfs.tmp. Did you remove it?

Now yes, and now the server started. Thank you.

Now let's analyze the damage…

@guestisp

This comment has been minimized.

Copy link

guestisp commented May 17, 2018

Please repost the "Info" table like you did in the other issue. I'm curious to see is some chunks are lost (as I think)

@mwaeckerlin

This comment has been minimized.

Copy link

mwaeckerlin commented May 17, 2018

$ lizardfs-admin info universum 9421
LizardFS v3.12.0
Memory usage:   15GiB
Total space:    89TiB
Available space:        27TiB
Trash space:    219TiB
Trash files:    16333386
Reserved space: 0B
Reserved files: 0
FS objects:     50022424
Directories:    1700476
Files:  47088074
Chunks: 8319024
Chunk copies:   16637889
Regular copies (deprecated):    16637889
@mwaeckerlin

This comment has been minimized.

Copy link

mwaeckerlin commented May 17, 2018

BTW, @guestip, why is Trash space so exorbitant high? What does it mean and how do I clean-up?

BTW: Mounting metadata shows empty trash…

@guestisp

This comment has been minimized.

Copy link

guestisp commented May 17, 2018

BTW, @guestip, why is Trash space so exorbitant high? What does it mean and how do I clean-up?

i have no idea.

@guestisp

This comment has been minimized.

Copy link

guestisp commented May 17, 2018

did you lost something ?

@mwaeckerlin

This comment has been minimized.

Copy link

mwaeckerlin commented May 17, 2018

Up to now, it does not seem so, AFAIK, but I am still analyzing…

@guestisp

This comment has been minimized.

Copy link

guestisp commented May 17, 2018

Also look at the file content, something could be restored at an outdated version.

@mwaeckerlin

This comment has been minimized.

Copy link

mwaeckerlin commented May 17, 2018

Hmm:

May 17 21:24:11 universum mfsmaster[8361]: chunk 000000000000f132 has not enough valid parts (1) consider repairing it manually

?!?

@mwaeckerlin

This comment has been minimized.

Copy link

mwaeckerlin commented May 17, 2018

It seems, that there are only undergoal files, but no defective:

$ lizardfs-admin list-defective-files universum 9421 | wc -l
1001
$ lizardfs-admin list-defective-files universum 9421 | grep -v undergoal | wc -l
0

What other tests should I run?

@mwaeckerlin

This comment has been minimized.

Copy link

mwaeckerlin commented May 17, 2018

$ lizardfs-admin chunks-health universum 9421 
Chunks availability state:
        Goal    Safe    Unsafe  Lost
        workers 8318938 -       76

Chunks replication state:
        Goal    0       1       2       3       4       5       6       7   89       10+
        workers 8318938 -       76      -       -       -       -       -   --       -

Chunks deletion state:
        Goal    0       1       2       3       4       5       6       7   89       10+
        workers 8319011 3       -       -       -       -       -       -   --       -
@4Dolio

This comment has been minimized.

Copy link

4Dolio commented May 18, 2018

Metaloggers so sync in real time. The record changelog files between each full sync. Should not lose anything.

@mwaeckerlin

This comment has been minimized.

Copy link

mwaeckerlin commented May 18, 2018

@4Dolio, do I need to do something to restore from metaloggers or does lizardfs automatically detect it state and start the necessary steps?

So it is enough to have two metaloggers, or to I need a shadow master?

@guestisp

This comment has been minimized.

Copy link

guestisp commented May 18, 2018

So it is enough to have two metaloggers, or to I need a shadow master?

With a shadow, you don't have to restore anything, just reload the shadow process after changing the personality in config file and everything would be up and running in almost zero time.

With metaloggers you have to restore from disk, on huge cluster, this could take hours.

@mwaeckerlin

This comment has been minimized.

Copy link

mwaeckerlin commented May 18, 2018

Since my server is up again, I can now close this ticket. Please feel free to add more information regarding mettalogger, restoring and suggested setups.

Thank you, @guestisp for saving my ass and @4Dolio for your clarification.

@guestisp

This comment has been minimized.

Copy link

guestisp commented May 18, 2018

@mwaeckerlin I strongly suggest you to add at least one shadow server.

@mwaeckerlin

This comment has been minimized.

Copy link

mwaeckerlin commented May 18, 2018

@guestisp, I have only servers with much less memory and cpu ressources.

I suppose, a shadow master consumes much less memory and cpu ressources compared to the active master, right?

@mwaeckerlin

This comment has been minimized.

Copy link

mwaeckerlin commented May 18, 2018

BTW: Perhaps the root cause of the problem is, that my trash is not cleaned up correctly? → #702

@guestisp

This comment has been minimized.

Copy link

guestisp commented May 18, 2018

I suppose, a shadow master consumes much less memory and cpu ressources compared to the active master, right?

shadow consume exactly the same as master, because it's a 1:1 copy.

@mwaeckerlin

This comment has been minimized.

Copy link

mwaeckerlin commented May 18, 2018

@guestisp

shadow consume exactly the same as master, because it's a 1:1 copy.

Even though it is passive and not actually serving?
In theory, it does not need to keep the metafiles in memory…

@guestisp

This comment has been minimized.

Copy link

guestisp commented May 18, 2018

I think you should read official docs at least one time.
Yes, shadow is an exact copy and as so, it keep everything in RAM, ready to be promoted to master.
Without keeping everything in ram, there wuould be no differences between shadows and metalogger.

@4Dolio

This comment has been minimized.

Copy link

4Dolio commented May 18, 2018

The metadatarestore command considers the newest bin metadata and the two most recent changelogs when you use the -a (automatic) option. It revalidates the bin and the replays the older changelog(already in the bin) then it replays the newest changelog to catch up after the bin. You get a new clean metadata bin with zero lost revisions.

@4Dolio

This comment has been minimized.

Copy link

4Dolio commented May 18, 2018

Shadows are most useful when using the personality ha-managed. In that mode a shadow can be sent a command to promote itself, instantly becoming a master. But shadows need equal Ram, and you need to move the masters floating IP address. And you should quick stop the old master first, quick stop skips the netadata bin write to disk when it exits, so the shadow does not need to wait for the master to write metadata to disk before the rest of the promotion process can proceed. I am not sure if you can do a running promotion without the ha-managed personality.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment