Skip to content

Incorrect epoch_id after restoring snapshot entries #136

@iaoing

Description

@iaoing

Issue:

If remounting NOVA after creating a snapshot, the NOVA's epoch id will be as same the last snapshot's epoch id. After that, we cannot create a new snapshot since the epoch id has existed in the radix tree.
Correctly, the system's epoch id should be one bigger than the last snapshot's epoch id.

Reproduction:

mount -t NOVA -o init,dax /dev/pmem0 /mnt/pmem0
# create a snapshot
echo 1 > /proc/fs/NOVA/pmem0/create_snapshot
umount /mnt/pmem0
# remount NOVA
mount -t NOVA -o dax /dev/pmem0 /mnt/pmem0
# try to create a new snapshot after the recovery
echo 1 > /proc/fs/NOVA/pmem0/create_snapshot 
# get the error msg
dmesg
# the msg will show "nova: nova_insert_snapshot_info ERROR -17" since the radix tree insertion is failed.

Reason:

In recovery phase, nova_sb_info->s_epoch_id will be set as nova_snapshot_info_entry->epoch_id, as Line 870 shows.

if (epoch_id > sbi->s_epoch_id)
sbi->s_epoch_id = epoch_id;

When the recovery is done, nova_sb_info->s_epoch_id will be set as nova_super_block->epoch_id to obtain the correct epoch.

sbi->s_epoch_id = le64_to_cpu(super->s_epoch_id);

However, nova_super_block->epoch_id is incorrect which was updated in the nova_create_snapshot function. As the below code shows, Line 926 gets the current epoch id and increases nova_sb_info->s_epoch_id by one. Then, the got epoch id is assigned to nova_super_block, as Line 966 says. Indeed, this epoch id is one smaller than the correct one, a.k.a. nova_sb_info->s_epoch_id.

/* Increase the epoch id, but use the old value as snapshot id */
epoch_id = sbi->s_epoch_id++;

sbi->nova_sb->s_wtime = cpu_to_le32(get_seconds());
sbi->nova_sb->s_epoch_id = cpu_to_le64(epoch_id);
nova_update_super_crc(sb);

Potential Correction:

// for Line 869 - 870, nova_restore_snapshot_entry(), linux-nova/fs/nova/snapshot.c
// ensure sbi's epoch id is always one bigger than the epoch id of the restored snapshot.
if (epoch_id >= sbi->s_epoch_id) 
    sbi->s_epoch_id = epoch_id + 1; 
// for Line 966, nova_create_snapshot(), linux-nova/fs/nova/snapshot.c
// correct the epoch id is one bigger than the epoch id of the new created snapshot.
sbi->nova_sb->s_wtime = cpu_to_le32(get_seconds()); 
sbi->nova_sb->s_epoch_id = cpu_to_le64(epoch_id) + 1; 
nova_update_super_crc(sb); 

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions