Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Write Back: Will the cache device reflush uncommitted data to core device after a system crash? #253

Closed
beef9999 opened this issue Dec 26, 2019 · 11 comments · Fixed by Open-CAS/ocf#342
Assignees
Labels
bug Something isn't working

Comments

@beef9999
Copy link

beef9999 commented Dec 26, 2019

Is there any possibility that we might lose data?

@beef9999 beef9999 changed the title Write Back: Will the cache device reflush data to core device after a system crash? Write Back: Will the cache device reflush uncommitted data to core device after a system crash? Jan 1, 2020
@mmichal10
Copy link
Contributor

Thank you for your question.

It depends on your configuration. First of all if you don't have your current config saved in /etc/opencas/opencas.conf, you will have to load cache manually (use casadm -S -d <dev_path> --load). After your cache instance is loaded successfully, flushing will be triggered according to cleaning policy which you set. By default it should start after 120 secods. Of course you can also start flushing manually - you can use casadm -F -i <cache_id> command, however if amount of dirty data is big, it will take a long time.

If something would not be clear, please feel free to ask.

@mmichal10 mmichal10 self-assigned this Jan 2, 2020
@beef9999
Copy link
Author

beef9999 commented Jan 7, 2020

hi, @mmichal10

I tried to flush manually ( -F ), but still encountered tiny data loss after that.

Here is my test case:

metaFile = open(O_DIRECT)  // meta file is on another independent SSD
dataFile = open(O_DIRECT)  // data file is on a HDD core device, accelerated by a SSD cache device
for (int i = 1; ; i++) {
    write(dataFile, 4k_block_empty_data)
    fsync(dataFile)
    
    wite(metaFile, 4k_block_count_i_string)  // record the count of total writes
    fsync(metaFile)
}

Run this program, then echo b > /proc/sysrq-trigger to emulate a power failure reboot.

Restart the cache and load dirty data by casadm -S -d /dev/dfa -c wb -l. Note the cleaning policy was acp.

Flush dirty data, casadm -F -i 1, then re-mount the core device.

The count number inside metaFile * 4k should be equals to the size of dataFile.

However the fact is that dataFile is slightly smaller (less than 1MB), that's why I assume data loss.

@mmichal10
Copy link
Contributor

Could you please tell, which version of open-cas-linux you are using? Is it 19.9? Or current master? Or any of older releases?

@beef9999
Copy link
Author

beef9999 commented Jan 7, 2020

@mmichal10
version 19.9

@mmichal10 mmichal10 added the bug Something isn't working label Jan 7, 2020
@mmichal10
Copy link
Contributor

I wrote following code basing on pseudocode:

    #define _GNU_SOURCE

    #include <stdio.h>
    #include <sys/types.h>
    #include <sys/stat.h>
    #include <fcntl.h>
    #include <stdlib.h>
    #include <string.h>

    #define PAGESIZE 4096

    int main ()
    {
    int i;
    int ret;
    void *len_buf;
    int metaFile, dataFile;
    int dmesg;

    posix_memalign(&len_buf, PAGESIZE, PAGESIZE);
    memset(len_buf, 0, PAGESIZE);

    metaFile = open("/root/meta_file", O_CREAT|O_TRUNC|O_WRONLY|O_DIRECT, S_IRWXU);
    if (!metaFile) {
    printf("Could not open MetaFile!\n");
    return -1;
    }

    dataFile = open("/mnt/cas/test_file", O_CREAT|O_TRUNC|O_WRONLY|O_DIRECT, S_IRWXU);
    if (!dataFile) {
    close(metaFile);
    printf("Could not open DataFile!\n");
    return -1;
    }

    dmesg = open("/dev/kmsg", O_WRONLY);
    if (!dataFile) {
    close(metaFile);
    close(dataFile);
    printf("Could not open kmsg!\n");
    return -1;
    }

    for (i = 1; ; i++) {
    sprintf(len_buf, "%d\n", i);
    write(dmesg, len_buf, strlen(len_buf));

    ret = write(dataFile, len_buf, PAGESIZE);
    if (ret != PAGESIZE)
            return -1;
    ret = fsync(dataFile);
    if (ret)
            return -1;

    ret = write(metaFile, len_buf, PAGESIZE);
    if (ret != PAGESIZE)
            return -1;
    ret = fsync(metaFile);
    if (ret)
            return -1;
    }

    return 0;
    }

I am initializing CAS instance like below:
casadm -S -d /dev/nvme0n1p1 --force -c wb
casadm -A -d /dev/sda1 -i 1
mkfs.ext3 /dev/cas1-1
casadm -X -n cleaning -p acp -i 1
mount /dev/cas1-1 /mnt/cas

However I am unable to reproduce your issue. Could you please take a closer look at my code and script? Are there any differences between my and your setup?

@beef9999
Copy link
Author

beef9999 commented Jan 9, 2020

@mmichal10 I can reproduce this issue even with your code.

#define _GNU_SOURCE

#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>

#define PAGESIZE 4096

int main() {
	int i;
	int ret;
	void *len_buf;
	int metaFile, dataFile;
	int dmesg;

	posix_memalign(&len_buf, PAGESIZE, PAGESIZE);
	memset(len_buf, 0, PAGESIZE);

	metaFile = open("/data22/meta", O_CREAT | O_TRUNC | O_WRONLY | O_DIRECT, S_IRWXU);
	if (!metaFile) {
		printf("Could not open MetaFile!\n");
		return -1;
	}

	dataFile = open("/mnt/data", O_CREAT | O_TRUNC | O_WRONLY | O_DIRECT, S_IRWXU);
	if (!dataFile) {
		close(metaFile);
		printf("Could not open DataFile!\n");
		return -1;
	}

	for (i = 1;; i++) {
		sprintf(len_buf, "%d\n", i);

		ret = write(dataFile, len_buf, PAGESIZE);
		if (ret != PAGESIZE)
			return -1;
		ret = fsync(dataFile);
		if (ret)
			return -1;

		// Write count at offset 0
		ret = pwrite(metaFile, len_buf, PAGESIZE, 0);	
		if (ret != PAGESIZE)
			return -1;
		ret = fsync(metaFile);
		if (ret)
			return -1;
	}

	return 0;
}

My test environment:
OS: CentOS 7
SSD: NVMe
Test script:

casadm -S -f -d /dev/nvme1n1 -c wb -x 64
casadm -X -n cleaning -p acp -i 1
mkfs.ext4 /dev/sdk1
casadm -A -i 1 -d /dev/sdk
mount /dev/cas1-1p1 /mnt/

echo b > /proc/sysrq-trigger     # run the code for a while, then reboot

casadm -S -d /dev/nvme1n1 -l -c wb -x 64
casadm -F -i 1
mount /dev/cas1-1p1 /mnt/

echo $(( `cat /data22/meta` * 4096 - `ls -l /mnt/data | awk '{print $5}'` ))  # size gap

Also tried v19.3, still able to reproduce.

@mmichal10
Copy link
Contributor

mmichal10 commented Jan 9, 2020

@beef9999
What is the the difference between expected and actual size of file? Is it constant?
Have you tried to reproduce this issue without filesystem? Does your SSD support Power Loss Protection?

@beef9999
Copy link
Author

beef9999 commented Jan 9, 2020

@mmichal10

  • It's a random number.
  • No, using filesystem makes it easier to verify my scenario.
  • Yes, it does, according to our IT guys.

@mmichal10
Copy link
Contributor

@beef9999
Thank you. I have stable reproduction and I am investigating this issue.

@mmichal10
Copy link
Contributor

@beef9999
We are implementing fix, I will notify you when it will be ready.

@arutk
Copy link
Contributor

arutk commented Jan 23, 2020

Fix in OCF: Open-CAS/ocf#342
Issue introduced in OCF 19.3: Open-CAS/ocf#55
We will follow up with tests shortly after fixing this.

vzhereb9 pushed a commit to vzhereb9/open-cas-linux that referenced this issue Jun 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants