Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mkfs.ext4 infinitely wait when "Writing superblocks and filesystem accounting information" #4

Closed
zhitaoli-6 opened this issue Jun 21, 2018 · 12 comments

Comments

@zhitaoli-6
Copy link
Contributor

Hi,

I run FEMU in white-box mode. It is ok to create a pblk target on top of Open-channel SSD by FEMU.
However, It will infinitely wait when I try creating a ext4 based on pblk.

The case is as the following picture shows:
image

It is waiting there infinitely.

Guest Configuration:
VM: Ubuntu18.04-Server
Kernel version: 4.16.0+

In fact, I try 14.04-server you supported and It waits there also in my case.I don't know what the wrong is. Could anyone give me some ideas?

The starting VM script is following:

sudo x86_64-softmmu/qemu-system-x86_64 \
    -name "FEMU-whitebox-SSD" \
    -enable-kvm \
    -cpu host \
    -smp 4 \
    -m 4G \
    -device virtio-scsi-pci,id=scsi0 \
    -device scsi-hd,drive=hd0 \
    -drive file=$OSIMGF,if=none,aio=native,cache=none,format=qcow2,id=hd0 \
    -drive file=$NVMEIMGF0,if=none,aio=threads,format=raw,id=id0 \
    -device nvme,drive=id0,serial=serial0,id=nvme0,namespaces=1,lver=1,lmetasize=16,ll2pmode=0,nlbaf=5,lba_index=3,mdts=10,lnum_ch=16,lnum_lun=1,lnum_pln=1,lsec_size=4096,lsecs_per_pg=4,lpgs_per_blk=512,ldebug=0,femu_mode=0 \
    -net user,hostfwd=tcp::8080-:22 \
    -net nic,model=virtio \
    -nographic \
    -qmp unix:./qmp-sock,server,nowait
    #-object iothread,id=iothread0 \
    #-display none \
    #-nographic \
    #-monitor stdio \
    #-s -S \
    #
@zhitaoli-6 zhitaoli-6 changed the title mkfs.ext4 infinite wait when "Writing superblocks and filesystem accounting information" mkfs.ext4 infinitely wait when "Writing superblocks and filesystem accounting information" Jun 21, 2018
@huaicheng
Copy link
Contributor

huaicheng commented Jun 23, 2018

lnum_pln=1

FEMU is only emulating OCSSD with two planes right now. It's not suggested to change this value, please stick to lnum_pln=2. It's better you only change "lnum_ch" and "lnum_lun". Future FEMU will remove such restrictions.

Make sure your $NVMEIMGF0 size is power of 2 and when you change the emulated device size, please remove the original $NVMEIMGF0 file. The script will create a new one for you.

If it still doesn't work, could you try the default configuration? and list your modified configurations here so I can do a further analysis.

@zhitaoli-6
Copy link
Contributor Author

zhitaoli-6 commented Jun 23, 2018

Thanks for your reply.
I have tried the default configuration for many times, and it always fails as before in my case.
I am curious about why it will fail in my case, while it works normally in your case and HongweiQin's case(the github user who issues the third issue on FEMU).
Guest OS: u14s
Kernel Verison: 4.16.0
The starting scripts is as following:

sudo x86_64-softmmu/qemu-system-x86_64 \
    -name "FEMU-whitebox-SSD" \
    -enable-kvm \
    -cpu host \
    -smp 4 \
    -m 4G \
    -device virtio-scsi-pci,id=scsi0 \
    -device scsi-hd,drive=hd0 \
    -drive file=$OSIMGF,if=none,aio=native,cache=none,format=qcow2,id=hd0 \
    -drive file=$NVMEIMGF,if=none,aio=threads,format=raw,id=id0 \
    -device nvme,drive=id0,serial=serial0,id=nvme0,namespaces=1,lver=1,lmetasize=16,ll2pmode=0,nlbaf=5,lba_index=3,mdts=10,lnum_ch=2,lnum_lun=8,lnum_pln=2,lsec_size=4096,lsecs_per_pg=4,lpgs_per_blk=512,ldebug=0,femu_mode=0 \
    -net user,hostfwd=tcp::8080-:22 \
    -net nic,model=virtio \
    -nographic \
    -qmp unix:./qmp-sock,server,nowait
    #-object iothread,id=iothread0 \
    #-display none \
    #-nographic \
    #-monitor stdio \
    #-s -S \
    #

@huaicheng
Copy link
Contributor

Could you provide the following information in your settings:

  • Your host memory size, and the virtual NVMe disk size (make sure the virtual NVMe disk size is less than the amount of your host free memory)
  • Inside your guest OS, does dmesg show any errors when you do mkfs.ext4?
  • can you do simple read/write operations to the pblk target you created? e.g. echo 'aaa' > /dev/myssd; cat /dev/myssd in your case. If not working, what's the error being reported?

@zhitaoli-6
Copy link
Contributor Author

first

OS: Ubuntu18.04
Memory: 16GB
virutal NVMe disk size: 1GB

second

Only three new lines displayed by "dmesg" after creating the pblk target and creating ext4 on top of pblk.

[  117.472340] pblk init: L2P CRC: 48f6b687
[  117.494514] pblk(mydevice): luns:4, lines:4, secs:65536, buf entries:256
[  272.935110] random: crng init done

third

It will infinitely waits there when "echo aaa > /dev/myssd", and there are no new lines displayed by dmesg.

@huaicheng
Copy link
Contributor

huaicheng commented Jun 23, 2018

ok, I tried a Ubuntu-18.04 guest and indeed, the mkfs.ext4 process is stuck.

My previous test with Ubuntu 14.04 is successful. Could you go back to use Ubuntu 14.04 first?

@zhitaoli-6
Copy link
Contributor Author

My host OS is ubuntu-18.04, and the guest OS is ubuntu-14.04 that you sent me some days ago. Maybe host OS leads to this issue? What's your host OS?

@huaicheng
Copy link
Contributor

huaicheng commented Jun 23, 2018

DISTRIB_DESCRIPTION="Ubuntu 14.04.5 LTS"

Before you switch your host OS, I believe you should try the following instructions first, if it doesn't work, then switch host OS to 14.04 as well.

  • Remove the $NVMEIMGF file in your host
  • Restart the FEMU VM
    Could you try this?

@zhitaoli-6
Copy link
Contributor Author

I have tried to remove the nvme image file and it still doesn't work in my case.

@zhitaoli-6
Copy link
Contributor Author

I have tried ubuntu 14.04 server both as host OS and guest OS. In my case mkfs.ext4 still infinitely waits there, while you guys can succeed, which has confused me so much. Could anyone give some ideas?

More detail configuration and environment are as following.

  1. Host OS
root@ubuntu:/home/lizhitao# lsb_release -a
No LSB modules are available.
Distributor ID:	Ubuntu
Description:	Ubuntu 14.04.5 LTS
Release:	14.04
Codename:	trusty
  1. Host CPU info
root@ubuntu:/home/lizhitao# lscpu
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                4
On-line CPU(s) list:   0-3
Thread(s) per core:    2
Core(s) per socket:    2
Socket(s):             1
NUMA node(s):          1
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 69
Stepping:              1
CPU MHz:               1138.125
BogoMIPS:              4788.96
Virtualization:        VT-x
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              4096K
NUMA node0 CPU(s):     0-3
  1. Host Memory Info
root@ubuntu:/home/lizhitao# cat /proc/meminfo 
MemTotal:       16350656 kB
MemFree:         6484132 kB
MemAvailable:   15732240 kB
Buffers:           35276 kB
Cached:          9325452 kB
SwapCached:            0 kB
Active:          4706428 kB
Inactive:        4780840 kB
Active(anon):      70056 kB
Inactive(anon):    63528 kB
Active(file):    4636372 kB
Inactive(file):  4717312 kB
Unevictable:           0 kB
Mlocked:               0 kB
SwapTotal:      16693244 kB
SwapFree:       16693176 kB
Dirty:                 0 kB
Writeback:             0 kB
AnonPages:        126540 kB
Mapped:            64556 kB
Shmem:              7044 kB
Slab:             276216 kB
SReclaimable:     230700 kB
SUnreclaim:        45516 kB
KernelStack:        3232 kB
PageTables:        15824 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:    24868572 kB
Committed_AS:     777372 kB
VmallocTotal:   34359738367 kB
VmallocUsed:           0 kB
VmallocChunk:          0 kB
HardwareCorrupted:     0 kB
AnonHugePages:     49152 kB
CmaTotal:              0 kB
CmaFree:               0 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
DirectMap4k:       85640 kB
DirectMap2M:     1929216 kB
DirectMap1G:    15728640 kB
  1. VM OS: ubuntu server 14.04 supported by FEMU
  2. VM starting script(default):
#!/bin/bash
# Huaicheng Li <huaicheng@cs.uchicago.edu>
# Run VM with lightnvm support: FEMU as a whitebox SSD (OpenChannel-SSD)

# image directory
IMGDIR=/root/repos/images
# virtual machine disk image
OSIMGF=$IMGDIR/u14s.qcow2
# virtual NVMe disk image
NVMEIMGF=$IMGDIR/vssd1.raw
# virtual NVMe disk size: 1GB
NVMEIMGSZ=1G

# every time we create a new SSD image file
#sudo rm -rf $IMGDIR/vssd1.raw

if [[ ! -e "$OSIMGF" ]]; then
	echo ""
	echo "VM disk image couldn't be found ..."
	echo "Please prepare a usable VM image and place it as $OSIMGF"
	echo "Once VM disk image is ready, please rerun this script again"
	echo ""
	exit
fi

# Please match the image file size with the emulated SSD size in vssd1.conf file
[[ ! -e $NVMEIMGF ]] && ./qemu-img create -f raw $NVMEIMGF $NVMEIMGSZ

#-object iothread,id=iothread0 \
#-device virtio-blk-pci,iothread=iothread0,drive=id0 \
    #-nographic \
    #-device nvme,drive=id0,serial=serial0,id=nvme0 \
    #-kernel /home/huaicheng/git/linux/arch/x86_64/boot/bzImage \
    #-append "root=/dev/vda1 console=ttyS0,115200n8 console=tty0" \
    #-virtfs local,path=/home/huaicheng/share/,security_model=passthrough,mount_tag=host_share \

    #must come before all other qemu options!!!!!!
    #-trace events=/tmp/events \
    #-object memory-backend-file,id=mem1,size=8G,mem-path=/dev/hugepages2M \
    #-device pc-dimm,id=dimm1,memdev=mem1 \

    # VOC related options
    #lbbtable=/media/bbtable.qemu,


sudo x86_64-softmmu/qemu-system-x86_64 \
    -name "FEMU-whitebox-SSD" \
    -enable-kvm \
    -cpu host \
    -smp 4 \
    -m 4G \
    -device virtio-scsi-pci,id=scsi0 \
    -device scsi-hd,drive=hd0 \
    -drive file=$OSIMGF,if=none,aio=native,cache=none,format=qcow2,id=hd0 \
    -drive file=$NVMEIMGF,if=none,aio=threads,format=raw,id=id0 \
    -device nvme,drive=id0,serial=serial0,id=nvme0,namespaces=1,lver=1,lmetasize=16,ll2pmode=0,nlbaf=5,lba_index=3,mdts=10,lnum_ch=2,lnum_lun=8,lnum_pln=2,lsec_size=4096,lsecs_per_pg=4,lpgs_per_blk=512,ldebug=0,femu_mode=0 \
    -net user,hostfwd=tcp::8080-:22 \
    -net nic,model=virtio \
    -nographic \
    -qmp unix:./qmp-sock,server,nowait
    #-object iothread,id=iothread0 \
    #-display none \
    #-nographic \
    #-monitor stdio \
    #-s -S \
    #

#sleep 10

#./pin.sh
#sshsim "~/tsc.sh"
#sshsim "echo 0 | sudo tee /proc/sys/kernel/timer_migration"
#sshsim "echo 0 | sudo tee /sys/kernel/debug/tracing/tracing_on"

echo "VM is up, enjoy it :)"

wait

@huaicheng
Copy link
Contributor

Based on your description, I have no clue where part went wrong.
Can you add verbose "printf()" to femu_oc_rw()? Recompile FEMU, redo the above process. And observe if the printf() is triggered during mkfs.ext4 (mkfs.ext4 will essentially do some writes to the device). In this way, we can at least know if the problem comes from FEMU (the emulated hardware) by observing the read/write traffic.

@zhitaoli-6
Copy link
Contributor Author

Up to now, in my case, mkfs.ext4 works on Guest kernel 4.15.0 and fails on kernel 4.16.0.

In addition, when creating pblk using " nvme lnvm create -d nvme0n1 --lun-begin=0 --lun-end=3 -n mydevice -t pblk", femu_oc_rw will print following infos,

image

I add "printf" in femu_oc_rw.

            if (state != FEMU_OC_SEC_WRITTEN) {
                printf("femu_oc_rw: the page state is %d, not WRITTEN, the ppa is:", state);
                print_ppa(ln, psl[i]);
                bitmap_set(&cqe->res64, i, n_pages - i);
                req->status = 0x42ff;

                /* Copy what has been read from the OOB area */
                if (meta)
                    nvme_addr_write(n, meta, (void *)msl,
                            n_pages * ln->params.sos);
                err = 0x42ff;
                goto fail_free_msl;
            }

It turns out that pblk in linux kernel try to read some pages while the pages have never been written. I'm not sure whether it is reasonable, which happens both on Linux kernel 4.15.0 and 4.16.0.

On Linux kernel 4.15.0, some writes follows and they will succeed. On Linux kernel 4.16.0, no writes information is printed by femu_oc_rw. It seems that "femu_oc_rw" gets no write request.

@huaicheng
Copy link
Contributor

Up to now, in my case, mkfs.ext4 works on Guest kernel 4.15.0 and fails on kernel 4.16.0.

Thanks for the update. Beside using different kernel versions, did you change anything else? It's good to know that 4.15.0 works for you. I guess 4.16.0 is moving towards OC2.0 (I need to double check).

It turns out that pblk in linux kernel try to read some pages while the pages have never been written. I'm not sure whether it is reasonable, which happens both on Linux kernel 4.15.0 and 4.16.0.

Yes, it's ok. According to my understanding, lightnvm will read from the drive for some metadata, mapping table, etc during target initialization phase.

nicktehrany pushed a commit to nicktehrany/FEMU that referenced this issue Feb 16, 2023
now, 4-chnl ZNS is much faster than 1-chnl
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants