Permalink
Commits on Sep 27, 2014
  1. nilfs2-kmod-centos6: 0.6.1 release

    Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
    konis committed Sep 27, 2014
  2. nilfs2: fix data loss with mmap()

    This bug leads to reproducible silent data loss, despite the use of
    msync(), sync() and a clean unmount of the file system.  It is easily
    reproducible with the following script:
    
      ----------------[BEGIN SCRIPT]--------------------
      mkfs.nilfs2 -f /dev/sdb
      mount /dev/sdb /mnt
    
      dd if=/dev/zero bs=1M count=30 of=/mnt/testfile
    
      umount /mnt
      mount /dev/sdb /mnt
      CHECKSUM_BEFORE="$(md5sum /mnt/testfile)"
    
      /root/mmaptest/mmaptest /mnt/testfile 30 10 5
    
      sync
      CHECKSUM_AFTER="$(md5sum /mnt/testfile)"
      umount /mnt
      mount /dev/sdb /mnt
      CHECKSUM_AFTER_REMOUNT="$(md5sum /mnt/testfile)"
      umount /mnt
    
      echo "BEFORE MMAP:\t$CHECKSUM_BEFORE"
      echo "AFTER MMAP:\t$CHECKSUM_AFTER"
      echo "AFTER REMOUNT:\t$CHECKSUM_AFTER_REMOUNT"
      ----------------[END SCRIPT]--------------------
    
    The mmaptest tool looks something like this (very simplified, with
    error checking removed):
    
      ----------------[BEGIN mmaptest]--------------------
      data = mmap(NULL, file_size - file_offset, PROT_READ | PROT_WRITE,
                  MAP_SHARED, fd, file_offset);
    
      for (i = 0; i < write_count; ++i) {
            memcpy(data + i * 4096, buf, sizeof(buf));
            msync(data, file_size - file_offset, MS_SYNC))
      }
      ----------------[END mmaptest]--------------------
    
    The output of the script looks something like this:
    
      BEFORE MMAP:    281ed1d5ae50e8419f9b978aab16de83  /mnt/testfile
      AFTER MMAP:     6604a1c31f10780331a6850371b3a313  /mnt/testfile
      AFTER REMOUNT:  281ed1d5ae50e8419f9b978aab16de83  /mnt/testfile
    
    So it is clear, that the changes done using mmap() do not survive a
    remount.  This can be reproduced a 100% of the time.  The problem was
    introduced in commit 136e8770cd5d ("nilfs2: fix issue of
    nilfs_set_page_dirty() for page at EOF boundary").
    
    If the page was read with mpage_readpage() or mpage_readpages() for
    example, then it has no buffers attached to it.  In that case
    page_has_buffers(page) in nilfs_set_page_dirty() will be false.
    Therefore nilfs_set_file_dirty() is never called and the pages are never
    collected and never written to disk.
    
    This patch fixes the problem by also calling nilfs_set_file_dirty() if the
    page has no buffers attached to it.
    
    [akpm@linux-foundation.org: s/PAGE_SHIFT/PAGE_CACHE_SHIFT/]
    Signed-off-by: Andreas Rohner <andreas.rohner@gmx.net>
    Tested-by: Andreas Rohner <andreas.rohner@gmx.net>
    Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
    zeitgeist87 committed with konis Sep 25, 2014
Commits on Apr 28, 2014
  1. nilfs2-kmod: use mnt_want_write instead of mnt_want_write_file

    This replaces mnt_{want,drop}_write_file() mistakenly backported in
    nilfs_ioctl_set_suinfo() function with mnt_{want,drop}_write().
    
    Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
    konis committed Apr 20, 2014
Commits on Apr 19, 2014
  1. nilfs2-kmod: 0.6.0 release

    Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
    konis committed Apr 19, 2014
Commits on Apr 6, 2014
  1. nilfs2: update project's web site in nilfs2.txt

    Project's web site was moved to nilfs.sourceforge.net from
    www.nilfs.org.  This updates the site information in
    Documentation/filesystems/nilfs2.txt with the new location.
    
    Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
    konis committed Apr 3, 2014
  2. nilfs2: verify metadata sizes read from disk

    Add code to check sizes of on-disk data of metadata files such as inode
    size, segment usage size, DAT entry size, and checkpoint size.  Although
    these sizes are read from disk, the current implementation doesn't check
    them.
    
    If these sizes are not sane on disk, it can cause out-of-range access to
    metadata or memory access overrun on metadata block buffers due to
    overflow in sundry calculations.
    
    Both lower limit and upper limit of metadata sizes are verified to
    prevent these issues.
    
    Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
    Cc: Andreas Rohner <andreas.rohner@gmx.net>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
    konis committed Apr 3, 2014
  3. nilfs2: add FITRIM ioctl support for nilfs2

    Add support for the FITRIM ioctl, which enables user space tools to
    issue TRIM/DISCARD requests to the underlying device.  Every clean
    segment within the specified range will be discarded.
    
    Signed-off-by: Andreas Rohner <andreas.rohner@gmx.net>
    Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
    zeitgeist87 committed with konis Apr 3, 2014
  4. nilfs2-kmod: add glue code to support FITRIM ioctl

    Declare FITRIM ioctl and add fstrim_range struct if the kernel doesn't
    have those declarations.
    
    Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
    konis committed Apr 6, 2014
  5. nilfs2: add nilfs_sufile_trim_fs to trim clean segs

    Add nilfs_sufile_trim_fs(), which takes an fstrim_range structure and
    calls blkdev_issue_discard for every clean segment in the specified
    range.  The range is truncated to file system block boundaries.
    
    Signed-off-by: Andreas Rohner <andreas.rohner@gmx.net>
    Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    [support old kernels which had deprecated barrier bio]
    [add KM_USER0 argument to k{map,unmap}_atomic() to fix build]
    Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
    zeitgeist87 committed with konis Apr 3, 2014
  6. nilfs2: implementation of NILFS_IOCTL_SET_SUINFO ioctl

    With this ioctl the segment usage entries in the SUFILE can be updated
    from userspace.
    
    This is useful, because it allows the userspace GC to modify and update
    segment usage entries for specific segments, which enables it to avoid
    unnecessary write operations.
    
    If a segment needs to be cleaned, but there is no or very little
    reclaimable space in it, the cleaning operation basically degrades to a
    useless moving operation.  In the end the only thing that changes is the
    location of the data and a timestamp in the segment usage information.
    With this ioctl the GC can skip the cleaning and update the segment
    usage entries directly instead.
    
    This is basically a shortcut to cleaning the segment.  It is still
    necessary to read the segment summary information, but the writing of
    the live blocks can be skipped if it's not worth it.
    
    [konishi.ryusuke@lab.ntt.co.jp: add description of NILFS_IOCTL_SET_SUINFO ioctl]
    Signed-off-by: Andreas Rohner <andreas.rohner@gmx.net>
    Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
    zeitgeist87 committed with konis Apr 3, 2014
  7. nilfs2: add nilfs_sufile_set_suinfo to update segment usage

    Introduce nilfs_sufile_set_suinfo(), which expects an array of
    nilfs_suinfo_update structures and updates the segment usage information
    accordingly.
    
    This is basically a helper function for the newly introduced
    NILFS_IOCTL_SET_SUINFO ioctl.
    
    [konishi.ryusuke@lab.ntt.co.jp: use put_bh() instead of brelse() because we know bh != NULL]
    Signed-off-by: Andreas Rohner <andreas.rohner@gmx.net>
    Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    [add KM_USER0 argument to k{map,kunmap}_atomic() to fix build]
    Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
    zeitgeist87 committed with konis Apr 3, 2014
  8. nilfs2: add struct nilfs_suinfo_update and flags

    Add the nilfs_suinfo_update structure, which contains the information
    needed to update one segment usage entry.  The flags specify, which
    fields need to be updated.
    
    Signed-off-by: Andreas Rohner <andreas.rohner@gmx.net>
    Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
    zeitgeist87 committed with konis Apr 3, 2014
Commits on Feb 28, 2014
  1. nilfs2: add comments for ioctls

    Add comments for ioctls in fs/nilfs2/ioctl.c file and describe NILFS2
    specific ioctls in Documentation/filesystems/nilfs2.txt.
    
    Signed-off-by: Vyacheslav Dubeyko <slava@dubeyko.com>
    Reviewed-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
    Cc: Wenliang Fan <fanwlexca@gmail.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
    dubeyko committed with konis Jan 23, 2014
Commits on Jan 18, 2014
  1. nilfs2-kmod: 0.5.1 release

    Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
    konis committed Jan 18, 2014
Commits on Jan 16, 2014
  1. nilfs2: fix segctor bug that causes file system corruption

    There is a bug in the function nilfs_segctor_collect, which results in
    active data being written to a segment, that is marked as clean.  It is
    possible, that this segment is selected for a later segment
    construction, whereby the old data is overwritten.
    
    The problem shows itself with the following kernel log message:
    
      nilfs_sufile_do_cancel_free: segment 6533 must be clean
    
    Usually a few hours later the file system gets corrupted:
    
      NILFS: bad btree node (blocknr=8748107): level = 0, flags = 0x0, nchildren = 0
      NILFS error (device sdc1): nilfs_bmap_last_key: broken bmap (inode number=114660)
    
    The issue can be reproduced with a file system that is nearly full and
    with the cleaner running, while some IO intensive task is running.
    Although it is quite hard to reproduce.
    
    This is what happens:
    
     1. The cleaner starts the segment construction
     2. nilfs_segctor_collect is called
     3. sc_stage is on NILFS_ST_SUFILE and segments are freed
     4. sc_stage is on NILFS_ST_DAT current segment is full
     5. nilfs_segctor_extend_segments is called, which
        allocates a new segment
     6. The new segment is one of the segments freed in step 3
     7. nilfs_sufile_cancel_freev is called and produces an error message
     8. Loop around and the collection starts again
     9. sc_stage is on NILFS_ST_SUFILE and segments are freed
        including the newly allocated segment, which will contain active
        data and can be allocated at a later time
    10. A few hours later another segment construction allocates the
        segment and causes file system corruption
    
    This can be prevented by simply reordering the statements.  If
    nilfs_sufile_cancel_freev is called before nilfs_segctor_extend_segments
    the freed segments are marked as dirty and cannot be allocated any more.
    
    Signed-off-by: Andreas Rohner <andreas.rohner@gmx.net>
    Reviewed-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
    Tested-by: Andreas Rohner <andreas.rohner@gmx.net>
    Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
    zeitgeist87 committed with konis Jan 15, 2014
Commits on Oct 24, 2013
  1. nilfs2: fix issue with race condition of competition between segments…

    … for dirty blocks
    
    Many NILFS2 users were reported about strange file system corruption
    (for example):
    
       NILFS: bad btree node (blocknr=185027): level = 0, flags = 0x0, nchildren = 768
       NILFS error (device sda4): nilfs_bmap_last_key: broken bmap (inode number=11540)
    
    But such error messages are consequence of file system's issue that takes
    place more earlier.  Fortunately, Jerome Poulin <jeromepoulin@gmail.com>
    and Anton Eliasson <devel@antoneliasson.se> were reported about another
    issue not so recently.  These reports describe the issue with segctor
    thread's crash:
    
      BUG: unable to handle kernel paging request at 0000000000004c83
      IP: nilfs_end_page_io+0x12/0xd0 [nilfs2]
    
      Call Trace:
       nilfs_segctor_do_construct+0xf25/0x1b20 [nilfs2]
       nilfs_segctor_construct+0x17b/0x290 [nilfs2]
       nilfs_segctor_thread+0x122/0x3b0 [nilfs2]
       kthread+0xc0/0xd0
       ret_from_fork+0x7c/0xb0
    
    These two issues have one reason.  This reason can raise third issue
    too.  Third issue results in hanging of segctor thread with eating of
    100% CPU.
    
    REPRODUCING PATH:
    
    One of the possible way or the issue reproducing was described by
    Jermoe me Poulin <jeromepoulin@gmail.com>:
    
    1. init S to get to single user mode.
    2. sysrq+E to make sure only my shell is running
    3. start network-manager to get my wifi connection up
    4. login as root and launch "screen"
    5. cd /boot/log/nilfs which is a ext3 mount point and can log when NILFS dies.
    6. lscp | xz -9e > lscp.txt.xz
    7. mount my snapshot using mount -o cp=3360839,ro /dev/vgUbuntu/root /mnt/nilfs
    8. start a screen to dump /proc/kmsg to text file since rsyslog is killed
    9. start a screen and launch strace -f -o find-cat.log -t find
    /mnt/nilfs -type f -exec cat {} > /dev/null \;
    10. start a screen and launch strace -f -o apt-get.log -t apt-get update
    11. launch the last command again as it did not crash the first time
    12. apt-get crashes
    13. ps aux > ps-aux-crashed.log
    13. sysrq+W
    14. sysrq+E  wait for everything to terminate
    15. sysrq+SUSB
    
    Simplified way of the issue reproducing is starting kernel compilation
    task and "apt-get update" in parallel.
    
    REPRODUCIBILITY:
    
    The issue is reproduced not stable [60% - 80%].  It is very important to
    have proper environment for the issue reproducing.  The critical
    conditions for successful reproducing:
    
    (1) It should have big modified file by mmap() way.
    
    (2) This file should have the count of dirty blocks are greater that
        several segments in size (for example, two or three) from time to time
        during processing.
    
    (3) It should be intensive background activity of files modification
        in another thread.
    
    INVESTIGATION:
    
    First of all, it is possible to see that the reason of crash is not valid
    page address:
    
      NILFS [nilfs_segctor_complete_write]:2100 bh->b_count 0, bh->b_blocknr 13895680, bh->b_size 13897727, bh->b_page 0000000000001a82
      NILFS [nilfs_segctor_complete_write]:2101 segbuf->sb_segnum 6783
    
    Moreover, value of b_page (0x1a82) is 6786.  This value looks like segment
    number.  And b_blocknr with b_size values look like block numbers.  So,
    buffer_head's pointer points on not proper address value.
    
    Detailed investigation of the issue is discovered such picture:
    
      [-----------------------------SEGMENT 6783-------------------------------]
      NILFS [nilfs_segctor_do_construct]:2310 nilfs_segctor_begin_construction
      NILFS [nilfs_segctor_do_construct]:2321 nilfs_segctor_collect
      NILFS [nilfs_segctor_do_construct]:2336 nilfs_segctor_assign
      NILFS [nilfs_segctor_do_construct]:2367 nilfs_segctor_update_segusage
      NILFS [nilfs_segctor_do_construct]:2371 nilfs_segctor_prepare_write
      NILFS [nilfs_segctor_do_construct]:2376 nilfs_add_checksums_on_logs
      NILFS [nilfs_segctor_do_construct]:2381 nilfs_segctor_write
      NILFS [nilfs_segbuf_submit_bio]:464 bio->bi_sector 111149024, segbuf->sb_segnum 6783
    
      [-----------------------------SEGMENT 6784-------------------------------]
      NILFS [nilfs_segctor_do_construct]:2310 nilfs_segctor_begin_construction
      NILFS [nilfs_segctor_do_construct]:2321 nilfs_segctor_collect
      NILFS [nilfs_lookup_dirty_data_buffers]:782 bh->b_count 1, bh->b_page ffffea000709b000, page->index 0, i_ino 1033103, i_size 25165824
      NILFS [nilfs_lookup_dirty_data_buffers]:783 bh->b_assoc_buffers.next ffff8802174a6798, bh->b_assoc_buffers.prev ffff880221cffee8
      NILFS [nilfs_segctor_do_construct]:2336 nilfs_segctor_assign
      NILFS [nilfs_segctor_do_construct]:2367 nilfs_segctor_update_segusage
      NILFS [nilfs_segctor_do_construct]:2371 nilfs_segctor_prepare_write
      NILFS [nilfs_segctor_do_construct]:2376 nilfs_add_checksums_on_logs
      NILFS [nilfs_segctor_do_construct]:2381 nilfs_segctor_write
      NILFS [nilfs_segbuf_submit_bh]:575 bh->b_count 1, bh->b_page ffffea000709b000, page->index 0, i_ino 1033103, i_size 25165824
      NILFS [nilfs_segbuf_submit_bh]:576 segbuf->sb_segnum 6784
      NILFS [nilfs_segbuf_submit_bh]:577 bh->b_assoc_buffers.next ffff880218a0d5f8, bh->b_assoc_buffers.prev ffff880218bcdf50
      NILFS [nilfs_segbuf_submit_bio]:464 bio->bi_sector 111150080, segbuf->sb_segnum 6784, segbuf->sb_nbio 0
      [----------] ditto
      NILFS [nilfs_segbuf_submit_bio]:464 bio->bi_sector 111164416, segbuf->sb_segnum 6784, segbuf->sb_nbio 15
    
      [-----------------------------SEGMENT 6785-------------------------------]
      NILFS [nilfs_segctor_do_construct]:2310 nilfs_segctor_begin_construction
      NILFS [nilfs_segctor_do_construct]:2321 nilfs_segctor_collect
      NILFS [nilfs_lookup_dirty_data_buffers]:782 bh->b_count 2, bh->b_page ffffea000709b000, page->index 0, i_ino 1033103, i_size 25165824
      NILFS [nilfs_lookup_dirty_data_buffers]:783 bh->b_assoc_buffers.next ffff880219277e80, bh->b_assoc_buffers.prev ffff880221cffc88
      NILFS [nilfs_segctor_do_construct]:2367 nilfs_segctor_update_segusage
      NILFS [nilfs_segctor_do_construct]:2371 nilfs_segctor_prepare_write
      NILFS [nilfs_segctor_do_construct]:2376 nilfs_add_checksums_on_logs
      NILFS [nilfs_segctor_do_construct]:2381 nilfs_segctor_write
      NILFS [nilfs_segbuf_submit_bh]:575 bh->b_count 2, bh->b_page ffffea000709b000, page->index 0, i_ino 1033103, i_size 25165824
      NILFS [nilfs_segbuf_submit_bh]:576 segbuf->sb_segnum 6785
      NILFS [nilfs_segbuf_submit_bh]:577 bh->b_assoc_buffers.next ffff880218a0d5f8, bh->b_assoc_buffers.prev ffff880222cc7ee8
      NILFS [nilfs_segbuf_submit_bio]:464 bio->bi_sector 111165440, segbuf->sb_segnum 6785, segbuf->sb_nbio 0
      [----------] ditto
      NILFS [nilfs_segbuf_submit_bio]:464 bio->bi_sector 111177728, segbuf->sb_segnum 6785, segbuf->sb_nbio 12
    
      NILFS [nilfs_segctor_do_construct]:2399 nilfs_segctor_wait
      NILFS [nilfs_segbuf_wait]:676 segbuf->sb_segnum 6783
      NILFS [nilfs_segbuf_wait]:676 segbuf->sb_segnum 6784
      NILFS [nilfs_segbuf_wait]:676 segbuf->sb_segnum 6785
    
      NILFS [nilfs_segctor_complete_write]:2100 bh->b_count 0, bh->b_blocknr 13895680, bh->b_size 13897727, bh->b_page 0000000000001a82
    
      BUG: unable to handle kernel paging request at 0000000000001a82
      IP: [<ffffffffa024d0f2>] nilfs_end_page_io+0x12/0xd0 [nilfs2]
    
    Usually, for every segment we collect dirty files in list.  Then, dirty
    blocks are gathered for every dirty file, prepared for write and
    submitted by means of nilfs_segbuf_submit_bh() call.  Finally, it takes
    place complete write phase after calling nilfs_end_bio_write() on the
    block layer.  Buffers/pages are marked as not dirty on final phase and
    processed files removed from the list of dirty files.
    
    It is possible to see that we had three prepare_write and submit_bio
    phases before segbuf_wait and complete_write phase.  Moreover, segments
    compete between each other for dirty blocks because on every iteration
    of segments processing dirty buffer_heads are added in several lists of
    payload_buffers:
    
      [SEGMENT 6784]: bh->b_assoc_buffers.next ffff880218a0d5f8, bh->b_assoc_buffers.prev ffff880218bcdf50
      [SEGMENT 6785]: bh->b_assoc_buffers.next ffff880218a0d5f8, bh->b_assoc_buffers.prev ffff880222cc7ee8
    
    The next pointer is the same but prev pointer has changed.  It means
    that buffer_head has next pointer from one list but prev pointer from
    another.  Such modification can be made several times.  And, finally, it
    can be resulted in various issues: (1) segctor hanging, (2) segctor
    crashing, (3) file system metadata corruption.
    
    FIX:
    This patch adds:
    
    (1) setting of BH_Async_Write flag in nilfs_segctor_prepare_write()
        for every proccessed dirty block;
    
    (2) checking of BH_Async_Write flag in
        nilfs_lookup_dirty_data_buffers() and
        nilfs_lookup_dirty_node_buffers();
    
    (3) clearing of BH_Async_Write flag in nilfs_segctor_complete_write(),
        nilfs_abort_logs(), nilfs_forget_buffer(), nilfs_clear_dirty_page().
    
    Reported-by: Jerome Poulin <jeromepoulin@gmail.com>
    Reported-by: Anton Eliasson <devel@antoneliasson.se>
    Cc: Paul Fertser <fercerpav@gmail.com>
    Cc: ARAI Shun-ichi <hermes@ceres.dti.ne.jp>
    Cc: Piotr Szymaniak <szarpaj@grubelek.pl>
    Cc: Juan Barry Manuel Canham <Linux@riotingpacifist.net>
    Cc: Zahid Chowdhury <zahid.chowdhury@starsolutions.com>
    Cc: Elmer Zhang <freeboy6716@gmail.com>
    Cc: Kenneth Langga <klangga@gmail.com>
    Signed-off-by: Vyacheslav Dubeyko <slava@dubeyko.com>
    Acked-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
    dubeyko committed with konis Sep 30, 2013
Commits on Aug 31, 2013
  1. nilfs2-kmod: 0.5.0 release

    Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
    konis committed Aug 31, 2013
  2. nilfs2: use atomic64_t type for inodes_count and blocks_count fields …

    …in nilfs_root struct
    
    The cp_inodes_count and cp_blocks_count are represented as __le64 type in
    on-disk structure (struct nilfs_checkpoint).  But analogous fields in
    in-core structure (struct nilfs_root) are represented by atomic_t type.
    
    This patch replaces atomic_t on atomic64_t type in representation of
    inodes_count and blocks_count fields in struct nilfs_root.
    
    Signed-off-by: Vyacheslav Dubeyko <slava@dubeyko.com>
    Acked-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
    Acked-by: Joern Engel <joern@logfs.org>
    Cc: Clemens Eisserer <linuxhippy@gmail.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
    dubeyko committed with konis Jul 3, 2013
  3. nilfs2-kmod: add glue code to support atomic64_t type for earlier ker…

    …nels
    
    This adds aliases of atomic64_t type and atomic64_xxx() operations for
    kernels which don't have atomic64 type support.
    
    Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
    konis committed Aug 31, 2013
Commits on Aug 30, 2013
  1. nilfs2: implement calculation of free inodes count

    Currently, NILFS2 returns 0 as free inodes count (f_ffree) and current
    used inodes count as total file nodes in file system (f_files):
    
    df -i
    Filesystem      Inodes  IUsed   IFree IUse% Mounted on
    /dev/loop0           2      2       0  100% /mnt/nilfs2
    
    This patch implements real calculation of free inodes count.  First of
    all, it is calculated total file nodes in file system as
    (desc_blocks_count * groups_per_desc_block * entries_per_group).  Then, it
    is calculated free inodes count as difference the total file nodes and
    used inodes count.  As a result, we have such output for NILFS2:
    
    df -i
    Filesystem       Inodes   IUsed    IFree IUse% Mounted on
    /dev/loop0      4194304 2114701  2079603   51% /mnt/nilfs2
    
    Reported-by: Clemens Eisserer <linuxhippy@gmail.com>
    Signed-off-by: Vyacheslav Dubeyko <slava@dubeyko.com>
    Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
    Tested-by: Vyacheslav Dubeyko <slava@dubeyko.com>
    Cc: Joern Engel <joern@logfs.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
    dubeyko committed with konis Jul 3, 2013
  2. nilfs2: fix issue with counting number of bio requests for BIO_EOPNOT…

    …SUPP error detection
    
    Fix the issue with improper counting number of flying bio requests for
    BIO_EOPNOTSUPP error detection case.
    
    The sb_nbio must be incremented exactly the same number of times as
    complete() function was called (or will be called) because
    nilfs_segbuf_wait() will call wail_for_completion() for the number of
    times set to sb_nbio:
    
      do {
          wait_for_completion(&segbuf->sb_bio_event);
      } while (--segbuf->sb_nbio > 0);
    
    Two functions complete() and wait_for_completion() must be called the
    same number of times for the same sb_bio_event.  Otherwise,
    wait_for_completion() will hang or leak.
    
    Signed-off-by: Vyacheslav Dubeyko <slava@dubeyko.com>
    Cc: Dan Carpenter <dan.carpenter@oracle.com>
    Acked-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
    Tested-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
    dubeyko committed with konis Aug 22, 2013
  3. nilfs2: remove double bio_put() in nilfs_end_bio_write() for BIO_EOPN…

    …OTSUPP error
    
    Remove double call of bio_put() in nilfs_end_bio_write() for the case of
    BIO_EOPNOTSUPP error detection.  The issue was found by Dan Carpenter
    and he suggests first version of the fix too.
    
    Signed-off-by: Vyacheslav Dubeyko <slava@dubeyko.com>
    Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
    Acked-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
    Tested-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
    dubeyko committed with konis Aug 22, 2013
Commits on Jun 5, 2013
  1. nilfs2-kmod: 0.4.5 release

    Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
    konis committed Jun 5, 2013
  2. nilfs2-kmod: enable new sb freeze mechanism

    Since RHEL 6.4 introduced a new fs_flag FS_HAS_NEW_FREEZE to keep
    compatibility of out-of-tree modules, we need to add FS_HAS_NEW_FREEZE
    flag to enable new sb freeze mechanism.
    
    Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
    konis committed Jun 4, 2013
  3. nilfs2-kmod: use new sb freeze mechanism

    Kernels in RHEL 6.4 and its clones have backported new file system
    freeze mechanism with routines
    sb_start_pagefault()/sb_end_pagefault(),
    sb_start_intwrite()/sb_end_intwrite(), and
    sb_start_write()/sb_end_write().
    
    These are used to provide proper freeze protection.
    
    This applies them to nilfs_page_mkwrite(), nilfs_transcation_begin()
    nilfs_transaction_commit(), and nilfs_tranaction_abort().
    
    Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
    konis committed Jun 4, 2013
Commits on Jun 4, 2013
  1. nilfs2-kmod: use block_page_mkwrite_return

    Kernels in RHEL 6.1 have backported block_page_mkwrite_return() and
    split the main part of block_page_mkwrite() into a sub routine
    __block_page_mkwrite().
    
    This converts nilfs_page_mkwrite() so that it uses
    block_page_mkwrite_return() and __block_page_mkwrite().
    
    Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
    konis committed Jun 4, 2013
Commits on May 30, 2013
  1. nilfs2: fix issue of nilfs_set_page_dirty for page at EOF boundary

    DESCRIPTION:
     There are use-cases when NILFS2 file system (formatted with block size
    lesser than 4 KB) can be remounted in RO mode because of encountering of
    "broken bmap" issue.
    
    The issue was reported by Anthony Doggett <Anthony2486@interfaces.org.uk>:
     "The machine I've been trialling nilfs on is running Debian Testing,
      Linux version 3.2.0-4-686-pae (debian-kernel@lists.debian.org) (gcc
      version 4.6.3 (Debian 4.6.3-14) ) #1 SMP Debian 3.2.35-2), but I've
      also reproduced it (identically) with Debian Unstable amd64 and Debian
      Experimental (using the 3.8-trunk kernel).  The problematic partitions
      were formatted with "mkfs.nilfs2 -b 1024 -B 8192"."
    
    SYMPTOMS:
    (1) System log contains error messages likewise:
    
        [63102.496756] nilfs_direct_assign: invalid pointer: 0
        [63102.496786] NILFS error (device dm-17): nilfs_bmap_assign: broken bmap (inode number=28)
        [63102.496798]
        [63102.524403] Remounting filesystem read-only
    
    (2) The NILFS2 file system is remounted in RO mode.
    
    REPRODUSING PATH:
    (1) Create volume group with name "unencrypted" by means of vgcreate utility.
    (2) Run script (prepared by Anthony Doggett <Anthony2486@interfaces.org.uk>):
    
    ----------------[BEGIN SCRIPT]--------------------
    
    VG=unencrypted
    lvcreate --size 2G --name ntest $VG
    mkfs.nilfs2 -b 1024 -B 8192 /dev/mapper/$VG-ntest
    mkdir /var/tmp/n
    mkdir /var/tmp/n/ntest
    mount /dev/mapper/$VG-ntest /var/tmp/n/ntest
    mkdir /var/tmp/n/ntest/thedir
    cd /var/tmp/n/ntest/thedir
    sleep 2
    date
    darcs init
    sleep 2
    dmesg|tail -n 5
    date
    darcs whatsnew || true
    date
    sleep 2
    dmesg|tail -n 5
    ----------------[END SCRIPT]--------------------
    
    REPRODUCIBILITY: 100%
    
    INVESTIGATION:
    As it was discovered, the issue takes place during segment
    construction after executing such sequence of user-space operations:
    
      open("_darcs/index", O_RDWR|O_CREAT|O_NOCTTY, 0666) = 7
      fstat(7, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
      ftruncate(7, 60)
    
    The error message "NILFS error (device dm-17): nilfs_bmap_assign: broken
    bmap (inode number=28)" takes place because of trying to get block
    number for third block of the file with logical offset #3072 bytes.  As
    it is possible to see from above output, the file has 60 bytes of the
    whole size.  So, it is enough one block (1 KB in size) allocation for
    the whole file.  Trying to operate with several blocks instead of one
    takes place because of discovering several dirty buffers for this file
    in nilfs_segctor_scan_file() method.
    
    The root cause of this issue is in nilfs_set_page_dirty function which
    is called just before writing to an mmapped page.
    
    When nilfs_page_mkwrite function handles a page at EOF boundary, it
    fills hole blocks only inside EOF through __block_page_mkwrite().
    
    The __block_page_mkwrite() function calls set_page_dirty() after filling
    hole blocks, thus nilfs_set_page_dirty function (=
    a_ops->set_page_dirty) is called.  However, the current implementation
    of nilfs_set_page_dirty() wrongly marks all buffers dirty even for page
    at EOF boundary.
    
    As a result, buffers outside EOF are inconsistently marked dirty and
    queued for write even though they are not mapped with nilfs_get_block
    function.
    
    FIX:
    This modifies nilfs_set_page_dirty() not to mark hole blocks dirty.
    
    Thanks to Vyacheslav Dubeyko for his effort on analysis and proposals
    for this issue.
    
    Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
    Reported-by: Anthony Doggett <Anthony2486@interfaces.org.uk>
    Reported-by: Vyacheslav Dubeyko <slava@dubeyko.com>
    Cc: Vyacheslav Dubeyko <slava@dubeyko.com>
    Tested-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
    konis committed May 24, 2013
Commits on May 13, 2013
  1. nilfs2: fix issue with flush kernel thread after remount in RO mode b…

    …ecause of driver's internal error or metadata corruption
    
    The NILFS2 driver remounts itself in RO mode in the case of discovering
    metadata corruption (for example, discovering a broken bmap).  But
    usually, this takes place when there have been file system operations
    before remounting in RO mode.
    
    Thereby, NILFS2 driver can be in RO mode with presence of dirty pages in
    modified inodes' address spaces.  It results in flush kernel thread's
    infinite trying to flush dirty pages in RO mode.  As a result, it is
    possible to see such side effects as: (1) flush kernel thread occupies
    50% - 99% of CPU time; (2) system can't be shutdowned without manual
    power switch off.
    
    SYMPTOMS:
    (1) System log contains error message: "Remounting filesystem read-only".
    (2) The flush kernel thread occupies 50% - 99% of CPU time.
    (3) The system can't be shutdowned without manual power switch off.
    
    REPRODUCTION PATH:
    (1) Create volume group with name "unencrypted" by means of vgcreate utility.
    (2) Run script (prepared by Anthony Doggett <Anthony2486@interfaces.org.uk>):
    
      ----------------[BEGIN SCRIPT]--------------------
      #!/bin/bash
    
      VG=unencrypted
      #apt-get install nilfs-tools darcs
      lvcreate --size 2G --name ntest $VG
      mkfs.nilfs2 -b 1024 -B 8192 /dev/mapper/$VG-ntest
      mkdir /var/tmp/n
      mkdir /var/tmp/n/ntest
      mount /dev/mapper/$VG-ntest /var/tmp/n/ntest
      mkdir /var/tmp/n/ntest/thedir
      cd /var/tmp/n/ntest/thedir
      sleep 2
      date
      darcs init
      sleep 2
      dmesg|tail -n 5
      date
      darcs whatsnew || true
      date
      sleep 2
      dmesg|tail -n 5
      ----------------[END SCRIPT]--------------------
    
    (3) Try to shutdown the system.
    
    REPRODUCIBILITY: 100%
    
    FIX:
    
    This patch implements checking mount state of NILFS2 driver in
    nilfs_writepage(), nilfs_writepages() and nilfs_mdt_write_page()
    methods.  If it is detected the RO mount state then all dirty pages are
    simply discarded with warning messages is written in system log.
    
    [akpm@linux-foundation.org: fix printk warning]
    Signed-off-by: Vyacheslav Dubeyko <slava@dubeyko.com>
    Acked-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
    Cc: Anthony Doggett <Anthony2486@interfaces.org.uk>
    Cc: ARAI Shun-ichi <hermes@ceres.dti.ne.jp>
    Cc: Piotr Szymaniak <szarpaj@grubelek.pl>
    Cc: Zahid Chowdhury <zahid.chowdhury@starsolutions.com>
    Cc: Elmer Zhang <freeboy6716@gmail.com>
    Cc: Wu Fengguang <fengguang.wu@intel.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
    
    [applied fix "nilfs2: fix using of PageLocked() in nilfs_clear_dirty_page()"]
    [applied fix "nilfs2: remove unneeded test in nilfs_writepage()"]
    dubeyko committed with konis Apr 30, 2013
Commits on Mar 5, 2013
  1. nilfs2: fix fix very long mount time issue

    There exists a situation when GC can work in background alone without
    any other filesystem activity during significant time.
    
    The nilfs_clean_segments() method calls nilfs_segctor_construct() that
    updates superblocks in the case of NILFS_SC_SUPER_ROOT and
    THE_NILFS_DISCONTINUED flags are set.  But when GC is working alone the
    nilfs_clean_segments() is called with unset THE_NILFS_DISCONTINUED flag.
    As a result, the update of superblocks doesn't occurred all this time
    and in the case of SPOR superblocks keep very old values of last super
    root placement.
    
    SYMPTOMS:
    
    Trying to mount a NILFS2 volume after SPOR in such environment ends with
    very long mounting time (it can achieve about several hours in some
    cases).
    
    REPRODUCING PATH:
    
    1. It needs to use external USB HDD, disable automount and doesn't
       make any additional filesystem activity on the NILFS2 volume.
    
    2. Generate temporary file with size about 100 - 500 GB (for example,
       dd if=/dev/zero of=<file_name> bs=1073741824 count=200).  The size of
       file defines duration of GC working.
    
    3. Then it needs to delete file.
    
    4. Start GC manually by means of command "nilfs-clean -p 0".  When you
       start GC by means of such way then, at the end, superblocks is updated
       by once.  So, for simulation of SPOR, it needs to wait sometime (15 -
       40 minutes) and simply switch off USB HDD manually.
    
    5. Switch on USB HDD again and try to mount NILFS2 volume.  As a
       result, NILFS2 volume will mount during very long time.
    
    REPRODUCIBILITY: 100%
    
    FIX:
    
    This patch adds checking that superblocks need to update and set
    THE_NILFS_DISCONTINUED flag before nilfs_clean_segments() call.
    
    Reported-by: Sergey Alexandrov <splavgm@gmail.com>
    Signed-off-by: Vyacheslav Dubeyko <slava@dubeyko.com>
    Tested-by: Vyacheslav Dubeyko <slava@dubeyko.com>
    Acked-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
    Tested-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
    dubeyko committed with konis Feb 4, 2013
Commits on Dec 18, 2012
  1. nilfs2-kmod: 0.4.4 release

    Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
    konis committed Dec 18, 2012
  2. nilfs2: fix deprecated barrier warning during discard mount

    On CentOS 6.1 and later, nilfs2 kmod gets the following warning
    during garbage collection if discard mount option is specified:
    
      WARNING: at block/blk-core.c:1379 __make_request+0x535/0x580()
      ...
      block: BARRIER is deprecated, use FLUSH/FUA instead
      ...
    
    This will kill the warning.
    
    Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.nt.tco.jp>
    Ryusuke Konishi committed with konis Dec 18, 2012
Commits on Aug 14, 2012
  1. nilfs2: remove references to long gone super operations

    ->delete_inode(), ->write_super_lockfs(), ->unlockfs() are gone so remove
    references to them in the NTFS code.  Noticed while cleaning up the
    fsfreeze mess.
    
    Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
    Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
    Fernando Luis Vazquez Cao committed with konis Jul 30, 2012
  2. nilfs2: add omitted comments for different structures in driver imple…

    …mentation
    
    Add omitted comments for different structures in driver implementation.
    
    Signed-off-by: Vyacheslav Dubeyko <slava@dubeyko.com>
    Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
    dubeyko committed with konis Jul 30, 2012
  3. nilfs2: add omitted comments for structures in nilfs2_fs.h

    Add omitted comments for structures in nilfs2_fs.h.
    
    Signed-off-by: Vyacheslav Dubeyko <slava@dubeyko.com>
    Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
    dubeyko committed with konis Jul 30, 2012
  4. nilfs2: add omitted comment for ns_mount_state field of the_nilfs str…

    …ucture
    
    Add omitted comment for ns_mount_state field of the_nilfs structure.
    
    Signed-off-by: Vyacheslav Dubeyko <slava@dubeyko.com>
    Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
    dubeyko committed with konis Jul 30, 2012