Permalink
Switch branches/tags
Commits on Dec 14, 2009
  1. Linux 2.6.31.8

    gregkh committed Dec 14, 2009
  2. ext4: Fix potential fiemap deadlock (mmap_sem vs. i_data_sem)

    (cherry picked from commit fab3a54)
    
    Fix the following potential circular locking dependency between
    mm->mmap_sem and ei->i_data_sem:
    
        =======================================================
        [ INFO: possible circular locking dependency detected ]
        2.6.32-04115-gec044c5 #37
        -------------------------------------------------------
        ureadahead/1855 is trying to acquire lock:
         (&mm->mmap_sem){++++++}, at: [<ffffffff81107224>] might_fault+0x5c/0xac
    
        but task is already holding lock:
         (&ei->i_data_sem){++++..}, at: [<ffffffff811be1fd>] ext4_fiemap+0x11b/0x159
    
        which lock already depends on the new lock.
    
        the existing dependency chain (in reverse order) is:
    
        -> #1 (&ei->i_data_sem){++++..}:
               [<ffffffff81099bfa>] __lock_acquire+0xb67/0xd0f
               [<ffffffff81099e7e>] lock_acquire+0xdc/0x102
               [<ffffffff81516633>] down_read+0x51/0x84
               [<ffffffff811a2414>] ext4_get_blocks+0x50/0x2a5
               [<ffffffff811a3453>] ext4_get_block+0xab/0xef
               [<ffffffff81154f39>] do_mpage_readpage+0x198/0x48d
               [<ffffffff81155360>] mpage_readpages+0xd0/0x114
               [<ffffffff811a104b>] ext4_readpages+0x1d/0x1f
               [<ffffffff810f8644>] __do_page_cache_readahead+0x12f/0x1bc
               [<ffffffff810f86f2>] ra_submit+0x21/0x25
               [<ffffffff810f0cfd>] filemap_fault+0x19f/0x32c
               [<ffffffff81107b97>] __do_fault+0x55/0x3a2
               [<ffffffff81109db0>] handle_mm_fault+0x327/0x734
               [<ffffffff8151aaa9>] do_page_fault+0x292/0x2aa
               [<ffffffff81518205>] page_fault+0x25/0x30
               [<ffffffff812a34d8>] clear_user+0x38/0x3c
               [<ffffffff81167e16>] padzero+0x20/0x31
               [<ffffffff81168b47>] load_elf_binary+0x8bc/0x17ed
               [<ffffffff81130e95>] search_binary_handler+0xc2/0x259
               [<ffffffff81166d64>] load_script+0x1b8/0x1cc
               [<ffffffff81130e95>] search_binary_handler+0xc2/0x259
               [<ffffffff8113255f>] do_execve+0x1ce/0x2cf
               [<ffffffff81027494>] sys_execve+0x43/0x5a
               [<ffffffff8102918a>] stub_execve+0x6a/0xc0
    
        -> #0 (&mm->mmap_sem){++++++}:
               [<ffffffff81099aa4>] __lock_acquire+0xa11/0xd0f
               [<ffffffff81099e7e>] lock_acquire+0xdc/0x102
               [<ffffffff81107251>] might_fault+0x89/0xac
               [<ffffffff81139382>] fiemap_fill_next_extent+0x95/0xda
               [<ffffffff811bcb43>] ext4_ext_fiemap_cb+0x138/0x157
               [<ffffffff811be069>] ext4_ext_walk_space+0x178/0x1f1
               [<ffffffff811be21e>] ext4_fiemap+0x13c/0x159
               [<ffffffff811390e6>] do_vfs_ioctl+0x348/0x4d6
               [<ffffffff811392ca>] sys_ioctl+0x56/0x79
               [<ffffffff81028cb2>] system_call_fastpath+0x16/0x1b
    
        other info that might help us debug this:
    
        1 lock held by ureadahead/1855:
         #0:  (&ei->i_data_sem){++++..}, at: [<ffffffff811be1fd>] ext4_fiemap+0x11b/0x159
    
        stack backtrace:
        Pid: 1855, comm: ureadahead Not tainted 2.6.32-04115-gec044c5 #37
        Call Trace:
         [<ffffffff81098c70>] print_circular_bug+0xa8/0xb7
         [<ffffffff81099aa4>] __lock_acquire+0xa11/0xd0f
         [<ffffffff8102f229>] ? sched_clock+0x9/0xd
         [<ffffffff81099e7e>] lock_acquire+0xdc/0x102
         [<ffffffff81107224>] ? might_fault+0x5c/0xac
         [<ffffffff81107251>] might_fault+0x89/0xac
         [<ffffffff81107224>] ? might_fault+0x5c/0xac
         [<ffffffff81124b44>] ? __kmalloc+0x13b/0x18c
         [<ffffffff81139382>] fiemap_fill_next_extent+0x95/0xda
         [<ffffffff811bcb43>] ext4_ext_fiemap_cb+0x138/0x157
         [<ffffffff811bca0b>] ? ext4_ext_fiemap_cb+0x0/0x157
         [<ffffffff811be069>] ext4_ext_walk_space+0x178/0x1f1
         [<ffffffff811be21e>] ext4_fiemap+0x13c/0x159
         [<ffffffff81107224>] ? might_fault+0x5c/0xac
         [<ffffffff811390e6>] do_vfs_ioctl+0x348/0x4d6
         [<ffffffff8129f6d0>] ? __up_read+0x8d/0x95
         [<ffffffff81517fb5>] ? retint_swapgs+0x13/0x1b
         [<ffffffff811392ca>] sys_ioctl+0x56/0x79
         [<ffffffff81028cb2>] system_call_fastpath+0x16/0x1b
    
    Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
    tytso committed with gregkh Dec 10, 2009
  3. signal: Fix alternate signal stack check

    commit 2a855dd upstream.
    
    All architectures in the kernel increment/decrement the stack pointer
    before storing values on the stack.
    
    On architectures which have the stack grow down sas_ss_sp == sp is not
    on the alternate signal stack while sas_ss_sp + sas_ss_size == sp is
    on the alternate signal stack.
    
    On architectures which have the stack grow up sas_ss_sp == sp is on
    the alternate signal stack while sas_ss_sp + sas_ss_size == sp is not
    on the alternate signal stack.
    
    The current implementation fails for architectures which have the
    stack grow down on the corner case where sas_ss_sp == sp.This was
    reported as Debian bug #544905 on AMD64.
    Simplified test case: http://download.breakpoint.cc/tc-sig-stack.c
    
    The test case creates the following stack scenario:
       0xn0300	stack top
       0xn0200	alt stack pointer top (when switching to alt stack)
       0xn01ff	alt stack end
       0xn0100	alt stack start == stack pointer
    
    If the signal is sent the stack pointer is pointing to the base
    address of the alt stack and the kernel erroneously decides that it
    has already switched to the alternate stack because of the current
    check for "sp - sas_ss_sp < sas_ss_size"
    
    On parisc (stack grows up) the scenario would be:
       0xn0200	stack pointer
       0xn01ff	alt stack end
       0xn0100	alt stack start = alt stack pointer base
       		    	  	  (when switching to alt stack)
       0xn0000	stack base
    
    This is handled correctly by the current implementation.
    
    [ tglx: Modified for archs which have the stack grow up (parisc) which
      	would fail with the correct implementation for stack grows
      	down. Added a check for sp >= current->sas_ss_sp which is
      	strictly not necessary but makes the code symetric for both
      	variants ]
    
    Signed-off-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
    Cc: Oleg Nesterov <oleg@redhat.com>
    Cc: Roland McGrath <roland@redhat.com>
    Cc: Kyle McMartin <kyle@mcmartin.ca>
    LKML-Reference: <20091025143758.GA6653@Chamillionaire.breakpoint.cc>
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
    Sebastian Andrzej Siewior committed with gregkh Oct 25, 2009
  4. SCSI: scsi_lib_dma: fix bug with dma maps on nested scsi objects

    commit d139b9b upstream.
    
    Some of our virtual SCSI hosts don't have a proper bus parent at the
    top, which can be a problem for doing DMA on them
    
    This patch makes the host device cache a pointer to the physical bus
    device and provides an extra API for setting it (the normal API picks
    it up from the parent).  This patch also modifies the qla2xxx and lpfc
    vport logic to use the new DMA host setting API.
    
    Acked-By: James Smart  <james.smart@emulex.com>
    Signed-off-by: James Bottomley <James.Bottomley@suse.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
    James Bottomley committed with gregkh Nov 5, 2009
  5. SCSI: osd_protocol.h: Add missing #include

    commit 0899638 upstream.
    
    include/scsi/osd_protocol.h uses ALIGN() without an #include
    <linux/kernel.h>, leading to:
    | include/scsi/osd_protocol.h:362: error: implicit declaration of function 'ALIGN'
    
    Signed-off-by: Martin Michlmayr <tbm@cyrius.com>
    Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
    Signed-off-by: James Bottomley <James.Bottomley@suse.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
    tbm committed with gregkh Nov 16, 2009
  6. SCSI: megaraid_sas: fix 64 bit sense pointer truncation

    commit 7b2519a upstream.
    
    The current sense pointer is cast to a u32 pointer, which can truncate
    on 64 bits.  Fix by using unsigned long instead.
    
    Signed-off-by Bo Yang<bo.yang@lsi.com>
    Signed-off-by: James Bottomley <James.Bottomley@suse.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
    Yang, Bo committed with gregkh Oct 6, 2009
  7. ext4: Fix insufficient checks in EXT4_IOC_MOVE_EXT

    (cherry picked from commit 4a58579)
    
    This patch fixes three problems in the handling of the
    EXT4_IOC_MOVE_EXT ioctl:
    
    1. In current EXT4_IOC_MOVE_EXT, there are read access mode checks for
    original and donor files, but they allow the illegal write access to
    donor file, since donor file is overwritten by original file data.  To
    fix this problem, change access mode checks of original (r->r/w) and
    donor (r->w) files.
    
    2.  Disallow the use of donor files that have a setuid or setgid bits.
    
    3.  Call mnt_want_write() and mnt_drop_write() before and after
    ext4_move_extents() calling to get write access to a mount.
    
    Signed-off-by: Akira Fujita <a-fujita@rs.jp.nec.com>
    Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
    Akira Fujita committed with gregkh Dec 7, 2009
  8. ext4: Wait for proper transaction commit on fsync

    (cherry picked from commit b436b9b)
    
    We cannot rely on buffer dirty bits during fsync because pdflush can come
    before fsync is called and clear dirty bits without forcing a transaction
    commit. What we do is that we track which transaction has last changed
    the inode and which transaction last changed allocation and force it to
    disk on fsync.
    
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
    jankara committed with gregkh Dec 10, 2009
  9. ext4: fix incorrect block reservation on quota transfer.

    (cherry picked from commit 194074a)
    
    Inside ->setattr() call both ATTR_UID and ATTR_GID may be valid
    This means that we may end-up with transferring all quotas. Add
    we have to reserve QUOTA_DEL_BLOCKS for all quotas, as we do in
    case of QUOTA_INIT_BLOCKS.
    
    Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
    Reviewed-by: Mingming Cao <cmm@us.ibm.com>
    Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
    dmonakhov committed with gregkh Dec 9, 2009
  10. ext4: quota macros cleanup

    (cherry picked from commit 5aca07e)
    
    Currently all quota block reservation macros contains hard-coded "2"
    aka MAXQUOTAS value. This is no good because in some places it is not
    obvious to understand what does this digit represent. Let's introduce
    new macro with self descriptive name.
    
    Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
    Acked-by: Mingming Cao <cmm@us.ibm.com>
    Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
    dmonakhov committed with gregkh Dec 9, 2009
  11. ext4: ext4_get_reserved_space() must return bytes instead of blocks

    (cherry picked from commit 8aa6790)
    
    Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
    Reviewed-by: Eric Sandeen <sandeen@redhat.com>
    Acked-by: Mingming Cao <cmm@us.ibm.com>
    Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
    dmonakhov committed with gregkh Dec 9, 2009
  12. ext4: remove blocks from inode prealloc list on failure

    (cherry picked from commit b844167)
    
    This fixes a leak of blocks in an inode prealloc list if device failures
    cause ext4_mb_mark_diskspace_used() to fail.
    
    Signed-off-by: Curt Wohlgemuth <curtw@google.com>
    Acked-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
    Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
    Curt Wohlgemuth committed with gregkh Dec 9, 2009
  13. ext4: wait for log to commit when umounting

    (cherry picked from commit d4edac3)
    
    There is a potential race when a transaction is committing right when
    the file system is being umounting.  This could reduce in a race
    because EXT4_SB(sb)->s_group_info could be freed in ext4_put_super
    before the commit code calls a callback so the mballoc code can
    release freed blocks in the transaction, resulting in a panic trying
    to access the freed s_group_info.
    
    The fix is to wait for the transaction to finish committing before we
    shutdown the multiblock allocator.
    
    Signed-off-by: Josef Bacik <josef@redhat.com>
    Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
    Josef Bacik committed with gregkh Dec 9, 2009
  14. ext4: Avoid data / filesystem corruption when write fails to copy data

    (cherry picked from commit b9a4207)
    
    When ext4_write_begin fails after allocating some blocks or
    generic_perform_write fails to copy data to write, we truncate blocks
    already instantiated beyond i_size.  Although these blocks were never
    inside i_size, we have to truncate the pagecache of these blocks so
    that corresponding buffers get unmapped.  Otherwise subsequent
    __block_prepare_write (called because we are retrying the write) will
    find the buffers mapped, not call ->get_block, and thus the page will
    be backed by already freed blocks leading to filesystem and data
    corruption.
    
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
    jankara committed with gregkh Dec 9, 2009
  15. ext4: Return the PTR_ERR of the correct pointer in setup_new_group_bl…

    …ocks()
    
    (cherry picked from commit c09eef3)
    
    Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
    Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
    RoelKluin committed with gregkh Dec 7, 2009
  16. jbd2: Add ENOMEM checking in and for jbd2_journal_write_metadata_buff…

    …er()
    
    (cherry picked from commit e6ec116)
    
    OOM happens.
    
    Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
    tytso committed with gregkh Dec 1, 2009
  17. ext4: move_extent_per_page() cleanup

    (cherry picked from commit ac48b0a)
    
    Integrate duplicate lines (acquire/release semaphore and invalidate
    extent cache in move_extent_per_page()) into mext_replace_branches(),
    to reduce source and object code size.
    
    Signed-off-by: Akira Fujita <a-fujita@rs.jp.nec.com>
    Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
    Akira Fujita committed with gregkh Nov 24, 2009
  18. ext4: initialize moved_len before calling ext4_move_extents()

    (cherry picked from commit 446aaa6)
    
    The move_extent.moved_len is used to pass back the number of exchanged
    blocks count to user space.  Currently the caller must clear this
    field; but we spend more code space checking for this requirement than
    simply zeroing the field ourselves, so let's just make life easier for
    everyone all around.
    
    Signed-off-by: Kazuya Mio <k-mio@sx.jp.nec.com>
    Signed-off-by: Akira Fujita <a-fujita@rs.jp.nec.com>
    Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
    Kazuya Mio committed with gregkh Nov 24, 2009
  19. ext4: Fix double-free of blocks with EXT4_IOC_MOVE_EXT

    (cherry picked from commit 94d7c16)
    
    At the beginning of ext4_move_extent(), we call
    ext4_discard_preallocations() to discard inode PAs of orig and donor
    inodes.  But in the following case, blocks can be double freed, so
    move ext4_discard_preallocations() to the end of ext4_move_extents().
    
    1. Discard inode PAs of orig and donor inodes with
       ext4_discard_preallocations() in ext4_move_extents().
    
       orig : [ DATA1 ]
       donor: [ DATA2 ]
    
    2. While data blocks are exchanging between orig and donor inodes, new
       inode PAs is created to orig by other process's block allocation.
       (Since there are semaphore gaps in ext4_move_extents().)  And new
       inode PAs is used partially (2-1).
    
       2-1 Create new inode PAs to orig inode
       orig : [ DATA1 | used PA1 | free PA1 ]
       donor: [ DATA2 ]
    
    3. Donor inode which has old orig inode's blocks is deleted after
       EXT4_IOC_MOVE_EXT finished (3-1, 3-2).  So the block bitmap
       corresponds to old orig inode's blocks are freed.
    
       3-1 After EXT4_IOC_MOVE_EXT finished
       orig : [ DATA2 |  free PA1 ]
       donor: [ DATA1 |  used PA1 ]
    
       3-2 Delete donor inode
       orig : [ DATA2 |  free PA1 ]
       donor: [ FREE SPACE(DATA1) | FREE SPACE(used PA1) ]
    
    4. The double-free of blocks is occurred, when close() is called to
       orig inode.  Because ext4_discard_preallocations() for orig inode
       frees used PA1 and free PA1, though used PA1 is already freed in 3.
    
       4-1 Double-free of blocks is occurred
       orig : [ DATA2 |  FREE SPACE(free PA1) ]
       donor: [ FREE SPACE(DATA1) | DOUBLE FREE(used PA1) ]
    
    Signed-off-by: Akira Fujita <a-fujita@rs.jp.nec.com>
    Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
    Akira Fujita committed with gregkh Nov 24, 2009
  20. ext4: make "norecovery" an alias for "noload"

    (cherry picked from commit e3bb52a)
    
    Users on the linux-ext4 list recently complained about differences
    across filesystems w.r.t. how to mount without a journal replay.
    
    In the discussion it was noted that xfs's "norecovery" option is
    perhaps more descriptively accurate than "noload," so let's make
    that an alias for ext4.
    
    Also show this status in /proc/mounts
    
    Signed-off-by: Eric Sandeen <sandeen@redhat.com>
    Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
    Eric Sandeen committed with gregkh Nov 19, 2009
  21. ext4: make trim/discard optional (and off by default)

    (cherry picked from commit 5328e63)
    
    It is anticipated that when sb_issue_discard starts doing
    real work on trim-capable devices, we may see issues.  Make
    this mount-time optional, and default it to off until we know
    that things are working out OK.
    
    Signed-off-by: Eric Sandeen <sandeen@redhat.com>
    Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
    Eric Sandeen committed with gregkh Nov 19, 2009
  22. ext4: fix error handling in ext4_ind_get_blocks()

    (cherry picked from commit 2bba702)
    
    When an error happened in ext4_splice_branch we failed to notice that
    in ext4_ind_get_blocks and mapped the buffer anyway. Fix the problem
    by checking for error properly.
    
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
    jankara committed with gregkh Nov 23, 2009
  23. ext4: avoid issuing unnecessary barriers

    (cherry picked from commit 6b17d90)
    
    We don't to issue an I/O barrier on an error or if we force commit
    because we are doing data journaling.
    
    Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
    Cc: Jan Kara <jack@suse.cz>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
    tytso committed with gregkh Nov 23, 2009
  24. ext4: fix block validity checks so they work correctly with meta_bg

    (cherry picked from commit 1032988)
    
    The block validity checks used by ext4_data_block_valid() wasn't
    correctly written to check file systems with the meta_bg feature.  Fix
    this.
    
    Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
    tytso committed with gregkh Nov 15, 2009
  25. ext4: fix uninit block bitmap initialization when s_meta_first_bg is …

    …non-zero
    
    (cherry picked from commit 8dadb19)
    
    The number of old-style block group descriptor blocks is
    s_meta_first_bg when the meta_bg feature flag is set.
    
    Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
    tytso committed with gregkh Nov 23, 2009
  26. ext4: don't update the superblock in ext4_statfs()

    (cherry picked from commit 3f8fb94)
    
    commit a71ce8c updated ext4_statfs()
    to update the on-disk superblock counters, but modified this buffer
    directly without any journaling of the change.  This is one of the
    accesses that was causing the crc errors in journal replay as seen in
    kernel.org bugzilla #14354.
    
    Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
    tytso committed with gregkh Nov 23, 2009
  27. ext4: journal all modifications in ext4_xattr_set_handle

    (cherry picked from commit 86ebfd0)
    
    ext4_xattr_set_handle() was zeroing out an inode outside
    of journaling constraints; this is one of the accesses that
    was causing the crc errors in journal replay as seen in
    kernel.org bugzilla #14354.
    
    Reviewed-by: Andreas Dilger <adilger@sun.com>
    Signed-off-by: Eric Sandeen <sandeen@redhat.com>
    Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
    Eric Sandeen committed with gregkh Nov 15, 2009
  28. ext4: fix i_flags access in ext4_da_writepages_trans_blocks()

    (cherry picked from commit 30c6e07)
    
    We need to be testing the i_flags field in the ext4 specific portion
    of the inode, instead of the (confusingly aliased) i_flags field in
    the generic struct inode.
    
    Signed-off-by: Julia Lawall <julia@diku.dk>
    Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
    JuliaLawall committed with gregkh Nov 15, 2009
  29. ext4: make sure directory and symlink blocks are revoked

    (cherry picked from commit 5068969)
    
    When an inode gets unlinked, the functions ext4_clear_blocks() and
    ext4_remove_blocks() call ext4_forget() for all the buffer heads
    corresponding to the deleted inode's data blocks.  If the inode is a
    directory or a symlink, the is_metadata parameter must be non-zero so
    ext4_forget() will revoke them via jbd2_journal_revoke().  Otherwise,
    if these blocks are reused for a data file, and the system crashes
    before a journal checkpoint, the journal replay could end up
    corrupting these data blocks.
    
    Thanks to Curt Wohlgemuth for pointing out potential problems in this
    area.
    
    Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
    tytso committed with gregkh Nov 23, 2009
  30. ext4: plug a buffer_head leak in an error path of ext4_iget()

    (cherry picked from commit 567f3e9)
    
    One of the invalid error paths in ext4_iget() forgot to brelse() the
    inode buffer head.  Fix it by adding a brelse() in the common error
    return path, which also simplifies function.
    
    Thanks to Andi Kleen <ak@linux.intel.com> reporting the problem.
    
    Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
    tytso committed with gregkh Nov 14, 2009
  31. ext4: fix possible recursive locking warning in EXT4_IOC_MOVE_EXT

    (cherry picked from commit 49bd22b)
    
    If CONFIG_PROVE_LOCKING is enabled, the double_down_write_data_sem()
    will trigger a false-positive warning of a recursive lock.  Since we
    take i_data_sem for the two inodes ordered by their inode numbers,
    this isn't a problem.  Use of down_write_nested() will notify the lock
    dependency checker machinery that there is no problem here.
    
    This problem was reported by Brian Rogers:
    
    	http://marc.info/?l=linux-ext4&m=125115356928011&w=1
    
    Reported-by: Brian Rogers <brian@xyzw.org>
    Signed-off-by: Akira Fujita <a-fujita@rs.jp.nec.com>
    Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
    Akira Fujita committed with gregkh Nov 23, 2009
  32. ext4: fix lock order problem in ext4_move_extents()

    (cherry picked from commit fc04cb4)
    
    ext4_move_extents() checks the logical block contiguousness
    of original file with ext4_find_extent() and mext_next_extent().
    Therefore the extent which ext4_ext_path structure indicates
    must not be changed between above functions.
    
    But in current implementation, there is no i_data_sem protection
    between ext4_ext_find_extent() and mext_next_extent().  So the extent
    which ext4_ext_path structure indicates may be overwritten by
    delalloc.  As a result, ext4_move_extents() will exchange wrong blocks
    between original and donor files.  I change the place where
    acquire/release i_data_sem to solve this problem.
    
    Moreover, I changed move_extent_per_page() to start transaction first,
    and then acquire i_data_sem.  Without this change, there is a
    possibility of the deadlock between mmap() and ext4_move_extents():
    
    * NOTE: "A", "B" and "C" mean different processes
    
    A-1: ext4_ext_move_extents() acquires i_data_sem of two inodes.
    
    B:   do_page_fault() starts the transaction (T),
         and then tries to acquire i_data_sem.
         But process "A" is already holding it, so it is kept waiting.
    
    C:   While "A" and "B" running, kjournald2 tries to commit transaction (T)
         but it is under updating, so kjournald2 waits for it.
    
    A-2: Call ext4_journal_start with holding i_data_sem,
         but transaction (T) is locked.
    
    Signed-off-by: Akira Fujita <a-fujita@rs.jp.nec.com>
    Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
    Akira Fujita committed with gregkh Nov 23, 2009
  33. ext4: fix the returned block count if EXT4_IOC_MOVE_EXT fails

    (cherry picked from commit f868a48)
    
    If the EXT4_IOC_MOVE_EXT ioctl fails, the number of blocks that were
    exchanged before the failure should be returned to the userspace
    caller.  Unfortunately, currently if the block size is not the same as
    the page size, the returned block count that is returned is the
    page-aligned block count instead of the actual block count.  This
    commit addresses this bug.
    
    Signed-off-by: Akira Fujita <a-fujita@rs.jp.nec.com>
    Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
    Akira Fujita committed with gregkh Nov 23, 2009
  34. ext4: avoid divide by zero when trying to mount a corrupted file system

    (cherry picked from commit 503358a)
    
    If s_log_groups_per_flex is greater than 31, then groups_per_flex will
    will overflow and cause a divide by zero error.  This can cause kernel
    BUG if such a file system is mounted.
    
    Thanks to Nageswara R Sastry for analyzing the failure and providing
    an initial patch.
    
    http://bugzilla.kernel.org/show_bug.cgi?id=14287
    
    Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
    tytso committed with gregkh Nov 23, 2009
  35. ext4: fix potential buffer head leak when add_dirent_to_buf() returns…

    … ENOSPC
    
    (cherry picked from commit 2de770a)
    
    Previously add_dirent_to_buf() did not free its passed-in buffer head
    in the case of ENOSPC, since in some cases the caller still needed it.
    However, this led to potential buffer head leaks since not all callers
    dealt with this correctly.  Fix this by making simplifying the freeing
    convention; now add_dirent_to_buf() *never* frees the passed-in buffer
    head, and leaves that to the responsibility of its caller.  This makes
    things cleaner and easier to prove that the code is neither leaking
    buffer heads or calling brelse() one time too many.
    
    Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
    Cc: Curt Wohlgemuth <curtw@google.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
    tytso committed with gregkh Nov 23, 2009