Skip to content
Commits on Apr 19, 2012
  1. @morrone

    LU-1320 llite: fix a race between readpage and releasepage

    Jinshan Xiong committed with morrone
    This is a race between page stealing and readpage. If a just read
    page is stolen, readpage will find the page is not uptodate, this
    makes it panic so -EIO is returned to the reading application.
    Signed-off-by: Jinshan Xiong <>
    Change-Id: Ib16d12d3bc3cc8c0545aa27f0836e4fd89c3a809
    NOTE: This is patchset 5 from
  2. @nedbass

    LU-1299 clio: handle signal correctly for cl_lock

    Jinshan Xiong committed with nedbass
    Two issues are fixed in this patch:
    * for fault handling, signal is not allowed;
    * if a process is interrupted while waiting for a cl_lock, the lock
      shouldn't be erred out.
    Signed-off-by: Jinshan Xiong <>
    Change-Id: Iffce8be356723781b8f33ec9bdc2cf73e9e07138
  3. @nedbass

    LU-1282 lprocfs: disable interrupts in lprocfs_stats_lock()

    nedbass committed
    Deadlock will occur if we try to take stats->ls_lock under an
    interrupt while already holding the lock.  Use the irqsave version
    of the spinlock to avoid this.
    Signed-off-by: Ned Bass <>
    Change-Id: I9f978f93ff0da446b1632f25ac849143218e64a3
  4. @morrone

    LU-1092 ptlrpc: take export refcount during connect

    Lai Siyao committed with morrone
    In the process of (re)connect,  a refcount of export should be taken,
    otherwise disconnect of this export may be called, and it will put
    the last refcount of this export and make access to this export
    Signed-off-by: Lai Siyao <>
    Change-Id: Iaf27e842ed516b8968c90bfce396609e39f52c85
Commits on Apr 17, 2012
  1. @morrone

    LU-1245 lprocfs: use correct cpu number

    Bobi Jam committed with morrone
    Take care of correct cpu number in lprocfs_stats_collector().
    Signed-off-by: Bobi Jam <>
    Change-Id: Ifb149f64ee6d5b67a029331c0d0452fc29533c6b
Commits on Apr 16, 2012
  1. @morrone
Commits on Apr 13, 2012
  1. @morrone

    LU-1282 lprocfs: Add a module param to disable percpu stats

    Bobi Jam committed with morrone
    Add an obdclass module option to choose to use a single lprocfs stats
    structure rather than percpu data.
    Signed-off-by: Bobi Jam <>
    Change-Id: I45d5a05029197e629d4f7d161a5e4e5d01a93bf5
Commits on Apr 10, 2012
  1. @morrone

    LU-81 deadlock of changelog adding vs. changelog cancelling

    Niu Yawei committed with morrone
    This is a workaround for the deadlock of changelog adding vs.
    changelog cancelling. Changelog adding always start transaction
    before acquiring the catlog lock(lgh_lock), whereas, changelog
    cancelling do start transaction after holding the catlog lock.
    We start transaction earlier to avoid above deadlock.
    Signed-off-by: Niu Yawei <>
    Change-Id: I9647b9a559f68a27dc0d4b4885857d3cf73b5b8e
    Tested-by: Hudson
    Tested-by: Maloo <>
    Reviewed-by: Alex Zhuravlev <>
    Reviewed-by: Oleg Drokin <>
Commits on Apr 7, 2012
  1. @morrone

    LU-1282 misc: Use present cpu numbers to save memory.

    Bobi Jam committed with morrone
    lprocfs stats data should allocated by the number of present cpus in
    stead of by possible cpu numbers which wastes a lot of memory.
    OSS minimum thread number also better be decided by present cpu
    Signed-off-by: Bobi Jam <>
    Change-Id: Id1690f185f4f83fae75be7eddb756e413cbc4fba
Commits on Apr 6, 2012
  1. @morrone

    LU-1234 dcache: don't drop invalid dentry arbitrarily

    Lai Siyao committed with morrone
    This is a backport of part of LU-506 dcache scalability support:
    * remove super hack d_rehash_cond(), and treats DCACHE_LUSTRE_INVALID
      similar to DCACHE_DISCONNECTED, therefore dentry doesn't need to
      be dropped and rehashed frequently.
    * .lookup(LOOKUP_CREATE) calls d_add() dentry directly, and .create
      only needs to d_instantiate() this dentry.
    * other cleanups.
    Signed-off-by: Lai Siyao <>
    Change-Id: Ie169bc7e763e6891084999041aac9f62c8dee9f0
Commits on Apr 5, 2012
  1. @nedbass

    LU-1280 ldiskfs: remove LASSERTF from ext3_ext_new_extent_cb()

    Yu Jian committed with nedbass
    The LASSERTF() in ext3_ext_new_extent_cb() was injected for
    debugging purpose to make sure the race really happened but
    was forgotten to be removed from the original patch in .
    Signed-off-by: Yu Jian <>
    Change-Id: I978b8ab88cc4413c7ac00db838f7578f8011b192
Commits on Mar 26, 2012
  1. @morrone

    LU-931 mdd: store lu_fid instead of pointer in md_capainfo

    Hongchao Zhang committed with morrone
    in md_capainfo, mc_fid contains at most 5 pointers to lu_fid,
    and if the corresponding lu_fid is freed, the pointer isn't notified
    about it, then the pointer will point to freed memory!
    Signed-off-by: Hongchao Zhang <>
    Change-Id: I00088cbfeb145ceac0477467a8b2436f6cf1e530
  2. @morrone

    LU-1217 clio: debug patch

    Jinshan Xiong committed with morrone
    debug patch to print lock state.
    Signed-off-by: Jinshan Xiong <>
    Change-Id: Ia6ff85a0fdaffb6d03216ad0d69953fb339417a9
Commits on Mar 16, 2012
  1. @morrone

    LU-1206 mdt: Fix error handling in mdt_mfd_open

    Oleg Drokin committed with morrone
    In mdt_mfd_open if the mo_open() call failed or we could not allocate
    mfd, we also need to undo write/exec reference count in order to
    not mess up with subsequent exec/write accesses.
    Signed-off-by: Prakash Surya <>
    Signed-off-by: Oleg Drokin <>
    Change-Id: I3bd98bd68368b48f2afaa7bb450d3a9947c992ac
Commits on Mar 13, 2012
  1. @nedbass @morrone

    Increase the lnet chkconfig shutdown number

    nedbass committed with morrone
    Increase the lnet shutdown script number so it will run after the
    netfs on RHEL6.  Mount lustre with the _netdev option so that netfs
    will handle unmounting it before lnet stops.
    LLNL-bug-id: bz1396
    Signed-off-by: Christopher J. Morrone <>
Commits on Mar 3, 2012
  1. @morrone

    LU-1095 debug: Improve recovery console messages

    morrone committed
    Quiet and/or improve a few recovery messages.
    A sysadmin will not understand this:
      2012-03-02 16:27:19 Lustre: 5211:0:(ldlm_lib.c:2072:
      target_queue_recovery_request()) Next recovery transno: 410629539,
      current: 410629539, replaying
    Messages like this are too verbose for the console:
      2012-03-02 16:27:59 LustreError: 5286:0:
      lc3-OST0004: disconnect stale client
    and can be left to this simpler message:
      2012-03-02 16:27:59 Lustre: lc3-OST0005: disconnecting 0 stale
    Signed-off-by: Christopher J. Morrone <>
    Change-Id: I457602c3440ba10475e4ddca7c4e58ef8669922c
Commits on Feb 28, 2012
  1. @behlendorf @morrone

    LU-1095 debug: Report remaining recovery time consistently

    behlendorf committed with morrone
    Consistency is good, always report the remaining recovery time
    in the mm:ss format.  This patch get's the last 3 remaining
    instances where it is simply reported as a total number of seconds.
    Signed-off-by: Christopher J. Morrone <>
    Change-Id: If5599d8c24b1cd862ab89670553fcd24672cadbc
  2. @morrone

    LU-1095 debug: Improve messages for fake requests

    morrone committed
    Update the console filter to correctly handle fake requests and
    squelched the lov_update_create_set() message for the
     LustreError: 7872:0:(lov_request.c:693:lov_update_create_set()) error
     creating fid 0x104c5e0b sub-object on OST idx 53/2: rc = -107
    Signed-off-by: Christopher J. Morrone <>
    Change-Id: I5f37f585566b053d515665fcddbcc8a3e653d89a
  3. @behlendorf @morrone

    LU-1095 debug: CWARN to CDEBUG for mds_notify() event

    behlendorf committed with morrone
    Both of these warnings represent correct behavior the administrator
    does not need to know about, or more importantly do anything about.
    As such I am moving both of these warnings to CDEBUG(D_CONFIG).
      Lustre: 8099:0:(mds_lov.c:1167:mds_notify()) MDS lc1-MDT0000:
      add target lc1-OST0023_UUID
      Lustre: lc1-MDT0000: in recovery, not resetting orphans on lc1-OST0007_UUID
    Signed-off-by: Christopher J. Morrone <>
  4. @behlendorf @morrone

    LU-1095 debug: Quiet/cleanup various common console messages

    behlendorf committed with morrone
    Turn off several common message we always observe when restarting
    servers assuming they are for the expected return code.  Simply
    move others which are not helpful to an administrator to the
    internal debug kernel log.  Less noise means your more likely
    to spot the important error messages.
    Signed-off-by: Christopher J. Morrone <>
    Change-Id: I523eacfe3c72480d62c915537427aa6c99428647
  5. @behlendorf @morrone

    LU-1095 debug: Common client/server message standardization

    behlendorf committed with morrone
    Enhance and standardize several common messages.  In particular
    when a peer is involved ensure peers nid is in the message, and
    on the server include the obd name in the message.
    Signed-off-by: Christopher J. Morrone <>
    Change-Id: Iaea477e7dab240866a10c1863886d21d674e293d
  6. @behlendorf @morrone

    LU-1095 debug: Standardize, suppress mount/umount messages

    behlendorf committed with morrone
    Standardize mount/umount console message to include profile name,
    and optionally suppress them with the 'quiet' mount option.  We
    have been using private namespaces for testing and mounting then
    umounting the FS as needed for each job.  In this context these
    messages end up causing alot of syslog noise.
    Signed-off-by: Christopher J. Morrone <>
    Change-Id: I7514f6016c337a358e5e31146644810dff292d02
  7. @morrone

    LU-1095 debug: Send common recovery/startup messages to D_HA

    morrone committed
    These messages are always present at recovery and/or boot time,
    and are not understable by a sysadmin.
    Signed-off-by: Christopher J. Morrone <>
    Change-Id: I907b0ac49541b20699914dc4f8c5e0db3fb6bec9
  8. @morrone

    LU-1084 ptlrpc: Change CWARNs to CDEBUGs

    morrone committed
    These messages should not appear on the console.  A sysadmin
    will have no idea what to make of most of them.
    Signed-off-by: Christopher J. Morrone <>
    Change-Id: I58bbc1eca9f5082d08cee6c5a95793c0f64ef370
    Tested-by: Hudson
    Reviewed-by: Andreas Dilger <>
    Tested-by: Maloo <>
    Reviewed-by: Oleg Drokin <>
  9. @morrone

    LU-1098 debug: lower debug message level

    Bobi Jam committed with morrone
    File info read and unlink race is normal, we'd lower the debug message
    level since a lot of unnecessary unmasked messages will be generated
    if mdt_object_find() cannot find those deleted objects.
    Signed-off-by: Bobi Jam <>
    Change-Id: I7630e6a1456ffb435c8e67cc626bf38547b840d0
    Tested-by: Hudson
    Reviewed-by: Christopher J. Morrone <>
    Reviewed-by: Andreas Dilger <>
    Tested-by: Maloo <>
    Reviewed-by: Oleg Drokin <>
  10. @morrone

    LU-1128 ldlm: return -1 for server pools shrinker

    Niu Yawei committed with morrone
    The ldlm server pool shrinker is just for decreasing the SLV,
    and it doesn't reclaim any memory directly, so it should be
    sufficient to call the shrinker once when the server is under
    memory pressure. Returning -1 from shrinker can inform kernel
    to stop calling the shrinker repeatedly.
    Signed-off-by: Niu Yawei <>
    Change-Id: I17f51ac84eb0b8c70b2cee9ac7eeca34647c1990
Commits on Feb 27, 2012
  1. @liangzhen @morrone

    LU-143 fid hash improvements

    liangzhen committed with morrone
    Current hash function of fid is not good enough for lu_site and
    We have to use two totally different hash functions to hash fid into
    hash table with millions of entries.
    Change-Id: I6261e63a406118a93d578210c31e67fc7f9e389c
    Signed-off-by: Liang Zhen <>
    Tested-by: Hudson
    Tested-by: Maloo <>
    Reviewed-by: Fan Yong <>
    Reviewed-by: Oleg Drokin <>
  2. @morrone

    LU-459 debug: quiet overly verbose debug messages

    Andreas Dilger committed with morrone
    Some debugging messages are being printed to the console, but
    do not provide any particular value.  Turn these into kernel
    debug messages.
    Signed-off-by: Andreas Dilger <>
    Change-Id: Id8b0624b281ce67501d0d81cd0e89cc020cd669a
    Tested-by: Maloo <>
    Reviewed-by: Niu Yawei <>
    Reviewed-by: Oleg Drokin <>
  3. @morrone

    LU-1095 mgs: move message to debug log

    morrone committed
    There is no good reason for a sysadmin to see this message
    on the console.  Most of the time this will be a fluke
    due to the vagarities of lnet networks (server decides
    client is disconnected, but client doesn't know that yet,
    messages arriving out of order, etc.).
    Signed-off-by: Christopher J. Morrone <>
    Change-Id: I0c18734f82a9c89a5e940ce4e2c602614e89ce26
  4. @morrone

    LU-106 procfs: many proc entries are not accessed safely

    Lai Siyao committed with morrone
    Some in memory data may be released/uninitialized at the time
    of proc entry creation/removal, this patch includes the following
    * initialize data before proc entry creation
    * free data after proc entry removal
    * free proc entries in obd_precleanup() because
      obd_uuid/nid/nid_stats_hash are released in class_cleanup().
    * free proc entries after obd_zombie_barrier() because obd_export
      hold one refcound of nid_stat.
    * check osd->od_mount before accessing osd proc entries because the
      osd proc entries are created before mount.
    Signed-off-by: Lai Siyao <>
    Change-Id: I03cb977e1be0747032a70f6a39fec804f81d70cc
    Tested-by: Hudson
    Reviewed-by: Johann Lombardi <>
    Tested-by: Maloo <>
    Reviewed-by: Oleg Drokin <>
  5. @morrone

    LU-1044 ptlrpc: Fixed a swab race for ptlrpc

    Jinshan Xiong committed with morrone
    ---- debug use only ----
    Do hpreq check before the request is added into exp_hp_rpcs queue.
    Signed-off-by: Jinshan Xiong <>
    Change-Id: I87ef0252795ce8e25f6ba6e0643e52383a24bb6b
  6. @morrone

    ORNL-3 mntopt: consider low-layer options for MDT ACL flags

    Fan Yong committed with morrone
    Currently, MDT layer enables ACL support by default without checking
    whether low-layer (ldiskfs) enables ACL support or not, then causes
    unnecessary data traffic on network and through MDS stack for ACL.
    So MDT should communicate with low-layer before setting ACL flags.
    Signed-off-by: Fan Yong <>
    Change-Id: I804f10bf486745ddd3b23b89e959dfd585589ac0
    Tested-by: Hudson
    Tested-by: Maloo <>
    Reviewed-by: Oleg Drokin <>
  7. @morrone

    LU-874 osc: prioritize writeback pages

    Jinshan Xiong committed with morrone
    When a lock is being canceled, we should prioritize those covering
    pages which have already been submitted by page writeback daemon;
    otherwise, this client may be evicted because there is no active IO
    for that lock for a long time.
    Signed-off-by: Jinshan Xiong <>
    Change-Id: If14eff6361f55d2b2eeb2db7146789dda4c32060
  8. @morrone

    LU-874 ldlm: Fix ldlm_bl_* thread creation

    morrone committed
    Always create a new ldlm_bl_ thread when all threads
    are busy, not just after returning from sleep.
    Change-Id: I2fa99a0f09a42e1333589fc7bc2a6eebef4924b6
    Signed-off-by: Christopher J. Morrone <>
  9. @morrone

    LU-874 ptlrpc: handle in-flight hqreq correctly

    Jinshan Xiong committed with morrone
    If there are in-flight requests pending, we shouldn't timeout the
    covering dlm locks; neither should we remove the requests from export
    exp_hp_rpcs list until the requests are handled.
    In this patch, the following things are improved:
    1. leave IO rpcs in export's hp list until they are handled;
    2. using interval tree to find rpc overlapped locks;
    3. refresh the lock again after IO rpcs are finished to leave a time
       window for clients to cancel covering dlm locks;
    4. cleanup the code.
    Signed-off-by: Jinshan Xiong <>
    Change-Id: I33e2d113d7929a56389741c06dffb5efb6bf28a3
Something went wrong with that request. Please try again.