Skip to content

Commit

Permalink
Reimplement rwlocks for Linux lock profiling/analysis.
Browse files Browse the repository at this point in the history
It turns out that the previous rwlock implementation worked well but
did not integrate properly with the upstream kernel lock profiling/
analysis tools.  This is a major problem since it would be awfully
nice to be able to use the automatic lock checker and profiler.

The problem is that the upstream lock tools use the pre-processor
to create a lock class for each uniquely named locked.  Since the
rwsem was embedded in a wrapper structure the name was always the
same.  The effect was that we only ended up with one lock class for
the entire SPL which caused the lock dependency checker to flag
nearly everything as a possible deadlock.

The solution was to directly map a krwlock to a Linux rwsem using
a typedef there by eliminating the wrapper structure.  This was not
done initially because the rwsem implementation is specific to the arch.
To fully implement the Solaris krwlock API using only the provided rwsem
API is not possible.  It can only be done by directly accessing some of
the internal data member of the rwsem structure.

For example, the Linux API provides a different function for dropping
a reader vs writer lock.  Whereas the Solaris API uses the same function
and the caller does not pass in what type of lock it is.  This means to
properly drop the lock we need to determine if the lock is currently a
reader or writer lock.  Then we need to call the proper Linux API function.
Unfortunately, there is no provided API for this so we must extracted this
information directly from arch specific lock implementation.  This is
all do able, and what I did, but it does complicate things considerably.

The good news is that in addition to the profiling benefits of this
change.  We may see performance improvements due to slightly reduced
overhead when creating rwlocks and manipulating them.

The only function I was forced to sacrafice was rw_owner() because this
information is simply not stored anywhere in the rwsem.  Luckily this
appears not to be a commonly used function on Solaris, and it is my
understanding it is mainly used for debugging anyway.

In addition to the core rwlock changes, extensive updates were made to
the rwlock regression tests.  Each class of test was extended to provide
more API coverage and to be more rigerous in checking for misbehavior.

This is a pretty significant change and with that in mind I have been
careful to validate it on several platforms before committing.  The full
SPLAT regression test suite was run numberous times on all of the following
platforms.  This includes various kernels ranging from 2.6.16 to 2.6.29.

- SLES10   (ppc64)
- SLES11   (x86_64)
- CHAOS4.2 (x86_64)
- RHEL5.3  (x86_64)
- RHEL6    (x86_64)
- FC11     (x86_64)
  • Loading branch information
behlendorf committed Sep 18, 2009
1 parent 73358d5 commit e811949
Show file tree
Hide file tree
Showing 9 changed files with 708 additions and 936 deletions.
16 changes: 15 additions & 1 deletion config/spl-build.m4
Original file line number Diff line number Diff line change
Expand Up @@ -68,6 +68,7 @@ AC_DEFUN([SPL_AC_CONFIG_KERNEL], [
SPL_AC_4ARGS_VFS_RENAME
SPL_AC_CRED_STRUCT
SPL_AC_GROUPS_SEARCH
SPL_AC_PUT_TASK_STRUCT
])

AC_DEFUN([SPL_AC_MODULE_SYMVERS], [
Expand Down Expand Up @@ -1263,7 +1264,7 @@ AC_DEFUN([SPL_AC_CRED_STRUCT], [
])

dnl #
dnl # Custom SPL patch may export this symbol
dnl # Custom SPL patch may export this symbol.
dnl #
AC_DEFUN([SPL_AC_GROUPS_SEARCH], [
SPL_CHECK_SYMBOL_EXPORT(
Expand All @@ -1273,3 +1274,16 @@ AC_DEFUN([SPL_AC_GROUPS_SEARCH], [
[groups_search() is available])],
[])
])

dnl #
dnl # 2.6.x API change,
dnl # __put_task_struct() was exported in RHEL5 but unavailable elsewhere.
dnl #
AC_DEFUN([SPL_AC_PUT_TASK_STRUCT], [
SPL_CHECK_SYMBOL_EXPORT(
[__put_task_struct],
[],
[AC_DEFINE(HAVE_PUT_TASK_STRUCT, 1,
[__put_task_struct() is available])],
[])
])
82 changes: 82 additions & 0 deletions configure
Original file line number Diff line number Diff line change
Expand Up @@ -21916,6 +21916,47 @@ _ACEOF
fi



echo "$as_me:$LINENO: checking whether symbol __put_task_struct is exported" >&5
echo $ECHO_N "checking whether symbol __put_task_struct is exported... $ECHO_C" >&6
grep -q -E '[[:space:]]__put_task_struct[[:space:]]' \
$LINUX_OBJ/Module*.symvers 2>/dev/null
rc=$?
if test $rc -ne 0; then
export=0
for file in ; do
grep -q -E "EXPORT_SYMBOL.*(__put_task_struct)" \
"$LINUX_OBJ/$file" 2>/dev/null
rc=$?
if test $rc -eq 0; then
export=1
break;
fi
done
if test $export -eq 0; then
echo "$as_me:$LINENO: result: no" >&5
echo "${ECHO_T}no" >&6

else
echo "$as_me:$LINENO: result: yes" >&5
echo "${ECHO_T}yes" >&6

cat >>confdefs.h <<\_ACEOF
#define HAVE_PUT_TASK_STRUCT 1
_ACEOF

fi
else
echo "$as_me:$LINENO: result: yes" >&5
echo "${ECHO_T}yes" >&6

cat >>confdefs.h <<\_ACEOF
#define HAVE_PUT_TASK_STRUCT 1
_ACEOF

fi


;;
user) ;;
all)
Expand Down Expand Up @@ -24884,6 +24925,47 @@ _ACEOF



echo "$as_me:$LINENO: checking whether symbol __put_task_struct is exported" >&5
echo $ECHO_N "checking whether symbol __put_task_struct is exported... $ECHO_C" >&6
grep -q -E '[[:space:]]__put_task_struct[[:space:]]' \
$LINUX_OBJ/Module*.symvers 2>/dev/null
rc=$?
if test $rc -ne 0; then
export=0
for file in ; do
grep -q -E "EXPORT_SYMBOL.*(__put_task_struct)" \
"$LINUX_OBJ/$file" 2>/dev/null
rc=$?
if test $rc -eq 0; then
export=1
break;
fi
done
if test $export -eq 0; then
echo "$as_me:$LINENO: result: no" >&5
echo "${ECHO_T}no" >&6

else
echo "$as_me:$LINENO: result: yes" >&5
echo "${ECHO_T}yes" >&6

cat >>confdefs.h <<\_ACEOF
#define HAVE_PUT_TASK_STRUCT 1
_ACEOF

fi
else
echo "$as_me:$LINENO: result: yes" >&5
echo "${ECHO_T}yes" >&6

cat >>confdefs.h <<\_ACEOF
#define HAVE_PUT_TASK_STRUCT 1
_ACEOF

fi



;;
*)
echo "$as_me:$LINENO: result: Error!" >&5
Expand Down
121 changes: 71 additions & 50 deletions include/sys/rwlock.h
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
/*
* This file is part of the SPL: Solaris Porting Layer.
*
* Copyright (c) 2008 Lawrence Livermore National Security, LLC.
* Copyright (c) 2009 Lawrence Livermore National Security, LLC.
* Produced at Lawrence Livermore National Laboratory
* Written by:
* Brian Behlendorf <behlendorf1@llnl.gov>,
Expand Down Expand Up @@ -30,68 +30,89 @@
#include <linux/module.h>
#include <linux/slab.h>
#include <linux/rwsem.h>
#include <asm/current.h>
#include <sys/types.h>
#include <sys/kmem.h>

#ifdef __cplusplus
extern "C" {
#endif

typedef enum {
RW_DRIVER = 2, /* driver (DDI) rwlock */
RW_DEFAULT = 4 /* kernel default rwlock */
RW_DRIVER = 2,
RW_DEFAULT = 4
} krw_type_t;

typedef enum {
RW_WRITER,
RW_READER
RW_NONE = 0,
RW_WRITER = 1,
RW_READER = 2
} krw_t;

typedef struct rw_semaphore krwlock_t;

#define RW_MAGIC 0x3423645a
#define RW_POISON 0xa6
#define rw_init(rwlp, name, type, arg) init_rwsem(rwlp)
#define rw_destroy(rwlp) ((void)0)
#define rw_downgrade(rwlp) downgrade_write(rwlp)
#define RW_LOCK_HELD(rwlp) rwsem_is_locked(rwlp)
/*
* the rw-semaphore definition
* - if activity/count is 0 then there are no active readers or writers
* - if activity/count is +ve then that is the number of active readers
* - if activity/count is -1 then there is one active writer
*/
#if defined(CONFIG_RWSEM_GENERIC_SPINLOCK)
# define RW_COUNT(rwlp) ((rwlp)->activity)
# define RW_READ_HELD(rwlp) ((RW_COUNT(rwlp) > 0) ? RW_COUNT(rwlp) : 0)
# define RW_WRITE_HELD(rwlp) ((RW_COUNT(rwlp) < 0))
# define rw_exit_locked(rwlp) __up_read_locked(rwlp)
# define rw_tryenter_locked(rwlp) __down_write_trylock_locked(rwlp)
void __up_read_locked(struct rw_semaphore *);
int __down_write_trylock_locked(struct rw_semaphore *);
#else
# define RW_COUNT(rwlp) ((rwlp)->count & RWSEM_ACTIVE_MASK)
# define RW_READ_HELD(rwlp) ((RW_COUNT(rwlp) > 0) ? RW_COUNT(rwlp) : 0)
# define RW_WRITE_HELD(rwlp) ((RW_COUNT(rwlp) < 0))
# define rw_exit_locked(rwlp) up_read(rwlp)
# define rw_tryenter_locked(rwlp) down_write_trylock(rwlp)
#endif

typedef struct {
int32_t rw_magic;
int32_t rw_name_size;
char *rw_name;
struct rw_semaphore rw_sem;
struct task_struct *rw_owner; /* holder of the write lock */
} krwlock_t;
#define rw_tryenter(rwlp, rw) \
({ \
int _rc_ = 0; \
switch (rw) { \
case RW_READER: _rc_ = down_read_trylock(rwlp); break; \
case RW_WRITER: _rc_ = down_write_trylock(rwlp); break; \
default: SBUG(); \
} \
_rc_; \
})

extern void __rw_init(krwlock_t *rwlp, char *name, krw_type_t type, void *arg);
extern void __rw_destroy(krwlock_t *rwlp);
extern int __rw_tryenter(krwlock_t *rwlp, krw_t rw);
extern void __rw_enter(krwlock_t *rwlp, krw_t rw);
extern void __rw_exit(krwlock_t *rwlp);
extern void __rw_downgrade(krwlock_t *rwlp);
extern int __rw_tryupgrade(krwlock_t *rwlp);
extern kthread_t *__rw_owner(krwlock_t *rwlp);
extern int __rw_read_held(krwlock_t *rwlp);
extern int __rw_write_held(krwlock_t *rwlp);
extern int __rw_lock_held(krwlock_t *rwlp);
#define rw_enter(rwlp, rw) \
({ \
switch (rw) { \
case RW_READER: down_read(rwlp); break; \
case RW_WRITER: down_write(rwlp); break; \
default: SBUG(); \
} \
})

#define rw_init(rwlp, name, type, arg) \
({ \
if ((name) == NULL) \
__rw_init(rwlp, #rwlp, type, arg); \
else \
__rw_init(rwlp, name, type, arg); \
#define rw_exit(rwlp) \
({ \
if (RW_READ_HELD(rwlp)) \
up_read(rwlp); \
else if (RW_WRITE_HELD(rwlp)) \
up_write(rwlp); \
else \
SBUG(); \
})
#define rw_destroy(rwlp) __rw_destroy(rwlp)
#define rw_tryenter(rwlp, rw) __rw_tryenter(rwlp, rw)
#define rw_enter(rwlp, rw) __rw_enter(rwlp, rw)
#define rw_exit(rwlp) __rw_exit(rwlp)
#define rw_downgrade(rwlp) __rw_downgrade(rwlp)
#define rw_tryupgrade(rwlp) __rw_tryupgrade(rwlp)
#define rw_owner(rwlp) __rw_owner(rwlp)
#define RW_READ_HELD(rwlp) __rw_read_held(rwlp)
#define RW_WRITE_HELD(rwlp) __rw_write_held(rwlp)
#define RW_LOCK_HELD(rwlp) __rw_lock_held(rwlp)

#ifdef __cplusplus
}
#endif
#define rw_tryupgrade(rwlp) \
({ \
unsigned long flags; \
int _rc_ = 0; \
spin_lock_irqsave(&(rwlp)->wait_lock, flags); \
if (list_empty(&(rwlp)->wait_list) && (RW_READ_HELD(rwlp) == 1)) { \
rw_exit_locked(rwlp); \
_rc_ = rw_tryenter_locked(rwlp); \
ASSERT(_rc_); \
} \
spin_unlock_irqrestore(&(rwlp)->wait_lock, flags); \
_rc_; \
})

#endif /* _SPL_RWLOCK_H */
16 changes: 16 additions & 0 deletions module/spl/spl-generic.c
Original file line number Diff line number Diff line change
Expand Up @@ -253,6 +253,22 @@ ddi_copyout(const void *from, void *to, size_t len, int flags)
}
EXPORT_SYMBOL(ddi_copyout);

#ifndef HAVE_PUT_TASK_STRUCT
/*
* This is only a stub function which should never be used. The SPL should
* never be putting away the last reference on a task structure so this will
* not be called. However, we still need to define it so the module does not
* have undefined symbol at load time. That all said if this impossible
* thing does somehow happen SBUG() immediately so we know about it.
*/
void
__put_task_struct(struct task_struct *t)
{
SBUG();
}
EXPORT_SYMBOL(__put_task_struct);
#endif /* HAVE_PUT_TASK_STRUCT */

struct new_utsname *__utsname(void)
{
#ifdef HAVE_INIT_UTSNAME
Expand Down
Loading

0 comments on commit e811949

Please sign in to comment.