Skip to content

DocAPIOverview

Jared Yanovich edited this page Jun 30, 2015 · 3 revisions

SLASH2 API Overview

Terminology

  • lnd - Lustre networking device. LNDs are abstraction layers that let underlying protocols, such as TCP or Cray portals, handle lower-level communication details.

Viewing Documentation

Manual documentation is written in nroff mdoc markup language which can be formatted via (and piped into a pager such as less(1) from):

$ nroff -mandoc foo.1

Make Infrastructure

To build all components, use make build. This command will automatically probe system compatibility, calculate target dependencies, and build all executables. Since target dependencies are auto-generated, one should never have to worry about cleaning old object files manually.

Other make Targets

To create a recursive code database for use with tagname lookups in vim (e.g. to go to functions/symbols by name), use the cscope target:

$ make cs

For emacs:

$ make etags

It is also advised to check code against lint(1) often:

$ make lint

Most targets are recursive. See more targets in $ROOTDIR/mk/main.mk.

Customized Local Settings

Program deployments will necessarily require some changes to configuration and/or other local repository files. Avoid committing to example configuration files or files called local.* unless you know what you're doing, otherwise it will likely cause merge conflicts on all other developers' checkouts (as well as in any other checkouts you may have).

Project Hierarchy

lnet-lite

Low level networking library; used by PSCRPC. For high level network communication function, see the PSCRPC APIs instead.

mk - make(1)/build infrastructure

Note: individual projects may contain additional build rules and settings (such as include file directories pertinent to its various sources) in its $APP_ROOT/mk directory.

pfl

This is a utility library defining common data structures and routines shared by file systems.

  • pfl - core include directory
    • pfl/types.h - core data types

    • pfl/hashtbl.h - hash table API

      • automatically registered for export/control
    • pfl/dynarray.h - dynamically resizable arrays using realloc(3)

    • pfl/list.h - lightweight linked list implementation; NOT thread-safe (these routines must be invoked mutually exclusive sections of code e.g. protected by spinlocks)

    • pfl/listcache.h - heavier duty linked list; thread-safe

      • automatically registered for export/control
    • pfl/lockedlist.h - lightweight thread-safe linked list API

    • pfl/queue.h - BSD sys/queue.h, additional list/queue structures such as singly linked lists

    • pfl/stree - simple tree where each node may have a variable number of branches/children

    • pfl/tree - BSD sys/tree.h, contains two types of trees:

      http://www.openbsd.org/cgi-bin/man.cgi?query=tree</oof:link>

    • pfl/vbitmap.h - arbitrarily-sized bitmaps (byte strings)

    • pfl/rpc.h - remote procedure call (RPC) library build upon the Lustre ptlrpc networking library

    • pfl/rsx.h - high-level interface for simple RPC message communication the other files are split/shared between client/server activity

    • pfl/acsvc.h` - file system access control mechanism, for performing file system operations with other user privileges.

    • pfl/alloc.h- validity-checked allocation routines supporting page-alignment andmlock(2)`

    • pfl/atomic.h - a data type which provies a variety of mathematical operations which by nature of atomic i.e mutually exclusive, ideal e.g. for thread-shared data structures containing members which need to be updated but shouldn't require the overhead of locking the structure

    • pfl/cdefs.h - miscellaneous C definitions

    • pfl/crc.h - cyclic redundancy checks for attempting to verify data integrity

    • pfl/ctlcli.h - client interface for daemon control

    • pfl/ctlsvr.h - server guts for daemon control

    • pfl/fmt.h - string formatting APIs, such as human-readable sizes (2.5M)

    • pfl/fmtstr.h - custom format strings, e.g. %M for minutes, %H for hours, etc.

    • pfl/init.c - PFL initialization, key data structures and threads, etc.

    • pfl/iostats.h - routines for gathering statistics about any form of I/O

      • automatically registered for export/control
    • pfl/journal.h - operation journaling routines

    • pfl/lock.h - simple spinlock implementation for mutually exclusive code sections

      • spinlock() - acquire a spinlock, blocking until release if already held by another thread
      • freelock() - release a held spinlock
      • trylock() - attempt to grab a spinlock, returning false if another thread already holds it
      • reqlock() - require a spinlock for a section, ideal for recursive or highly nested program structure
      • tryreqlock() - attempt to require a spinlock
      • ureqlock() - (possibly) release a required lock, if not already held before corresponding reqlock()
    • pfl/log.h - fine-grained logging API in PFL programs, there are a number of subsystems a program may register,

      each subsystem has a loglevel associated with it which describes which kinds of messages (by severity) may be reported, and each thread has its own set of values for these. these levels are all controllable via the control interface.

      • psclog_trace() flow report
      • psclog_info() informational/diagnostic message
      • psclog_dbg() debugging messages
      • psclog_notice() condition alert
      • psclog_warn() non-critical error, append errno
      • psclog_warnx() non-critical error, not system-related
      • psclog_error() serious error, append errno to message
      • psclog_errorx() serious error, not system-related
      • psc_fatal() fatal error, end program execution, with errno
      • psc_fatalx() fatal error, end program execution, not sys-related
    • pfl/meter.h - API for progress meters

    • pfl/mkdirs.h - "mkdir -p" in a function

    • pfl/multiwait.h - pthread_cond/psc_waitq-like API for waiting of any of number of conditions to occur

    • pfl/pool.h - object memory management

      • automatically registered for export/control
    • pfl/printhex.h - simple data printer in hexadecimal for debugging

    • pfl/prsig.h - signal(2) behavior dumper

    • pfl/random.h - pseudo-random number generator based on /dev/urandom

    • pfl/rlimit.h - resource limit (getrlimit(2)) API

    • pfl/setprocesstitle.h - API for setting the ps(1) process name string

    • pfl/subsys.h - routines relating to the "subsystem" facility which may be used to logically divide large program structure into modules

    • pfl/thread.h - layer above pthread which gives you many things (basically, all thread-safe code in PFL deals with pscthreads and not pthreads directly)

    • pfl/timerthr.h - API for dedicating threads to perform events periodically

    • pfl/usklndthr.h - LNET userland socket LND module thread spawner

    • pfl/waitq.h - wait on an event, from the context of a thread

  • slash2 - root of SLASH2-specific code
    • include - shared include files
      • fid - global file ID definitions
      • fidcache - managing a collection of in-core files
      • inode - one in-core file
      • slashrpc - SLASH2 RPC definitions
      • slconfig - lex-based configuration parser definitions
    • mk - SLASH2-specific build rules/definitions/customizations/etc.
    • mount_slash - fuse mounter for SLASH2, connects to slashd and sliod
      • bflush.c - dirty bmap write handling logic
      • dircache.c - cache for open(2) directory handles for efficiency
      • io.c - core file system I/O operation logic
      • main.c - core file system operation handlers
    • msctl - command-line mount_slash controller
    • share - code amongst several SLASH2-related programs
      • fidc_common.c - handles for file structures and the in-memory cache for them
      • ctlcli_common - control interface definitions for clients (e.g. msctl)
      • ctlsvr_common - control interface definitions for daemons (e.g. mount_slash)
      • lconf - lex definitions for slcfg configuration parser
      • rpc_common.c - RPC routines
      • `yconf - yacc rules/codes for slcfg configuration parser
    • slashd - MDS server
      • rmm.c - inter- MDS and I/O daemon communication
      • rmc.c - handling of RPC requests from clients
      • journal - SLASH2-specific journalling routines
      • mds - RPC messages for metadata exchange
    • slmctl - command-line slashd controller
    • sliod - I/O server
      • slab - slab buffers for memory-resident file portions
    • slictl - command-line sliod controller
  • tools - miscellaneous development tools
    • libdep.pl - invoked by "make depend" for tracking libfoo.a and -lfoo on executables
    • mkdep - a portable file dependency generator used by "make depend"
    • notempty - simple utility for a workaround in the make infrastructure
    • unwrapcc - a tool for finding the actual intended final $CC target when deep cc(1) wrappers are in use

Writing Makefiles

This section explains how to leverage the make infrastructure for building custom programs.

First, list all source files your program consists of in SRCS:

PROG=		foo
SRCS+=		foo.c
SRCS+=		bar.c

No PFL library is provided; instead, all source files specifically manually through the SRCS variable:

SRCS+=		${PFL_BASE}/pfl/crc.c

There are a number of convenience variables for large libraries:

SRCS+=		${LNET_SOCKLND_SRCS}	# socket networking device
SRCS+=		${LNET_CFS_SRCS}	# miscellaneous routines
SRCS+=		${LNET_LIB_SRCS}	# message/mem routines
SRCS+=		${LNET_PTLLND_SRCS}	# portals networking device
SRCS+=		${PSCRPC_SRCS}		# PSCRPC library

Next, specify any additional environment variables:

  • INCLUDES - a list of -I<path> passed to gcc and a few other places
  • DEFINES - a list of -D<name>[=<value>] directives
    • LDFLAGS - linker flags such as -l<lib> or -L<path>. MODULES is recommended over use of this variable.
  • MODULES - list of dependencies this program uses

Notes:

  • Only set CFLAGS directly when you want to pass flags solely to gcc. For example, if you want to add CFLAGS+=-DFOO, then FOO won't propagated to other targets such as make lint.

    Avoid hardcoding paths as much as possible by using the path variables; e.g. use:

INCLUDES+= -I${PFL_BASE}/tests INCLUDES+= -I${KERNEL_BASE}/include


<!-- ` -->

  instead of hardcoding any paths.

## Writing PFL programs

### PFL initialization

Every PFL application must invoke `pfl_init()` before it can access PFL
APIs.
This routine:

* creates the subsystem facility to allow fine-grained logging
  capabilities
* initializes the thread subsystem
* initializes the NUMA local memory access subsystem
* parses the environment as described in `pflenv(7)`

### PFL thread registration

Every thread in a PFL application which invokes PFL APIs should invoke
`pscthr_init()` to initialize some metathread structures.
This information can be access later with `pscthr_get()`, including
thread-local storage via the `pscthr_private` field.

### PSCRPC initialization

Our modified Lustre networking stack requests to the application to
spawn a thread.
PFL provides an implementation of these routines but requires that your
application provides two routines to aid it in spawning:

* `psc_usklndthr_getname()`
* `psc_usklndthr_gettype()`

See example implementations of these in `mount_slash`.

### PSCRPC per-connection data storage

PSCRPC maintains a structure `psc_export` associated with each request
which contains the peer information and contains a member:

```c
void *rq->rq_export->exp_private

This can be used to store an application-specific (or task-specific) structure to associate data with the peer.