Skip to content

Commit

Permalink
Merge branch 'develop' for v0.2.1
Browse files Browse the repository at this point in the history
  • Loading branch information
JCapul committed May 24, 2019
2 parents 6dd5b89 + 418f7f1 commit 86e8c51
Show file tree
Hide file tree
Showing 7 changed files with 63 additions and 37 deletions.
6 changes: 6 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,11 @@
# Changelog

## vO.2.1

- Fix a memory corruption issue by passing a copy of pathnames from C to Go [2f9e3db]
- Update README.md [f893f1e]
- Add file metadata caching in client [0a098e0]

## v0.2.0

- Add Spack package for pdwfs [b6cb1f2]
Expand Down
18 changes: 10 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,9 @@

[![Build Status](https://travis-ci.org/cea-hpc/pdwfs.png?branch=master)](https://travis-ci.org/cea-hpc/pdwfs)

pdwfs (we like to pronounce it "*padawan-f-s*", see below) is a preload library implementing a distributed in-memory filesystem in user space suitable for intercepting *bulk* I/O workloads typical of HPC simulations. It is using [Redis](https://redis.io) as the backend memory store.
pdwfs (we like to pronounce it "*padawan-f-s*", see [below](#padawan-project)) is a preload library implementing a distributed in-memory filesystem in user space suitable for intercepting *bulk* I/O workloads typical of HPC simulations. It is using [Redis](https://redis.io) as the backend memory store.

pdwfs objective is to provide a lightweight infrastructure to execute HPC simulation workflows without writing/reading any intermediate data to/from a (parallel) filesystem. This type of approach is known as *in transit* or *loosely-coupled in situ*, see the two next sections for further details.
pdwfs objective is to provide a lightweight infrastructure to execute HPC simulation workflows without writing/reading any intermediate data to/from a (parallel) filesystem, but rather staging it in memory. This type of approach is known as *in transit* or *loosely-coupled in situ* and is further explained in a [section](#in-situ-and-in-transit-hpc-workflows) below.

pdwfs is written in [Go](https://golang.org) and C and runs on Linux systems only (we provide a Dockerfile for testing and development on other systems).

Expand All @@ -29,7 +29,7 @@ A binary distribution is available for Linux system and x86_64 architecture in t
The following steps will install pdwfs and make it available in your PATH.

```bash
$ wget -O pdwfs.tar.gz https://github.com/cea-hpc/pdwfs/releases/download/v0.2.0/pdwfs-v0.2.0-linux-amd64.tar.gz
$ wget -O pdwfs.tar.gz https://github.com/cea-hpc/pdwfs/releases/download/v0.2.1/pdwfs-v0.2.1-linux-amd64.tar.gz
$ mkdir /usr/local/pdwfs
$ tar xf pdwfs.tar.gz --strip-component=1 -C /usr/local/pdwfs
$ export PATH="/usr/local/pdwfs/bin:$PATH"
Expand All @@ -47,16 +47,18 @@ $ make PREFIX=/usr/local/pdwfs install

### Using Spack

pdwfs and its dependencies (even Go) can be installed with the package manager [Spack](https://spack.io).
pdwfs and its dependencies (Redis and Go) can be installed with the package manager [Spack](https://spack.io).

A Spack package for pdwfs is not yet available in Spack upstream repository, but is available [here](https://github.com/cea-hpc/pdwfs/releases/download/v0.2.0/pdwfs-spack.py).
NOTE: at the time of this writing, the latest Spack release (v0.12.1) does not have the Redis package, it is only available in the develop branch. Still, the Redis package python file can easily be copy-pasted in your Spack installation.

A Spack package for pdwfs is not yet available in Spack upstream repository, but is available [here](https://github.com/cea-hpc/pdwfs/releases/download/v0.2.1/pdwfs-spack.py).

To add pdwfs package to your Spack installation, you can proceed as follows:

```bash
$ export DIR=$SPACK_ROOT/var/spack/repos/builtin/pdwfs
$ mkdir $DIR
$ wget -O $DIR/package.py https://github.com/cea-hpc/pdwfs/releases/download/v0.2.0/pdwfs-spack.py
$ wget -O $DIR/package.py https://github.com/cea-hpc/pdwfs/releases/download/v0.2.1/pdwfs-spack.py
$ spack spec pdwfs # check everything is OK
```

Expand Down Expand Up @@ -197,7 +199,7 @@ Test cases have been successfully run so far with the followig codes and tools:
- [ParaView](https://www.paraview.org/in-situ/) VTK file reader.


## Examples
## Docker-packaged examples
We provide a set of Dockerfiles to test on a laptop the codes and tools described in the Validation section.

- **Example 1**: HydroC + ParaView + FFmpeg workflow
Expand Down Expand Up @@ -230,7 +232,7 @@ The foundational work for this project was an initial version of pdwfs entierly

- *PaDaWAn: a Python Infrastructure for Loosely-Coupled In Situ Workflows*, J. Capul, S. Morais, J-B. Lekien, ISAV@SC (2018).

## In situ / in transit HPC workflows
## In situ and in transit HPC workflows
Within the HPC community, in situ data processing is getting quite some interests as a potential enabler for future exascale-era simulations.

The original in situ approach, also called tightly-coupled in situ, consists in executing data processing routines within the same address space as the simulation and sharing the resources with it. It requires the simulation to use a dedicated API and to link against a library embedding a processing runtime. Notable in situ frameworks are ParaView [Catalyst](https://www.paraview.org/in-situ/), VisIt [LibSim](https://wci.llnl.gov/simulation/computer-codes/visit). [SENSEI](http://sensei-insitu.org) provides a common API that can map to various in situ processing backends.
Expand Down
31 changes: 15 additions & 16 deletions src/c/pdwfs.c
Original file line number Diff line number Diff line change
Expand Up @@ -219,7 +219,7 @@ int open(const char *pathname, int flags, ...) {
return libc_open(pathname, flags, mode);
}

GoString filename = {pathname, strlen(pathname)};
GoString filename = {strdup(pathname), strlen(pathname)};

int fd = get_new_fd(fd_register);

Expand Down Expand Up @@ -510,7 +510,7 @@ int access(const char *pathname, int mode) {
if PATH_NOT_MANAGED(pathname) {
return libc_access(pathname, mode);
}
GoString filename = {pathname, strlen(pathname)};
GoString filename = {strdup(pathname), strlen(pathname)};
return Access(filename, mode);
}

Expand All @@ -520,7 +520,7 @@ int unlink(const char *pathname) {
if PATH_NOT_MANAGED(pathname) {
return libc_unlink(pathname);
}
GoString filename = {pathname, strlen(pathname)};
GoString filename = {strdup(pathname), strlen(pathname)};
int ret = Unlink(filename);
if (ret < 0) {
errno = GetErrno();
Expand All @@ -534,7 +534,7 @@ int __xstat(int vers, const char *pathname, struct stat *buf) {
if PATH_NOT_MANAGED(pathname) {
return libc__xstat(vers, pathname, buf);
}
GoString filename = {pathname, strlen(pathname)};
GoString filename = {strdup(pathname), strlen(pathname)};
return Stat(filename, buf);
}

Expand All @@ -544,7 +544,7 @@ int __xstat64(int vers, const char *pathname, struct stat64 *buf) {
if PATH_NOT_MANAGED(pathname) {
return libc__xstat64(vers, pathname, buf);
}
GoString filename = {pathname, strlen(pathname)};
GoString filename = {strdup(pathname), strlen(pathname)};
return Stat64(filename, buf);
}

Expand All @@ -554,7 +554,7 @@ int __lxstat(int vers, const char *pathname, struct stat *buf) {
if PATH_NOT_MANAGED(pathname) {
return libc__lxstat(vers, pathname, buf);
}
GoString filename = {pathname, strlen(pathname)};
GoString filename = {strdup(pathname), strlen(pathname)};
return Lstat(filename, buf);
}

Expand All @@ -564,7 +564,7 @@ int __lxstat64(int vers, const char *pathname, struct stat64 *buf) {
if PATH_NOT_MANAGED(pathname) {
return libc__lxstat64(vers, pathname, buf);
}
GoString filename = {pathname, strlen(pathname)};
GoString filename = {strdup(pathname), strlen(pathname)};
return Lstat64(filename, buf);
}

Expand Down Expand Up @@ -592,7 +592,7 @@ int statfs(const char *path, struct statfs *buf) {
if PATH_NOT_MANAGED(path) {
return libc_statfs(path, buf);
}
GoString filename = {path, strlen(path)};
GoString filename = {strdup(path), strlen(path)};
return Statfs(filename, buf);
}

Expand All @@ -602,7 +602,7 @@ int statfs64(const char *path, struct statfs64 *buf) {
if PATH_NOT_MANAGED(path) {
return libc_statfs64(path, buf);
}
GoString filename = {path, strlen(path)};
GoString filename = {strdup(path), strlen(path)};
return Statfs64(filename, buf);
}

Expand Down Expand Up @@ -641,8 +641,8 @@ FILE* fopen(const char *path, const char *mode) {
}

FILE *stream = get_new_stream(fd_register);
GoString gopath = {path, strlen(path)};
GoString gomode = {mode, strlen(mode)};
GoString gopath = {strdup(path), strlen(path)};
GoString gomode = {strdup(mode), strlen(mode)};
int ret = Fopen(gopath, gomode, fileno(stream));
if (ret < 0) {
remove_fd(fd_register, fileno(stream));
Expand Down Expand Up @@ -1007,7 +1007,6 @@ int openat(int dirfd, const char *pathname, int flags, ...) {
}

TRACE("intercepting openat(dirfd=%d, pathname=%s, flags=%d, mode=%d) (PASS THOUGH)\n", dirfd, pathname, flags, mode)
// NOTE: see the comment of func (fs *PdwFS) registerFile in pdwfs.go before implementing openat interception
return libc_openat(dirfd, pathname, flags, mode);
}

Expand All @@ -1017,7 +1016,7 @@ int mkdir(const char *pathname, mode_t mode) {
if PATH_NOT_MANAGED(pathname) {
return libc_mkdir(pathname, mode);
}
GoString gopath = {pathname, strlen(pathname)};
GoString gopath = {strdup(pathname), strlen(pathname)};
return Mkdir(gopath, mode);
}

Expand All @@ -1032,7 +1031,7 @@ int rmdir(const char *pathname) {
if PATH_NOT_MANAGED(pathname) {
return libc_rmdir(pathname);
}
GoString gopath = {pathname, strlen(pathname)};
GoString gopath = {strdup(pathname), strlen(pathname)};
return Rmdir(gopath);
}

Expand Down Expand Up @@ -1079,7 +1078,7 @@ int statvfs(const char *pathname, struct statvfs *buf) {
if PATH_NOT_MANAGED(pathname) {
return libc_statvfs(pathname, buf);
}
GoString gopath = {pathname, strlen(pathname)};
GoString gopath = {strdup(pathname), strlen(pathname)};
return Statvfs(gopath, buf);
}

Expand All @@ -1089,7 +1088,7 @@ int statvfs64(const char *pathname, struct statvfs64 *buf) {
if PATH_NOT_MANAGED(pathname) {
return libc_statvfs64(pathname, buf);
}
GoString gopath = {pathname, strlen(pathname)};
GoString gopath = {strdup(pathname), strlen(pathname)};
return Statvfs64(gopath, buf);
}

Expand Down
30 changes: 19 additions & 11 deletions src/go/redisfs/inodes.go
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,8 @@ type Inode struct {
path string
keyPrefix string
mtx *sync.RWMutex
isDir *bool
mode *os.FileMode
}

//NewInode returns a new Inode object
Expand Down Expand Up @@ -70,21 +72,27 @@ func (i *Inode) delMeta() {

//IsDir returns true if inode is a directory
func (i *Inode) IsDir() bool {
//FIXME: cache the query
client := i.redisRing.GetClient(i.keyPrefix)
res, err := client.Exists(i.keyPrefix + ":children")
Check(err)
return res
if i.isDir == nil {
client := i.redisRing.GetClient(i.keyPrefix)
res, err := client.Exists(i.keyPrefix + ":children")
Check(err)
i.isDir = &res
}
return *i.isDir
}

//Mode returns the inode access mode
func (i *Inode) Mode() os.FileMode {
client := i.redisRing.GetClient(i.keyPrefix)
val, err := client.Get(i.keyPrefix + ":mode")
Check(err)
res, err := strconv.ParseInt(string(val), 10, 64)
Check(err)
return os.FileMode(res)
if i.mode == nil {
client := i.redisRing.GetClient(i.keyPrefix)
val, err := client.Get(i.keyPrefix + ":mode")
Check(err)
res, err := strconv.ParseInt(string(val), 10, 64)
Check(err)
m := os.FileMode(res)
i.mode = &m
}
return *i.mode
}

//Path returns the Path of the file
Expand Down
11 changes: 11 additions & 0 deletions src/go/redisfs/redis.go
Original file line number Diff line number Diff line change
Expand Up @@ -138,6 +138,17 @@ func (c *RedisClient) Get(key string) ([]byte, error) {
return b, err
}

// Strlen command
func (c *RedisClient) Strlen(key string) (int, error) {
conn := c.pool.Get()
defer conn.Close()
l, err := redis.Int(conn.Do("STRLEN", key))
if err == redis.ErrNil {
return l, ErrRedisKeyNotFound
}
return l, err
}

// GetInto implements GET redis command with an input destination buffer to read bytes into.
// The actual number of bytes read is returned.
// If dst is too small, it will panic.
Expand Down
4 changes: 2 additions & 2 deletions src/go/redisfs/store.go
Original file line number Diff line number Diff line change
Expand Up @@ -235,9 +235,9 @@ func (s DataStore) GetSize(name string) int64 {
return 0
}
key := key(name, ilast)
lastStripe, err := s.redisRing.GetClient(key).Get(key)
l, err := s.redisRing.GetClient(key).Strlen(key)
Check(err)
return ilast*s.stripeSize + int64(len(lastStripe))
return ilast*s.stripeSize + int64(l)
}

// helper to obtain the last stripe ID and length based on the total size and stripe size
Expand Down
Empty file modified tools/create_tarballs.sh
100644 → 100755
Empty file.

0 comments on commit 86e8c51

Please sign in to comment.