Skip to content
/ stream Public

Golang file system abstraction tailored for AWS S3, enabling seamless streaming of binary objects along with their corresponding metadata.

License

Notifications You must be signed in to change notification settings

fogfish/stream

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

62 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

stream

Golang file system abstraction tailored for AWS S3


The library provides a Golang file system abstraction tailored for AWS S3, enabling seamless streaming of binary objects along with their corresponding metadata.

Inspiration

The streaming is convenient paradigm for handling large binary objects like images, videos, and more. Applications can effectively manage data consumption by leveraging io.Reader and io.Writer for seamless abstraction. This library employs the AWS Golang SDK v2 under the hood to facilitate access to AWS S3 through streams.

On the other hand, a file system is a method used by computers to organize and store data on storage devices. It provides a structured way to access and manage binary objects. File systems handle tasks such as creating, reading, writing, and deleting binary objects, as well as managing permissions and metadata associated with each file or directory.

The library implements Golang File System and enhances it by adding support for writable files and type-safe metadata. The file system api is following:

type FileSystem interface {
  Create(path string) (File, error)
  Open(path string) (fs.File, error)
  Stat(path string) (fs.FileInfo, error)
  ReadDir(path string) ([]fs.DirEntry, error)
  Glob(pattern string) ([]string, error)
}

Notably, the interface supports reading and writing metadata associated with AWS objects using fs.FileInfo.

Getting started

The library requires Go 1.18 or later due to usage of generics.

The latest version of the library is available at its main branch. All development, including new features and bug fixes, take place on the main branch using forking and pull requests as described in contribution guidelines. The stable version is available via Golang modules.

Use go get to retrieve the library and add it as dependency to your application.

go get -u github.com/fogfish/stream

Quick Start

Check out the examples. They cover all fundamental use cases with runnable code snippets. Below is a simplest "Hello World"-like application for reading the object from AWS S3.

import (
  "io"
  "os"

  "github.com/fogfish/stream"
)

// mount s3 bucket as file system
s3fs, err := stream.NewFS(/* name of S3 bucket */)
if err != nil {
  return err
}

// open stream `io.Reader` to an object on S3
fd, err := s3fs.Open("/the/example/key.txt")
if err != nil {
  return err
}

// stream data using io.Reader interface
buf, err := io.ReadAll(fd)
if err != nil {
  return err
}

// close stream
err = fd.Close()
if err != nil {
  return err
}

See and try examples. Its cover all basic use-cases of the library.

Mounting S3

The library serves as a user-side implementation of Golang's file system abstractions defined by io/fs. It implements fs.FS, fs.StatFS, fs.ReadDirFS and fs.GlobFS. Additionally, it offers extensions for file writing: stream.CreateFS, stream.RemoveFS and stream.CopyFS.

To create a file system instance, utilize stream.NewFS or stream.New. The file system is configurable using options pattern.

s3fs, err := stream.NewFS(
  /* name of S3 bucket */,
  stream.WithIOTimeout(5 * time.Second),
  stream.WithListingLimit(25),
)

Reading objects

To open the file for reading use Open function giving the absolute path starting with /, the returned file descriptor is a composite of io.Reader, io.Closer and stream.Stat. Utilize Golang's convenient streaming methods to consume S3 object seamlessly.

r, err := s3fs.Open("/the/example/key")
if err != nil {
  return err
}
defer r.Close() 

// utilize Golang's convenient streaming methods
io.ReadAll(r)

Writing objects

To open the file for writing use Create function giving the absolute path starting with /, the returned file descriptor is a composite of io.Writer, io.Closer and stream.Stat. Utilize Golang's convenient streaming methods to update S3 object seamlessly. Once all bytes are written, it's crucial to close the stream. Failure to do so would cause data loss. The object is considered successfully created on S3 only if all Write operations and subsequent Close actions are successful.

w, err := s3fs.Create("/the/example/key", nil)
if err != nil {
  return err
}

// utilize Golang's convenient streaming methods
io.WriteString(w, "Hello World!\n")

// close stream and handle error to prevent data loss. 
err = w.Close()
if err != nil {
  return err
}

Walking through objects

The file system implements interfaces fs.ReadDirFS and fs.GlobFS for traversal through objects. The classical file system organize data hierarchically into directories as opposed to the flat storage structure of general purpose AWS S3 (the directory bucket is not supported yet). The flat structure implies a limitations into the implementation

  1. it assumes a directory if the path ends with / (e.g. /the/example/key points to the object, /the/example/key/ points to the directory).
  2. it return path relative to pattern for all found object.
err := fs.WalkDir(s3fs, dir, func(path string, d fs.DirEntry, err error) error {
  if err != nil {
    return err
  }

  if d.IsDir() {
    return nil
  }

  // do something with file
  // path is absolute path to the file but entry is relative path
  // path == dir + d.Name()

  return nil
})

Supported File System Operations

For added convenience, the file system is enhanced with stream.RemoveFS and stream.CopyFS, enabling the removal of S3 objects and the copying of objects across buckets, respectively.

Objects metadata

fs.FileInfo is a primary container for S3 objects metadata. The file system provides access to metadata either from open streams (file descriptors) or for any key.

fi, err := s3fs.Stat("/the/example/key")
if err != nil {
  return err
}

Type-safe objects metadata

AWS S3 support object metadata as a set of name-value pairs and allows to define the metadata at the time you upload the object and read it late. This library support both system and user-controlled metadata attributes.

What sets this library apart is its encouragement for developers to utilize the Golang type system in defining object metadata. Rather than working with loosely typed name-value pairs, metadata is structured as Golang structs, promoting correctness and maintainability. This approach is facilitated through generic programming style within the library.

A Golang struct type serves as the metadata container, where each public field is transformed into name-value pairs before being written to S3. Example below defines the container build with two user controlled attributes Author and Chapter and two system attributes ContentType and ContentLanguage.

type Note struct {
  Author          string
  Chapter         string
  ContentType     string
  ContentLanguage string
}

The file system interface has been expanded to handle user-defined metadata in a type-safe manner. Firstly, stream.New() create type annotated client. Secondly, the Create() function accepts a pointer to the metadata container, which is then written alongside the data. Lastly, the fs.FileInfo container retains an instance of associated metadata, which is accessible through either a Sys() call or the StatSys() helper.

// create client and define metadata type
s3fs, err := stream.New[Note](/* name of S3 bucket */)

// create stream and annotate it with metadata
fd, err := s3fs.Create("/the/example/key",
  &Note{/* defined metadata values */},
)

// fs.FileInfo carries previously written metadata, use Sys() function to access.
fi, err := s3fs.Stat("/the/example/key")

note := s3fs.StatSys(fi)

AWS S3 defined collection of well-known system attributes. This library supports only subset of those: Cache-Control, Content-Encoding, Content-Language, Content-Type, Expires, ETag, Last-Modified and Storage-Class. Open Pull Request or raise an issue if subset needs to be enhanced.

The library define type stream.SystemMetadata that incorporates all supported attributes. You might annotate your own types.

type Note struct {
  stream.SystemMetadata
  Author          string
  Chapter         string
}

Presigned Urls

Usage of io.Reader and io.Writer interfaces is sufficient for majority cloud applications. However, there are instances where delegating read/write responsibilities to a mobile client becomes necessary. For example, directly uploading images or video files from a mobile client to an S3 bucket is both scalable and considerably more efficient than routing through a backend system. The library accommodates this scenario with a special case for streaming binary objects using pre-signed URLs. The file system return pre-signed URL for the stream within the metadata. It only requires definition of attribute PreSignedUrl of string type.

type PreSignedUrl struct {
	PreSignedUrl string
}

Use fs.FileInfo container and metadata api depicted above to obtain pre-signed URLs.

// Mount the S3 bucket with metadata containing the `PreSignedUrl` attribute
s3fs, err := stream.New[stream.PreSignedUrl](/* name of S3 bucket */)

// Open file for read or write
fd, err := s3fs.Create("/the/example/key", nil)
if err != nil {
  return err
}
defer fd.Close()

// read files metadata
fi, err := fd.Stat()
if err != nil {
return err
}

if meta := s3fs.StatSys(fi); meta != nil {
  // Use meta.PreSignedUrl
}

Note: Utilizing a pre-signed URL necessitates passing all headers that were provided to the Create function.

fd, err := s3fs.Create("/the/example/key",
  &Note{
    Author:          "fogfish",
    ContentType:     "text/plain",
    ContentLanguage: "en",
  }
)
curl -XPUT https://pre-signed-url-goes-here \
  -H 'Content-Type: text/plain' \
  -H 'Content-Language: en' \
  -H 'X-Amz-Meta-Author: fogfish' \
  -d 'some content'

Error handling

The library consistently returns fs.PathError, except in cases where the object is not found, in which fs.ErrNotExist is returned. Additionally, it refrains from wrapping stream I/O errors.

How To Contribute

The library is MIT licensed and accepts contributions via GitHub pull requests:

  1. Fork it
  2. Create your feature branch (git checkout -b my-new-feature)
  3. Commit your changes (git commit -am 'Added some feature')
  4. Push to the branch (git push origin my-new-feature)
  5. Create new Pull Request

The build and testing process requires Go version 1.13 or later.

build and test library.

git clone https://github.com/fogfish/stream
cd stream
go test

commit message

The commit message helps us to write a good release note, speed-up review process. The message should address two question what changed and why. The project follows the template defined by chapter Contributing to a Project of Git book.

bugs

If you experience any issues with the library, please let us know via GitHub issues. We appreciate detailed and accurate reports that help us to identity and replicate the issue.

License

See LICENSE

About

Golang file system abstraction tailored for AWS S3, enabling seamless streaming of binary objects along with their corresponding metadata.

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages