The sftppush is a mini pipeline for file write-close event > decompress > s3 archive
.
Initially, it was intended to replace i.e. low-compute serverless functions that would simply push files from the Sftp server into an S3 Bucket location. Instead of mounting an Sftp server’s file system directly onto S3 FUSE this solution seems to be more fit for production use cases.
- AWS cli (https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-install.html)
- Go 1.15.x
Most likely you want to run this project inside an Sftp server, which receives a constant stream of data files.
The sftppush project is intended to run in a Linux (Ubuntu/Debian) VM. It captures WRITE_CLOSE events for files on the file system based on a single or multiple source directories.
The watch --source
flag can read a single directory as well as a configuration
file containing multiple directories. In case of multiple directory targets
there will be a separate go watch process
spawned for each target directory, respectively.
NOTE: ONLY TESTED ON UBUNTU AND DEBIAN (this project relies on the UNIX CLOSE_WRITE event)
$ git clone https://github.com/olmax99/sftppush.git
$ cd sftppush
$ make build
$ ./bin/sftppush-0.1.0-linux_amd64 help
This will create a new binary in ./bin/sftppush-0.2.2-linux_amd64
.
Recommended: Create config.yaml
in project root and set flag --config
or -c
.
All source directories for fsnotify are determined by: <defaults.userpath> + <watch.source.name> + <watch.source.paths>
./config.yaml
defaults:
userpath: # Set by default, can be overwritten here or with environment variable
s3target: olmax-test-sftppush-126912
awsprofile: ***
awsregion: ***
# log:
# level: info
# location: "syslog" || <abs/path/to/logfile>
# format: json
watch:
source:
- name: sftpuser1
paths:
- /path/to/source/directory1
- /path/to/source/directory2
# - name: sftpuser2
# paths:
# - path/to/source/directory1
# s3target: olmax-test-sftppush-126912
By default (without log:
) Sftppush
will try to use ~/.sftppush/sftppush.log
.
- If the directory does not exist, it will use
Stderr
- Optionally
syslog
can be used but requiresrsyslog
to be active. - Log level is at
debug
by default, which is producing overhead.
If a config files is created there is no need to set the --source
flags. Flags
will overwrite config file values.
Running it should be as simple as:
$ ./bin/sftppush-0.2.0-linux_amd64 --config config.yaml watch
# EXAMPLE 1: Run without config with two sources
$ SFTPPUSH_DEFAULTS_S3TARGET=*** SFTPPUSH_DEFAULTS_AWSPROFILE=*** \
./bin/sftppush-0.2.0-linux_amd64 watch \
--source="name=sftpuser1,paths=/device1/data /device2/data" \
--source="name=sftpuser2,paths=/device1/data /device2/data"
# EXAMPLE 2: Run with a custom User directory - needs trailing '/'
$ SFTPPUSH_DEFAULTS_USERPATH="/home/my_test_dir/" ./bin/sftppush-0.2.0-linux_amd64 -c config.yaml
Some tests require the OS file system. You can choose to run the tests inside a Docker container.
$ [DOCKER=1] make test
- WRITE_CLOSE events are not cross-platform, and currently only readily accessible on Linux file systems. This restriction made the fsnotify creators to keep the respective PR in pending state.
- Note that this
sftppush
project is relying heavily on this feature. - One way of doing it is to fork the original fsnotify, and accepting the changes made on the respective PR.
$ aws s3api --profile *** list-objects --bucket *** --query 'Contents[?contains(Key,``)].{Key: Key, Size: Size}' --output table | wc -l # install golangci-lint curl -sSfL https://raw.githubusercontent.com/golangci/golangci-lint/master/install.sh | sh -s -- -b $(go env GOPATH)/bin v1.31.0