-
Notifications
You must be signed in to change notification settings - Fork 22
CLI Syntax
Eric Ren edited this page Jun 29, 2023
·
6 revisions
The following is the complete syntax of the CLI arguments for 3.5.2. Note that you can also generate this text simply by running:
java -jar ecs-sync-3.5.3.jar --help
Full 3.5.3 CLI syntax:
EcsSync v3.5.3
usage: java -jar ecs-sync.jar -source <source-uri> [-filters
<filter1>[,<filter2>,...]] -target <target-uri> [options]
Common options:
--bandwidth-limit <bandwidth-limit> Specify the max speed in bytes/s
to throttle the traffic
bandwidth. Default is 0 (no
throttle). Note that if verify
is enabled, the target storage
will have reads and writes using
the same throttle limit, so make
sure your bandwidth is
sufficient to support full
duplex traffic
--buffer-size <buffer-size> Sets the buffer size (in bytes)
to use when streaming data from
the source to the target
(supported plugins only).
Defaults to 128K
--db-connect-string <db-connect-string> Enables the MySQL database
engine and specifies the JDBC
connect string to connect to the
database (i.e.
"jdbc:mysql://localhost:3306/ecs
_sync?user=foo&password=bar"). A
database will make repeat runs
and incrementals more efficient.
With this database type, you can
use the mysql client to
interrogate the details of all
objects in the sync. Note that
in the UI, this option is the
default and is automatically
populated by the server (you
don't need a value here)
--db-enc-password <db-enc-password> Specifies the encrypted password
for the MySQL database
--db-enhanced-details-enabled Specifies whether the DB should
included enhanced details, like
source/target MD5 checksum,
retention durations, etc. Note
this will cause the DB to
consume more storage and may add
some latency to each copy
operation
--db-file <db-file> Enables the Sqlite database
engine and specifies the file to
hold the status database. A
database will make repeat runs
and incrementals more efficient.
With this database type, you can
use the sqlite3 client to
interrogate the details of all
objects in the sync
--db-table <db-table> Specifies the DB table name to
use. When using MySQL or the UI,
be sure to provide a unique
table name or risk corrupting a
previously used table. Default
table is "objects" except in the
UI, where a unique name is
generated for each job. In the
UI, you should specify a table
name to ensure the table
persists after the job is
archived
--delete-source Supported source plugins will
delete each source object once
it is successfully synced (does
not include directories). Use
this option with care! Be sure
log levels are appropriate to
capture transferred (source
deleted) objects
--filters <filter-names> The comma-delimited list of
filters to apply to objects as
they are synced. Note that
filters are applied in the order
specified (via CLI, XML or
UI)Specify the activation names
of the filters [returned from
Filter.getActivationName()].
Examples:
id-logging
gladinet-mapping,strip-acls
Each filter may have additional
custom parameters you may
specify separately
--force-sync Force the write of each object,
regardless of its state in the
target storage
--help Displays this help content
--ignore-invalid-acls If syncing ACL information when
syncing objects, ignore any
invalid entries (i.e.
permissions or identities that
don't exist in the target
system)
--log-level <log-level> Sets the verbosity of logging
(silent|quiet|verbose|debug).
Default is quiet
--no-estimation By default, the source plugin
will query the source storage to
crawl and estimate the total
amount of data to be
transferred. Use this option to
disable estimation (i.e. for
performance improvement)
--no-monitor-performance Enables performance monitoring
for reads and writes on any
plugin that supports it. This
information is available via the
REST service during a sync
--no-rest-server Disables the REST server
--no-sync-data Object data is synced by default
--no-sync-metadata Metadata is synced by default
--non-recursive Hierarchical storage will sync
recursively by default
--perf-report-seconds <seconds> Report upload and download rates
for the source and target
plugins every <x> seconds to
INFO logging. Default is off
(0)
--remember-failed Tracks all failed objects and
displays a summary of failures
when finished
--rest-endpoint <rest-endpoint> Specified the host and port to
use for the REST endpoint.
Optional; defaults to
localhost:9200
--rest-only Enables REST-only control. This
will start the REST server and
remain alive until manually
terminated. Excludes all other
options except --rest-endpoint
--retry-attempts <retry-attempts> Specifies how many times each
object should be retried after
an error. Default is 2 retries
(total of 3 attempts)
--source <source-uri> The URI for the source storage.
Examples:
atmos:http://uid:secret@host:por
t
'- Uses Atmos as the source;
could also be https.
file:///tmp/atmos/
'- Reads from a directory
archive:///tmp/atmos/backup.tar.
gz
'- Reads from an archive file
s3:http://key:secret@host:port
'- Reads from an S3 bucket
Other plugins may be available.
See their documentation for URI
formats
--source-list <source-list> The list of source objects to
sync. Unless sourceListRawValues
is enabled, this should be in
CSV format, with one object per
line, where the absolute
identifier (full path or key) is
the first value in each line.
This entire line is available to
each plugin as a raw string
--source-list-file <source-list-file> Path to a file that supplies the
list of source objects to sync.
Unless sourceListRawValues is
enabled, this file should be in
CSV format, with one object per
line, where the absolute
identifier (full path or key) is
the first value in each line.
This entire line is available to
each plugin as a raw string
--source-list-raw-values Whether to treat the lines in
the sourceList or sourceListFile
as raw object identifier values
(do not do any CSV parsing and
do not remove comments, escapes,
or trim white space). Default is
false
--sync-acl Sync ACL information when
syncing objects (in supported
plugins)
--sync-retention-expiration Sync retention/expiration
information when syncing objects
(in supported plugins). The
target plugin will *attempt* to
replicate retention/expiration
for each object. Works only on
plugins that support
retention/expiration. If the
target is an Atmos cloud, the
target policy must enable
retention/expiration immediately
for this to work
--target <target-uri> The URI for the target storage.
Examples:
atmos:http://uid:secret@host:por
t
'- Uses Atmos as the target;
could also be https.
file:///tmp/atmos/
'- Writes to a directory
archive:///tmp/atmos/backup.tar.
gz
'- Writes to an archive file
s3:http://key:secret@host:port
'- Writes to an S3 bucket
Other plugins may be available.
See their documentation for URI
formats
--thread-count <thread-count> Specifies the number of objects
to sync simultaneously. Default
is 16
--throughput-limit <throughput-limit> Specify the max TPS throughput
limit in objects/s. Default is 0
(no throttle)
--timing-window <timing-window> Sets the window for timing
statistics. Every {timingWindow}
objects that are synced, timing
statistics are logged and reset.
Default is 10,000 objects
--timings-enabled Enables operation timings on all
plug-ins that support it
--use-metadata-checksum-for-verification When available, use the checksum
in the metadata of the object
(e.g. S3 ETag) during
verification, instead of reading
back the object data. This may
improve efficiency by avoiding a
full read of the object data to
verify source and target.
However, you must fully trust
the checksum provided by both
source and target storage
--verify After a successful object
transfer, the object will be
read back from the target system
and its MD5 checksum will be
compared with that of the source
object (generated during
transfer). This only compares
object data (metadata is not
compared) and does not include
directories
--verify-only Similar to --verify except that
the object transfer is skipped
and only read operations are
performed (no data is written)
--version Displays package version
--xml-config <xml-config> Specifies an XML configuration
file. In this mode, the XML file
contains all of the
configuration for the sync job.
In this mode, most other CLI
arguments are ignored.
Available plugins are listed below along with any custom options they may have
Archive File (archive:)
The archive plugin reads/writes data from/to an archive file (tar, zip,
etc.) It is triggered by an archive URL:
archive:[<scheme>://]<path>, e.g. archive:file:///home/user/myfiles.tar
or archive:http://company.com/bundles/project.tar.gz or archive:cwd_file.zip
The contents of the archive are the objects. To preserve object metadata on the
target filesystem, or to read back preserved metadata, use
--store-metadata.
NOTE: Storage options must be prefixed by source- or target-, depending on
which role they assume
--delete-check-script <delete-check-script> When --delete-source is used,
add this option to execute an
external script to check
whether a file should be
deleted. If the process
exits with return code zero,
the file is safe to delete.
--delete-older-than <delete-age> When --delete-source is used,
add this option to only
delete files that have been
modified more than
<delete-age> milliseconds
ago.
--excluded-paths <regex-pattern> A list of regular expressions
to search against the full
file path. If the path
matches, the file will be
skipped. Since this is a
regular expression, take care
to escape special characters.
For example, to exclude all
.snapshot directories, the
pattern would be
.*/\.snapshot. Specify
multiple entries by repeating
the CLI option or using
multiple lines in the UI
form.
--follow-links Instead of preserving
symbolic links, follow them
and sync the actual files.
--include-base-dir By default, the base
directory is not included as
part of the sync (only its
children are). enable this to
sync the base directory.
--modified-since <yyyy-MM-ddThh:mm:ssZ> Only look at files that have
been modified since the
specifiec date/time.
Date/time should be provided
in ISO-8601 UTC format (i.e.
2015-01-01T04:30:00Z).
--no-relative-link-targets By default, any symbolic link
targets that point to an
absolute path within the
primary source directory will
be changed to a (more
portable) relative path.
Turn this option off to keep
the target path as-is.
--store-metadata When used as a target, stores
source metadata in a json
file, since filesystems have
no concept of user metadata.
--use-absolute-path Uses the absolute path to the
file when storing it instead
of the relative path from the
source dir.
Filesystem (file:)
The filesystem plugin reads/writes data from/to a file or directory. It is
triggered by the URI:
file://<path>, e.g. file:///home/user/myfiles
If the URL refers to a file, only that file will be synced. If a directory is
specified, the contents of the directory will be synced. Unless the
--non-recursive flag is set, the subdirectories will also be recursively
synced. To preserve object metadata on the target filesystem, or to read
back preserved metadata, use --store-metadata.
NOTE: Storage options must be prefixed by source- or target-, depending on
which role they assume
--delete-check-script <delete-check-script> When --delete-source is used,
add this option to execute an
external script to check
whether a file should be
deleted. If the process
exits with return code zero,
the file is safe to delete.
--delete-older-than <delete-age> When --delete-source is used,
add this option to only
delete files that have been
modified more than
<delete-age> milliseconds
ago.
--excluded-paths <regex-pattern> A list of regular expressions
to search against the full
file path. If the path
matches, the file will be
skipped. Since this is a
regular expression, take care
to escape special characters.
For example, to exclude all
.snapshot directories, the
pattern would be
.*/\.snapshot. Specify
multiple entries by repeating
the CLI option or using
multiple lines in the UI
form.
--follow-links Instead of preserving
symbolic links, follow them
and sync the actual files.
--include-base-dir By default, the base
directory is not included as
part of the sync (only its
children are). enable this to
sync the base directory.
--modified-since <yyyy-MM-ddThh:mm:ssZ> Only look at files that have
been modified since the
specifiec date/time.
Date/time should be provided
in ISO-8601 UTC format (i.e.
2015-01-01T04:30:00Z).
--no-relative-link-targets By default, any symbolic link
targets that point to an
absolute path within the
primary source directory will
be changed to a (more
portable) relative path.
Turn this option off to keep
the target path as-is.
--store-metadata When used as a target, stores
source metadata in a json
file, since filesystems have
no concept of user metadata.
--use-absolute-path Uses the absolute path to the
file when storing it instead
of the relative path from the
source dir.
S3 (s3:)
Represents storage in an Amazon S3 bucket. This plugin is triggered by the
pattern:
s3:[http[s]://]access_key:secret_key@[host[:port]]/bucket[/root-prefix]
Scheme, host and port are all optional. If omitted,
https://s3.amazonaws.com:443 is assumed. sessionToken (optional) is
required for STS session credentials. profile (optional) will allow profile
credentials provider. useDefaultCredentialsProvider (optional) enables the
AWS default credentials provider chain. keyPrefix (optional) is the prefix
under which to start enumerating or writing keys within the bucket, e.g.
dir1/. If omitted, the root of the bucket is assumed. Note that MPUs now
use a shared thread pool per plugin instance, the size of which matches the
threadCount setting in the main options - so mpuThreadCount here has no
effect.
NOTE: Storage options must be prefixed by source- or target-, depending on
which role they assume
--base64-tls-certificate <base64-tls-certificate> Base64 Encoded TLS
Certificate of the S3
Host to be trusted
--create-bucket By default, the target
bucket must exist. This
option will create it
if it does not
--disable-v-hosts Specifies whether
virtual hosted buckets
will be disabled (and
path-style buckets will
be used)
--excluded-keys <regex-pattern> A list of regular
expressions to search
against the full object
key. If the key
matches, the object
will not be included in
the enumeration. Since
this is a regular
expression, take care
to escape special
characters. For
example, to exclude all
.md5 checksums, the
pattern would be
.*\.md5. Specify
multiple entries by
repeating the CLI
option or XML element,
or using multiple lines
in the UI form
--include-versions Transfer all versions
of every object. NOTE:
this will overwrite all
versions of each source
key in the target
system if any exist!
--legacy-signatures Specifies whether the
client will use v2
auth. Necessary for ECS
< 3.0
--mpu-part-size-mb <size-in-MB> Sets the part size to
use when multipart
upload is required
(objects over 5GB).
Default is 128MB,
minimum is 5MB
--mpu-resume-enabled Enables multi-part
upload (MPU) to be
resumed from existing
uploaded parts.
--mpu-threshold-mb <size-in-MB> Sets the size threshold
(in MB) when an upload
shall become a
multipart upload
--no-preserve-directories By default, directories
are stored in S3 as
empty objects to
preserve empty dirs and
metadata from the
source. Turn this off
to avoid copying
directories. Note that
if this is turned off,
verification may fail
for all directory
objects
--no-url-decode-keys In bucket list
operations, the
encoding-type=url
parameter is always
sent (to request safe
encoding of keys). By
default, object keys in
bucket listings are
then URL-decoded.
Disable this for
S3-compatible systems
that do not support the
encoding-type
parameter, so that keys
are pulled verbatim out
of the XML. Note:
disabling this on
systems that *do*
support the parameter
may provide corrupted
key names in the bucket
list.
--profile <profile> The profile name to use
when providing
credentials in a
configuration file (via
useDefaultCredentialsPr
ovider)
--region <region> Overrides the AWS
region that would be
inferred from the
endpoint
--session-token <session-token> The session token to
use with temporary
credentials
--socket-timeout-ms <timeout-ms> Sets the socket timeout
in milliseconds
(default is 0ms)
--sse-s3-enabled Specifies whether
Server-Side Encryption
(SSE-S3) is enabled
when writing to the
target
--store-source-object-copy-markers Enable this to store
source object copy
markers (mtime and
ETag) in target user
metadata. This will
mark the object such
that a subsequent copy
job can more accurately
recognize if the source
object has already been
copied and skip it
--use-default-credentials-provider Use S3 default
credentials provider.
See
https://docs.aws.amazon
.com/AWSJavaSDK/latest/
javadoc/com/amazonaws/a
uth/DefaultAWSCredentia
lsProviderChain.html
--write-test-object By default, storage
plugins will avoid
writing any non-user
data to the target.
However, that is the
only way to verify
write access in AWS.
Enable this option to
write (and then delete)
a test object to the
target bucket during
initialization, to
verify write access
before starting the
job. The test object
name will start with
.ecs-sync-write-access-
check-
ECS S3 (ecs-s3:)
Reads and writes content from/to an ECS S3 bucket. This plugin is triggered
by the pattern:
ecs-s3:http[s]://access_key:secret_key@hosts/bucket[/key-prefix] where hosts =
host[,host][,..] or vdc-name(host,..)[,vdc-name(host,..)][,..] or
load-balancer[:port]
Scheme, host and port are all required. key-prefix (optional) is the prefix
under which to start enumerating or writing within the bucket, e.g. dir1/.
If omitted the root of the bucket will be enumerated or written to. Note
that MPUs now use a shared thread pool per plugin instance, the size of
which matches the threadCount setting in the main options - so
mpuThreadCount here has no effect.
NOTE: Storage options must be prefixed by source- or target-, depending on
which role they assume
--create-bucket By default, the target
bucket must exist. This
option will create it
if it does not
--default-retention-mode <default-retention-mode> Specify the default
Object Lock Retention
Mode (Governance or
Compliance) to be used
if unavailable from
source. Note: if
Compliance mode is
selected, a protected
object can't be
overwritten or deleted
by any user, including
root user.
--enable-v-hosts Specifies whether
virtual hosted buckets
will be used (default
is path-style buckets)
--geo-pinning-enabled Enables geo-pinning.
This will use a
standard algorithm to
select a consistent VDC
for each object key or
bucket name
--include-versions Enable to transfer all
versions of every
object. NOTE: this will
overwrite all versions
of each source key in
the target system if
any exist!
--mpu-enabled Enables multi-part
upload (MPU). Large
files will be split
into multiple streams
and (if possible) sent
in parallel
--mpu-part-size-mb <size-in-MB> Sets the part size to
use when multipart
upload is required
(objects over 5GB).
Default is 128MB,
minimum is 4MB
--mpu-resume-enabled Enables multi-part
upload (MPU) to be
resumed from existing
uploaded parts.
--mpu-threshold-mb <size-in-MB> Sets the size threshold
(in MB) when an upload
shall become a
multipart upload
--no-apache-client-enabled Disabling this will use
the native Java HTTP
protocol handler, which
can be faster in some
situations, but is
buggy
--no-preserve-directories By default, directories
are stored in S3 as
empty objects to
preserve empty dirs and
metadata from the
source. Turn this off
to avoid copying
directories. Note that
if this is turned off,
verification may fail
for all directory
objects
--no-reset-invalid-content-type By default, any invalid
content-type is reset
to the default
(application/octet-stre
am). Turn this off to
fail these objects (ECS
does not allow invalid
content-types)
--no-smart-client The smart-client is
enabled by default. Use
this option to turn it
off when using a load
balancer or fixed set
of nodes
--remote-copy If enabled, a
remote-copy command is
issued instead of
streaming the data. Can
only be used when the
source and target is
the same system
--retention-type <retention-type> Specify Retention type
to be used: S3 Object
Lock, or ECS Classic
Retention. Note: the
retention type
ObjectLock requires IAM
users. The retention
type Classic is not
supported when source
bucket enables S3
Object Lock.
--session-token <session-token> The STS Session Token,
if using temporary
credentials to access
ECS
--socket-connect-timeout-ms <timeout-ms> Sets the connection
timeout in milliseconds
(default is 15000ms)
--socket-read-timeout-ms <timeout-ms> Sets the read timeout
in milliseconds
(default is 0ms)
--store-source-object-copy-markers Enable this to store
source object copy
markers (mtime and
ETag) in target user
metadata. This will
mark the object such
that a subsequent copy
job can more accurately
recognize if the source
object has already been
copied and skip it
--url-encode-keys Enables URL-encoding of
object keys in bucket
listings. Use this if a
source bucket has
illegal XML characters
in key names
Atmos (atmos:)
The Atmos plugin is triggered by the URI pattern:
atmos:http[s]://uid:secret@host[,host..][:port][/namespace-path]
Note that the uid should be the 'full token ID' including the subtenant ID and
the uid concatenated by a slash
If you want to software load balance across multiple hosts, you can provide a
comma-delimited list of hostnames or IPs in the host part of the URI.
NOTE: Storage options must be prefixed by source- or target-, depending on
which role they assume
--access-type <access-type> The access method to locate objects
(objectspace or namespace)
--include-top-folder (only applies to namespace) By
default, only the children of the
specified namespace folder will be
synced. Enable this to include the
top folder when syncing. Useful
when there is metadata on that
folder (i.e. GeoDrive)
--no-encode-utf8 By default, metadata and header
values are URL-encoded in UTF-8.
Use this option to disable encoding
and send raw metadata and headers
--preserve-object-id Supported in ECS 3.0+ when used as
a target where another AtmosStorage
is the source (both must use
objectspace). When enabled, a new
ECS feature will be used to
preserve the legacy object ID,
keeping all object IDs the same
between the source and target
--remove-tags-on-delete When deleting from a source
subtenant, specifies whether to
delete listable-tags prior to
deleting the object. This is done
to reduce the tag index size and
improve write performance under the
same tags
--replace-metadata Atmos does not have a call to
replace metadata; only to set or
remove it. By default, set is used,
which means removed metadata will
not be reflected when updating
objects. Use this flag if your sync
operation might remove metadata
from an existing object
--retention-enabled Specifies that retention is enabled
in the target. Changes the write
behavior to work with wschecksum
and retention
--ws-checksum-type <ws-checksum-type> If specified, the atmos wschecksum
feature will be applied to writes.
Valid algorithms are sha1, or md5.
Disabled by default
Azure Blob (azure-blob:) -- Source Only
Reads content from an Azure Blob Storage. This plugin is triggered by the
pattern:
DefaultEndpointsProtocol=https;AccountName=[containerName];AccountKey=[accountK
ey];EndpointSuffix=core.windows.net
Note that this plugin only used as target which need to be sync.
Please run the sync without mpu when target is Azure blob storage.
NOTE: Storage options must be prefixed by source- or target-, depending on
which role they assume
--include-snap-shots Enable to transfer all snapshots of every blob
object
CAS (cas:)
The CAS plugin is triggered by the URI pattern:
cas:[hpp:]//host[:port][,host[:port]...]?name=<name>,secret=<secret>
or cas:[hpp:]//host[:port][,host[:port]...]?<pea_file>
Note that <name> should be of the format <subtenant_id>:<uid> when connecting
to an Atmos system. This is passed to the CAS SDK as the connection string
(you can use primary=, secondary=, etc. in the server hints). To facilitate
CAS migrations, sync from a CasStorage source to a CasStorage target. Note
that by default, verification of a CasStorage object will also verify all
blobs.
NOTE: Storage options must be prefixed by source- or target-, depending on
which role they assume
--application-name <application-name> This is the application name
given to the pool during
initial connection.
--application-version <application-version> This is the application
version given to the pool
during initial connection.
--delete-reason <audit-string> When deleting source clips,
this is the audit string.
--large-blob-count-enabled Enable this option for clips
with more than 100 blobs. It
will reduce the memory
footprint.
--no-drain-blobs-on-error May provide more stability
when errors occur while
writing a blob to the target.
Disable this for clips with
very large blobs.
--privileged-delete When deleting source clips,
use privileged delete.
--query-end-time <yyyy-MM-ddThh:mm:ssZ> When used as a source with
CAS query (no clip list is
provided), specifies the end
time of the query (only clips
created before this time will
be synced). If no end time is
provided, all clips created
after the specified start
time are synced. Note the end
time must not be in the
future, according to the CAS
server clock. Date/time
should be provided in
ISO-8601 UTC format (i.e.
2015-01-01T04:30:00Z)
--query-start-time <yyyy-MM-ddThh:mm:ssZ> When used as a source with
CAS query (no clip list is
provided), specifies the
start time of the query (only
clips created after this time
will be synced). If no start
time is provided, all clips
created before the specified
end time are synced. Note the
start time must not be in the
future, according to the CAS
server clock. Date/time
should be provided in
ISO-8601 UTC format (i.e.
2015-01-01T04:30:00Z)
--synchronize-clip-close EXPERIMENTAL - option to
serialize all selected calls
to the CAS SDK
--synchronize-clip-open EXPERIMENTAL - option to
serialize all selected calls
to the CAS SDK
--synchronize-clip-write EXPERIMENTAL - option to
serialize all selected calls
to the CAS SDK
NFS (nfs:)
The nfs plugin reads/writes data from/to an nfs file or directory. It is
triggered by the URI:
nfs://server/<mount_root_path>, e.g.nfs://myserver/home/user/myfiles.
If <code>subPath</code> refers to a file, only that file will be synced. If a
directory is specified, the contents of the directory will be synced.
Unless the --non-recursive flag is set, the subdirectories will also be
recursively synced. To preserve object metadata on the target filesystem,
or to read back preserved metadata, use --store-metadata.
NOTE: Storage options must be prefixed by source- or target-, depending on
which role they assume
--delete-older-than <delete-age> When --delete-source is used, add
this option to only delete files
that have been modified more than
<delete-age> milliseconds ago
--excluded-paths <regex-pattern> A list of regular expressions to
search against the full file
path. If the path matches, the
file will be skipped. Since this
is a regular expression, take
care to escape special
characters. For example, to
exclude all .snapshot
directories, the pattern would be
.*/\.snapshot. Specify multiple
entries by repeating the CLI
option or using multiple lines in
the UI form
--follow-links Instead of preserving symbolic
links, follow them and sync the
actual files
--modified-since <yyyy-MM-ddThh:mm:ssZ> Only look at files that have been
modified since the specific
date/time. Date/time should be
provided in ISO-8601 UTC format
(i.e. 2015-01-01T04:30:00Z)
--store-metadata When used as a target, stores
source metadata in a json file,
since NFS filesystems have no
concept of user metadata
--sub-path <sub-path> Path to the primary file or
directory from the mount root.
Simulated Storage for Testing (test:)
This plugin will generate random data when used as a source, or act as
/dev/null when used as a target
NOTE: Storage options must be prefixed by source- or target-, depending on
which role they assume
--chance-of-children <chance-of-children> When used as a source, the
percent chance that an object
is a directory vs a data
object. Default is 30
--max-child-count <max-child-count> When used as a source, the
maximum child count for a
directory (actual child count
is random). Default is 8
--max-depth <max-depth> When used as a source, the
maximum directory depth for
children. Default is 5
--max-metadata <max-metadata> When used as a source, the
maximum number of metadata tags
to generate (actual number is
random). Default is 5
--max-size <max-size> When used as a source, the
maximum size of objects (actual
size is random). Default is
1048576
--min-size <min-size> When used as a source, the
minimum size of objects (actual
size is random). Default is 0
--no-discard-data By default, all data generated
or read will be discarded. Turn
this off to store the object
data and index in memory
--no-read-data When used as a target, all data
is streamed from source by
default. Turn this off to avoid
reading data from the source
--object-count <object-count> When used as a source, the
exact number of root objects to
generate. Default is 100
--object-owner <object-owner> When used as a source,
specifies the owner of every
object (in the ACL)
--valid-groups <valid-groups> When used as a source,
specifies valid groups for
which to generate random grants
in the ACL
--valid-permissions <valid-permissions> When used as a source,
specifies valid permissions to
use when generating random
grants
--valid-users <valid-users> When used as a source,
specifies valid users for which
to generate random grants in
the ACL
ACL Mapper (acl-mapping)
The ACL Mapper will map ACLs from the source system to the target using a
provided mapping file. The mapping file should be ordered by priority and
will short-circuit (the first mapping found for the source key will be
chosen for the target). Note that if a mapping is not specified for a
user/group/permission, that value will remain unchanged in the ACL of the
object. You can optionally remove grants by leaving the target value empty
and you can add grants to all objects using the --acl-add-grants option.
If you wish to migrate ACLs with your data, you will always need this plugin
unless the users, groups and permissions in both systems match exactly.
Note: If you simply want to take the default ACL of the target system,
there is no need for this filter; just don't sync ACLs (this is the default
behavior)
--acl-add-grants <acl-add-grants> Adds a list of grants to
all objects synced to the
target system. Syntax is
like so (repeats are
allowed):
group.<target_group>=<targe
t_perm>
user.<target_user>=<target_
perm>
You can specify multiple
entries by repeating the
CLI option or using
multiple lines in the UI
form
--acl-append-domain <acl-append-domain> Appends a directory
realm/domain to each user
that is mapped. Useful when
mapping POSIX users to LDAP
identities
--acl-map-file <acl-map-file> Path to a file that
contains the mapping of
identities and permissions
from source to target. Each
entry is on a separate
line and specifies a
group/user/permission
source and target name[s]
like so:
group.<source_group>=<targe
t_group>
user.<source_user>=<target_
user>
permission.<source_perm>=<t
arget_perm>[,<target_perm>.
.]
You can also pare down
permissions that are
redundant in the target
system by using permission
groups. I.e.:
permission1.WRITE=READ_WRIT
E
permission1.READ=READ
will pare down separate
READ and WRITE permissions
into one READ_WRITE/READ
(note the ordering by
priority). Groups are
processed before straight
mappings. Leave the target
value blank to flag an
identity/permission that
should be removed (perhaps
it does not exist in the
target system)
--acl-map-instructions <acl-map-instructions> The mapping of identities
and permissions from source
to target. Each entry is on
a separate line and
specifies a
group/user/permission
source and target name[s]
like so:
group.<source_group>=<targe
t_group>
user.<source_user>=<target_
user>
permission.<source_perm>=<t
arget_perm>[,<target_perm>.
.]
You can also pare down
permissions that are
redundant in the target
system by using permission
groups. I.e.:
permission1.WRITE=READ_WRIT
E
permission1.READ=READ
will pare down separate
READ and WRITE permissions
into one READ_WRITE/READ
(note the ordering by
priority). Groups are
processed before straight
mappings. Leave the target
value blank to flag an
identity/permission that
should be removed (perhaps
it does not exist in the
target system)
--acl-strip-domain Strips the directory
realm/domain from each user
that is mapped. Useful when
mapping LDAP identities to
POSIX users
--acl-strip-groups Drops all groups from each
object's ACL. Use with
--acl-add-grants to add
specific group grants
instead
--acl-strip-users Drops all users from each
object's ACL. Use with
--acl-add-grants to add
specific user grants
instead
ID Logging Filter (id-logging)
Logs the input and output Object IDs to a file. These IDs are specific to
the source and target plugins
--id-log-file <path-to-file> The path to the file to log IDs to
Metadata Filter (metadata)
Allows adding regular and listable (Atmos only) metadata to each object.
--add-listable-metadata <name=value> Adds listable metadata to every
object. You can specify multiple
name/value pairs by repeating
the CLI option or using multiple
lines in the UI form.
--add-metadata <name=value> Adds regular metadata to every
object. You can specify multiple
name/value pairs by repeating
the CLI option or using multiple
lines in the UI form.
--change-metadata-keys <oldName=newName> Changes metadata keys on every
object. You can specify multiple
old/new key names by repeating
the CLI option or using multiple
lines in the UI form.
--remove-all-user-metadata Removes *all* user metadata from
every object.
--remove-metadata <name,name,...> Removes metadata from every
object. You can specify multiple
names by repeating the CLI
option or using multiple lines
in the UI form.
Override Mimetype (override-mimetype)
This plugin allows you to override the default mimetype of objects getting
transferred. It is useful for instances where the mimetype of an object
cannot be inferred from its extension or is nonstandard (not in Java's
mime.types file). You can also use the force option to override the
mimetype of all objects
--force-mimetype If specified, the mimetype will be
overwritten regardless of its prior value
--override-mimetype <mimetype> Specifies the mimetype to use when an
object has no default mimetype
Path Mapping Filter (path-mapping)
Maps object paths between source and target. The mapping can be specified
by a CSV source list file, or by a user metadata value stored on the source
object, or by providing a regular-expression-based replacement. Note that
the mapping for each object will output the new *relative path* of the
object in the target. This is the path relative to the configured target
storage location. For example, suppose you are using filesystem plugins
where the configured source location is "/mnt/nfs1" and the configured
target location is "/mnt/nfs2". If you want to map the "/mnt/nfs1/foo" file
in the source to "/mnt/nfs2/bar" in the target, you would have an entry in
the source list file (in CSV format) like so: "/mnt/nfs1/foo","bar" (be
sure to quote each value in case they contain commas!). This will change
the relative path of the object from "foo" to "bar", which will get written
under the target storage as "/mnt/nfs2/bar". If you were using metadata to
map the objects, then the metadataName you specify should contain the
target relative path as its value (just "bar" in the example above). If you
are using regular expressions, note that the pattern is applied to the
relative path, not the source identifier. So in the example above, your
pattern could be "foo" and the replacement could be "bar", which would
replace any occurrence of "foo" with "bar" in the relative path of each
object, but the pattern will not apply to the full source location (it will
not see "/mnt/nfs1/"). The important thing to remember is that the mapping
applies to the relative path, not the full identifier. PLEASE NOTE: when
mapping identifiers, it will not be possible to verify object names between
source and target because they will change. Be sure that your application
and data consumers are aware that this mapping has occurred.
--map-source <map-source> Identifies where
the mapping
information for
each object is
stored. Can be
pulled from the
source list file
as the 2nd CSV
column, or from a
user metadata
value stored with
each object, or
it can be a
regular
expression
replacement
--metadata-name <metadata-name> The name of the
metadata on each
object that holds
its target
relative path.
Use with
mapSource:
Metadata
--no-store-previous-path-as-metadata Enable this
option to store
the original
pathname as
metadata on the
target object
under the key
"x-emc-previous-n
ame"
--reg-ex-pattern <reg-ex-pattern> The regular
expression
pattern to use
against the
relative path.
Use with
mapSource: RegEx
and in
combination with
regExReplacementS
tring. This
should follow the
standard Java
regex format
(https://docs.ora
cle.com/javase/8/
docs/api/java/uti
l/regex/Pattern.h
tml). Be sure to
test the pattern
and the
replacement
thoroughly before
using them in a
real migration
--reg-ex-replacement-string <reg-ex-replacement-string> The regex
replacement
string to use
that will
generate the
target relative
path based on the
search pattern in
regExPattern. Use
with mapSource:
RegEx. This
should follow the
standard Java
regex replacement
conventions
(https://docs.ora
cle.com/javase/8/
docs/api/java/uti
l/regex/Matcher.h
tml#appendReplace
ment-java.lang.St
ringBuffer-java.l
ang.String-). Be
sure to test the
pattern and the
replacement
thoroughly before
using them in a
real migration
Path Sharding Filter (path-sharding)
Shards the relative path of an object based on the MD5 of the existing path
(i.e. "a1/fe/my-identifier"). Useful when migrating a flat list of many
identifiers to a filesystem to prevent overloading directories
--shard-count <shard-count> The number of shard directories (a value of 2
means two levels of subdirectories, each
containing a set of hex characters from the
MD5). I.e. a path "my-identifier" with
shardCount of 2 and shardSize of 2 would
change to "28/24/my-identifier"
--shard-size <shard-size> The number of hexadecimal characters in each
shard directory (a value of 2 would mean each
subdirectory name has 2 hex characters). I.e.
a path "my-identifier" with shardCount of 2
and shardSize of 2 would change to
"28/24/my-identifier"
--upper-case By default, shard directories will be in
lower-case. Enable this to make them
upper-case
Shell Command Filter (shell-command)
Executes a shell command for each object transferred. The command will be
given one or two arguments, depending on when it is executed: the source
identifier is always the first argument. The target identifier is the
second argument only if the command is executed after sending the object to
the target (otherwise it will be null). By default, the command will
execute before sending the object to the target storage, but that can be
changed by setting the executeAfterSending option.
--execute-after-sending Specifies whether the shell command
should be executed after sending the
object to the target storage. By
default it is executed before sending
the object
--no-fail-on-non-zero-exit By default, any non-zero exit status
from the command will cause the object
to fail the sync. Disable this option
to allow a non-zero status to be marked
a success. Note: if executeAfterSending
is false (default) and the command
returns a non-zero exit status, the
object will not be sent to the target
--retry-on-fail By default, if failOnNonZeroExit is
true, a failure of the script is
flagged as NOT retryable. If you script
failures to be retried, set this to
true
--shell-command <path-to-command> The shell command to execute
CAS Single Blob Extractor (cas-extractor)
Extracts a single blob from each CAS clip (must use CAS source). If more
than one blob is found in a clip, an error is thrown and the clip will not
be migrated. Please specify what should be used as the relative path/name
of each object: the clip ID (CA of the clip), the value of a tag attribute,
or provided in the source list file (in CSV format:
{clip-id},{relative-path}). Tag attributes will be migrated as user
metadata, but names are limited to the US-ASCII charset - choose an
appropriate behavior for migrating invalid attribute names. NOTE: When
changing protocols, applications must be updated to integrate with the new
protocol and database references may need updating to use the new object
identifiers.
--attribute-name-behavior <attribute-name-behavior> Indicate how to
handle attribute name
characters outside
US-ASCII charset.
When bad characters
are encountered, you
can fail the clip
(don't migrate clip),
skip moving the bad
attribute name as
user metadata (still
migrating clip) or
replace bad attribute
name characters with
'-' (still migrating
clip). If character
replacement is
necessary, original
attribute names will
be saved in the
"x-emc-invalid-meta-n
ames" field as a
comma-delimited list
--no-missing-blobs-are-empty-files By default, if a clip
does not have a blob
and meets all other
criteria, it will be
treated as an empty
file. Disable this to
fail the clip in that
case
--path-attribute <path-attribute-name> The name of the tag
attribute that holds
the path. Use with
pathSource: Attribute
--path-source <path-source> Identifies where the
path information for
the object is stored.
Can be pulled from
the source list file
as the 2nd CSV
column, or from an
attribute value
somewhere in the
clip, or it can just
be the clip ID (CA).
Default is the clip
ID
CUA Extraction Filter (cua-extractor)
Extracts CUA files directly from CAS clips (must use CAS source). NOTE:
this filter requires a specifically formatted CSV file as the source list
file. For NFS, the format is:
[source-id],[relative-path-name],NFS,[uid],[gid],[mode],[mtime],[ctime],[at
ime],[symlink-target]For CIFS, the format is:
[source-id],[relative-path-name],[cifs-ecs-encoding],[original-name],[file-
attributes],[security-descriptor]
--no-file-metadata-required by default, file metadata must be extracted
from the origin filesystem and provided in
the source file list. this is the only way to
get the attributes/security info
DX Extraction Filter (dx-extractor)
Extracts DX file data directly from the backing storage system. NOTE: this
filter requires a specifically formatted CSV file as the source list file.
The format is:
[source-id],[relative-path-name],[cifs-ecs-encoding],[original-name],[file-
attributes],[security-descriptor]
--no-file-metadata-required by default, file metadata must be extracted
from the origin filesystem and provided in
the source file list. this is the only way to
get the attributes/security info
CIFS-ECS Ingest Filter (cifs-ecs-ingester)
Ingests CIFS attribute and security descriptor metadata so it is compatible
with CIFS-ECS. NOTE: typically, this filter requires a specifically
formatted CSV file as the source list file. The format is:
[source-id],[relative-path-name],[cifs-ecs-encoding],[original-name],[file-
attributes],[security-descriptor]
--no-file-metadata-required by default, file metadata must be extracted
from the source CIFS share and provided in
the source file list. this is the only way to
get the CIFS security descriptor and extended
attributes. you can disable this if you are
ingesting from a GeoDrive Atmos subtenant
Decryption Filter (decrypt)
Decrypts object data using the Atmos Java SDK encryption standard
(https://community.emc.com/docs/DOC-34465). This method uses envelope
encryption where each object has its own symmetric key that is itself
encrypted using the master asymmetric key. As such, there are additional
metadata fields added to the object that are required for decrypting
--decrypt-keystore <keystore-file> required. the .jks keystore
file that holds the
decryption keys. which key to
use is actually stored in the
object metadata
--decrypt-keystore-pass <keystore-password> the keystore password
--decrypt-update-mtime by default, the modification
time (mtime) of an object
does not change when
decrypted. set this flag to
update the mtime. useful for
in-place decryption when
objects would not otherwise
be overwritten due to
matching timestamps
--fail-if-not-encrypted by default, if an object is
not encrypted, it will be
passed through the filter
chain untouched. set this
flag to fail the object if it
is not encrypted
Encryption Filter (encrypt)
Encrypts object data using the Atmos Java SDK encryption standard
(https://community.emc.com/docs/DOC-34465). This method uses envelope
encryption where each object has its own symmetric key that is itself
encrypted using the master asymmetric key. As such, there are additional
metadata fields added to the object that are required for decrypting. Note
that currently, metadata is not encrypted
--encrypt-force-strong 256-bit cipher strength is
always used if available.
this option will stop
operations if strong ciphers
are not available
--encrypt-key-alias <encrypt-key-alias> the alias of the master
encryption key within the
keystore
--encrypt-keystore <keystore-file> the .jks keystore file that
holds the master encryption
key
--encrypt-keystore-pass <keystore-password> the keystore password
--encrypt-update-mtime by default, the modification
time (mtime) of an object
does not change when
encrypted. set this flag to
update the mtime. useful for
in-place encryption when
objects would not otherwise
be overwritten due to
matching timestamps
--fail-if-encrypted by default, if an object is
already encrypted using this
method, it will be passed
through the filter chain
untouched. set this flag to
fail the object if it is
already encrypted
Local Cache (local-cache)
Writes each object to a local cache directory before writing to the target.
Useful for applying external transformations or for transforming objects
in-place (source/target are the same)
NOTE: this filter will remove any extended properties from storage plugins
(i.e. versions, CAS tags, etc.) Do not use this plugin if you are using
those features
--local-cache-root <cache-directory> specifies the root directory in
which to cache files
Preserve ACLs (preserve-acl)
This plugin will preserve source ACL information as user metadata on each
object
Preserve File Attributes (preserve-file-attributes)
This plugin will read and preserve POSIX file attributes as metadata on the
object
Restore Preserved ACLs (restore-acl)
This plugin will read preserved ACLs from user metadata and restore them to
each object
Restore File Attributes (restore-file-attributes)
This plugin will restore POSIX file attributes that were previously
preserved in metadata on the object
--no-fail-on-parse-error by default, if an error occurs parsing the
attribute metadata, this will fail the object.
disable this to only show a warning in that case