Skip to content

VirusAlex/NetCopy

Repository files navigation

NetCopy

CI Release Container License Java 25

Fast multi-stream file transfer between two trusted hosts on a LAN. Built for the "copy a few hundred GB across the room" case where NFS is overkill, scp is single-stream, and rsync does not give you a browsable UI or live progress.

NetCopy runs as a single self-contained JAR (or Docker container) on each host. Both peers serve and receive at the same time over an authenticated control plane (HTTP + WebSocket) and a parallel data plane (HTTP/1.1 with Range requests, or a custom TCP framing protocol). Transfers resume after kill/network drop via per-file sidecar bitmaps and a JSON job store.

NetCopy is not:

  • a sync tool — it never deletes anything on either side, by design and enforced by ArchUnit tests,
  • a public-internet tool — it assumes a trusted LAN; the only auth is a shared bearer token,
  • a backup tool — there is no scheduling, snapshotting, or retention.

Linux is the only platform under CI and the only one we ship release images for. The pure-Java parts run on macOS and Windows too, but a few platform quirks aren't tested on every commit — see Known issues.

Quick start

Two ways to run NetCopy on each host: a Docker container (recommended — no JDK install) or a plain JAR.

Docker (no JDK required)

docker run -d --name netcopy \
    -e PUID=$(id -u) -e PGID=$(id -g) \
    -p 7777:7777 -p 7778:7778 \
    -v /srv/share:/share:ro \
    -v /srv/incoming:/incoming \
    -v netcopy-state:/var/lib/netcopy \
    ghcr.io/virusalex/netcopy:latest
docker logs netcopy | head            # grab the auto-generated token

The image always serves the host paths mounted at /share (read-only, exposed to the peer) and /incoming (the destination for incoming files). /var/lib/netcopy holds resume state and should be a named volume or a host bind-mount so it survives container restarts.

To pin a specific version, use ghcr.io/virusalex/netcopy:<version> (or :main for the rolling pre-release built from the latest main commit).

File ownership. PUID / PGID make NetCopy write files owned by your real user — the entrypoint chowns /var/lib/netcopy, then drops to that uid via gosu before launching the JVM. Defaults are 10001:10001 for backward compatibility with images before v0.2.7. If you'd previously been running without PUID / PGID and have receive-root files owned by 10001, take ownership once with sudo chown -R "$(id -u):$(id -g)" /srv/incoming.

Plain JAR (needs JDK 25)

Install JDK 25 from your distro's package manager or Adoptium Temurin, then drop the jar somewhere convenient and start it on host A:

java -jar netcopy.jar \
    --bind 0.0.0.0 --port 7777 --tcp-port 7778 \
    --shared-root /srv/share \
    --receive-root /srv/incoming

The first line printed on stdout is the auth token, e.g.

=== NetCopy auth token (use this on the peer to connect) ===
8oFqA7n2_kZLpW6cV8kx-mVxYkpJ4u2QH6n0uZ4u2dE
===========================================================

Start the daemon on host B with the same token (or a different one — both hosts only need to know the other's token, not their own):

java -jar netcopy.jar \
    --bind 0.0.0.0 --port 7777 --tcp-port 7778 \
    --shared-root /home/me/Videos \
    --receive-root /home/me/Downloads \
    --token 8oFqA7n2_kZLpW6cV8kx-mVxYkpJ4u2QH6n0uZ4u2dE

Using the UI

Open http://<host>:7777/ from a browser on either side. Paste the peer's URL and the peer's token, browse the peer's --shared-root, select files or folders, pick a --receive-root on the local side, and start the transfer. Progress is live over WebSocket; killing the process and restarting with the same --state-dir resumes from the last verified chunk.

CLI flags

All flags are optional except that you almost certainly want at least one --shared-root (so the peer can pull from you) and at least one --receive-root (so you have somewhere to write).

Flag Default Description
--bind 0.0.0.0 HTTP/WS bind address.
--port 7777 HTTP port. Serves UI, REST, and WebSocket on this port.
--tcp-port 7778 Custom TCP blob server port. Set to 0 to disable and use HTTP-only data plane.
--shared-root (none) Directory exposed to the peer as a read-only outbound source. Repeatable.
--receive-root (none) Directory the peer is allowed to write into. Repeatable.
--state-dir $XDG_STATE_HOME/netcopy Where jobs/<id>.json resume files are stored.
--token random Shared bearer token. If absent, a 256-bit URL-safe random token is generated and printed to stdout on startup.
--file-parallelism 4 Maximum number of files transferred in parallel within one job.
--chunks-per-file 8 Maximum number of chunks transferred in parallel per file.
--chunk-size 8MB Chunk size for files below --large-threshold. Accepts KB/MB/GB suffixes.
--chunk-size-large 32MB Chunk size for files above --large-threshold.
--large-threshold 1GB Files larger than this use --chunk-size-large; smaller files use --chunk-size.
--follow-symlinks false If true, symlinks under --shared-root are dereferenced; otherwise they are transferred as symlinks.
--help, -h Show help.
--version, -V Print version.

Example for a fat 10 GbE link between two well-tuned hosts:

java -jar netcopy.jar \
    --shared-root /tank/media --receive-root /tank/incoming \
    --file-parallelism 8 --chunks-per-file 16 \
    --chunk-size-large 64MB --large-threshold 256MB

Architecture

NetCopy splits cleanly into a control plane and a data plane.

+------------------ host A ------------------+         +------------------ host B ------------------+
|  Browser UI  (Alpine.js + WebSocket)       |         |  Browser UI                                |
|        |                                    |        |        |                                    |
|  Javalin: REST + /ws/progress  (port 7777)  |<------>|  Javalin: REST + /ws/progress  (port 7777)  |
|        |                                    |  ctrl  |        |                                    |
|  ManifestPlanner / ManifestRegistry         |        |  ManifestPlanner / ManifestRegistry         |
|        |                                    |        |        |                                    |
|  TransferEngine + ChunkWorker pool          |        |  TransferEngine + ChunkWorker pool          |
|     |          |                            |  data  |     |          |                            |
|  HttpPuller  TcpPuller  (port 7778 server)  |<------>|  HttpPuller  TcpPuller  (port 7778 server)  |
|     |          |                            |        |     |          |                            |
|  SidecarStore (data.partial + chunks.bitmap + chunks.hashes + meta.json)                            |
|  JsonJobStore (<state-dir>/jobs/<id>.json)                                                          |
+---------------------------------------------+         +---------------------------------------------+

Control plane (HTTP + WebSocket via Javalin, port 7777):

Endpoint Auth Purpose
GET /api/health no Liveness probe (open).
GET /api/peer/info yes Peer self-description: hostname, version, TCP blob port, root counts.
GET /api/browse?root=&path= yes List a directory under a --shared-root.
GET /api/browse-local?root=&path= yes Same shape, rooted under a --receive-root (UI uses it for the target panel).
POST /api/browse/stats yes Recursive file count + byte total per path; powers the selection-stats footer.
POST /api/manifest yes Plan a transfer. Returns the full manifest (entries, sizes, mtimes, chunk plans, manifestId).
POST /api/manifest/register yes Re-register a previously-issued manifest (used by the puller after a source-side restart).
GET /api/blob/{manifestId}/{fileId} yes HTTP data-plane: file bytes (with Range support, X-Chunk-Hash response header).
GET /api/hash/{manifestId}/{fileId} yes Lazy XXH3-128 of a manifest entry; returns 202 while computing.
POST /api/transfers yes Start a job (target host pulls from a remote source).
GET /api/transfers yes List status snapshots (newest first).
GET /api/transfers/{id} yes Single status snapshot, including per-file table and per-chunk metrics.
POST /api/transfers/{id}/{pause,resume,cancel} yes Lifecycle controls.
DELETE /api/transfers/{id} yes Dismiss a terminal-state job from the persistent store.
POST /api/relay/push yes "Push from here to peer" — proxies POST /api/transfers to the peer using its token.
GET /api/metrics yes Host metrics (CPU/RAM/disk/GC, top threads) + per-server serve metrics.
WS /ws/progress yes Live ProgressEvent stream (subscribe per transfer or wildcard).

Data plane (two interchangeable protocols):

  • GET /api/blob/{manifestId}/{fileId} with HTTP Range headers, served by Javalin via FileChannel.transferTo.
  • A custom binary TCP protocol on port 7778: framed [len:u32][type:u8][payload] with HELLO / REQUEST / DATA_HEAD / DATA / DATA_END / DATA_END_V2 (xxh3 trailer, single-pass; v0.3.0+) / ERR / BYE. Designed to reuse one connection across many pullChunk calls and avoid HTTP parsing overhead at the price of a more interesting wire format.

The protocol is selected per job at start time. See docs/protocol-comparison.md for design notes.

State and resume:

  • Each in-progress target file owns a sidecar directory <file>.netcopy/ containing four files:
    • data.partial — sparse, pre-allocated to the final size, written via positional FileChannel writes;
    • meta.json — immutable per-file descriptor (relPath, size, sourceMtime, chunk plan, schemaVersion);
    • chunks.bitmap — one bit per chunk, set after the chunk's bytes are fsynced and its xxh3-128 chunk-level hash matches what the source advertised on the wire;
    • chunks.hashes — fixed-size array of XXH3-128 digests (16 bytes per chunk), positionally written as each chunk completes. Used by the selective re-verify path on full-file hash mismatch so resume re-pulls only the corrupted chunks instead of the whole file.
  • Hashing has two layers:
    • Per-chunk verification (and the on-the-wire X-Chunk-Hash / DATA_END_V2) is XXH3-128 — fast, ~10 GB/s on x86, allocates a small per-chunk buffer.
    • Full-file finalize is SHA-256 in 256 KiB strides. Streaming XXH3-128 in this codebase buffers all bytes into a ByteArrayOutputStream that overflows the array-size limit on multi-GiB files — SHA-256 streams cleanly via MessageDigest.update. The resulting digest lives in the JSON's hashHex field for v0.x wire-format stability (the field name will change in a future major bump).
  • After all chunks are verified, FileFinalizer rehashes the whole file and atomic-renames data.partial to the final target path.
  • A job's overall state lives in <state-dir>/jobs/<id>.json (one JSON per job). On startup, ResumeManager loads any RUNNING/PAUSED jobs and reattaches to their sidecars; chunks already marked done in the bitmap are skipped.
  • The peer's manifest is captured in the job at planning time, so a transfer is reproducible even if the peer reboots between runs. If the source file's size or mtime changed since planning, the job fails with source_changed rather than silently writing garbage.

Security

NetCopy is a "trusted LAN, two friendly admins" tool. Its security model is small and explicit.

  • Shared bearer token. Every /api/* call (except GET /api/health) and every WebSocket connection must present the token in X-NetCopy-Token (or ?token=... for browsers that cannot set custom headers on WS). On the TCP data plane, the very first frame must be a valid HELLO carrying the token; otherwise the connection is closed with ERR_UNAUTHORIZED. Enforcement lives in a single TokenGate class.
  • Sandboxed roots. The peer can only read paths under one of your --shared-roots and can only write under one of your --receive-roots. All path resolution goes through PathResolver, which canonicalises the user-supplied path, joins it with the root, and re-checks containment; any escape attempt (../, absolute paths, symlink traversal when symlinks are off) raises SecurityException and aborts the request.
  • No-delete invariant. NetCopy is forbidden, statically, from deleting, shrinking, or overwriting user data outside a small whitelist (SafeFileOps, FileFinalizer, JsonJobStore, FileSidecarStore — each reviewed manually). An ArchUnit test in src/test/java rejects any production code outside that whitelist that calls Files.delete*, File.delete, Files.move, FileChannel.truncate, or RandomAccessFile.setLength, or that touches the StandardCopyOption.REPLACE_EXISTING / StandardOpenOption.TRUNCATE_EXISTING fields. Final-file overwrite happens only via SafeFileOps.atomicRename with ConflictPolicy.OVERWRITE, which itself triggers from an explicit UI action and is double-gated server-side (acknowledgeOverwrite: true required in the REST request). Sidecar directories under <file>.netcopy/ are deleted only by FileFinalizer immediately after a successful atomic rename; cancelled / failed transfers leave them in place for the next resume or for the user to clean up.
  • No TLS. The token travels in cleartext. NetCopy assumes the LAN is trusted. If you need to cross an untrusted network, tunnel it (WireGuard, SSH port-forward, stunnel) — do not expose port 7777 or 7778 to the internet.

HTTP vs TCP protocols

Both protocols carry exactly the same bytes (file contents, in chunks, with xxh3-128 verification). They differ in framing and multiplexing:

  • HTTP: one GET /api/blob/{manifestId}/{fileId} per chunk, with a Range: bytes=offset-end header. Connection reuse via keep-alive. Server uses FileChannel.transferTo to splice file pages straight to the socket. Pro: trivial to debug with curl, plays nice with proxies. Con: HTTP parsing overhead on every chunk; one TCP connection per concurrent chunk.
  • TCP: one long-lived connection per peer, multiplexed by reqId. Custom framing (see tasks/contracts/data-formats.md for the wire layout). Pro: fewer round-trips, less parsing, payload hash sent inline in DATA_HEAD. Con: needs its own port (--tcp-port); not curl-debuggable.

See docs/protocol-comparison.md for quantitative comparison (throughput, CPU%, latency to first byte, behaviour under packet loss).

Building from source

Requires JDK 25 and Maven (the wrapper mvnw is committed).

./mvnw -B verify

This runs unit tests and ArchUnit rules, then produces a shaded self-contained JAR at target/netcopy.jar. Integration tests (two-node in-JVM scenarios) live behind the integration profile:

./mvnw -B -Pintegration verify

Downloads

Two distribution channels, both populated by the same release workflow:

Container images on ghcr.io/virusalex/netcopy:

Tag When updated
latest Highest tagged stable release (vX.Y.Z); excludes 0.x pre-1.0 line.
<version> (e.g. 1.0.0) Pinned to that tag, never overwritten.
main Rolling — overwritten by every push to main. Pre-release.

Plain jars on GitHub Releases:

  • Stable tag releases attach netcopy-<version>.jar.
  • The rolling Main snapshot release attaches the latest main build's jar (also overwritten on every push). Useful if you want a JAR but don't want to wait for a tag.

There are no Maven Central artifacts; NetCopy is a daemon, not a library.

Known issues

  • Windows + Java NIO Selector loopback. On some Windows hosts (notably with certain antivirus suites or VPN clients running in user mode), Java's NIO Selector self-pipe wakeup over loopback can hang or be silently dropped, which manifests as the TCP blob server failing to accept connections from 127.0.0.1. Workaround: run on a real LAN address rather than 127.0.0.1, temporarily disable the AV/VPN, or set --tcp-port 0 to fall back to the HTTP data plane. NetCopy's primary supported platform is Linux.
  • source_changed after long pauses. A transfer that is paused for hours while the source file is actively being edited will fail-safe rather than produce a corrupted target. Re-plan the manifest from the UI to pick up the new content.
  • Sidecar leftovers. A cancelled transfer leaves <file>.netcopy/ directories under the receive root. By the no-delete invariant, NetCopy will not clean them up automatically. They are safe to delete manually.

More: see docs/troubleshooting.md.

Contributing

Frozen API surface lives in tasks/contracts/ (interfaces.md for Java signatures, data-formats.md for JSON schemas and the TCP wire format). Changes to those contracts cascade across the codebase, so run them past the maintainer before adjusting.

Forbidden in production code, forever: Files.delete*, File.delete, and Files.move(..., REPLACE_EXISTING) outside the small whitelist in SafeFileOps. ArchUnit will reject the build.

PRs welcome — open one and CI will tell you if you broke something.

About

Fast multi-stream file transfer between two trusted hosts on a LAN. Java + Docker + browsable UI; resumable, hashed, no deletes.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors