GoSync is a peer-to-peer file synchronization application written in Go. It allows multiple nodes on the same local network to automatically synchronize files in a specified folder. GoSync is designed to detect changes in files and propagate those changes to other peers in real-time, ensuring that all nodes have the most up-to-date files.
- Introduction
- Features
- Getting Started
- Application Architecture
- Detailed Components Explanation
- Configuration
- Contributing
- License
GoSync is designed to keep a specified folder in sync across multiple machines on the same local network. It detects file changes using filesystem notifications and synchronizes those changes to other connected peers using gRPC streaming. The application uses mDNS (Multicast DNS) for peer discovery, allowing nodes to find each other without manual configuration.
- Automatic Peer Discovery: Uses mDNS to discover other GoSync instances on the local network.
- Real-Time File Synchronization: Monitors file changes and synchronizes them across peers in real-time.
- Efficient Data Transfer: Uses chunking and hashing to minimize data transfer by only sending changed portions of files.
- Robust Connection Management: Maintains stable connections with peers and handles reconnections seamlessly.
- Cross-Platform Compatibility: Written in Go, making it easy to run on various operating systems.
- Go Programming Language: Go version 1.16 or higher.
- Git: For cloning the repository.
- Protobuf Compiler: For generating gRPC code (only if modifying
.protofiles).
-
Clone the Repository
git clone https://github.com/TypeTerrors/go_sync.git cd go_sync -
Install Dependencies
Ensure all Go dependencies are installed:
go mod tidy
-
Compile Protobuf Files (If Necessary)
If you modify the
.protofiles, you need to regenerate the Go code:protoc --go_out=./proto --go-grpc_out=./proto proto/filesync.proto
-
Build the Application
go build -o gosync ./cmd/server/main.go
-
Run the Application
./gosync --sync-folder=<path_to_sync_folder> [options]
Example:
./gosync --sync-folder=./sync_folder --chunk-size=64 --sync-interval=1m --port=50051
Available Flags:
--sync-folder: (Required) Path to the folder to synchronize.--chunk-size: (Optional) Chunk size in kilobytes (default: 64).--sync-interval: (Optional) Synchronization interval (default: 1 minute).--port: (Optional) Port number for the gRPC server (default: 50051).
GoSync is structured into several components, each responsible for a specific aspect of the application's functionality:
- mDNS Service: Handles peer discovery using Multicast DNS.
- Connection Service: Manages connections to peers and message dispatching.
- File Service: Monitors filesystem changes and handles file events.
- Metadata Service: Manages file metadata, including chunk hashes and synchronization status.
- gRPC Service: Provides the RPC interface for communication between peers.
- Discovers other GoSync instances on the local network.
- Registers the local instance as a service that others can discover.
- Maintains a list of peers and notifies the Connection Service of new peers.
- Manages gRPC connections to peers.
- Dispatches messages to peers based on the message type.
- Maintains channels for different types of messages (file sync, health check, metadata exchange, etc.).
- Watches the synchronization folder for file changes using filesystem notifications.
- Handles file creation, modification, and deletion events.
- Initiates synchronization processes when files change.
- Calculates and stores metadata for files and their chunks.
- Uses strong and weak hashes to detect changes in file chunks.
- Determines which chunks need to be synchronized with peers.
- Implements the gRPC server interface.
- Handles incoming RPC calls from peers.
- Manages streaming RPCs for efficient data transfer.
- Peer Discovery: Instances discover each other using mDNS.
- Connection Establishment: A gRPC connection is established between peers.
- File Monitoring: The File Service detects changes in the sync folder.
- Metadata Calculation: The Metadata Service calculates hashes for changed files/chunks.
- Message Dispatching: The Connection Service sends messages to peers about file changes.
- Data Synchronization: Peers exchange file data using gRPC streaming methods.
- Metadata Exchange: Peers compare metadata to determine which chunks need synchronization.
- Discovers other GoSync instances on the local network without manual configuration.
- Registers the local GoSync instance as a discoverable service.
- Service Registration: The local instance registers itself using
zeroconf, advertising its IP and port. - Service Discovery: Continuously browses for services matching the GoSync service type (
_myapp_filesync._tcp). - Peer Filtering:
- Skips its own service instance.
- Checks if discovered services are on the same subnet.
- Validates the service using TXT records.
- Peer Management: Notifies the Connection Service to add or remove peers based on discovery results.
Start(): Starts the mDNS service discovery.Ping(): Periodically sends health check messages to peers.LocalIp(): Returns the local IP address.SetConn(): Sets the connection interface for communication.
- Manages gRPC connections and communication with peers.
- Dispatches messages to peers and handles incoming messages.
- Peer Management:
- Maintains a map of peers (
peers). - Adds or removes peers based on mDNS discoveries.
- Maintains a map of peers (
- Connection Handling:
- Establishes gRPC connections with peers.
- Monitors connection states and handles reconnections.
- Message Dispatching:
- Uses channels to send different types of messages (file sync, health checks, etc.).
- Implements message senders and receivers for each gRPC stream.
- Synchronization Logic:
- Receives file lists and determines missing files.
- Handles file chunk transfers and metadata synchronization.
Conn: Main structure managing peers and message dispatching.Peer: Represents a peer connection, including channels and streams.
AddPeer(): Adds a new peer and starts managing it.RemovePeer(): Removes a peer and cleans up resources.Start(): Starts the dispatch of messages to peers.SendMessage(): Enqueues a message to be sent to all peers.managePeer(): Handles the connection and communication with a single peer.dispatchMessages(): Distributes messages from the central send channel to all peers.
- Monitors the synchronization folder for file changes.
- Handles file creation, modification, and deletion events.
- Initiates synchronization processes when files change.
- Filesystem Monitoring:
- Uses
fsnotifyto watch for file system events. - Detects file creation, modification, and deletion.
- Uses
- Event Handling:
- Debounces events to prevent rapid, repeated processing.
- Handles debounced events by updating metadata and notifying peers.
- File Synchronization:
- Marks files as "in progress" during synchronization to prevent conflicts.
- Updates metadata for new or modified files.
- Notifies the Connection Service to send updates to peers.
Start(): Starts watching the directory for changes.Scan(): Periodically checks if each peer has the same list of files.HandleFileCreation(),HandleFileModification(),HandleFileDeletion(): Handles respective file events.markFileAsInProgress(),markFileAsComplete(): Manages file synchronization state.BuildLocalFileList(): Builds a list of local files for comparison with peers.CompareFileLists(): Compares the local file list with that of a peer to identify missing files.
- Manages file metadata, including chunk hashes.
- Determines which chunks need to be synchronized.
- Stores metadata in memory and persists it using BadgerDB.
- Metadata Calculation:
- Splits files into chunks based on the configured chunk size.
- Calculates strong and weak hashes for each chunk.
- Uses a rolling checksum for efficient weak hash calculation.
- Change Detection:
- Compares current chunk hashes with previous ones to detect changes.
- Identifies new, modified, or deleted chunks.
- Metadata Storage:
- Stores metadata in an in-memory map for quick access.
- Persists metadata to BadgerDB for durability.
- Synchronization Assistance:
- Provides methods to check if all chunks of a file have been received.
- Helps the File Service and Connection Service in determining synchronization actions.
Meta: Main structure holding file metadata.FileMetaData: Holds metadata for a specific file.Hash: Represents the strong and weak hashes of a chunk.
CreateFileMetaData(): Creates or updates metadata for a file.SaveMetaData(): Saves metadata to both memory and database.GetMetaData(): Retrieves metadata for a specific chunk.DeleteMetaData(): Deletes metadata for a chunk.hashChunk(): Calculates the strong hash for a chunk.
- Provides the RPC interface for peer-to-peer communication.
- Implements methods for file synchronization, health checks, metadata exchange, and more.
-
SyncFile (
rpc SyncFile(stream FileSyncRequest) returns (stream FileSyncResponse);)- Purpose: Handles file synchronization requests and responses.
- Usage: Transfers file chunks, deletion notices, and truncation commands between peers.
- Messages:
FileChunk: Contains file chunk data, offset, and metadata.FileDelete: Instructs a peer to delete a file or specific chunk.FileTruncate: Instructs a peer to truncate a file to a specific size.
-
HealthCheck (
rpc HealthCheck(stream Ping) returns (stream Pong);)- Purpose: Performs health checks between peers to ensure connectivity.
- Usage: Exchanges ping and pong messages to monitor peer availability.
-
ExchangeMetadata (
rpc ExchangeMetadata(stream MetadataRequest) returns (stream MetadataResponse);)- Purpose: Exchanges file metadata between peers.
- Usage: Allows peers to compare metadata to determine which chunks need synchronization.
-
RequestChunks (
rpc RequestChunks(stream ChunkRequest) returns (stream ChunkResponse);)- Purpose: Requests specific chunks from a peer.
- Usage: Used when a peer needs to synchronize specific chunks of a file.
-
GetMissingFiles (
rpc GetMissingFiles(stream FileList) returns (stream FileChunk);)- Purpose: Synchronizes missing files between peers.
- Usage: Peers send their file lists to each other, and missing files are transferred accordingly.
- FileSyncRequest: Oneof message containing
FileChunk,FileDelete, orFileTruncate. - FileChunk: Contains file name, chunk data, offset, total chunks, and total size.
- FileDelete: Contains file name and offset (if deleting a specific chunk).
- MetadataRequest/Response: Contains file name and a list of chunk metadata.
- ChunkRequest/Response: Contains file name, offsets, and chunk data.
- FileList: Contains a list of
FileEntrystructures, each representing a file.
The application uses a Config struct defined in ./conf/conf.go to hold runtime configurations:
type Config struct {
SyncFolder string
ChunkSize int64
SyncInterval time.Duration
Port string
}
var AppConfig ConfigConfiguration Parameters:
- SyncFolder: The folder to synchronize across peers.
- ChunkSize: The size of file chunks in bytes.
- SyncInterval: The interval at which the application checks for synchronization.
- Port: The port on which the gRPC server listens.
Setting Configurations:
Configurations are set using command-line flags when starting the application. For example:
./gosync --sync-folder=./sync_folder --chunk-size=64 --sync-interval=1m --port=50051Contributions are welcome! If you'd like to contribute to GoSync, please follow these steps:
- Fork the Repository: Create a fork of the repository on GitHub.
- Create a Feature Branch: Work on your feature or bugfix in a separate branch.
- Commit Your Changes: Make clear and concise commit messages.
- Create a Pull Request: Submit your pull request for review.
This project is licensed under the MIT License - see the LICENSE file for details.
Note: This application is intended for use on local networks and may not be secure for use over the internet. Always ensure you understand the security implications before deploying applications that synchronize files across networks.
- Go Programming Language: For providing an excellent platform for building network applications.
- gRPC: For facilitating efficient communication between peers.
- fsnotify: For providing filesystem notifications.
- zeroconf: For enabling mDNS service discovery.
- Discovery: Uses Multicast DNS to broadcast and discover services.
- Registration: Registers the local GoSync instance with a unique service name that includes the local IP.
- Filtering: Ensures that only GoSync services are discovered by checking TXT records.
- Peer Updates: Notifies the Connection Service to add or remove peers based on discovery events.
- Peer Management: Adds peers discovered by the mDNS Service and manages their connections.
- Message Channels: Uses separate channels for different message types to organize communication.
- gRPC Streams: Establishes persistent gRPC streams for continuous communication.
- Error Handling: Monitors connection states and handles reconnections when necessary.
- Event Debouncing: Debounces file system events to avoid processing rapid, successive events.
- Synchronization States: Marks files as "in progress" or "complete" to manage synchronization flow.
- Open File Management: Keeps track of open file handles to manage resources efficiently.
- Conflict Handling: Checks if a file is already being synchronized to prevent conflicts.
- Chunk Hashing: Uses XXH3 hashing for strong hashes and a rolling checksum for weak hashes.
- Metadata Comparison: Compares previous and current metadata to detect changes.
- Data Structures: Maintains in-memory maps and persists data using BadgerDB.
- Chunk Management: Identifies new, modified, or deleted chunks for synchronization.
- Streaming RPCs: Utilizes streaming RPCs for efficient and continuous data transfer.
- Message Handling: Processes different types of messages based on the request type.
- Error Handling: Handles errors gracefully and attempts reconnections when necessary.
- Data Transfer: Manages the sending and receiving of file chunks and metadata.
- Initial Scan: On startup, GoSync scans the synchronization folder and builds metadata for existing files.
- Peer Discovery: The mDNS Service discovers peers and establishes connections.
- File Monitoring: The File Service watches for file events and updates metadata accordingly.
- Metadata Exchange: Peers exchange metadata to determine differences in files and chunks.
- Chunk Comparison: The Metadata Service identifies which chunks need to be sent or requested.
- Data Transfer: Chunks are sent between peers using the appropriate gRPC methods.
- File Assembly: Received chunks are written to files, and metadata is updated.
- Completion: Once all chunks are received, files are marked as complete.
To synchronize files between multiple machines:
- Ensure All Machines Are on the Same Network: GoSync uses mDNS for discovery, which works on local networks.
- Start GoSync on Each Machine: Use the same
--sync-folderpath on each machine (the folder must exist). - Wait for Discovery: The mDNS Service will discover peers automatically.
- Monitor Logs: Check the logs to ensure peers are discovered and files are synchronizing.
- GoSync uses
github.com/charmbracelet/logfor logging. - Logs include information about:
- Peer discovery and connection states.
- File events and synchronization status.
- Metadata calculations and comparisons.
- To adjust log levels or formats, modify the logging configuration in the code.