-
-
Notifications
You must be signed in to change notification settings - Fork 131
ReGrid File Storage
ReGrid is a distributed file storage on top of RethinkDB. ReGrid is similarly inspired by GridFS from MongoDB. With ReGrid, a large 4GB file can be broken up into chunks and stored on RethinkDB servers. Later, the file can be retrieved by streaming the file's chunks back to the client. The figure below shows ReGrid storing a large video file in chunks across a three node cluster.
(Note: Please ask permission before using figures in presentations, videos, or other works. Thanks.)
- Physical refers to the physical topology, location, and layout of data.
- Logical refers to a logical location of data. A high-level user's view of the organization of files regardless of the physical layout of data.
A Bucket is a logical set of files organized together. File read/download and write/upload operations are performed using a Bucket.
- A Bucket requires a RethinkDB database.
- A RethinkDB database can be partitioned into several Buckets.
- Multiple Buckets in the same RethinkDB database are differentiated by a Bucket's name.
- The default name for a Bucket is
fs.
The figure below illustrates the logical separation of buckets within a single MyFiles database:

In Figure 2 above, there are three logical file Bucket stores in the MyFiles RethinkDB database. It is important to note video.mp4 from the fs bucket is not the same file as video.mp4 from the dev bucket. Buckets can be used to organize files in anyway developers see fit.
To create a Bucket named dev in MyFiles simply:
var bucket = new Bucket(conn, "MyFiles", bucketName: "dev" );
bucket.Mount(); // required before use...
Mounting the dev Bucket before use is required. Mount is necessary to ensure the existence of tables and indexes.
When a File is uploaded to a Bucket a path is specified in the destination Bucket. Multiple uploads to the same path cause the file to be revisioned. Figure 3 below shows the

Negative Revision Numbers
-
-1: The most recent revision. -
-2: The second most recent revision. -
-3: The third most recent revision. - etc...
Positive Revision Numbers
-
0: The original stored file. -
1: The first revision. -
2: The second revision. - etc...
The following code uploads a file to a Bucket:
var fileId = bucket.Upload("/video.mp4", videoBytes);
fileId will be the file reference for that specific revision. There are many methods on bucket that allow the use of IO streams and async methods.
//Downloads to a byte[]
var bytes = bucket.DownloadAsBytesByName("/video.mp4");
//Download revision:0 to a file stream on the client
var localFileStream = File.Open("C:\\video_original.mp4", FileMode.Create);
bucket.DownloadToStreamByName("/video.mp4", localFileStream, revision: 0);
localFileStream.Close();
Caution using DownloadAsBytes as it returns a byte[] with int.MaxValue as a maximum size. For relatively large files use DownloadToStream. DownloadToStream does not have any maximum size limit beyond the host's OS file limitations on the client side.
ReGrid supports starting downloads at an offset by seeking into part of a large file.
CODE
### List files in /foo folder, using localhost as default
rg ls /foo
### List root files in cluster with a well-known IP 192.168.0.4
rg 192.168.0.4 ls /
rg 192.168.0.4:2802 ls /
rg 192.168.0.4 ls /folder
### Get file metadata on /path/video.mp4 and any past revisions
rg 192.168.0.4 info /path/video.mp4
### Copy video.mp4 from the local computer and send it
### to the cluster at 192.168.0.4 but wait until there
### is a pooled connection of at least 5 servers to
### increase fan-out write-performance.
rg 192.168.0.4 -pool 5 put ./video.mp4 /video_uploads/video.mp4
### Create a file video.mp4 on the local computer and
### receive it from the cluster at 192.168.0.4 but wait until there
### is a pooled connection of at least 5 servers to
### increase fan-in read-performance.
rg 192.168.0.4 -pool 5 get /video_uploads/video.mp4 ./video.mp4
### Perform a sha256 integrety check on video.mp4
rg 192.168.0.4 fsck /path/video.mp4
### Reclaim diskspace by cleaning up orphaned file chunks or soft-deleted files.
rg 192.168.0.4 cleanup
- Home
- Query Examples
- Logging
- Connections & Pooling
- Extra C# Features
- GOTCHA Goblins!
- LINQ to ReQL Provider
- Differences
- Java ReQL API Documentation