Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
RDoc-328 Initial RavenFS documentation
- Loading branch information
Arkadiusz Palinski
committed
Feb 9, 2015
1 parent
48b4803
commit 6602cc4
Showing
15 changed files
with
233 additions
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -5,4 +5,5 @@ | |
/server Server | ||
/studio Studio | ||
/samples Samples | ||
/glossary Glossary | ||
/glossary Glossary | ||
/file-system File system |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
6 changes: 6 additions & 0 deletions
6
Documentation/3.0/Raven.Documentation.Pages/file-system/.docslist
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
what-is-raven-fs.markdown What is RavenFS? | ||
files.markdown Files | ||
indexing.markdown Indexing | ||
/client-api Client API | ||
/synchronization Synchronization | ||
/server-side Server side |
64 changes: 64 additions & 0 deletions
64
Documentation/3.0/Raven.Documentation.Pages/file-system/files.markdown
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,64 @@ | ||
#Files | ||
|
||
RavenFS can stores data by using one of the following storage engines: Esent or Voron. You can choose then at the moment of creating a new file system. | ||
|
||
##What is a file? | ||
|
||
A file in the file system consists of: | ||
|
||
* name (full path), | ||
* total size, | ||
* uploaded size, | ||
* metadata - collection of properties associated with a file, | ||
* sequence of bytes that make up file content. | ||
|
||
##Pages | ||
|
||
Internally each file is divided into multiple pages. A page is a sequence of bytes, its maximum size is 64KB and it has an unique identifier - a pair of hashes calculated on the page's content. | ||
The concept of pages implicates a few facts: | ||
|
||
* stored pages are unique, | ||
* file content is an ordered list of page references, | ||
* each page might be a part of multiple files, | ||
* pages are immutable - once they are written to storage, they cannot be modified (but they can be removed if there is no file referencing this page), | ||
* occupied disk space is reduced by reusing pages if files share the same information (or even the same file has repeated data patterns). | ||
|
||
##Directories | ||
|
||
In RavenFS directories are just a virtual concept. The directory tree is built upon names of existing files. A file name must be a full path e.g. `/docs/pics/wall.jpg`. | ||
A directory part of a file name is indexed together with the file metadata what allows you to browse files by catalogs - you just need to query an appropriate index entry field. | ||
Note that moving a file between directories is actually implemented as a rename operation. | ||
|
||
##Default metadata | ||
|
||
Each file has an associated collection of properties called metadata. A user can attach any information about a file by adding another metadata record. | ||
Some properties are defined by RavenFS itself because they are necessary for internal work. This is metadata of a sample file: | ||
|
||
{CODE-BLOCK:json} | ||
{ | ||
ETag: "00000000-0000-0100-0000-000000000002", | ||
Content-MD5: "0d7a08e7f58bfe020c59d739911ee519", | ||
RavenFS-Size: 23552, | ||
Raven-Creation-Date: 2015-02-09T12:20:06.7257923+00:00, | ||
Raven-Last-Modified: 2015-02-09T12:20:06.7669533+00:00, | ||
Raven-Synchronization-Version: 1, | ||
Raven-Synchronization-Source: c6230a52-d1d7-4ea0-9942-6312431f32a1 | ||
Raven-Synchronization-History: [], | ||
} | ||
{CODE-BLOCK/} | ||
|
||
|
||
|
||
* `ETag` is an internal file identifier, updated every time if a file is modified. The file is considered as modified when new content is uploaded, a name or its metadata are changed or any of those changes has been synchronized from a remote file system, | ||
* `Content-MD5` is a hash of file content, calculated on the fly during an upload by using MD5 algorithm, | ||
* `RavenFS-Size` is a total size of a file, | ||
* `Raven-Creation-Date`, `Raven-Last-Modified` - dates of creation and last modification, | ||
* `Raven-Synchronization-Version` is a number describing a file version in a file system, | ||
* `Raven-Synchronization-Source` is an unique identifier of an origin file server (where a last file modification has been made), | ||
* `Raven-Synchronization-History` is a list that consists of previous {`Raven-Synchronization-Version`, `Raven-Synchronization-Source`} pairs, updated every time a file is synchronized between servers. | ||
|
||
{INFO: Updating synchronization history} | ||
`Raven-Synchronization-Version`, `Raven-Synchronization-Source` and `Raven-Synchronization-History` are always updated together. | ||
Existing `Raven-Synchronization-Version`, `Raven-Synchronization-Source` values are added to the history array (`Raven-Synchronization-History`) | ||
and get new values then. All of those properties, according to their names, are utilized for synchronization purposes (conflicts handling). | ||
{INFO/} |
Binary file added
BIN
+13.7 KB
Documentation/3.0/Raven.Documentation.Pages/file-system/images/indexing_studio.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+44.5 KB
Documentation/3.0/Raven.Documentation.Pages/file-system/images/studio_view.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
35 changes: 35 additions & 0 deletions
35
Documentation/3.0/Raven.Documentation.Pages/file-system/indexing.markdown
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,35 @@ | ||
#Indexing | ||
|
||
The file system allows you to search files by using [Lucene query syntax](http://lucene.apache.org/core/old_versioned_docs/versions/3_0_0/queryparsersyntax.html). You can look for a file by using: | ||
|
||
* name, | ||
* size, | ||
* directory, | ||
* date of modification, | ||
* any user defined metadata. | ||
|
||
The more files and corresponded metadata you add the more search terms you can use to build your search query. All available search fields you can find by using [Client API](TODO arek). Below there is an explanation of built-in search fields: | ||
|
||
Let's assume that we have a file `documents/pictures/wallpaper.jpg`, then default search terms would have the values: | ||
|
||
* `__key` - the full name of the file: `documents/pictures/wallpaper.jpg`, | ||
* `__fileName` - the last part of file path `wallpaper.jpg`, | ||
* `__rfileName` - the *reversed* version of `__fileName` (to support queries that ends with the wildcard): `gpj.repapllaw`, | ||
* `__directory` - the full directory path: `/documents/pictures`, | ||
* `__rdirectory` - the *reversed* directory path (to support queries that ends with the wildcard): `serutcip/stnemucod/` | ||
* `__directoryName` - the list of directories associated with the file: `/documents/pictures`, `/documents`, `/`, | ||
* `__rdirectoryName` - the list of *reversed* paths of directories associated with the file (to support queries that ends with the wildcard): `serutcip/stnemucod/`, `stnemucod/`, `/`, | ||
* `__level` - the nesting level: `3`, | ||
* `__modified` - the date of file indexing (the date index format is *yyyy-MM-dd_HH-mm-ss*), | ||
* `__size` - the file length (in bytes) stored as string (format D20 used), | ||
* `__size_numeric` - the file length (in bytes) stored as numeric fields, what allows to search by range. | ||
|
||
A sample query to find all files under `/documents` directory (or nested) that name ends with `.jpg` and size is greater or equal than 1MB: | ||
|
||
`__directoryName:/documents AND __rfileName:gpj.* AND __size_numeric:[1048576 TO *]` | ||
|
||
The easiest way to search for files from the code is to use either [Client API](../client-api/indexTODO arek) methods. | ||
|
||
Searching is also supported by studio, where you will find useful predefined search filters: | ||
|
||
![Figure 1: Search filters](images\indexing_studio.png) |
Empty file.
54 changes: 54 additions & 0 deletions
54
Documentation/3.0/Raven.Documentation.Pages/file-system/what-is-raven-fs.markdown
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,54 @@ | ||
#What is RavenFS? | ||
|
||
The Raven File System (RavenFS) is a distributed virtual file system integrated with RavenDB to provide a first class support for binary data. | ||
Since RavenDB 3.0 it is the recommended way to store your binary files instead of the deprecated attachment mechanism. | ||
|
||
It was designed upfront to handle very large files (multiple GBs) efficiently at API and storage layers level by minimizing the amount of duplicated data between files. | ||
It has a built-in file indexing support that allows you to search files by their associated metadata (such as size of a file, a modification date or custom ones defined by user). | ||
|
||
RavenFS is a replicated and highly available system. It provides an optimized file synchronization mechanism which ensures that only differences between a file are transferred | ||
over network to synchronize it between configured nodes. This lets you update very large files and replicate only the changes - everything is transparent for a user, you just need | ||
to specify destination nodes. | ||
|
||
##Basic concepts | ||
|
||
###File | ||
|
||
An essential item that you will work with is a file. Besides binary data that makes up a file's content, each one has associated metadata. There are two kinds of metadata: | ||
|
||
* the first one is provided by the system and internally used by it (for instance: `ETag`), | ||
* the second one is defined by a user and can contain any information under a custom key. | ||
|
||
As it was already mentioned metadata is available for searching. More details about files are stored internally you will find in [Files](files) article. | ||
|
||
###Configuration | ||
|
||
A configuration is an item for keeping non-binary data as a collection of key/value properties stored under a unique name. Note that configurations can be | ||
completely unrelated to your files but they can hold additional information that matters for your application. They are also used internally by RavenFS to store | ||
some configuration settings (i.e. `Raven/Synchronization/Destinations` keeps addresses of synchronization destination nodes). | ||
|
||
###Indexing | ||
|
||
Files are indexed by default. It allows you to execute the queries against metadata of stored files. Under the hood, the same like in RavenDB, | ||
Lucene search engine is used. This allows you to do an efficient search by using file name, its size and metadata. | ||
|
||
###Synchronization | ||
|
||
A synchronization between RavenFS nodes works out of the box. The only thing you need to do is to provide a list of destination file systems. | ||
Once one of the following events happens, then it will automatically start to synchronize an affected file: | ||
|
||
* new file uploaded, | ||
* file content changed, | ||
* file metadata changed. | ||
* file renamed, | ||
* file deleted. | ||
|
||
The synchronization task also runs periodically to handle failures and restart scenarios. Each of the above operations is related with a different kind of | ||
synchronization work, which is determined by the server in order to minimize the amount of transferred data across the network. For example if you just change | ||
a file name then there is no need to sent its content, just the destination nodes know what is a new file name. To get more details about implemented synchronization solutions click [here](). | ||
|
||
##Management studio | ||
|
||
You can easily manage your files by using HTML5 application studio. Databases as well as file systems are handled by the same application accessible under RavenDB server url. | ||
|
||
![Figure 1. Studio. File system](images/studio_view.png) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
@using Raven.Documentation.Web.Helpers | ||
@model Raven.Documentation.Web.Models.PageModel | ||
@{ | ||
ViewBag.Title = "File System"; | ||
} | ||
<div id="article"> | ||
<div class="row"> | ||
<div class="col-md-3"> | ||
@Html.GenerateTableOfContents(Url, Model.TableOfContents, null) | ||
</div> | ||
<div class="col-md-9"> | ||
TODO | ||
</div> | ||
</div> | ||
</div> | ||
|
||
|