Skip to content

Commit

Permalink
RDoc-328 Initial RavenFS documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
Arkadiusz Palinski committed Feb 9, 2015
1 parent 48b4803 commit 6602cc4
Show file tree
Hide file tree
Showing 15 changed files with 233 additions and 1 deletion.
3 changes: 2 additions & 1 deletion Documentation/3.0/Raven.Documentation.Pages/.docslist
Expand Up @@ -5,4 +5,5 @@
/server Server
/studio Studio
/samples Samples
/glossary Glossary
/glossary Glossary
/file-system File system
Expand Up @@ -351,6 +351,10 @@
<None Include="client-api\setting-up-connection-string.dotnet.markdown" />
<None Include="client-api\setting-up-default-database.dotnet.markdown" />
<None Include="client-api\what-is-a-document-store.dotnet.markdown" />
<None Include="file-system\.docslist" />
<None Include="file-system\server-side\.docslist" />
<None Include="file-system\server-side\files-in-ravenfs.markdown" />
<None Include="file-system\what-is-raven-fs.markdown" />
<None Include="glossary\.docslist" />
<None Include="glossary\admin-statistics.dotnet.markdown" />
<None Include="glossary\attachment-information.dotnet.markdown" />
Expand Down Expand Up @@ -600,9 +604,12 @@
<None Include="transformers\what-are-transformers.markdown" />
</ItemGroup>
<ItemGroup>
<Folder Include="file-system\client-api\" />
<Folder Include="file-system\synchronization\" />
<Folder Include="Properties\" />
</ItemGroup>
<ItemGroup>
<Content Include="file-system\images\studio_view.png" />
<Content Include="indexes\images\side-by-side-1.png" />
<Content Include="indexes\images\side-by-side-2.png" />
<Content Include="indexes\images\side-by-side-3.png" />
Expand Down
@@ -0,0 +1,6 @@
what-is-raven-fs.markdown What is RavenFS?
files.markdown Files
indexing.markdown Indexing
/client-api Client API
/synchronization Synchronization
/server-side Server side
@@ -0,0 +1,64 @@
#Files

RavenFS can stores data by using one of the following storage engines: Esent or Voron. You can choose then at the moment of creating a new file system.

##What is a file?

A file in the file system consists of:

* name (full path),
* total size,
* uploaded size,
* metadata - collection of properties associated with a file,
* sequence of bytes that make up file content.

##Pages

Internally each file is divided into multiple pages. A page is a sequence of bytes, its maximum size is 64KB and it has an unique identifier - a pair of hashes calculated on the page's content.
The concept of pages implicates a few facts:

* stored pages are unique,
* file content is an ordered list of page references,
* each page might be a part of multiple files,
* pages are immutable - once they are written to storage, they cannot be modified (but they can be removed if there is no file referencing this page),
* occupied disk space is reduced by reusing pages if files share the same information (or even the same file has repeated data patterns).

##Directories

In RavenFS directories are just a virtual concept. The directory tree is built upon names of existing files. A file name must be a full path e.g. `/docs/pics/wall.jpg`.
A directory part of a file name is indexed together with the file metadata what allows you to browse files by catalogs - you just need to query an appropriate index entry field.
Note that moving a file between directories is actually implemented as a rename operation.

##Default metadata

Each file has an associated collection of properties called metadata. A user can attach any information about a file by adding another metadata record.
Some properties are defined by RavenFS itself because they are necessary for internal work. This is metadata of a sample file:

{CODE-BLOCK:json}
{
ETag: "00000000-0000-0100-0000-000000000002",
Content-MD5: "0d7a08e7f58bfe020c59d739911ee519",
RavenFS-Size: 23552,
Raven-Creation-Date: 2015-02-09T12:20:06.7257923+00:00,
Raven-Last-Modified: 2015-02-09T12:20:06.7669533+00:00,
Raven-Synchronization-Version: 1,
Raven-Synchronization-Source: c6230a52-d1d7-4ea0-9942-6312431f32a1
Raven-Synchronization-History: [],
}
{CODE-BLOCK/}



* `ETag` is an internal file identifier, updated every time if a file is modified. The file is considered as modified when new content is uploaded, a name or its metadata are changed or any of those changes has been synchronized from a remote file system,
* `Content-MD5` is a hash of file content, calculated on the fly during an upload by using MD5 algorithm,
* `RavenFS-Size` is a total size of a file,
* `Raven-Creation-Date`, `Raven-Last-Modified` - dates of creation and last modification,
* `Raven-Synchronization-Version` is a number describing a file version in a file system,
* `Raven-Synchronization-Source` is an unique identifier of an origin file server (where a last file modification has been made),
* `Raven-Synchronization-History` is a list that consists of previous {`Raven-Synchronization-Version`, `Raven-Synchronization-Source`} pairs, updated every time a file is synchronized between servers.

{INFO: Updating synchronization history}
`Raven-Synchronization-Version`, `Raven-Synchronization-Source` and `Raven-Synchronization-History` are always updated together.
Existing `Raven-Synchronization-Version`, `Raven-Synchronization-Source` values are added to the history array (`Raven-Synchronization-History`)
and get new values then. All of those properties, according to their names, are utilized for synchronization purposes (conflicts handling).
{INFO/}
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@@ -0,0 +1,35 @@
#Indexing

The file system allows you to search files by using [Lucene query syntax](http://lucene.apache.org/core/old_versioned_docs/versions/3_0_0/queryparsersyntax.html). You can look for a file by using:

* name,
* size,
* directory,
* date of modification,
* any user defined metadata.

The more files and corresponded metadata you add the more search terms you can use to build your search query. All available search fields you can find by using [Client API](TODO arek). Below there is an explanation of built-in search fields:

Let's assume that we have a file `documents/pictures/wallpaper.jpg`, then default search terms would have the values:

* `__key` - the full name of the file: `documents/pictures/wallpaper.jpg`,
* `__fileName` - the last part of file path `wallpaper.jpg`,
* `__rfileName` - the *reversed* version of `__fileName` (to support queries that ends with the wildcard): `gpj.repapllaw`,
* `__directory` - the full directory path: `/documents/pictures`,
* `__rdirectory` - the *reversed* directory path (to support queries that ends with the wildcard): `serutcip/stnemucod/`
* `__directoryName` - the list of directories associated with the file: `/documents/pictures`, `/documents`, `/`,
* `__rdirectoryName` - the list of *reversed* paths of directories associated with the file (to support queries that ends with the wildcard): `serutcip/stnemucod/`, `stnemucod/`, `/`,
* `__level` - the nesting level: `3`,
* `__modified` - the date of file indexing (the date index format is *yyyy-MM-dd_HH-mm-ss*),
* `__size` - the file length (in bytes) stored as string (format D20 used),
* `__size_numeric` - the file length (in bytes) stored as numeric fields, what allows to search by range.

A sample query to find all files under `/documents` directory (or nested) that name ends with `.jpg` and size is greater or equal than 1MB:

`__directoryName:/documents AND __rfileName:gpj.* AND __size_numeric:[1048576 TO *]`

The easiest way to search for files from the code is to use either [Client API](../client-api/indexTODO arek) methods.

Searching is also supported by studio, where you will find useful predefined search filters:

![Figure 1: Search filters](images\indexing_studio.png)
Empty file.
@@ -0,0 +1,54 @@
#What is RavenFS?

The Raven File System (RavenFS) is a distributed virtual file system integrated with RavenDB to provide a first class support for binary data.
Since RavenDB 3.0 it is the recommended way to store your binary files instead of the deprecated attachment mechanism.

It was designed upfront to handle very large files (multiple GBs) efficiently at API and storage layers level by minimizing the amount of duplicated data between files.
It has a built-in file indexing support that allows you to search files by their associated metadata (such as size of a file, a modification date or custom ones defined by user).

RavenFS is a replicated and highly available system. It provides an optimized file synchronization mechanism which ensures that only differences between a file are transferred
over network to synchronize it between configured nodes. This lets you update very large files and replicate only the changes - everything is transparent for a user, you just need
to specify destination nodes.

##Basic concepts

###File

An essential item that you will work with is a file. Besides binary data that makes up a file's content, each one has associated metadata. There are two kinds of metadata:

* the first one is provided by the system and internally used by it (for instance: `ETag`),
* the second one is defined by a user and can contain any information under a custom key.

As it was already mentioned metadata is available for searching. More details about files are stored internally you will find in [Files](files) article.

###Configuration

A configuration is an item for keeping non-binary data as a collection of key/value properties stored under a unique name. Note that configurations can be
completely unrelated to your files but they can hold additional information that matters for your application. They are also used internally by RavenFS to store
some configuration settings (i.e. `Raven/Synchronization/Destinations` keeps addresses of synchronization destination nodes).

###Indexing

Files are indexed by default. It allows you to execute the queries against metadata of stored files. Under the hood, the same like in RavenDB,
Lucene search engine is used. This allows you to do an efficient search by using file name, its size and metadata.

###Synchronization

A synchronization between RavenFS nodes works out of the box. The only thing you need to do is to provide a list of destination file systems.
Once one of the following events happens, then it will automatically start to synchronize an affected file:

* new file uploaded,
* file content changed,
* file metadata changed.
* file renamed,
* file deleted.

The synchronization task also runs periodically to handle failures and restart scenarios. Each of the above operations is related with a different kind of
synchronization work, which is determined by the server in order to minimize the amount of transferred data across the network. For example if you just change
a file name then there is no need to sent its content, just the destination nodes know what is a new file name. To get more details about implemented synchronization solutions click [here]().

##Management studio

You can easily manage your files by using HTML5 application studio. Databases as well as file systems are handled by the same application accessible under RavenDB server url.

![Figure 1. Studio. File system](images/studio_view.png)
4 changes: 4 additions & 0 deletions Raven.Documentation.Parser/Data/Category.cs
Expand Up @@ -29,6 +29,10 @@ public enum Category
[Description("Getting started")]
Start,

[Prefix("file-system")]
[Description("File System")]
FileSystem,

// legacy categories

[Prefix("intro")]
Expand Down
9 changes: 9 additions & 0 deletions Raven.Documentation.Web/Controllers/DocsController.cs
Expand Up @@ -218,6 +218,15 @@ public virtual ActionResult Samples(string version, string language)
return View(MVC.Docs.Views.Samples, new PageModel(toc));
}

public virtual ActionResult FileSystem(string version, string language)
{
var toc = DocumentSession
.Query<TableOfContents>()
.First(x => x.Category == Category.FileSystem && x.Version == CurrentVersion);

return View(MVC.Docs.Views.FileSystem, new PageModel(toc));
}

public virtual ActionResult Articles(string version, string language, string key)
{
ViewBag.Key = null;
Expand Down
32 changes: 32 additions & 0 deletions Raven.Documentation.Web/DocsController.generated.cs
Expand Up @@ -110,6 +110,12 @@ public virtual System.Web.Mvc.ActionResult Samples()
}
[NonAction]
[GeneratedCode("T4MVC", "2.0"), DebuggerNonUserCode]
public virtual System.Web.Mvc.ActionResult FileSystem()
{
return new T4MVC_System_Web_Mvc_ActionResult(Area, Name, ActionNames.FileSystem);
}
[NonAction]
[GeneratedCode("T4MVC", "2.0"), DebuggerNonUserCode]
public virtual System.Web.Mvc.ActionResult Articles()
{
return new T4MVC_System_Web_Mvc_ActionResult(Area, Name, ActionNames.Articles);
Expand Down Expand Up @@ -139,6 +145,7 @@ public class ActionNamesClass
public readonly string Server = "Server";
public readonly string Glossary = "Glossary";
public readonly string Samples = "Samples";
public readonly string FileSystem = "FileSystem";
public readonly string Articles = "Articles";
}

Expand All @@ -154,6 +161,7 @@ public class ActionNameConstants
public const string Server = "Server";
public const string Glossary = "Glossary";
public const string Samples = "Samples";
public const string FileSystem = "FileSystem";
public const string Articles = "Articles";
}

Expand Down Expand Up @@ -243,6 +251,15 @@ public class ActionParamsClass_Samples
public readonly string version = "version";
public readonly string language = "language";
}
static readonly ActionParamsClass_FileSystem s_params_FileSystem = new ActionParamsClass_FileSystem();
[GeneratedCode("T4MVC", "2.0"), DebuggerNonUserCode]
public ActionParamsClass_FileSystem FileSystemParams { get { return s_params_FileSystem; } }
[GeneratedCode("T4MVC", "2.0"), DebuggerNonUserCode]
public class ActionParamsClass_FileSystem
{
public readonly string version = "version";
public readonly string language = "language";
}
static readonly ActionParamsClass_Articles s_params_Articles = new ActionParamsClass_Articles();
[GeneratedCode("T4MVC", "2.0"), DebuggerNonUserCode]
public ActionParamsClass_Articles ArticlesParams { get { return s_params_Articles; } }
Expand All @@ -265,6 +282,7 @@ public class _ViewNamesClass
{
public readonly string Article = "Article";
public readonly string Client = "Client";
public readonly string FileSystem = "FileSystem";
public readonly string Glossary = "Glossary";
public readonly string Indexes = "Indexes";
public readonly string NotDocumented = "NotDocumented";
Expand All @@ -280,6 +298,7 @@ public class _ViewNamesClass
}
public readonly string Article = "~/Views/Docs/Article.cshtml";
public readonly string Client = "~/Views/Docs/Client.cshtml";
public readonly string FileSystem = "~/Views/Docs/FileSystem.cshtml";
public readonly string Glossary = "~/Views/Docs/Glossary.cshtml";
public readonly string Indexes = "~/Views/Docs/Indexes.cshtml";
public readonly string NotDocumented = "~/Views/Docs/NotDocumented.cshtml";
Expand Down Expand Up @@ -421,6 +440,19 @@ public override System.Web.Mvc.ActionResult Samples(string version, string langu
return callInfo;
}

[NonAction]
partial void FileSystemOverride(T4MVC_System_Web_Mvc_ActionResult callInfo, string version, string language);

[NonAction]
public override System.Web.Mvc.ActionResult FileSystem(string version, string language)
{
var callInfo = new T4MVC_System_Web_Mvc_ActionResult(Area, Name, ActionNames.FileSystem);
ModelUnbinderHelpers.AddRouteValues(callInfo.RouteValueDictionary, "version", version);
ModelUnbinderHelpers.AddRouteValues(callInfo.RouteValueDictionary, "language", language);
FileSystemOverride(callInfo, version, language);
return callInfo;
}

[NonAction]
partial void ArticlesOverride(T4MVC_System_Web_Mvc_ActionResult callInfo, string version, string language, string key);

Expand Down
2 changes: 2 additions & 0 deletions Raven.Documentation.Web/Helpers/HtmlHelperExtensions.cs
Expand Up @@ -91,6 +91,8 @@ private static MvcHtmlString GenerateNavigationFor30(HtmlHelper htmlHelper, Lang
builder.AppendLine("</ul>");
builder.AppendLine("</li>");

builder.AppendLine(string.Format("<li>{0}</li>", htmlHelper.ActionLink("File System", MVC.Docs.ActionNames.FileSystem, MVC.Docs.Name, new { language = language, version = "3.0", key = "file-system/what-is-ravenfs" }, null)));

builder.AppendLine("</ul>");

return new MvcHtmlString(builder.ToString());
Expand Down
1 change: 1 addition & 0 deletions Raven.Documentation.Web/Raven.Documentation.Web.csproj
Expand Up @@ -263,6 +263,7 @@
<Content Include="Views\Docs\Start.cshtml" />
<Content Include="Views\Docs\Studio.cshtml" />
<Content Include="Views\Docs\Samples.cshtml" />
<Content Include="Views\Docs\FileSystem.cshtml" />
</ItemGroup>
<ItemGroup>
<Folder Include="App_Data\" />
Expand Down
17 changes: 17 additions & 0 deletions Raven.Documentation.Web/Views/Docs/FileSystem.cshtml
@@ -0,0 +1,17 @@
@using Raven.Documentation.Web.Helpers
@model Raven.Documentation.Web.Models.PageModel
@{
ViewBag.Title = "File System";
}
<div id="article">
<div class="row">
<div class="col-md-3">
@Html.GenerateTableOfContents(Url, Model.TableOfContents, null)
</div>
<div class="col-md-9">
TODO
</div>
</div>
</div>


0 comments on commit 6602cc4

Please sign in to comment.