LINQ provider to run native queries on a Lucene.Net index
Clone or download
Pull request Compare This branch is even with chriseldredge:master.
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
.nuget
source
.editorconfig
.gitattributes
.gitignore
.travis.yml
Ciao.proj
Ciao.props
LICENSE.txt
Lucene.Net.Linq.sln
NuGet.exe
README.markdown

README.markdown

LINQ to Lucene.Net

Build status

Lucene.Net.Linq is a .net library that enables LINQ queries to run natively on a Lucene.Net index.

  • Automatically converts PONOs to Documents and back
  • Add, delete and update documents in atomic transaction
  • Unit of Work pattern automatically tracks and flushes updated documents
  • Update/replace documents with [Field(Key=true)] to prevent duplicates
  • Term queries
  • Prefix queries
  • Range queries and numeric range queries
  • Complex boolean queries
  • Native pagination using Skip and Take
  • Support storing and querying NumericField
  • Automatically convert complex types for storing, querying and sorting
  • Custom boost functions using IQueryable.Boost() extension method
  • Sort by standard string, NumericField or any type that implements IComparable
  • Sort by item.Score() extension method to sort by relevance
  • Specify custom format for DateTime stored as strings
  • Register cache-warming queries to be executed when IndexSearcher is being reloaded

Available on NuGet Gallery

To install the Lucene.Net.Linq package, run the following command in the Package Manager Console

PM> Install-Package Lucene.Net.Linq

Examples

  1. Using fluent syntax to configure mappings
  2. Using attributes to configure mappings
  3. Specifying document keys

Note on Performance

Initial versions of the library include a query filter when your entites specify a document key or key field in their mappings. The intention of this filter is to ensure that multiple entity types can be stored in a single index without unexpected errors.

It has been pointed out that this query filter adds significant overhead to query performance and goes against a best practive of using a different index for each type of document being stored.

To maintain backwards compatibility, the feature is left enabled by default, but it can now be disabled by doing:

luceneDataProvider.Settings.EnableMultipleEntities = false;

Future versions of this library may change the default behavior.

Integration with OData

Lucene.Net.Linq supports both WCF Data Services and WebApi OData. These libraries by default support a feature known as Null Propagation that adds null safety to LINQ Expressions to avoid NullReferenceException from being thrown when operating on a property that may be null.

A simple expression like:

from doc in Documents where doc.Name.StartsWith("Sample") select doc;

Is translated into:

from doc in Documents where (doc != null && doc.Name != null
    && doc.Name.StartsWith("Sample")) select doc;

Null Propagation is designed to work with LINQ To Objects but is not required for LINQ providers such as Lucene.Net.Linq. Lucene.Net.Linq does its best to remove these null-safety checks when translating a LINQ expression tree into a Lucene Query, but for best performance it is recommended to simply turn the feature off, as in this example:

public class PackagesODataController : ODataController
{
    [EnableQuery(HandleNullPropagation = HandleNullPropagationOption.False)]
    public IQueryable<Package> Get()
    {
        return provider.AsQueryable<Package>();
    }
}

Upcoming features / ideas / bugs / known issues

See Issues on the GitHub project page.

Unsupported Characters in Indexed Properties

Some characters, even when using a KeywordAnalyzer or equivalent, will not be handled correctly by Lucene.Net.Linq, such as \, :, ? and * because these characters have special meaning to Lucene's query parser.

This means if you want to index a DOS style path such as c:\dos and later retrieve documents using the same term, it will not work properly.

These characters are perfectly fine for fields that will be analyzed by a tokenizer that would remove them, but exact matching on the entire value is not possible.

If exact matching is required, these characters should be replaced with suitable substitutes that are not reserved by Lucene.