Skip to content

Commit

Permalink
Numerous POD updates and corrections thanks to Tim Bunce
Browse files Browse the repository at this point in the history
  • Loading branch information
clintongormley committed Sep 22, 2012
1 parent 8038d31 commit f67ea3d
Show file tree
Hide file tree
Showing 6 changed files with 215 additions and 126 deletions.
2 changes: 1 addition & 1 deletion lib/Elastic/Manual.pod
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,7 @@ Fine-tuning how the attributes in your Elastic::Doc classes are indexed.

Making attributes in your Elastic::Doc classes unique.

=head2 L<Elastic::Manual::Reindexing>
=head2 L<Elastic::Manual::Reindex>

How to reindex your data when you make changes to your attributes.

Expand Down
98 changes: 88 additions & 10 deletions lib/Elastic/Manual/Attributes.pod
Original file line number Diff line number Diff line change
Expand Up @@ -528,8 +528,40 @@ but uses more disk space, and the field needs to be setup correctly before use:

=head1 NUMERIC FIELDS

The following keyword applies only to fields of L</type> C<integer>,
C<long>, C<float>, C<double>, C<short> or C<byte>.
Numeric fields can have any of the following types:

=over

=item byte

Range: -128 to 127

=item short

Range: -32,768 to 32,767

=item integer

Range: -2,147,483,648 to 2,147,483,647

=item long

Range: -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807

=item float

Single-precision 32-bit IEEE 754 floating point

=item double

Double-precision 64-bit IEEE 754 floating point

=back

(See L<http://docs.oracle.com/javase/specs/jls/se7/html/jls-4.html#jls-4.2.3>
for more on single or double value ranges.)

The following keyword applies only to numeric fields.

You can also use L</index>, L</include_in_all>, L</boost>, L</multi>,
L</index_name>, L</unique_key>, L</store> and L</null_value>.
Expand All @@ -548,7 +580,8 @@ memory used.
=head1 DATE FIELDS

Dates in ElasticSearch are stored internally as C<long> values containing
B<milli>seconds since the epoch.
B<milli>seconds since the epoch. The maximum date range is 292269055 BC to
292278994 AD.

The following keywords apply only to fields of L</"type"> C<date>.
You can also use L</index>, L</include_in_all>, L</boost>, L</multi>,
Expand Down Expand Up @@ -740,6 +773,12 @@ if an unknown field is included).

=head2 path

The C<path> keyword controls how the attribute names of an object are flattened
into field names in elasticsearch. It can be set to C<full> (the default) or
C<just_path>.

For instance, given the following example:

package MyApp::Types;

use MooseX::Types -declare 'FullName';
Expand All @@ -754,6 +793,43 @@ if an unknown field is included).
];


package MyApp::Couple;

use Moose;
use MyApp::Types qw(FullName);

has 'husband' => (
is => 'ro',
isa => FullName,
);

has 'wife' => (
is => 'ro',
isa => FullName,
);

A C<MyApp::Couple> object may look like this:

{
husband => { first => 'John', last => 'Smith' },
wife => { first => 'Mary', last => 'Smith' }
}

By default, this data would be flattened and stored as follows:

FIELD NAME | VALUES
------------------|---------
husband.first | john
husband.last | smith
wife.first | mary
wife.last | smith

These field names, or "paths", can also have the C<type> name prepended,
so C<husband.first> could also be referred to as C<couple.husband.first>.

The C<path> keyword can be used to control the construction of this path. For
instance, if C<MyApp::Couple> was defined as:

package MyApp::Couple;

use Moose;
Expand All @@ -768,15 +844,17 @@ if an unknown field is included).
has 'wife' => (
is => 'ro',
isa => FullName,
path => 'full'
path => 'just_name'
);

The C<path> keyword accepts the values C<full> and C<just_name>. By default,
nested attributes can be referenced by just their name, or by their path,
using dot-notation, eg C<wife.first>, or C<couple.wife.first>.
then the values would be indexed as:

FIELD NAME | VALUES
------------------|---------------
first | john, mary
last | smith, smith

The C<path> setting, which defaults to C<full> (eg C<wife.first>) can be
set to C<just_name>, in which case eg the name C<husband.first> won't be defined.
... also accessible as C<couple.first> and C<couple.last>.

The C<path> keyword can also be combined with the L</"index_name"> keyword.

Expand Down Expand Up @@ -848,7 +926,7 @@ but refers to the top-most document, rather than the direct parent.

You can have attributes in one class that refer to another L<Elastic::Doc>
class. For instance, a C<MyApp::Post> object could have, as an attribute,
the C<MyApp::User> object to whome the post belongs.
the C<MyApp::User> object to whom the post belongs.

You may want to store just the L<Elastic::Model::UID> of the C<user> object,
or you may want to include the user's name and email address, so that you
Expand Down
23 changes: 14 additions & 9 deletions lib/Elastic/Manual/Attributes/Unique.pod
Original file line number Diff line number Diff line change
Expand Up @@ -6,18 +6,23 @@ package Elastic::Manual::Attributes::Unique;

=head1 INTRODUCTION

The only unique key available in ElasticSearch is the document ID. Typically,
if you want a document to be unique, you use the unique value as the ID.
The only unique constraint available in ElasticSearch is the document ID.
Typically, if you want a document to be unique, you use the unique value as
the ID.

However, sometimes you don't want to do this. For instance, you may want to
use the email address as a unique identifier for your user accounts, but
use the email address as a unique constraint for your user accounts, but
you also want to be able to link to a user account without exposing their
email address, and let the user change their email address without having
to update the ID of their user account wherever it is used.

In this case, we want the ID of the user document to be auto-generated, but
we also want the value of the C<email> attribute to be unique.

=head1 STORING UNIQUE KEYS

L<Elastic::Model> uses L<ElasticSearchX::UniqueKey> to enable unique key
tracking. Your unique keys are tracked in a special index which defaults to
L<Elastic::Model> uses L<ElasticSearchX::UniqueKey> to enable unique constraints.
Your unique attributes are tracked in a special index which defaults to
C<"unique_key">, but which can be specified in your Model class:

package MyApp;
Expand All @@ -30,10 +35,10 @@ C<"unique_key">, but which can be specified in your Model class:

The index will be created automatically.

=head1 MAKING AN ATTRIBUTE UNIQUE
=head1 APPLYING UNIQUE CONSTRAINTS

Any attribute whose value is a string (including numeric attributes) can
be made unique:
have a unique constraint applied:

has 'email' => (
is => 'rw',
Expand All @@ -52,8 +57,8 @@ C<myapp_email>.

=head1 COMPOUND KEYS

It is easy to make a compound key unique. For instance, to combine the
attributes C<account_type> and C<account_name> you could do:
It is easy to make a compound key a unique constraint. For instance, to combine
the attributes C<account_type> and C<account_name> you could do:

has 'account_type' => (
is => 'rw',
Expand Down
6 changes: 3 additions & 3 deletions lib/Elastic/Manual/Scaling.pod
Original file line number Diff line number Diff line change
Expand Up @@ -136,7 +136,7 @@ for these domains, you have two options:

=head3 Manually specify domain names

Either specify the extra domains in you namespace declaration:
You can manually specify the extra domains in your namespace declaration:

has_namespace 'myapp' => {
user => 'MyApp::User',
Expand Down Expand Up @@ -185,8 +185,8 @@ for a presentation discussing the strategies described below.
=head2 Overallocation - the "Kagillion shards" solution

The first scaling response to I<"our new business-started-on-a-shoestring will
be HUGE!!!"> is: I<"Lets create an index with 1,000 shards and run it on a
ZX Spectrum!">
be HUGE!!!"> is: I<"Lets create an index with 10,000 shards and run it on an
Amazon EC2 micro instance!">

Unfortunately, this approach doesn't work. Each shard consumes resources:
memory, filehandles, CPU. Your ZX Spectrum won't handle 1,000 shards!
Expand Down
2 changes: 1 addition & 1 deletion lib/Elastic/Manual/Searching.pod
Original file line number Diff line number Diff line change
Expand Up @@ -219,7 +219,7 @@ shouldn't make your scroll timeouts longer than they need to be. The
default is 1 minute, but you may be able to reduce that considerably depending
on your use case.

Of coure, sometimes consistency won't matter - it may be perfectly reasonable to
Of course, sometimes consistency won't matter - it may be perfectly reasonable to
show duplicates in keyword searches, but less reasonable to have duplicate or
missing items in a list.

Expand Down
Loading

0 comments on commit f67ea3d

Please sign in to comment.