Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specify a DSL for metadata searches #268

Closed
wants to merge 26 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
c0615ca
Added a metadataQuery property
christeredvartsen Feb 19, 2014
6af31f4
Store an optional metadata query in the query object
christeredvartsen Feb 19, 2014
73db785
Added a method that prepares and validates the metadata query before …
christeredvartsen Feb 20, 2014
142bd3e
Added a field for normalized metadata that will be used for searching
christeredvartsen Feb 20, 2014
e2871a2
Added some tests for the metadata query. Skips the tests for the Doct…
christeredvartsen Feb 20, 2014
70908be
Updated ChangeLog
christeredvartsen Feb 21, 2014
bc46b8d
Added docs about the new rules regarding metadata keys
christeredvartsen Feb 21, 2014
869a76b
Validate metadata keys. Tests have also been added
christeredvartsen Feb 21, 2014
c966f61
Added some unversioned files. WIP
christeredvartsen Mar 4, 2014
836c4a7
Fixed values in the data provider
christeredvartsen Mar 7, 2014
3cbe04c
Added a setter and getter for the metadata query parser
christeredvartsen Mar 7, 2014
6f8658a
First (seemingly) working copy of the metadata query parser. Not yet …
christeredvartsen Mar 7, 2014
0c45d6a
Changed namespace
christeredvartsen Mar 7, 2014
c904dc1
Moved the test case to the correct dir
christeredvartsen Mar 12, 2014
b5fc7a2
Added a method for fetching an instance of a MongoRegex that is used …
christeredvartsen Mar 12, 2014
381400e
Added some more queries, and enable testing of Doctrine as well (not …
christeredvartsen Mar 13, 2014
79c7c94
Added a couple of indexes and set the NOCASE collation in SQLite
christeredvartsen Mar 13, 2014
e969cf3
Removed the not in operator
christeredvartsen Mar 13, 2014
1496e35
Updated the Doctrine metadata query parser to actually work with the …
christeredvartsen Mar 13, 2014
53f8322
Added support for exists
christeredvartsen Mar 13, 2014
3633c95
Initial commit of the docs for the metadata query
christeredvartsen Mar 15, 2014
b9830bc
Added a method for specifying a metadata query parameter, and expande…
christeredvartsen Mar 15, 2014
9600480
Added the GenerateNormalizedMetadata command
christeredvartsen Mar 15, 2014
1aab43b
Added docs regarding the metadata normalizing
christeredvartsen Mar 15, 2014
9ae0c07
Copied the test data from the Behat suite to the integration suite. F…
christeredvartsen Mar 15, 2014
e1f429b
Added comment
christeredvartsen Mar 17, 2014
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions ChangeLog.markdown
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ __N/A__

* #276: Support checking if the `accessToken` matches the URI "as is" (Peter Rudolfsen)
* #269: Return metadata on write requests against the metadata resource
* #268: Added "search by metadata" to the images resource
* #260: Generate short URLs on demand
* #253: Store the original checksum of added images

Expand Down
21 changes: 21 additions & 0 deletions docs/installation/cli.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,27 @@ The binary can be found in one of two places, depending on the :doc:`installatio
:local:
:depth: 1

.. _cli-generate-normalized-metadata:

Generate normalized metadata (MongoDB) - ``generate-normalized-metadata``
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

This command can be used to generate normalized metadata in MongoDB in case you have added metadata prior to upgrading to Imbo-1.2.0 **and** use the MongoDB database adapter. If you use the Doctrine adapter you don't need to run this command. Running the command multiple times is safe, but should not be needed.

The command supports three optional options:

* ``--server`` The server to connect to, defaults to ``mongodb://localhost:27017``.
* ``--database`` The database the data exists in, defaults to ``imbo``.
* ``--collection`` The collection of the data, defaults to ``images``.

Example:

.. code-block:: console

./bin/imbo generate-normalized-metadata --server mongodb://somehost --database my-imbo-database

The command requires the `ext-mongo <http://pecl.php.net/package/mongo>`_ extension to be installed to be able to convert the data. You will also need to confirm the conversion before it starts. Please make a backup of the data before running this command.

.. _cli-generate-private-key:

Generate a private key - ``generate-private-key``
Expand Down
23 changes: 20 additions & 3 deletions docs/installation/upgrading.rst
Original file line number Diff line number Diff line change
Expand Up @@ -18,10 +18,22 @@ Below are the changes you need to be aware of when upgrading to Imbo-1.2.0.
:local:
:depth: 2

Response to metadata write operations
+++++++++++++++++++++++++++++++++++++
Metadata keys
+++++++++++++

Versions prior to 1.2.0 contained the image identifier in the response to ``HTTP POST/PUT/DELETE`` against the :ref:`metadata resource <metadata-resource>`. Starting from Imbo-1.2.0 the response to these requests will contain the metadata attached to the image instead. Read more about the different responses in the :ref:`metadata resource <metadata-resource>` section.
Prior to Imbo-1.2.0 metadata keys could not contain ``::`` if you used the :ref:`Doctrine database adapter <doctrine-database-adapter>`. From Imbo-1.2.0 this is now true regardless of the adapter you are using. Two other rules have also been added:

* Keys can not contain ``.`` (``foo.bar`` for instance). This is a limitation in MongoDB, and to make it easier for users of Imbo to port data between back-ends it will deny this for all adapters.
* Keys can not start with ``$`` (``$foo`` for instance). This is because of the DSL used by the :ref:`metadata queries <metadata-query>`, added to Imbo-1.2.0.

If you are using the MongoDB adapter, and have keys that contain ``::`` you are encouraged to change these into something else. Likewise, if you are using the Doctrine adapter, and have keys that start with ``$`` or contain a ``.`` you should change these as well for metadata search compatibility.

Metadata queries
++++++++++++++++

Imbo-1.2.0 introduces a new metadata query feature that lets you search for images by querying the metadata attached to the images. Read more about the feature in the :ref:`metadata-query` section about the feature itself.

If you have added metadata to images prior to upgrading to Imbo-1.2.0 **and** use the :ref:`MongoDB database adapter <mongodb-database-adapter>` you will need to update some metadata in the collection used by Imbo. The :doc:`command line script <cli>` that ships with Imbo can be used to convert the data for you, more specifically the :ref:`generate-normalized-metadata <cli-generate-normalized-metadata>` command. If you use the :ref:`Doctrine database adapter <doctrine-database-adapter>` you do not need to worry about this.

Original checksum
+++++++++++++++++
Expand Down Expand Up @@ -59,3 +71,8 @@ Short image URLs
++++++++++++++++

In versions prior to Imbo-1.2.0 short image URLs were created automatically whenever a user agent requested the image resource (with or without transformations), and sent in the response as the ``X-Imbo-ShortUrl`` header. This no longer done automatically. Refer to the :ref:`shorturls-resource` section for more information on how to generate short URLs from this version on.

Response to metadata write operations
+++++++++++++++++++++++++++++++++++++

Versions prior to 1.2.0 contained the image identifier in the response to ``HTTP POST/PUT/DELETE`` against the :ref:`metadata resource <metadata-resource>`. Starting from Imbo-1.2.0 the response to these requests will contain the metadata attached to the image instead. Read more about the different responses in the :ref:`metadata resource <metadata-resource>` section.
1 change: 1 addition & 0 deletions docs/spelling_wordlist.txt
Original file line number Diff line number Diff line change
Expand Up @@ -36,3 +36,4 @@ Imagick
Behat
Lighttpd
ETag
Nøgne
213 changes: 211 additions & 2 deletions docs/usage/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ In this section you will find information on the different resources Imbo's REST

.. contents:: Available resources
:local:
:depth: 1
:depth: 3

.. _index-resource:

Expand Down Expand Up @@ -317,6 +317,9 @@ The images resource can also be used to gather information on which images a use
``originalChecksums[]``
An array of the original image checksums to filter the results by.

``q``
Perform a metadata query.

.. code-block:: bash

curl "http://imbo/users/<user>/images.json?limit=1&metadata=1"
Expand Down Expand Up @@ -374,8 +377,214 @@ The ``images`` list contains image objects. Each object has the following fields

* 200 OK
* 304 Not modified
* 400 Invalid metadata query
* 404 Public key not found

.. _metadata-query:

Metadata query
``````````````

When searching for images you might want to do this by querying the metadata attached to the images. Imbo's metadata DSL is quite similar to MongoDB's, but it only supports a subset of the features supported by MongoDB (and other DBMSs). The query is a JSON-encoded object including ``key => value`` matches and/or a combination of the supported operators, sent to Imbo in the ``q`` query parameter. This section lists all operators and includes a number of examples showing you how to find images using the metadata query.

.. note:: The results of the different queries **might** end up with slightly different results depending on the backend you use the for metadata.
.. warning:: When a malformed metadata query is specified Imbo will respond with ``400 Invalid metadata query``.

.. contents:: Supported operators and query types
:local:
:depth: 1

Key/value match
_______________

The simplest form of a metadata query is a simple ``key => value`` match, where the expressions are AND-ed together if there is more than one key/value match in the query.

.. code-block:: json

{"key":"value","otherkey":"othervalue"}

The above search would result in images that have the metadata key ``key`` set to ``value`` **and** ``otherkey`` set to ``othervalue``.


Greater than - ``$gt``
______________________

This operator can be used to check for values greater than the value specified.

.. code-block:: json

{"age":{"$gt":35}}

Greater than or equal - ``$gte``
________________________________

Check for values greater than or equal to the value specified.

.. code-block:: json

{"age":{"$gte":35}}

Less than - ``$lt``
___________________

This operator can be used to check for values less than the value specified.

.. code-block:: json

{"age":{"$lt":35}}

Less than or equal - ``$lte``
_____________________________

Check for values less than or equal to the value specified.

.. code-block:: json

{"age":{"$lte":35}}

Not equal - ``$ne``
___________________

Matches values that are not equal to the value specified.

.. code-block:: json

{"name":{"$ne":"christer"}}

In - ``$in``
____________

Look for values that appear in the specified set.

.. code-block:: json

{"styles":{"$in":["IPA","Imperial Stout","Lambic"]}}

Wild card - ``$wildcard``
_________________________

Perform a wild card search. The ``*`` is used to match zero or more characters, and ``_`` is used to match a single character.

.. code-block:: json

{"style":{"$wildcard":"_PA"}}

would match "IPA" and "APA", but not "DIPA" nor "PA".

.. code-block:: json

{"style":{"$wildcard":"*_PA"}}

would match "IPA", "APA" and "DIPA" but not "PA".

.. code-block:: json

{"style":{"$wildcard":"*PA"}}

would match "IPA", "APA", "DIPA" and "PA".

Wild cards can appear multiple times in the expression.


Exists - ``$exists``
____________________

Returns images where a specific metadata **key** exists.

.. code-block:: json

{"author":{"$exists":true}}

would return images that has a metadata key called ``author``.

.. code-block:: json

{"author":{"$exists":false}}

would return images that does not have a metadata key called ``author``.

And - ``$and``
______________

This operator can be used to AND expressions together. This is used per default when specifying several clauses in the query.

.. code-block:: json

{"key":"value","otherkey":"othervalue"}

is the same as:

.. code-block:: json

{"$and":[{"key":"value"},{"otherkey":"othervalue"}]}

Or - ``$or``
____________

This operator can be used to OR expressions together.

.. code-block:: json

{"$or":[{"key":"value"},{"otherkey":"othervalue"}]}

would fetch images that have a key named ``age`` with the value ``value`` and/or a key named ``otherkey`` which has the value of ``othervalue``.

Using several operators in one query
____________________________________

All the above operators can be combined into one query. Consider a collection of images of beers which have all been tagged with the name of the brewery, the name of the beer, the style of the beer and the ABV. If we wanted to find all images of beers within a set of styles, above a specific ABV, from two different breweries, and all images of beers from Nøgne Ø, regardless of style and ABV, but not beers called Wit, regardless of brewery, style or ABV, the query could look like this (formatted for easier reading):

.. code-block:: json

{
"name":
{
"$ne": "Wit"
},
"$or":
[
{
"brewery": "Nøgne Ø"
},

{
"$and":
[
{
"abv":
{
"$gte": 5.5
}
},

{
"style":
{
"$in":
[
"IPA",
"Imperial Stout"
]
}
},

{
"brewery":
{
"$in":
[
"HaandBryggeriet",
"Ægir"
]
}
}
]
}
]
}

Keep in mind that large complex queries against large image collections can take a while to finish, and might cause performance issues on the Imbo server(s).

.. _image-resource:

Image resource - ``/users/<user>/images/<image>``
Expand Down Expand Up @@ -802,7 +1011,7 @@ The value of the ``ETag`` header is simply the MD5 sum of the content in the res
Last-Modified
+++++++++++++

Imbo also includes a ``Last-Modified`` response header for resources that has a know last modification date, and these resources are:
Imbo also includes a ``Last-Modified`` response header for resources that have a known last modification date, and these resources are:

* :ref:`user-resource`: The date of when the user last added or deleted an image, or manipulated the metadata of an image. If the user don't have any images yet, the value of this date will be the current timestamp.
* :ref:`images-resource`: The date of when the user last modified an image in the collection (either the image itself, or metadata attached to the image).
Expand Down
Loading