Skip to content

Commit

Permalink
Merge branch 'master' into 1358-datastore-ui
Browse files Browse the repository at this point in the history
  • Loading branch information
amercader committed Dec 5, 2013
2 parents e7bc7ad + 82cdbb8 commit b3ad6bd
Show file tree
Hide file tree
Showing 17 changed files with 240 additions and 116 deletions.
4 changes: 3 additions & 1 deletion CHANGELOG.rst
Expand Up @@ -12,7 +12,9 @@ v2.2

API changes and deprecations:


* The Solr schema file is now always named ``schema.xml`` regardless of the
CKAN version. Old schema files have been kept for backwards compatibility
but users are encouraged to point to the new unified one.
* The `ckan.api_url` has been completely removed and it can no longer be used
* The edit() and after_update() methods of IPackageController plugins are now
called when updating a resource using the web frontend or the
Expand Down
3 changes: 1 addition & 2 deletions bin/travis-run-tests
Expand Up @@ -2,8 +2,7 @@

# Configure Solr
echo "NO_START=0\nJETTY_HOST=127.0.0.1\nJETTY_PORT=8983\nJAVA_HOME=$JAVA_HOME" | sudo tee /etc/default/jetty
# FIXME the solr schema cannot be hardcoded as it is dependent on the ckan version
sudo cp ckan/config/solr/schema-2.0.xml /etc/solr/conf/schema.xml
sudo cp ckan/config/solr/schema.xml /etc/solr/conf/schema.xml
sudo service jetty restart

# Run mocha front-end tests
Expand Down
29 changes: 0 additions & 29 deletions ckan/config/solr/CHANGELOG.txt

This file was deleted.

36 changes: 9 additions & 27 deletions ckan/config/solr/README.txt
@@ -1,30 +1,12 @@
CKAN SOLR schemas
=================
CKAN Solr schema
================

This folder contains the latest and previous versions of the SOLR XML
schema files used by CKAN. These can be use on the SOLR server to
override the default SOLR schema. Please note that not all schemas are
backwards compatible with old CKAN versions. Check the CHANGELOG.txt file
in this same folder to check which version of the schema should you use
depending on the CKAN version you are using.
This folder contains the Solr schema file used by CKAN (schema.xml).

Developers, when pushing changes to the SOLR schema:
Starting from 2.2 this is the only file that should be used by users and
modified by devs. The rest of files (schema-{version}.xml) are kept for
backwards compatibility purposes and should not be used, as they might be
removed in future versions.

* Note that updates on the schema are only release based, i.e. all changes
in the schema between releases will be part of the same new version of
the schema.

* Name the new version of the file using the following convention::

schema-<version>.xml

* Update the `version` attribute of the `schema` tag in the new file::

<schema name="ckan" version="<version>">

* Update the SUPPORTED_SCHEMA_VERSIONS list in `ckan/lib/search/__init__.py`
Consider if the changes introduced are or are not compatible with
previous schema versions.

* Update the CHANGELOG.txt file with the new version, the CKAN version
required and changes made to the schema.
When upgrading CKAN, always check the CHANGELOG on each release to see if
you need to update the schema file and reindex your datasets.
8 changes: 8 additions & 0 deletions ckan/config/solr/schema-1.2.xml
@@ -1,4 +1,12 @@
<?xml version="1.0" encoding="UTF-8" ?>
<!--
THIS FILE IS DEPRECATED
Starting from CKAN 2.2 the Solr schema file that should be used is
`schema.xml`.
This file is maintained for backwards compatibility purposes but might
be removed in future vesions.
-->
<!--
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
Expand Down
8 changes: 8 additions & 0 deletions ckan/config/solr/schema-1.3.xml
@@ -1,4 +1,12 @@
<?xml version="1.0" encoding="UTF-8" ?>
<!--
THIS FILE IS DEPRECATED
Starting from CKAN 2.2 the Solr schema file that should be used is
`schema.xml`.
This file is maintained for backwards compatibility purposes but might
be removed in future vesions.
-->
<!--
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
Expand Down
8 changes: 8 additions & 0 deletions ckan/config/solr/schema-1.4.xml
@@ -1,4 +1,12 @@
<?xml version="1.0" encoding="UTF-8" ?>
<!--
THIS FILE IS DEPRECATED
Starting from CKAN 2.2 the Solr schema file that should be used is
`schema.xml`.
This file is maintained for backwards compatibility purposes but might
be removed in future vesions.
-->
<!--
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
Expand Down
8 changes: 8 additions & 0 deletions ckan/config/solr/schema-2.0.xml
@@ -1,4 +1,12 @@
<?xml version="1.0" encoding="UTF-8" ?>
<!--
THIS FILE IS DEPRECATED
Starting from CKAN 2.2 the Solr schema file that should be used is
`schema.xml`.
This file is maintained for backwards compatibility purposes but might
be removed in future vesions.
-->
<!--
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
Expand Down
165 changes: 165 additions & 0 deletions ckan/config/solr/schema.xml
@@ -0,0 +1,165 @@
<?xml version="1.0" encoding="UTF-8" ?>
<!--
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->

<schema name="ckan" version="2.0">

<types>
<fieldType name="string" class="solr.StrField" sortMissingLast="true" omitNorms="true"/>
<fieldType name="boolean" class="solr.BoolField" sortMissingLast="true" omitNorms="true"/>
<fieldtype name="binary" class="solr.BinaryField"/>
<fieldType name="int" class="solr.TrieIntField" precisionStep="0" omitNorms="true" positionIncrementGap="0"/>
<fieldType name="float" class="solr.TrieFloatField" precisionStep="0" omitNorms="true" positionIncrementGap="0"/>
<fieldType name="long" class="solr.TrieLongField" precisionStep="0" omitNorms="true" positionIncrementGap="0"/>
<fieldType name="double" class="solr.TrieDoubleField" precisionStep="0" omitNorms="true" positionIncrementGap="0"/>
<fieldType name="tint" class="solr.TrieIntField" precisionStep="8" omitNorms="true" positionIncrementGap="0"/>
<fieldType name="tfloat" class="solr.TrieFloatField" precisionStep="8" omitNorms="true" positionIncrementGap="0"/>
<fieldType name="tlong" class="solr.TrieLongField" precisionStep="8" omitNorms="true" positionIncrementGap="0"/>
<fieldType name="tdouble" class="solr.TrieDoubleField" precisionStep="8" omitNorms="true" positionIncrementGap="0"/>
<fieldType name="date" class="solr.TrieDateField" omitNorms="true" precisionStep="0" positionIncrementGap="0"/>
<fieldType name="tdate" class="solr.TrieDateField" omitNorms="true" precisionStep="6" positionIncrementGap="0"/>

<fieldType name="text" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.SnowballPorterFilterFactory" language="English" protected="protwords.txt"/>
<filter class="solr.ASCIIFoldingFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.SnowballPorterFilterFactory" language="English" protected="protwords.txt"/>
<filter class="solr.ASCIIFoldingFilterFactory"/>
</analyzer>
</fieldType>


<!-- A general unstemmed text field - good if one does not know the language of the field -->
<fieldType name="textgen" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="0"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="0"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
</types>


<fields>
<field name="index_id" type="string" indexed="true" stored="true" required="true" />
<field name="id" type="string" indexed="true" stored="true" required="true" />
<field name="site_id" type="string" indexed="true" stored="true" required="true" />
<field name="title" type="text" indexed="true" stored="true" />
<field name="entity_type" type="string" indexed="true" stored="true" omitNorms="true" />
<field name="dataset_type" type="string" indexed="true" stored="true" />
<field name="state" type="string" indexed="true" stored="true" omitNorms="true" />
<field name="name" type="string" indexed="true" stored="true" omitNorms="true" />
<field name="revision_id" type="string" indexed="true" stored="true" omitNorms="true" />
<field name="version" type="string" indexed="true" stored="true" />
<field name="url" type="string" indexed="true" stored="true" omitNorms="true" />
<field name="ckan_url" type="string" indexed="true" stored="true" omitNorms="true" />
<field name="download_url" type="string" indexed="true" stored="true" omitNorms="true" />
<field name="notes" type="text" indexed="true" stored="true"/>
<field name="author" type="textgen" indexed="true" stored="true" />
<field name="author_email" type="textgen" indexed="true" stored="true" />
<field name="maintainer" type="textgen" indexed="true" stored="true" />
<field name="maintainer_email" type="textgen" indexed="true" stored="true" />
<field name="license" type="string" indexed="true" stored="true" />
<field name="license_id" type="string" indexed="true" stored="true" />
<field name="ratings_count" type="int" indexed="true" stored="false" />
<field name="ratings_average" type="float" indexed="true" stored="false" />
<field name="tags" type="string" indexed="true" stored="true" multiValued="true"/>
<field name="groups" type="string" indexed="true" stored="true" multiValued="true"/>

<field name="capacity" type="string" indexed="true" stored="true" multiValued="false"/>

<field name="res_description" type="textgen" indexed="true" stored="true" multiValued="true"/>
<field name="res_format" type="string" indexed="true" stored="true" multiValued="true"/>
<field name="res_url" type="string" indexed="true" stored="true" multiValued="true"/>

<!-- catchall field, containing all other searchable text fields (implemented
via copyField further on in this schema -->
<field name="text" type="text" indexed="true" stored="false" multiValued="true"/>
<field name="urls" type="text" indexed="true" stored="false" multiValued="true"/>

<field name="depends_on" type="text" indexed="true" stored="false" multiValued="true"/>
<field name="dependency_of" type="text" indexed="true" stored="false" multiValued="true"/>
<field name="derives_from" type="text" indexed="true" stored="false" multiValued="true"/>
<field name="has_derivation" type="text" indexed="true" stored="false" multiValued="true"/>
<field name="links_to" type="text" indexed="true" stored="false" multiValued="true"/>
<field name="linked_from" type="text" indexed="true" stored="false" multiValued="true"/>
<field name="child_of" type="text" indexed="true" stored="false" multiValued="true"/>
<field name="parent_of" type="text" indexed="true" stored="false" multiValued="true"/>
<field name="views_total" type="int" indexed="true" stored="false"/>
<field name="views_recent" type="int" indexed="true" stored="false"/>
<field name="resources_accessed_total" type="int" indexed="true" stored="false"/>
<field name="resources_accessed_recent" type="int" indexed="true" stored="false"/>

<field name="metadata_created" type="date" indexed="true" stored="true" multiValued="false"/>
<field name="metadata_modified" type="date" indexed="true" stored="true" multiValued="false"/>

<field name="indexed_ts" type="date" indexed="true" stored="true" default="NOW" multiValued="false"/>

<!-- Copy the title field into titleString, and treat as a string
(rather than text type). This allows us to sort on the titleString -->
<field name="title_string" type="string" indexed="true" stored="false" />

<field name="data_dict" type="string" indexed="false" stored="true" />
<field name="validated_data_dict" type="string" indexed="false" stored="true" />

<field name="_version_" type="string" indexed="true" stored="true"/>

<dynamicField name="*_date" type="date" indexed="true" stored="true" multiValued="false"/>

<dynamicField name="extras_*" type="text" indexed="true" stored="true" multiValued="false"/>
<dynamicField name="vocab_*" type="string" indexed="true" stored="true" multiValued="true"/>
<dynamicField name="*" type="string" indexed="true" stored="false"/>
</fields>

<uniqueKey>index_id</uniqueKey>
<defaultSearchField>text</defaultSearchField>
<solrQueryParser defaultOperator="AND"/>

<copyField source="url" dest="urls"/>
<copyField source="ckan_url" dest="urls"/>
<copyField source="download_url" dest="urls"/>
<copyField source="res_url" dest="urls"/>
<copyField source="extras_*" dest="text"/>
<copyField source="vocab_*" dest="text"/>
<copyField source="urls" dest="text"/>
<copyField source="name" dest="text"/>
<copyField source="title" dest="text"/>
<copyField source="text" dest="text"/>
<copyField source="license" dest="text"/>
<copyField source="notes" dest="text"/>
<copyField source="tags" dest="text"/>
<copyField source="groups" dest="text"/>
<copyField source="res_description" dest="text"/>
<copyField source="maintainer" dest="text"/>
<copyField source="author" dest="text"/>

</schema>
2 changes: 1 addition & 1 deletion ckan/plugins/interfaces.py
Expand Up @@ -770,7 +770,7 @@ def read_template(self):
If the user requests the dataset in a format other than HTML
(CKAN supports returning datasets in RDF or N3 format by appending .rdf
or .n3 to the dataset read URL, see :doc:`linked-data-and-rdf`) then
or .n3 to the dataset read URL, see :doc:`/linked-data-and-rdf`) then
CKAN will try to render
a template file with the same path as returned by this function,
but a different filename extension, e.g. ``'package/read.rdf'``.
Expand Down
7 changes: 1 addition & 6 deletions ckan/tests/lib/test_solr_schema_version.py
@@ -1,5 +1,4 @@
import os
from pylons import config
from ckan.tests import TestController

class TestSolrSchemaVersionCheck(TestController):
Expand All @@ -11,11 +10,7 @@ def setup_class(cls):

def _get_current_schema(self):

from ckan.lib.search import SUPPORTED_SCHEMA_VERSIONS

current_version = sorted(SUPPORTED_SCHEMA_VERSIONS).pop()

current_schema = os.path.join(self.root_dir,'..','..','config','solr','schema-%s.xml' % current_version)
current_schema = os.path.join(self.root_dir,'..','..','config','solr','schema.xml')

return current_schema

Expand Down
8 changes: 4 additions & 4 deletions doc/appendices/solr-multicore.rst
Expand Up @@ -9,7 +9,7 @@ same machine. Each configuration is called a Solr *core*. Having multiple cores
is useful when you want different applications or different versions of CKAN to
share the same Solr instance, each application can have its own Solr core so
each can use a different ``schema.xml`` file. This is necessary, for example,
if you want to CKAN instances to share the same |solr| server and those two
if you want two CKAN instances to share the same |solr| server and those two
instances are running different versions of CKAN that require differemt
``schema.xml`` files, or if the two instances have different |solr| schema
customizations.
Expand Down Expand Up @@ -57,7 +57,7 @@ core for it, see :ref:`creating another solr core`.
CKAN's ``schema.xml`` file::

sudo mv /etc/solr/ckan_default/conf/schema.xml /etc/solr/ckan_default/conf/schema.xml.bak
sudo ln -s /usr/lib/ckan/default/src/ckan/ckan/config/solr/schema-2.0.xml /etc/solr/ckan_default/conf/schema.xml
sudo ln -s /usr/lib/ckan/default/src/ckan/ckan/config/solr/schema.xml /etc/solr/ckan_default/conf/schema.xml

#. Edit ``/etc/solr/ckan_default/conf/solrconfig.xml`` and change the
``<dataDir>`` tag to this::
Expand Down Expand Up @@ -116,7 +116,7 @@ In this example we'll assume that:
#. You've installed a second instance of CKAN in a second virtual environment
at /usr/lib/ckan/|ckan|, and now want to setup a second Solr core for it.

You can ofcourse follow these instructions again to setup further Solr cores.
You can of course follow these instructions again to setup further Solr cores.

#. Add the core to ``/usr/share/solr/solr.xml``. This file should now list
two cores. For example:
Expand Down Expand Up @@ -154,7 +154,7 @@ You can ofcourse follow these instructions again to setup further Solr cores.
.. parsed-literal::
sudo rm /etc/solr/|core|/conf/schema.xml
sudo ln -s /usr/lib/ckan/|ckan|/src/ckan/ckan/config/solr/schema-2.0.xml /etc/solr/|core|/conf/schema.xml
sudo ln -s /usr/lib/ckan/|ckan|/src/ckan/ckan/config/solr/schema.xml /etc/solr/|core|/conf/schema.xml
#. Create the /usr/share/solr/|core| directory and put a symlink to the
``conf`` directory in it:
Expand Down
2 changes: 2 additions & 0 deletions doc/contents.rst
Expand Up @@ -5,6 +5,8 @@ Full table of contents
.. toctree::

index
user-guide
sysadmin-guide
installing
upgrading
getting-started
Expand Down

0 comments on commit b3ad6bd

Please sign in to comment.