forked from apache/nutch
-
Notifications
You must be signed in to change notification settings - Fork 3
Index Metadatas
mbauhardt edited this page Sep 13, 2010
·
6 revisions
If you want to index the metadatas from the url metadata file which you have uploaded (see Admin Url Upload) then you have to configure the plugin.xml from the index-metadata plugin.
Edit the extension-implementation-id MetadataIndexingFilter and add the raw-field foo to index all values untokenized with key foo. If you use fields instead raw-fields all values will be index as tokenized fields.
<extension id="org.apache.nutch.indexer.metadata.index"
name="Nutch Metadata Indexing Filter"
point="org.apache.nutch.indexer.IndexingFilter">
<implementation id="MetadataIndexingFilter"
class="org.apache.nutch.indexer.metadata.MetadataIndexingFilter">
<parameter name="raw-fields" value="foo"/>
</implementation>
</extension>
To make a query for example “http foo:0.1” or “http foo:0.9” you have to configure the MetadataQueryFilter. Edit the plugin.xml.
<extension id="org.apache.nutch.indexer.metadata.query"
name="Nutch Metadata Query Filter"
point="org.apache.nutch.searcher.QueryFilter">
<implementation id="MetadataQueryFilter"
class="org.apache.nutch.indexer.metadata.MetadataQueryFilter">
<parameter name="raw-fields" value="foo"/>
</implementation>
</extension>
If you use raw-fields then a RawFieldQueryFilter is used. If you use fields instead raw-fields the FieldQueryFilter is used.