Skip to content
mbauhardt edited this page Sep 13, 2010 · 6 revisions

If you want to index the metadatas from the url metadata file which you have uploaded (see Admin Url Upload) then you have to configure the ADMIN-GUI-INSTALLATION/plugins/index-metadata/plugin.xml from the index-metadata plugin.

Edit the extension-implementation-id MetadataIndexingFilter and add the raw-field foo to index all values untokenized with key foo. If you use fields instead raw-fields all values will be index as tokenized fields.


   <extension id="org.apache.nutch.indexer.metadata.index"
              name="Nutch Metadata Indexing Filter"
              point="org.apache.nutch.indexer.IndexingFilter">
      <implementation id="MetadataIndexingFilter"
                      class="org.apache.nutch.indexer.metadata.MetadataIndexingFilter">
        <parameter name="raw-fields" value="foo"/>
      </implementation> 
   </extension>

To make a query for example “http foo:0.1” or “http foo:0.9” you have to configure the MetadataQueryFilter. Edit the plugin.xml.


   <extension id="org.apache.nutch.indexer.metadata.query"
              name="Nutch Metadata Query Filter"
              point="org.apache.nutch.searcher.QueryFilter">
      <implementation id="MetadataQueryFilter"
                      class="org.apache.nutch.indexer.metadata.MetadataQueryFilter">
        <parameter name="raw-fields" value="foo"/>
      </implementation> 
   </extension>

If you use raw-fields then a RawFieldQueryFilter is used. If you use fields instead raw-fields the FieldQueryFilter is used.

< Previous Next >

Clone this wiki locally