Using Lucene Search Text Queries
Pages 202
- Home
- 2009 Esri Federal UC
- 2009 Esri International UC
- 2010 Esri Federal UC
- 2010 Esri International UC
- 2011 Esri Federal UC
- 2012 Esri Federal UC
- 2013 Esri Federal GIS Conference
- 2013 Esri International User Conference
- 2015 SDI Special Interest Group
- Add a Custom Profile
- Add an OpenSearch endpoint for Federated Search
- Add Another Tab to the Geoportal Interface
- Add Custom Link to a Search Result
- Add Custom Search Criteria
- Add the Geoportal Search to a List of Search Providers
- Add v1.1.1 FGDC editor to a previous Geoportal release
- AGP TO AGP Harvesting with the Geoportal
- AGS TO AGP Harvesting with the Geoportal
- All gpt.xml file settings
- An Introduction to vi
- Apache Tomcat geoportal logging
- Being a Good Robot
- Best Practice for Edits to JSP files
- Biological or Remote Sensing FGDC xsds
- Browse Tree
- Cart Processor
- Catalog Service
- Clear the Tomcat Work Folder
- Collections
- Common problems and solutions
- Communities and live examples
- Components
- Configure a Directory Server for the Geoportal
- Configure geoportal User and Schema in the PostgreSQL Database
- Configure Previewable Filetypes
- Configure Searching of YouTube
- Configure the gpt.xml File
- Configure Widgets
- Connecting to a User Directory
- Create a user account
- Create Relationships between Resources
- Customizations
- Customize DCAT output
- Customize Metadata Validation
- Database problems
- Database Tables
- DataDownload Tab
- Deploy and Configure the Geoportal Web Application in Tomcat
- Deploy and Configure the Servlet Web Application
- Deploy the Geoportal Web Application
- Details of Lucene Indexing in the Geoportal
- Development topics
- Discovering Resources
- Eclipse Project from Compiled WAR
- Eclipse Project from Source Code
- Enable Search Using an Ontology Service
- Error Messages in the Geoportal Web Application
- Esri Geoportal Server LiveDVD
- Extending the Web Harvester
- Federated Search in Portal for ArcGIS
- Feedback
- FGDC Biological Profile and Remote Sensing Extension
- FGDC Service Checker Integration
- Geoportal Clients for ArcGIS
- Geoportal CSW Clients
- Geoportal Facets using Apache Solr
- Geoportal genie
- Geoportal Project from Compiled WAR
- Geoportal Publish Client
- Geoportal Server 1.2.5 What's New
- Geoportal Server 1.2.6 What's New
- Geoportal Server 1.2.7 What's New
- Geoportal server as a broker
- Geoportal Server Downloads
- Geoportal Server v 1.0 What's New
- Geoportal Server v 1.1 What's New
- Geoportal Server v 1.1.1 What's New
- Geoportal Server v 1.2 What's New
- Geoportal Server v 1.2.2 What's New
- Geoportal Server v 1.2.4 What's New
- Geoportal SPARQL Sample
- Geoportal User Interface Components
- Geoportal Web Application File Organization
- Geoportal XML Editor
- Get Assistance with an Implementation
- GXE Concepts
- GXE Crash Course
- GXE Structure
- GXE Workflow
- High Availability and Large Number of Records
- How to Browse for Resources
- How to Create and Manage My Profile
- How to find all documents of a particular metadata standard
- How to Leave a Resource Review
- How to Login and Manage my Password
- How to Manage and Edit Resources
- How to Publish Resources
- How to Restrict Access to Resources
- How to Search for Resources
- How to Search with an Ontology Service
- How to Set Up an Esri Geoportal Server on Linux
- How to Use Search Page Results
- How to Use the Data Download Feature
- How to View Resource Relationships
- IDE Topics
- Identity Components LDAP and Single Sign On
- Index All Metadata Content
- Indexing and Searching the Time Period of the Content
- Install Apache Tomcat 6
- Install Desktop Tools
- Install Esri Geoportal Server
- Install PostgreSQL 9.1.2
- Install the JDBC .jar Files
- Installation
- Installation Version 1.0
- Installation Version 1.1
- Installation Version 1.2
- Installation Version 1.2.2
- Installation Version 1.2.4
- Installation Version 1.2.5
- Installation Version 1.2.6
- Installation Version 1.2.7
- Installation Version 1.2.8
- Integrate with a Content Management System
- Integrate with the con terra Security Solution
- Localization
- Log In to the Geoportal
- Logging
- Look and Feel of the User Interface
- Main Page
- Map LDAP Attributes on the Registration Page
- Map Viewer
- Online form editing for all publication methods
- Open source acknowledgements
- Oracle WebLogic geoportal logging
- Orientation to the Create Metadata Page
- Perform Preinstallation Computer Setup
- Portal for ArcGIS Integration
- Post Deployment Actions
- Preinstallation
- Preinstallation 0.9
- Preinstallation 1.0 and 1.1.x
- Preinstallation 1.2
- Preinstallation 1.2.2
- Preinstallation 1.2.4
- Preinstallation 1.2.5
- Preinstallation 1.2.6
- Preinstallation 1.2.7
- Preinstallation 1.2.8
- Preview Function
- Publication Components
- Ratings and Comments for Search Results
- Register ArcGIS for Server with the Geoportal
- Release notes
- REST API Syntax
- Sample FGDC metadata.xml
- Scheduled tasks
- Search Components
- Search Map
- Search Widget for Flex
- Search Widget for HTML
- Search Widget for Silverlight
- Security Concepts
- Set Up Systemwide Environment Variables
- Set up the Geoportal Database
- Share Link
- Single Sign On
- Smoketest the Geoportal
- Standards Support
- Supported CSW Profiles for Synchronization
- Theme Library
- Troubleshooting
- Troubleshooting Tips
- Two geoportals on the same server
- Upgrade 1.x to 1.2 database
- Upgrading file system approach
- Upgrading Read This Overview
- Upgrading SVN approach
- Url filter customization
- Use an XSLT to Render the Details Page
- Use Ant to build Geoportal
- User Functions and Roles
- User Management Interface
- Using a geoportal
- Using Lucene Search Text Queries
- Version 0.9
- Version 1.0
- Version 1.1
- Version 1.1.1
- Version 1.2
- Version 1.2.2
- Version 1.2.4
- Version 1.2.5
- Version 1.2.6
- Version 1.2.7
- Version 1.2.8
- What is a geoportal and the geoportal server
- What is the esri geoportal server
- What's New
- wiki template
- WMC Client
- Show 187 more pages…
Clone this wiki locally
The Geoportal Server uses a sophisticated search engine that provides many search options, ranking options, fast performance, and extensibility. The search engine is based on the open source search engine Apache Lucene. To make the most of the geoportal's search page, the following sections describe how to use Lucene search syntax for text searches.
Terms
A query is broken up into terms and operators. There are two types of terms: Single Terms and Phrases. A Single Term is a single word such as air or quality. A Phrase is a group of words surrounded by double quotes such as "air quality". Multiple terms can be combined together with Boolean operators to form a more complex query. Search text examples:
- air could result in 35 hits (items contain the word air)
- quality results in 123 hits (items contain the word quality)
- air quality (without quotes) results in 148 hits (items contain the words air or quality or both)
- air AND quality results in 10 hits (results contain both words air and quality)
- "air quality" (with quotes) results in 7 hits (items contain the words air and quality directly after each other)
- title:air results in 5 hits (items contain the word air in the title)
- title:quality results in 14 hits (items contain the word quality in the title)
- +title:air +title:quality or title:"air quality" results in 2 hits (both items contain both words air and quality in the title)
Special Characters
The Geoportal Server supports escaping special characters that are part of the query syntax. The current list special characters are + - && || ! ( ) { } [] ^ " ~ * ? : \ To escape these character use the \ before the character. For example to search for items that contain the scale hint 1:250k use the query: \1\:250k.
Fields
Lucene supports fielded data. When performing a search you can either specify a field, or use the default field. The field names and default field is implementation-specific. You can search any field by typing the field name followed by a colon and then the term for which you are looking. Targeting a specific field in the query can be more accurate than just searching with terms. Keep in mind that some fields are case sensitive. Remember that certain special characters must be escaped in the query by using a back-slash (\) character or embraced within quotation ("") whenever they are a part of text to search. Examples:
- title:"The Right Way" AND text:"don't go this way"
- uuid:"{550E8400-E29B-41D4-A716-446655440000}"
- uuid:\{550E8400\-E29B\-41D4\-A716\-446655440000\}
- resource.url:"http://server.arcgisonline.com/ArcGIS/rest/services/ESRI_StreetMap_World_2D/MapServer"
Wildcard Searches
The Geoportal Server supports single and multiple character wildcard searches within single terms (not within phrase queries). Caution: You cannot use a * or ? symbol as the first character of a search.
- To perform a single character wildcard search use the "?" symbol. The single character wildcard search looks for terms that match that with the single character replaced. For example, to search for text or test you can use the search:te?t
- To perform a multiple character wildcard search use the "*" symbol. Multiple character wildcard searches looks for 0 or more characters. For example, to search for test, tests or tester, you can use the search: test* . You can also use the wildcard searches in the middle of a term: te*t
Fuzzy Searches
The Geoportal Server supports fuzzy searches based on the Levenshtein Distance, or Edit Distance algorithm. To do a fuzzy search use the tilde, "~", symbol at the end of a Single Term. For example to search for a term similar in spelling to air use the fuzzy search: air~. This search will not only find items containing terms like air and airplane, but also aid. The Geoportal Server supports specifying the required similarity. The value is between 0 and 1, with a value closer to 1 only terms with a higher similarity will be matched. For example: air~0.8 The default that is used if the parameter is not given is 0.5.
Proximity Searches
The Geoportal Server supports finding words are a within a specific distance away. To do a proximity search use the tilde, "~", symbol at the end of a Phrase. For example to search for air and quality within 10 words of each other in a document use the search: "air quality"~10
Range Searches
The Geoportal Server supports range queries for envelope and timestamp. This allows the user to match documents whose field(s) values are between the lower and upper bound specified by the Range Query. Range Queries can be inclusive or exclusive of the upper and lower bounds.
- Envelope examples:
- envelope:[-80,-70] This search would result in returned documents that intersect a spatial envelope with a Southwest bounding coordinate of -80' W and -70' S, and a Northeast bounding coordinate of 30' W and 70' N.
- envelope:{-80,-70 TO +30,+70} This search would result in returned documents that fall exactly within the range of a spatial envelope with a Southwest bounding coordinate of -80' W and -70' S, and a Northeast bounding coordinate of 30' W and 70' N.
- Timestamp examples:
- dateModified:[2009-10-11]
- dateModified:[2006]
- dateModified:2009-12
Boosting a Term
The Geoportal Server provides the relevance level of matching documents based on the terms found. To boost a term use the caret, ^, symbol with a boost factor (a number) at the end of the term you are searching. The higher the boost factor, the more relevant the term will be. Boosting allows you to control the relevance of a document by boosting its term. For example, if you are searching for air quality and you want the term air to be more relevant, boost it using the ^ symbol along with the boost factor next to the term. You would type: air^4 quality. This will make documents with the term air appear more relevant. You can also boost Phrase Terms as in the example: "air quality"^4 "water quality". By default, the boost factor is 1. Although the boost factor must be positive, it can be less than 1 (e.g. 0.2)
Boolean Operators
Boolean operators allow terms to be combined through logic operators. The Geoportal Server supports AND, +, OR, NOT and - as Boolean operators. Note: Boolean operators must be ALL CAPS
- The OR operator is the default conjunction operator. This means that if there is no Boolean operator between two terms, the OR operator is used. The OR operator links two terms and finds a matching document if either of the terms exist in a document. This is equivalent to a union using sets. The symbol || can be used in place of the word OR.
- The AND operator matches documents where both terms exist anywhere in the text of a single document. This is equivalent to an intersection using sets. The symbol && can be used in place of the word AND.
- The + or required operator requires that the term after the + symbol exist somewhere in a field of a single document.
- The NOT operator excludes documents that contain the term after NOT. This is equivalent to a difference using sets. The symbol ! can be used in place of the word NOT.
Grouping
The Geoportal Server supports using parentheses to group clauses to form sub queries. This can be very useful if you want to control the boolean logic for a query. For example: (air OR water) AND quality will find documents containing the words air and quality or the words water and quality.
Field Grouping
The Geoportal Server supports using parentheses to group multiple clauses to a single field. For example: title:(air OR water) finds items that contain the words air or water in the title.
Fore more information on how to specifically leverage Lucene search syntax for powerful searching in your geoportal, please see the Lucene website.