Search Result Format

isubiker edited this page Jan 4, 2012 · 5 revisions

Search Result Format

When search results are returned, various items can be included with each result via the include parameter. Valid items to incude for each search result are:

  • content - The original document for the result
  • collections - The collections that the document is in
  • properties - The properties on the document
  • permissions - The permissions on the document
  • quality - The quality set for the document
  • snippet - A search result snippet

JSON Format

All search results for JSON documents will return an object with two keys: meta and results. The meta key contains an object that gives information about what set of results have been returned and the total number of results there are. The results key contains an array of results. Each item in the array of results is an object, as shown:

{
    "meta": {
        "start":1,
        "end":10,
        "total":202
    },
    "results": [{
        "uri":"/256gxvrmkyvxhxzl.json",
        "content":{
            "id":"256gxvrmkyvxhxzl",
            "list":"org.kernel.vger.linux-kernel",
            "type":"development",
            …
        },
        "collections": ["messages", "linux"],
        "properties":{
            "state": "sent"
        },
        "permissions":{
            "public": ["read"],
            "admin": ["read", "update"]
        },
        "quality":0,
        "snippet":"<span class='match'>... impossible to make it 100% correct across the whole kernel (for example the <span class='hit'>compound_head</span> is safe for THP but it's still unsafe for hugetlbfs while the page...</span>"]
    },{
    …
    }]
}

XML Format

All search results for XML documents will return a <response> element with two children: <meta> and <results>. The meta element contains an object that gives information about what set of results have been returned and the total number of results there are. The results element contains sequence of <result> elements. Each item in the sequence of results returns info about each result, as shown:

<response xmlns="http://marklogic.com/corona">
    <meta>
        <start>1</start>
        <end>1</end>
        <total>14</total>
    </meta>
    <results>
        <result>
            <uri>/256gxvrmkyvxhxzl.xml</uri>
            <content>
                <message>
                    <id>256gxvrmkyvxhxzl</id>
                    <list>org.kernel.vger.linux-kernel</list>
                    <type>development</type>
                    …
                </message>
            </content>
            <collections>
                <collection>messages</collection>
                <collection>linux</collection>
            </collections>
            <properties>
                <state>sent</properties>
            </properties>
            <permissions>
                <public><permission>read</permission></public>
                <admin><permission>read</permission><permission>update</permission></admin>
            </permissions>
            <quality>0</quality>
            <snippet><span class='match'>... impossible to make it 100% correct across the whole kernel (for example the &lt;span class='hit'&gt;compound_head&lt;/span&gt; is safe for THP but it's still unsafe for hugetlbfs while the page...</span></snippet>
        </result>
    </results>
</response>

Discussion

The result should probably include the relevance "score". Otherwise quality and weight settings will be darn hard to adjust blindly.

Clone this wiki locally
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.
Press h to open a hovercard with more details.