In [1]:
import mdf_toolbox

# Globus Search Utilities
The MDF Toolbox provides a few utilities to make integrating with Globus Search easier.

# format_gmeta
`format_gmeta()` takes a dictionary of data you want to change into the Globus Search GMeta format and returns the `GMetaEntry` of that dictionary. It is required to provide the `acl` (Access Control List, or `["public"]` for public data) and `identifier` (unique ID for this entry, or an existing ID to overwrite).

To make a `GIngest` (the final form of Globus Search ingests), provide a list of `GMetaEntry` objects.

In [2]:
my_data = {
    "foo": "bar",
    "baz": [1, 2, 3, 4]
}
gmeta_entry = mdf_toolbox.format_gmeta(my_data,
                                       acl=["public"],
                                       identifier="abc123")

In [3]:
gmeta_entry

{'@datatype': 'GMetaEntry',
 '@version': '2016-11-09',
 'content': {'baz': [1, 2, 3, 4], 'foo': 'bar'},
 'subject': 'abc123',
 'visible_to': ['public']}

In [4]:
list_of_gmeta_entry = [gmeta_entry]
g_ingest = mdf_toolbox.format_gmeta(list_of_gmeta_entry)

In [5]:
g_ingest

{'@datatype': 'GIngest',
 '@version': '2016-11-09',
 'ingest_data': {'@datatype': 'GMetaList',
  '@version': '2016-11-09',
  'gmeta': [{'@datatype': 'GMetaEntry',
    '@version': '2016-11-09',
    'content': {'baz': [1, 2, 3, 4], 'foo': 'bar'},
    'subject': 'abc123',
    'visible_to': ['public']}]},
 'ingest_type': 'GMetaList'}

In [6]:
# globus_sdk.SearchClient.ingest(index, g_ingest)

## gmeta_pop
`gmeta_pop()` takes the results from a Globus Search query and unwraps them from the GMeta format. You can pass in a `GlobusHTTPResponse` from the `SearchClient`, a JSON-dumped string, or a dictionary.

In [7]:
sample_search_result = { 
    '@datatype': 'GSearchResult',
    '@version': '2016-11-09',
    'count': 11, 
    'gmeta': [{
        '@datatype': 'GMetaResult',
        '@version': '2016-11-09',
        'content': [{
            "foo": "bar",
            "baz": [1, 2, 3, 4, 5]
        }, {
            "food": "bard",
            "bazd": ["d"]
        }],
        'subject': "http://example.com/abc123",
    }],
    'offset': 0,
    'total': 22
}

In [8]:
mdf_toolbox.gmeta_pop(sample_search_result)

[{'baz': [1, 2, 3, 4, 5], 'foo': 'bar'}, {'bazd': ['d'], 'food': 'bard'}]

If you want the metadata associated with your query (total number of query matches), you can use `info=True` to get a tuple of (results, metadata).

In [9]:
mdf_toolbox.gmeta_pop(sample_search_result, info=True)

([{'baz': [1, 2, 3, 4, 5], 'foo': 'bar'}, {'bazd': ['d'], 'food': 'bard'}],
 {'total_query_matches': 22})

## translate_index
Globus Search requires or strongly encourages users to query using an index's UUID instead of the index's name. `translate_index()` takes the index name and returns the UUID (if found, otherwise it returns the input back).

In [10]:
mdf_toolbox.translate_index("mdf")

'1a57bbe5-5272-477f-9d31-343b8258b7a5'