## Global Search

Global Search allows you to search for vaults, files, folders, and datasets by name, tags, user, date, and other metadata which can be customized. Global Search is available on SolveBio Mesh. For more information about Global Search please look at [SolveBio docs](https://docs.solvebio.com/search/).

Similarly to Global Search on the web application, the search functionality is available through solvebio Python and R clients as well.

### Importing SolveBio library and logging in

In [1]:
# Importing SolveBio library
from solvebio import login
from solvebio import Filter
from solvebio import GlobalSearch

In [2]:
# Logging to SolveBio
login()

True

### Global Search Examples

#### Performing Global Search

GlobalSearch performs search based on the provided set of parameters (filters, entities, query, limit, ordering, etc.):

- `query` (optional): An optional query string (advanced search).
- `filters` (optional): Filter or List of filter objects.
- `entities` (optional): List of entity tuples to filter on (entity type, entity).
- `ordering` (optional): List of fields to order the results by.
- `limit` (optional): Maximum number of query results to return.
- `page_size` (optional): Number of results to fetch per query page.
- `result_class` (optional): Class of object returned by query.
- `debug` (optional): Sends debug information to the API.
- `raw_results` (optional): Whether to use raw API response or to cast logical objects to Vault and Object instances.

As previously seen, all parameters are optional. That means performing global search without any parameters is equivalent to global search on SolveBio Mesh without any filters - it will return all objects:

In [3]:
# No filters applied
search_results = GlobalSearch()
print('Returned {} objects.'.format(len(search_results)))

Returned 1449 objects.


Each result object has the following attributes:

In [4]:
print(search_results)


|                    Fields | Data                                             |
|---------------------------+--------------------------------------------------|
|                   _errors | {  "samples": "'Dataset query failed: datase ... |
|                       _id | dataset-1426183806524170528                      |
|                created_at | 2021-01-12T17:02:38.336007+00:00                 |
|                 full_path | solvebio:public:/ClinVar/5.2.0-20210110/Variants-|
|                        id | 1426183806524170528                              |
|                indexed_at | 2021-11-25T17:22:55.233690+00:00                 |
|                      name | Variants-GRCH37-1                                |
|                    parent | 5.2.0-20210110                                   |
|                 parent_id | 1426114255474932968                              |
|                      path | solvebio:public:/ClinVar/5.2.0-20210110/Variants-|
| postproc_template_version

You may use the `limit` parameter to limit the number of returned objects:

In [5]:
# No filters applied with limit parameter
search_results = GlobalSearch(limit=200)
print('Returned {} objects.'.format(len(search_results)))

Returned 200 objects.


In [6]:
# Type of results by default it either Vault instance or Object instance
type(search_results[0])

solvebio.resource.object.Object

#### Advanced search query

You may perform the advanced search, similar as you would do on SolveBio Mesh, by using `query` argument:

In [7]:
# Advanced search
advanced_query_results = GlobalSearch(query="test")
print('Returned {} objects.'.format(len(advanced_query_results)))

Returned 16 objects.


In [8]:
# Advanced search
advanced_query_results = GlobalSearch("fuji")
print('Returned {} objects.'.format(len(advanced_query_results)))

Returned 1408 objects.


#### Global Beacon Search

For all of datasets that have the global beacon enabled,  we should be able to perform entity search and see those datasets in the results:

In [9]:
# Entity search example
GlobalSearch(entities=[["gene", "BRCA2"]])


|                    Fields | Data                                             |
|---------------------------+--------------------------------------------------|
|                   _errors | {  "samples": "Dataset query failed: dataset ... |
|                       _id | dataset-1658666726768179211                      |
|                created_at | 2021-11-29T11:24:42.093240+00:00                 |
|                 full_path | solvebio:public:/beacon-test-dataset             |
|                        id | 1658666726768179211                              |
|                indexed_at | 2022-01-13T09:59:14.378879+00:00                 |
|                      name | beacon-test-dataset                              |
|                    parent |                                                  |
|                 parent_id |                                                  |
|                      path | solvebio:public:/beacon-test-dataset             |
| postproc_template_version

In [10]:
# Entity search example
GlobalSearch(entities=[["variant", "GRCH38-7-140753336-140753336-T"]])

Query returned 0 results.

You may combine multiple parameters to narrow down the search results:

In [11]:
# Multiple search parameters
GlobalSearch(entities=[["gene","BRCA2"]], query="test")


|                    Fields | Data                                             |
|---------------------------+--------------------------------------------------|
|                   _errors | {  "samples": "Dataset query failed: dataset ... |
|                       _id | dataset-1658666726768179211                      |
|                created_at | 2021-11-29T11:24:42.093240+00:00                 |
|                 full_path | solvebio:public:/beacon-test-dataset             |
|                        id | 1658666726768179211                              |
|                indexed_at | 2022-01-13T09:59:14.378879+00:00                 |
|                      name | beacon-test-dataset                              |
|                    parent |                                                  |
|                 parent_id |                                                  |
|                      path | solvebio:public:/beacon-test-dataset             |
| postproc_template_version

#### Getting the Global Search subjects

We can also retrieve the `list of subjects`:

In [12]:
# Get list of subjects for the entity search
search = GlobalSearch(entities=[["gene","BRCA2"]])
search.subjects()

[{'access': True,
  'dataset_id': '1589830521744205858',
  'dataset_path': 'solvebio:public:/HGNC/3.3.0-2019-07-22/HGNC-1',
  'subject': 'U43746'}]

In [13]:
# Subjects count
search.subjects_count()

1

#### Applying filters for Global Search

Similar as [filtering fileds in the dataset](https://docs.solvebio.com/datasets/querying/#filters) (please see the table and examples how to use "filter actions"), you may apply the same filtering mechanism to `apply filters to Global Search`:

In [14]:
# Global Search object
search = GlobalSearch()

# Equals match (in list)
vaults = search.filter(type__in=["vault"])
print('Found {} vaults.'.format(len(vaults)))

# Equals match (in list)
folders = search.filter(type__in=["folder"])
print('Found {} folders.'.format(len(folders)))

# Date range
objects = search.filter(created_at__range=["2021-11-28","2021-12-28"])
print('Found {} objects.'.format(len(objects)))

Found 4 vaults.
Found 90 folders.
Found 5 objects.


You may also combine filters to create more complex searches. Please look at the [docs for combining filters for dataset querying](https://docs.solvebio.com/datasets/querying/#combining-filters), similar logic applies here as well:

In [15]:
# Search for all datasets that are creted by the user Nikola
f = Filter(type="dataset") & Filter(user="Nikola")
results = GlobalSearch(filters=f)
print('Found {} objects.'.format(len(results)))

Found 4 objects.


#### Chaining search requests

Here you may find the examples on how to chain multiple method calls to perform the successive search requests:

In [16]:
s = GlobalSearch()

# Entity search
print("Results:")
for result in s.entity(gene="BRCA2"):
    print("\t" + result.id)

# Get subjects with BRCA2 in public vault
print("Subjects:")
for subject in s.filter(vault="public").entity(gene="BRCA2").subjects():
    print("\t" + subject["subject"])

# Get subjects count with BRCA2 in public vault
subjects_count = s.filter(vault="public").entity(gene="BRCA2").subjects_count()
print("{} subjects found.".format(subjects_count))

# Get all vaults with BRCA2 datasets
print("Facets:")
facets = s.entity(gene="BRCA2").facets("vault")
print(facets)

Results:
	1658666726768179211
	1453602241738607801
	1589830521744205858
Subjects:
	U43746
1 subjects found.
Facets:
{'vault': [['public', 3]]}
