# Exploring The Dimensions Search Language (DSL) - Deep Dive

This tutorial provides a detailed walkthrough of the most important features of the [Dimensions Search Language](https://docs.dimensions.ai/dsl/). 

This tutorial is based on the [Query Syntax](https://docs.dimensions.ai/dsl/language.html) section of the official documentation. So, it can be used as an interactive version of the documentation, as it allows to try out the various DSL queries presented there.

## What is the Dimensions Search Language?

The DSL aims to capture the type of interaction with Dimensions data
that users are accustomed to performing graphically via the [web
application](https://app.dimensions.ai/), and enable web app developers, power users, and others to
carry out such interactions by writing query statements in a syntax
loosely inspired by SQL but particularly suited to our specific domain
and data organization.

**Note:** this notebook uses the Python programming language, however all the **DSL queries are not Python-specific** and can in fact be reused with any other API client. 



## Prerequisites

This notebook assumes you have installed the [Dimcli](https://pypi.org/project/dimcli/) library and are familiar with the *Getting Started* tutorial.


In [1]:
!pip install dimcli --quiet 

import dimcli
from dimcli.utils import *
import json
import sys
import pandas as pd
#

print("==\nLogging in..")
# https://digital-science.github.io/dimcli/getting-started.html#authentication
ENDPOINT = "https://app.dimensions.ai"
if 'google.colab' in sys.modules:
  import getpass
  KEY = getpass.getpass(prompt='API Key: ')  
  dimcli.login(key=KEY, endpoint=ENDPOINT)
else:
  KEY = ""
  dimcli.login(key=KEY, endpoint=ENDPOINT)
dsl = dimcli.Dsl()

==
Logging in..
[2mDimcli - Dimensions API Client (v0.8.2.2)[0m
[2mConnected to: https://app.dimensions.ai - DSL v1.28[0m
[2mMethod: dsl.ini file[0m



## Sections Index 

1. Basic query structure
2. Full-text searching
3. Field searching
4. Searching for researchers
5. Returning results 
6. Aggregations

## 1. Basic query structure

DSL queries consist of two required components: a `search` phrase that
indicates the scientific records to be searched, and one or
more `return` phrases which specify the contents and structure of the
desired results.

The simplest valid DSL query is of the form `search <source>|return <result>`:

In [2]:
%%dsldf 
search grants return  grants limit 5

Returned Grants: 5 (total = 5623964)
[2mTime: 0.80s[0m


Unnamed: 0,project_num,end_date,active_year,original_title,start_year,id,funding_org_name,title_language,language,title,start_date,funders
0,890218,2023-11-30,"[2021, 2022, 2023]",Functional analysis of ribosome heterogeneity ...,2021,grant.9064785,European Commission,en,en,Functional analysis of ribosome heterogeneity ...,2021-12-01,"[{'id': 'grid.270680.b', 'country_name': 'Belg..."
1,2018-HRSI-1548,,[2021],APPROACH to Enriching the Real World Evidence ...,2021,grant.8690978,New Brunswick Health Research Foundation,en,en,APPROACH to Enriching the Real World Evidence ...,2021-11-30,"[{'id': 'grid.484521.e', 'country_name': 'Cana..."
2,948141,2026-10-31,"[2021, 2022, 2023, 2024, 2025, 2026]",Simulating ultracold correlated quantum matter...,2021,grant.9414093,European Research Council,en,en,Simulating ultracold correlated quantum matter...,2021-11-01,"[{'id': 'grid.452896.4', 'country_name': 'Belg..."
3,887019,2023-10-31,"[2021, 2022, 2023]",The role of microbial Oxylipins in the MIcrobe...,2021,grant.8964187,European Commission,en,en,The role of microbial Oxylipins in the MIcrobe...,2021-11-01,"[{'id': 'grid.270680.b', 'country_name': 'Belg..."
4,AH/V001841/1,2023-04-02,"[2021, 2022, 2023]",Playgrounds: A Material Cultural Study of Post...,2021,grant.9401634,Arts and Humanities Research Council,en,en,Playgrounds: A Material Cultural Study of Post...,2021-10-03,"[{'id': 'grid.426413.6', 'country_name': 'Unit..."


### `search source`

A query must begin with the word `search` followed by a `source` name, i.e. the name of a type of scientific `record`, such as `grants` or `publications`.

**What are the sources available?** See the [data sources](https://docs.dimensions.ai/dsl/data-sources.html) section of the documentation. 

Alternatively, we can use the 'schema' API ([describe](https://docs.dimensions.ai/dsl/data-sources.html#metadata-api)) to return this information programmatically:

In [3]:
dsl.query("describe schema")

<dimcli.DslDataset object #4479630496. Dict keys: 'sources', 'entities'>

A more useful query might also make use of the optional `for` and
`where` phrases to limit the set of records returned.

In [4]:
%%dsldf 
search grants  for "lung cancer" 
    where active_year=2000 
return  grants  limit 5

Returned Grants: 5 (total = 1734)
[2mTime: 0.59s[0m


Unnamed: 0,title_language,funders,original_title,funding_org_name,project_num,id,title,start_date,end_date,language,active_year,start_year
0,en,"[{'id': 'grid.279885.9', 'acronym': 'NHLBI', '...",ROLE OF CD44 ISOFORMS IN ENDOTHELIAL CELL DAMAGE,National Heart Lung and Blood Institute,F32HL010455,grant.2386513,ROLE OF CD44 ISOFORMS IN ENDOTHELIAL CELL DAMAGE,2000-12-31,2002-01-01,en,"[2000, 2001, 2002]",2000
1,en,"[{'id': 'grid.279885.9', 'acronym': 'NHLBI', '...","ESTROGEN, ANGIOGENESIS AND ENDOTHELIAL PROGENI...",National Heart Lung and Blood Institute,R01HL063695,grant.2537116,"ESTROGEN, ANGIOGENESIS AND ENDOTHELIAL PROGENI...",2000-12-18,2004-11-30,en,"[2000, 2001, 2002, 2003, 2004]",2000
2,en,"[{'id': 'grid.279885.9', 'acronym': 'NHLBI', '...",GENETIC ANALYSIS OF EPHRIN-EPH SIGNALING IN AN...,National Heart Lung and Blood Institute,R01HL066221,grant.2537801,GENETIC ANALYSIS OF EPHRIN-EPH SIGNALING IN AN...,2000-12-18,2007-11-30,en,"[2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007]",2000
3,en,"[{'id': 'grid.279885.9', 'acronym': 'NHLBI', '...",Synthetic Heparan Sulfate: Probing Biosynthesi...,National Heart Lung and Blood Institute,R01HL062244,grant.2536777,Synthetic Heparan Sulfate: Probing Biosynthesi...,2000-12-15,2017-12-31,en,"[2000, 2001, 2002, 2003, 2004, 2005, 2006, 200...",2000
4,en,"[{'id': 'grid.419213.c', 'acronym': 'RWJF', 'c...",SmokeLess States Program - Implementation,Robert Wood Johnson Foundation,41067,grant.8616620,SmokeLess States Program - Implementation,2000-12-01,2001-02-28,en,"[2000, 2001]",2000


### `return` result (source or facet)

The most basic `return` phrase consists of the keyword `return` followed
by the name of a `record` or `facet` to be returned. 

This must be the
name of the `source` used in the `search` phrase, or the name of a
`facet` of that source.

In [5]:
%%dsldf
search grants for "laryngectomy" 
return grants limit 5

Returned Grants: 5 (total = 117)
[2mTime: 0.54s[0m


Unnamed: 0,title_language,funders,original_title,funding_org_name,project_num,id,title,start_date,end_date,language,active_year,start_year
0,ja,"[{'id': 'grid.54432.34', 'acronym': 'JSPS', 'c...",喉頭全摘出者の家族の術後生活への移行を促進する外来での生活支援プログラムの開発,Japan Society for the Promotion of Science,20K10777,grant.9201764,Development of an outpatient life support prog...,2020-04-01,2024-03-31,ja,"[2020, 2021, 2022, 2023, 2024]",2020
1,ja,"[{'id': 'grid.54432.34', 'acronym': 'JSPS', 'c...",表情解析ソフトウェアを用いた自己音声再建による新規代用音声の開発,Japan Society for the Promotion of Science,20K18439,grant.9209426,Development of new substitute voice by self-vo...,2020-04-01,2022-03-31,ja,"[2020, 2021, 2022]",2020
2,en,"[{'id': 'grid.453041.7', 'acronym': 'DCS', 'co...",Prophylactic pectoralis major flap to compensa...,Dutch Cancer Society,12483,grant.9387466,Prophylactic pectoralis major flap to compensa...,2020-02-01,2024-02-01,nl,"[2020, 2021, 2022, 2023, 2024]",2020
3,en,"[{'id': 'grid.421091.f', 'acronym': 'EPSRC', '...",UKRI CDT in SLT- Continuous End-to-End Streami...,Engineering and Physical Sciences Research Cou...,2268211,grant.8674095,UKRI CDT in SLT- Continuous End-to-End Streami...,2019-09-29,2023-09-28,en,"[2019, 2020, 2021, 2022, 2023]",2019
4,en,"[{'id': 'grid.214431.1', 'acronym': 'NIDCD', '...",Wearable silent speech technology to enhance i...,National Institute on Deafness and Other Commu...,R01DC016621,grant.8554260,Wearable silent speech technology to enhance i...,2019-08-15,2024-07-31,en,"[2019, 2020, 2021, 2022, 2023, 2024]",2019


Eg let's see what are the *facets* available for the *grants* source:

In [6]:
fields = dsl.query("describe schema")['sources']['grants']['fields']
[x for x in fields if fields[x]['is_facet']]

['funder_countries',
 'research_org_state_codes',
 'funding_currency',
 'funding_org_city',
 'funders',
 'active_year',
 'language',
 'research_org_countries',
 'research_org_cities',
 'category_sdg',
 'category_for',
 'category_uoa',
 'funding_org_name',
 'research_orgs',
 'start_year',
 'category_icrp_cso',
 'category_hrcs_hc',
 'language_title',
 'category_bra',
 'researchers',
 'category_hra',
 'funding_org_acronym',
 'category_hrcs_rac',
 'category_icrp_ct',
 'category_rcdc']

## 2. Full-text Searching

Full-text search or keyword search finds all instances of a term
(keyword) in a document, or group of documents. 

Full text search works
by using search indexes, which can be targeting specific sections of a
document e.g. its $abstract$, $authors$, $full text$ etc...

In [7]:
%%dsldf 
search publications 
    in full_data for "moon landing" 
return publications limit 5

Returned Publications: 5 (total = 184216)
[2mTime: 1.29s[0m


Unnamed: 0,author_affiliations,pages,id,year,type,title,volume,issue,journal.id,journal.title
0,[[{'raw_affiliation': ['Centro de Investigacio...,105-123,pub.1134005335,2021,article,Participatory Mapping and PGIS: Secerning Fact...,10,3.0,jour.1147177,International Journal of E-Planning Research
1,[[{'raw_affiliation': ['Institute of Chemistry...,101596,pub.1132933847,2021,article,Critical review of chirality indicators of ext...,92,,jour.1138083,New Astronomy Reviews
2,[[{'raw_affiliation': ['Departamento de Matemá...,424-437,pub.1133496951,2021,article,Modeling of Enhanced Micro-Energy Harvesting o...,168,,jour.1048491,Renewable Energy
3,[[{'raw_affiliation': ['School of Chemical Eng...,118238,pub.1133999332,2021,article,Continuous Microfluidic Solvent Extraction of ...,260,,jour.1380465,Separation and Purification Technology
4,[[{'raw_affiliation': ['German Aerospace Cente...,87-97,pub.1133275673,2021,article,Green bipropellant development – A study on th...,226,,jour.1135200,Combustion and Flame


### 2.1 `in [search index]`

This optional phrase consists of the particle `in` followed by a term indicating a `search index`, specifying for example whether the search
is limited to full text, title and abstract only, or title only. 

In [8]:
%%dsldf 
search grants 
    in title_abstract_only for "something" 
return grants limit 5

Returned Grants: 5 (total = 10290)
[2mTime: 0.55s[0m


Unnamed: 0,project_num,end_date,active_year,original_title,start_year,id,funding_org_name,title_language,language,title,start_date,funders
0,890630,2023-08-31,"[2021, 2022, 2023]",Deciphering fundamental constraints on pathoge...,2021,grant.9064570,European Commission,en,en,Deciphering fundamental constraints on pathoge...,2021-09-01,"[{'id': 'grid.270680.b', 'country_name': 'Belg..."
1,949572,2026-01-31,"[2021, 2022, 2023, 2024, 2025, 2026]",Statistical Host Identification As a Test of D...,2021,grant.9385331,European Research Council,en,en,Statistical Host Identification As a Test of D...,2021-02-01,"[{'id': 'grid.452896.4', 'country_name': 'Belg..."
2,NE/V005847/1,2025-01-03,"[2021, 2022, 2023, 2024, 2025]",Sustainable Plastic Attitudes to benefit Commu...,2021,grant.9414560,Natural Environment Research Council,en,en,Sustainable Plastic Attitudes to benefit Commu...,2021-01-04,"[{'id': 'grid.8682.4', 'country_name': 'United..."
3,865624,2025-12-31,"[2021, 2022, 2023, 2024, 2025]",Overcoming stellar activity in radial velocity...,2021,grant.8964099,European Research Council,en,en,Overcoming stellar activity in radial velocity...,2021-01-01,"[{'id': 'grid.452896.4', 'country_name': 'Belg..."
4,895379,2021-12-31,[2021],Cortical-to-Subcortical Information Transfer U...,2021,grant.9065191,European Commission,en,en,Cortical-to-Subcortical Information Transfer U...,2021-01-01,"[{'id': 'grid.270680.b', 'country_name': 'Belg..."


Eg let's see what are the *search fields* available for the *grants* source:

In [9]:
dsl.query("describe schema")['sources']['grants']['search_fields']

['concepts', 'investigators', 'full_data', 'title_only', 'title_abstract_only']

In [10]:
%%dsldf 
search grants 
    in full_data for "graphene AND computer AND iron" 
return grants limit 5

Returned Grants: 5 (total = 11)
[2mTime: 0.56s[0m


Unnamed: 0,project_num,end_date,active_year,original_title,start_year,id,funding_org_name,title_language,language,title,start_date,funders
0,19-43-04129,2021-12-31,"[2019, 2020, 2021]",Weyl and Dirac semimetals and beyond - predict...,2019,grant.8413990,Russian Science Foundation,en,en,Weyl and Dirac semimetals and beyond - predict...,2019-01-01,"[{'id': 'grid.454869.2', 'country_name': 'Russ..."
1,18-02-20097,2018-12-31,[2018],Проект организации 18-ой Международной конфере...,2018,grant.8731867,Russian Foundation for Basic Research,ru,ru,Project of the organization of the 18th Intern...,2018-01-01,"[{'id': 'grid.452899.b', 'country_name': 'Russ..."
2,4491/E-370/S/2016,2016-12-31,[2016],Dotacja podmiotowa na utrzymanie potencjału ba...,2016,grant.7397800,Ministry of Science and Higher Education,pl,pl,Subject subsidy for maintaining the research p...,2016-02-22,"[{'id': 'grid.425823.a', 'country_name': 'Pola..."
3,2015/17/B/ST8/01422,2018-01-25,"[2016, 2017, 2018]",Katalizatory heterogeniczne na bazie nanostruk...,2016,grant.7401748,National Science Center,pl,pl,Heterogeneous catalysts based on carbon nanost...,2016-01-26,"[{'id': 'grid.436846.b', 'country_name': 'Pola..."
4,4491/E-370/S/2015,2015-12-31,[2015],Dotacja podmiotowa na utrzymanie potencjału ba...,2015,grant.7397795,Ministry of Science and Higher Education,pl,pl,Subject subsidy for maintaining the research p...,2015-02-19,"[{'id': 'grid.425823.a', 'country_name': 'Pola..."


Special search indexes for persons names permit to perform full text
searches on publications `authors` or grants `investigators`. Please see the
*Researchers Search* section below for more information
on how searches work in this case.

In [11]:
%dsldf search publications in authors for "\"Jennifer A Doudna\"" return publications limit 5

Returned Publications: 5 (total = 344)
[2mTime: 1.04s[0m


Unnamed: 0,id,title,year,author_affiliations,type,journal.id,journal.title,volume,issue,pages
0,pub.1133635406,Rapid detection of SARS-CoV-2 with Cas13.,2020,"[[{'raw_affiliation': [], 'first_name': 'Shree...",preprint,jour.1369542,medRxiv,,,
1,pub.1133534923,Controlling and enhancing CRISPR systems,2020,[[{'raw_affiliation': ['Department of Molecula...,article,jour.1327431,Nature Chemical Biology,17.0,1.0,10-19
2,pub.1133449309,IGI-LuNER: single-well multiplexed RT-qPCR tes...,2020,"[[{'raw_affiliation': [], 'first_name': 'Eliza...",preprint,jour.1369542,medRxiv,,,
3,pub.1133370119,Corrigendum to “Engineering of monosized lipid...,2020,[[{'raw_affiliation': ['Department of Chemical...,article,jour.1034525,Acta Biomaterialia,,,
4,pub.1133106825,Amplification-free detection of SARS-CoV-2 wit...,2020,[[{'raw_affiliation': ['J. David Gladstone Ins...,article,jour.1019114,Cell,,,


### 2.2 `for "search term"`

This optional phrase consists of the keyword `for` followed by a
`search term` `string`, enclosed in double quotes (`"`).

Strings in double quotes can contain nested quotes escaped by a
backslash `\`. This will ensure that the string in nested double quotes
is searched for as if it was a single phrase, not multiple words.

An example of a phrase: `"\"Machine Learning\""` : results must contain
`Machine Learning` as a phrase.

In [12]:
%dsldf search publications for "\"Machine Learning\"" return publications limit 5

Returned Publications: 5 (total = 1332770)
[2mTime: 1.01s[0m


Unnamed: 0,title,issue,type,id,author_affiliations,volume,pages,year,journal.id,journal.title
0,K-nearest neighbor and naïve Bayes based diagn...,6.0,article,pub.1130334381,[[{'raw_affiliation': ['Universiti Teknikal Ma...,9,2650-2657,2021,jour.1144063,Bulletin of Electrical Engineering and Informa...
1,Evaluation of Intersection Properties Using MA...,6.0,article,pub.1129037745,"[[{'raw_affiliation': [], 'first_name': 'Görke...",32,,2021,jour.1226975,Teknik Dergi
2,Priority-based low voltage DC microgrid system...,,article,pub.1132780778,[[{'raw_affiliation': ['Department of Electric...,7,43-51,2021,jour.1150945,Energy Reports
3,A Closed-Form Solution to Estimate Spatially V...,99.0,article,pub.1133333464,[[{'raw_affiliation': ['DIFFER-Dutch Institute...,PP,1-1,2021,jour.1293397,IEEE Control Systems Letters
4,Design and Implementation of a Wind Farm Contr...,99.0,article,pub.1133394955,[[{'raw_affiliation': ['Knowledge Exchange Fel...,PP,1-1,2021,jour.1293397,IEEE Control Systems Letters


Example of multiple keywords: `"Machine Learning"` : this searches for
keywords independently.

In [13]:
%dsldf search publications for "Machine Learning" return publications limit 5

Returned Publications: 5 (total = 2712835)
[2mTime: 0.87s[0m


Unnamed: 0,author_affiliations,pages,id,year,type,title,volume,issue,journal.id,journal.title
0,[[{'raw_affiliation': ['Universiti Teknikal Ma...,2650-2657,pub.1130334381,2021,article,K-nearest neighbor and naïve Bayes based diagn...,9,6.0,jour.1144063,Bulletin of Electrical Engineering and Informa...
1,[[{'raw_affiliation': ['International Universi...,100034,pub.1133596998,2021,article,Blockchain in international e-government proce...,3,,jour.1367388,Research in Globalization
2,"[[{'raw_affiliation': [], 'first_name': 'Görke...",,pub.1129037745,2021,article,Evaluation of Intersection Properties Using MA...,32,6.0,jour.1226975,Teknik Dergi
3,[[{'raw_affiliation': ['Department of Electric...,43-51,pub.1132780778,2021,article,Priority-based low voltage DC microgrid system...,7,,jour.1150945,Energy Reports
4,[[{'raw_affiliation': ['Engineering Research C...,70-80,pub.1132837792,2021,article,A novel optimum arrangement for a hybrid renew...,7,,jour.1150945,Energy Reports


Note: Special characters, such as any of `^ " : ~ \ [ ] { } ( ) ! | & +` must be escaped by a backslash `\`. Also, please note escaping rules in
[Python](http://python-reference.readthedocs.io/en/latest/docs/str/escapes.html) (or other languages). For example, when writing a query with escaped quotes, such as `search publications for "\"phrase 1\" AND \"phrase 2\""`, in Python, it is necessary to escape the backslashes as well, so it
would look like: `'search publications for "\\"phrase 1\\" AND \\"phrase 2\\""'`. 

See the [official docs](https://docs.dimensions.ai/dsl/language.html#for-search-term) for more details.

### 2.3 Boolean Operators

Search term can consist of multiple keywords or phrases connected using
boolean logic operators, e.g. `AND`, `OR` and `NOT`.

In [14]:
%dsldf search publications for "(dose AND concentration)" return publications limit 5

Returned Publications: 5 (total = 5559328)
[2mTime: 1.34s[0m


Unnamed: 0,author_affiliations,pages,id,year,type,title,volume,journal.id,journal.title
0,[[{'raw_affiliation': ['Department of Pharmace...,100069,pub.1134000145,2021,article,On the relationship between blend state and di...,3,jour.1365491,International Journal of Pharmaceutics X
1,[[{'raw_affiliation': ['Department of Clinical...,100017,pub.1133580772,2021,article,Malaria parasites and circadian rhythm: New in...,2,jour.1388648,Current Research in Microbial Sciences
2,"[[{'raw_affiliation': ['Chemistry Department, ...",100031,pub.1134186230,2021,article,Nanocellulose: A mini-review on types and use ...,2,jour.1390798,Carbohydrate Polymer Technologies and Applicat...
3,[[{'raw_affiliation': ['Section of Molecular H...,100010,pub.1131161031,2021,article,PMAP-36 reduces the innate immune response ind...,2,jour.1388648,Current Research in Microbial Sciences
4,[[{'raw_affiliation': ['Department of Biotechn...,100018,pub.1133594439,2021,article,Bioprospecting of cowdung microflora for susta...,2,jour.1388648,Current Research in Microbial Sciences


When specifying Boolean operators with keywords such as `AND`, `OR` and
`NOT`, the keywords must appear in all uppercase. 

The operators available are shown in the table below.
.

| Boolean Operator | Alternative Symbol | Description                                                                                                                                                                 |
|------------------|--------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `AND`            | `&&`               | Requires both terms on either side of the Boolean operator to be present for a match.                                                                                       |
| `NOT`            | `!`                | Requires that the following term not be present.                                                                                                                            |
| `OR`             | `||`               | Requires that either term (or both terms) be present for a match.                                                                                                           |
|                  | `+`                | Requires that the following term be present.                                                                                                                                |
|                  | `-`                | Prohibits the following term (that is, matches on fields or documents that do not include that term). The `-` operator is functionally similar to the Boolean operator `!`. |

In [15]:
%dsldf search publications for "(dose OR concentration) AND (-malaria +africa)" return publications limit 5

Returned Publications: 5 (total = 1468851)
[2mTime: 1.07s[0m


Unnamed: 0,author_affiliations,pages,id,year,type,title,volume,journal.id,journal.title,issue
0,[[{'raw_affiliation': ['Institute of Animal Pr...,100014,pub.1132941183,2021,article,A quadruple protection procedure for resuming ...,2.0,jour.1388648,Current Research in Microbial Sciences,
1,[[{'raw_affiliation': ['Joint Global Change Re...,100023,pub.1133100081,2021,article,Quantifying the reductions in mortality from a...,2.0,,,
2,[[{'raw_affiliation': ['Niger Delta University...,1-15,pub.1133401690,2021,article,Performance Evaluation of Adopting the Electro...,6.0,jour.1355738,International Journal of Applied Research on P...,2.0
3,,,pub.1107030350,2021,book,The Wiley Handbook of Home Education,,,,
4,[[{'raw_affiliation': ['National-Regional Join...,387-398,pub.1134184334,2021,article,Root-associated (rhizosphere and endosphere) m...,104.0,jour.1297326,Journal of Environmental Sciences,


The combination of keywords and boolean operators allow to construct rather sophisticated queries. For example, here's a real-world query used to extract publications related to COVID-19. 

In [16]:
q_inner = """ "2019-nCoV" OR "COVID-19" OR "SARS-CoV-2" OR "HCoV-2019" OR "hcov" OR "NCOVID-19" OR  
    "severe acute respiratory syndrome coronavirus 2" OR "severe acute respiratory syndrome corona virus 2" 
    OR (("coronavirus"  OR "corona virus") AND (Wuhan OR China OR novel)) """

# tip: dsl_escape is a dimcli utility function for escaping special characters 
q_outer = f"""search publications in full_data for "{dsl_escape(q_inner)}" return publications"""
print(q_outer)

dsl.query(q_outer)

search publications in full_data for " \"2019-nCoV\" OR \"COVID-19\" OR \"SARS-CoV-2\" OR \"HCoV-2019\" OR \"hcov\" OR \"NCOVID-19\" OR  
    \"severe acute respiratory syndrome coronavirus 2\" OR \"severe acute respiratory syndrome corona virus 2\" 
    OR ((\"coronavirus\"  OR \"corona virus\") AND (Wuhan OR China OR novel)) " return publications
Returned Publications: 20 (total = 326360)
[2mTime: 2.04s[0m


<dimcli.DslDataset object #4793105568. Records: 20/326360>

### 2.4 Wildcard Searches

The DSL supports single and multiple character wildcard searches within
single terms. Wildcard characters can be applied to single terms, but
not to search phrases.

In [17]:
%dsldf search publications in title_only for "ital? malaria" return publications limit 5

Returned Publications: 5 (total = 148)
[2mTime: 0.96s[0m


Unnamed: 0,id,title,year,author_affiliations,type,journal.id,journal.title,volume,issue,pages
0,pub.1133261890,Artemisinin resistant surveillance in African ...,2020,[[{'raw_affiliation': ['Department of Infectio...,article,jour.1112262,Journal of Travel Medicine,,,
1,pub.1132438137,Does Living in Previously Exposed Malaria or W...,2020,,article,jour.1278986,Biointerface Research in Applied Chemistry,11.0,2.0,9744-9748
2,pub.1130290794,A Cluster of Cryptic Plasmodium falciparum Mal...,2020,[[{'raw_affiliation': ['Operative Unit of Infe...,article,jour.1023805,Vector-Borne and Zoonotic Diseases,20.0,12.0,927-931
3,pub.1128245696,Non-imported malaria in Italy: paradigmatic ap...,2020,[[{'raw_affiliation': ['Dipartimento Malattie ...,article,jour.1024954,BMC Public Health,20.0,1.0,857
4,pub.1124231018,"Seasons in Italy: Northern European travelers,...",2020,[[{'raw_affiliation': ['Carnegie Mellon Univer...,article,jour.1141817,Journal of Tourism and Cultural Change,,,1-20


In [18]:
%dsldf search publications in title_only for "it* malaria" return publications limit 5

Returned Publications: 5 (total = 1593)
[2mTime: 0.79s[0m


Unnamed: 0,author_affiliations,id,year,type,title,volume,journal.id,journal.title,pages
0,[[{'raw_affiliation': ['Department of Chemistr...,pub.1133719264,2020,article,S-Adenosyl-L-homocysteine hydrolase: Its Inhib...,21.0,jour.1029063,Mini-Reviews in Medicinal Chemistry,
1,"[[{'raw_affiliation': [], 'first_name': 'P. A....",pub.1133730474,2020,article,Knowledge of Malaria and Utilization of Its Pr...,,jour.1050160,International Journal of TROPICAL DISEASE & He...,34-46
2,[[{'raw_affiliation': ['Department of Parasito...,pub.1133560867,2020,article,Case Report: The First Case of Genotypically C...,,jour.1017021,American Journal of Tropical Medicine and Hygiene,
3,"[[{'raw_affiliation': [], 'first_name': 'Zacch...",pub.1133512831,2020,article,Bio-Fabrication of ZnO-CuO Nanoporous Composit...,,jour.1050059,Journal of Pharmaceutical Research International,31-39
4,[[{'raw_affiliation': ['Department of Infectio...,pub.1133261890,2020,article,Artemisinin resistant surveillance in African ...,,jour.1112262,Journal of Travel Medicine,


| Wildcard Search Type                                             | Special Character | Example                                                                                                                                                                                                                         |
|------------------------------------------------------------------|-------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Single character - matches a single character                    | `?`               | The search string `te?t` would match both `test` and `text`.                                                                                                                                                                    |
| Multiple characters - matches zero or more sequential characters | `*`               | The wildcard search: `tes*` would match `test`, `testing`, and `tester`. You can also use wildcard characters in the middle of a term. For example: `te*t` would match `test` and `text`. `*est` would match `pest` and `test`. |

### 2.5 Proximity Searches

A proximity search looks for terms that are within a specific distance
from one another.

To perform a proximity search, add the tilde character `~` and a numeric
value to the end of a search phrase. For example, to search for a
`formal` and `model` within 10 words of each other in a document, use
the search:

In [19]:
%dsldf search publications for "\"formal model\"~10" return publications limit 5

Returned Publications: 5 (total = 504755)
[2mTime: 2.02s[0m


Unnamed: 0,title,issue,type,id,author_affiliations,volume,pages,year,journal.id,journal.title
0,Formal Analysis of Database Trigger Systems Us...,4.0,article,pub.1133001578,[[{'raw_affiliation': ['Hanoi University of Mi...,9.0,1-16,2021,jour.1148037,International Journal of Software Innovation
1,Performance Evaluation of Adopting the Electro...,2.0,article,pub.1133401690,[[{'raw_affiliation': ['Niger Delta University...,6.0,1-15,2021,jour.1355738,International Journal of Applied Research on P...
2,The Wiley Handbook of Home Education,,book,pub.1107030350,,,,2021,,
3,Uniform parsing for hyperedge replacement gram...,,article,pub.1132947815,[[{'raw_affiliation': ['Department of Computin...,118.0,1-27,2021,jour.1124986,Journal of Computer and System Sciences
4,Revisiting positive affect and reward influenc...,,article,pub.1134185830,[[{'raw_affiliation': ['Department of Psycholo...,39.0,27-33,2021,jour.1050888,Current Opinion in Behavioral Sciences


In [20]:
%dsldf search publications for "\"digital humanities\"~5  +ontology" return publications limit 5

Returned Publications: 5 (total = 10394)
[2mTime: 1.18s[0m


Unnamed: 0,title,issue,type,id,author_affiliations,volume,pages,year,journal.id,journal.title
0,Participatory Mapping and PGIS: Secerning Fact...,3.0,article,pub.1134005335,[[{'raw_affiliation': ['Centro de Investigacio...,10,105-123,2021,jour.1147177,International Journal of E-Planning Research
1,A neural Entity Coreference Resolution review,,article,pub.1133368057,"[[{'raw_affiliation': ['School of Informatics,...",168,114466,2021,jour.1128045,Expert Systems with Applications
2,Generating knowledge graphs by employing Natur...,,article,pub.1132205206,[[{'raw_affiliation': ['Department of Mathemat...,116,253-264,2021,jour.1125399,Future Generation Computer Systems
3,Visualizing stemming techniques on online news...,1.0,article,pub.1131619978,[[{'raw_affiliation': ['Universiti Teknologi M...,10,365-373,2021,jour.1144063,Bulletin of Electrical Engineering and Informa...
4,Ontology learning: Grand tour and challenges,,article,pub.1133579238,"[[{'raw_affiliation': ['NLP, Machine Learning ...",39,100339,2021,jour.1147019,Computer Science Review


The distance referred to here is the number of term movements needed to match the specified phrase.  
In the example above, if `formal` and `model` were 10 spaces apart in a
field, but `formal` appeared before `model`, more than 10 term movements
would be required to move the terms together and position `formal` to
the right of `model` with a space in between.

## 3. Field Searching

Field searching allows to use a specific `field` of a `source` as a
query filter. For example, this can be a
[Literal](supported-types.ipynb) field such as the $type$ of a
publication, its $date$, $mesh terms$, etc.. Or it can be an
[entity](data-entities.ipynb) field, such as the $journal title$ for a
publication, the $country name$ of its author affiliations, etc..

**What are the fields available for each source?** See the [data sources](https://docs.dimensions.ai/dsl/data-sources.html) section of the documentation. 

Alternatively, we can use the 'schema' API ([describe](https://docs.dimensions.ai/dsl/data-sources.html#metadata-api)) to return this information programmatically: 

In [21]:
%dsldocs publications  

Unnamed: 0,sources,field,type,description,is_filter,is_entity,is_facet
0,publications,abstract,string,The publication abstract.,False,False,False
1,publications,acknowledgements,string,The acknowledgements section text as found in ...,False,False,False
2,publications,altmetric,float,Altmetric attention score.,True,False,False
3,publications,altmetric_id,integer,AltMetric Publication ID,True,False,False
4,publications,authors,json,Ordered list of authors names and their affili...,True,False,False
5,publications,book_doi,string,The DOI of the book a chapter belongs to (note...,True,False,False
6,publications,book_series_title,string,"The title of the book series book, belong to.",False,False,False
7,publications,book_title,string,The title of the book a chapter belongs to (no...,False,False,False
8,publications,category_bra,categories,`Broad Research Areas <https://dimensions.fres...,True,True,True
9,publications,category_for,categories,`ANZSRC Fields of Research classification <htt...,True,True,True


### 3.1 `where`

This optional phrase consists of the keyword `where` followed by a
`filters` phrase consisting of DSL filter expressions, as described
below.

In [22]:
%dsldf search publications where type = "book" return publications limit 5

Returned Publications: 5 (total = 445353)
[2mTime: 0.52s[0m


Unnamed: 0,id,year,type,title
0,pub.1132813748,2021,book,Advances in Hospitality and Leisure
1,pub.1120755114,2021,book,Wege zum Werk als Sinngeschehen
2,pub.1132180633,2021,book,De animae quantitate
3,pub.1096537870,2021,book,Handbook of Autism
4,pub.1098573300,2021,book,Metalloproteins and Metalloenzymes


If a `for` phrase is also used in a filtered query, the
system will first apply the filters, and then search the resulting
restricted set of documents for the `search term`.

In [23]:
%dsldf search publications for "malaria" where type = "book" return publications limit 5

Returned Publications: 5 (total = 18783)
[2mTime: 0.55s[0m


Unnamed: 0,volume,id,title,year,type
0,217.0,pub.1132680288,"Viruses and Human Cancer, From Basic Science t...",2021,book
1,146.0,pub.1131964912,"Intelligent Computing and Networking, Proceedi...",2021,book
2,1255.0,pub.1132865116,Proceedings of International Conference on Fro...,2021,book
3,,pub.1131634515,Trauma Induced Coagulopathy,2021,book
4,,pub.1132931048,"Back Pain in the Young Child and Adolescent, A...",2021,book


### 3.2 `in`

For convenience, the DSL also supports shorthand notation for filters
where a particular field should be restricted to a specified range or
list of values (although the same logic may be expressed using complex
filters as shown below).

Syntax: a **range filter** consists of the `field` name, the keyword `in`, and a
range of values enclosed in square brackets (`[]`), where the range
consists of a `low` value, colon `:`, and a `high` value.

In [24]:
%%dsldf 
search grants 
    for "malaria" 
    where start_year in [ 2010 : 2015 ] 
return grants limit 5

Returned Grants: 5 (total = 3157)
[2mTime: 0.67s[0m


Unnamed: 0,project_num,end_date,active_year,original_title,start_year,id,funding_org_name,title_language,language,title,start_date,funders
0,R21AI120981,2017-11-30,"[2015, 2016, 2017]",Bloodborne tropical pathogen detection using m...,2015,grant.4729738,National Institute of Allergy and Infectious D...,en,en,Bloodborne tropical pathogen detection using m...,2015-12-28,"[{'id': 'grid.419681.3', 'country_name': 'Unit..."
1,R21AI120973,2019-02-28,"[2015, 2016, 2017, 2018, 2019]",Field-deployable Assay for Differential Diagno...,2015,grant.4729736,National Institute of Allergy and Infectious D...,en,en,Field-deployable Assay for Differential Diagno...,2015-12-24,"[{'id': 'grid.419681.3', 'country_name': 'Unit..."
2,R21AI109439,2018-11-30,"[2015, 2016, 2017, 2018]",T cell driven antigen discovery for vaccine ca...,2015,grant.4729699,National Institute of Allergy and Infectious D...,en,en,T cell driven antigen discovery for vaccine ca...,2015-12-21,"[{'id': 'grid.419681.3', 'country_name': 'Unit..."
3,91488,2018-12-18,"[2015, 2016, 2017, 2018]",Senior Fellowship for Dr. Eduardo Samo Gudo: E...,2015,grant.4854433,Volkswagen Foundation,en,en,Senior Fellowship for Dr. Eduardo Samo Gudo: E...,2015-12-18,"[{'id': 'grid.452969.5', 'country_name': 'Germ..."
4,,2019-09-30,"[2015, 2016, 2017, 2018, 2019]","Biology, Ecology & Management of Emerging Dise...",2015,grant.8821176,National Institute of Food and Agriculture,en,en,"Biology, Ecology & Management of Emerging Dise...",2015-12-10,"[{'id': 'grid.482914.2', 'country_name': 'Unit..."


Syntax: a **list filter** consists of the `field` name, the keyword `in`, and a list
of one or more `value` s enclosed in square brackets (`[]`), where
values are separated by commas (`,`):

In [25]:
%%dsldf 
search grants 
    for "malaria" 
    where research_org_name in [ "UC Berkeley", "UC Davis", "UCLA"  ] 
return grants limit 5

Returned Grants: 0
[2mTime: 0.70s[0m
Field 'research_org_name' is deprecated in favor of research_orgs. Please refer to https://docs.dimensions.ai/dsl/releasenotes.html for more details


### 3.3 `count` - filter function

The filter function `count` is supported on some fields in
[publications](publications.ipynb) (e.g. `researchers` and
`research_orgs`).

Use of this filter is shown on the example below:

In [26]:
%%dsldf 
search publications 
    for "malaria" 
    where count(research_orgs) > 5 
return research_orgs limit 5

Returned Research_orgs: 5
[2mTime: 0.80s[0m


Unnamed: 0,id,count,country_name,name,latitude,city_name,linkout,longitude,state_name,types,acronym
0,grid.4991.5,1920,United Kingdom,University of Oxford,51.753437,Oxford,[http://www.ox.ac.uk/],-1.25401,Oxfordshire,[Education],
1,grid.8991.9,1643,United Kingdom,London School of Hygiene & Tropical Medicine,51.5209,London,[http://www.lshtm.ac.uk/],-0.1307,Camden,[Education],LSHTM
2,grid.38142.3c,1374,United States,Harvard University,42.377052,Cambridge,[http://www.harvard.edu/],-71.11665,Massachusetts,[Education],
3,grid.21107.35,900,United States,Johns Hopkins University,39.328888,Baltimore,[https://www.jhu.edu/],-76.62028,Maryland,[Education],JHU
4,grid.7445.2,895,United Kingdom,Imperial College London,51.4986,London,[http://www.imperial.ac.uk/],-0.175478,Westminster,[Education],


Number of publications with more than 50 researcher.

In [27]:
%%dsldf 
search publications 
    for "malaria" 
    where count(researchers) > 50 
return publications limit 5

Returned Publications: 5 (total = 212)
[2mTime: 0.80s[0m


Unnamed: 0,title,type,id,author_affiliations,year,journal.id,journal.title,issue,volume,pages
0,Detection of SARS-CoV-2 N-antigen in blood dur...,article,pub.1133366554,"[[{'raw_affiliation': ['Université de Paris, I...",2020,jour.1114775,Clinical Microbiology and Infection,,,
1,Multiomics Characterization of Preterm Birth i...,article,pub.1133588553,[[{'raw_affiliation': ['Department of Pediatri...,2020,jour.1321825,JAMA Network Open,12.0,3.0,e2029655
2,Ensembl 2021,article,pub.1131509875,[[{'raw_affiliation': ['European Molecular Bio...,2020,jour.1018982,Nucleic Acids Research,,,gkaa942-
3,Detection of neutralising antibodies to SARS-C...,article,pub.1131972881,"[[{'raw_affiliation': ['Department of Zoology,...",2020,jour.1021032,Eurosurveillance,42.0,25.0,2000685
4,Five insights from the Global Burden of Diseas...,article,pub.1131794493,"[[{'raw_affiliation': [], 'first_name': 'GBD 2...",2020,jour.1077219,The Lancet,10258.0,396.0,1135-1159


Number of publications with more than one researcher.

In [28]:
%%dsldf 
search publications
where count(researchers) > 1
return funders limit 5

Returned Funders: 5
[2mTime: 1.97s[0m


Unnamed: 0,id,count,country_name,acronym,longitude,types,city_name,linkout,name,latitude,state_name
0,grid.419696.5,1979524,China,NSFC,116.33983,[Government],Beijing,[http://www.nsfc.gov.cn/publish/portal1/],National Natural Science Foundation of China,40.005177,
1,grid.270680.b,725673,Belgium,EC,4.36367,[Government],Brussels,[http://ec.europa.eu/index_en.htm],European Commission,50.85165,
2,grid.424020.0,622277,China,MOST,116.316284,[Government],Beijing,[http://www.most.gov.cn/eng/],Ministry of Science and Technology of the Peop...,39.827835,
3,grid.54432.34,570374,Japan,JSPS,139.74039,[Nonprofit],Tokyo,[http://www.jsps.go.jp/],Japan Society for the Promotion of Science,35.68716,
4,grid.48336.3a,559674,United States,NCI,-77.10119,[Government],Rockville,[http://www.cancer.gov/],National Cancer Institute,39.004326,Maryland


International collaborations: number of publications with more than one author and affiliations located in more than one country.

In [29]:
%%dsldf 
search publications
where count(researchers) > 1
and count(research_org_countries) > 1
return funders limit 5

Returned Funders: 5
[2mTime: 1.01s[0m


Unnamed: 0,id,count,country_name,name,latitude,city_name,linkout,longitude,acronym,types
0,grid.419696.5,481660,China,National Natural Science Foundation of China,40.005177,Beijing,[http://www.nsfc.gov.cn/publish/portal1/],116.33983,NSFC,[Government]
1,grid.270680.b,371757,Belgium,European Commission,50.85165,Brussels,[http://ec.europa.eu/index_en.htm],4.36367,EC,[Government]
2,grid.424150.6,164661,Germany,German Research Foundation,50.69934,Bonn,[http://www.dfg.de/en/],7.147797,DFG,[Facility]
3,grid.424020.0,156442,China,Ministry of Science and Technology of the Peop...,39.827835,Beijing,[http://www.most.gov.cn/eng/],116.316284,MOST,[Government]
4,grid.54432.34,145633,Japan,Japan Society for the Promotion of Science,35.68716,Tokyo,[http://www.jsps.go.jp/],139.74039,JSPS,[Nonprofit]


Domestic collaborations: number of publications with more than one author and more than one affiliation located in exactly one country.

In [30]:
%%dsldf 
search publications
where count(researchers) > 1
and count(research_org_countries) = 1
return funders limit 5

Returned Funders: 5
[2mTime: 2.70s[0m


Unnamed: 0,id,count,country_name,name,latitude,city_name,linkout,longitude,acronym,types,state_name
0,grid.419696.5,1455454,China,National Natural Science Foundation of China,40.005177,Beijing,[http://www.nsfc.gov.cn/publish/portal1/],116.33983,NSFC,[Government],
1,grid.424020.0,454801,China,Ministry of Science and Technology of the Peop...,39.827835,Beijing,[http://www.most.gov.cn/eng/],116.316284,MOST,[Government],
2,grid.48336.3a,409234,United States,National Cancer Institute,39.004326,Rockville,[http://www.cancer.gov/],-77.10119,NCI,[Government],Maryland
3,grid.54432.34,390452,Japan,Japan Society for the Promotion of Science,35.68716,Tokyo,[http://www.jsps.go.jp/],139.74039,JSPS,[Nonprofit],
4,grid.270680.b,336357,Belgium,European Commission,50.85165,Brussels,[http://ec.europa.eu/index_en.htm],4.36367,EC,[Government],


### 3.4 Filter Operators

A simple filter expression consists of a `field` name, an in-/equality
operator `op`, and the desired field `value`. 

The `value` must be a
`string` enclosed in double quotes (`"`) or an integer (e.g. `1234`).

The available operators are:

| `op`           | meaning                                                                                  |
|----------------|------------------------------------------------------------------------------------------|
| `=`            | *is* (or *contains* if the given `field` is multi-value)                                 |
| `!=`           | *is not*                                                                                 |
| `>`            | *is greater than*                                                                        |
| `<`            | *is less than*                                                                           |
| `>=`           | *is greater than or equal to*                                                            |
| `<=`           | *is less than or equal to*                                                               |
| `~`            | *partially matches* (see partial-string-matching below) |
| `is empty`     | *is empty* (see emptiness-filters below)                      |
| `is not empty` | *is not empty* (see emptiness-filters below)                  |

A couple of examples 

In [31]:
%dsldf search datasets where year > 2010 and year < 2012 return datasets limit 5

Returned Datasets: 5 (total = 155827)
[2mTime: 0.61s[0m


Unnamed: 0,id,keywords,year,title,authors,journal.id,journal.title
0,dataset.8857124,[Composable Chemiosmotic Energy Transduction R...,2011,Electron Transport Chain (ETC).,"[{'name': 'Ivan Chang'}, {'name': 'Margit Heis...",jour.1037553,PLoS ONE
1,dataset.8861504,"[tandem mass spectrometry, bile duct obstructi...",2011,Relative abundance of taurine- (A) and glycine...,"[{'name': 'Jocelyn Trottier'}, {'name': 'Andrz...",jour.1037553,PLoS ONE
2,dataset.8857589,"[Superantigen Toxicity Clarifying, Mechanism]",2011,A cartoon of what might be happening at the im...,[{'name': 'John D. Fraser'}],jour.1032549,PLOS Biology
3,dataset.8861444,"[tandem mass spectrometry, bile duct obstructi...",2011,"Relative abundance of primary (A & D), seconda...","[{'name': 'Jocelyn Trottier'}, {'name': 'Andrz...",jour.1037553,PLoS ONE
4,dataset.8857670,"[Superantigen Toxicity Clarifying, Mechanism]",2011,The model structures of four trimer complexes ...,[{'name': 'John D. Fraser'}],jour.1032549,PLOS Biology


In [32]:
%dsldf search patents where assignees != "grid.410484.d" return patents limit 5

Returned Patents: 5 (total = 54549156)
[2mTime: 0.84s[0m


Unnamed: 0,publication_date,id,title,year,times_cited,inventor_names,assignee_names,filing_status,assignees,granted_year
0,1958-11-13,DE-1043324-B,A method for the isomerization of 4substituier...,1954,0,"[BAIN JOSEPH PAUL, KLEIN EUGENE ALBERT, HUNT H...","[Glidden Co, Akzo Nobel Paints LLC]",Application,,
1,2011-03-17,WO-2011032015-A1,CONCURRENT WIRELESS TRANSMITTER MAPPING AND MO...,2010,0,"[GARIN LIONEL JACQUES, DO JU-YONG, ZHANG GENGS...","[GARIN LIONEL JACQUES, QUALCOMM INC, DO JU-YON...",Application,"[{'id': 'grid.430388.4', 'country_name': 'Unit...",
2,2012-05-08,US-8173093-B2,Iron silicide sputtering target and method for...,2005,2,"[ODA KUNIHIRO, SUZUKI RYO]",[JX Nippon Mining and Metals Corp],Grant,"[{'id': 'grid.497092.1', 'country_name': 'Japa...",2012.0
3,2004-03-01,CA-100082-S,STATUE,2002,0,,"[Henri Studio Inc, HENRI STUDIO INC]",Grant,,2004.0
4,1971-01-14,DE-1935222-A1,Device for automatic speiseresteabraeumung in ...,1969,0,[P CONRADI BENNO],[KUEPPERSBUSCH],Application,,


### 3.5 Partial string matching with `~`

The `~` operator indicates that the given `field` need only partially,
instead of exactly, match the given `string` (the `value` used with this
operator must be a `string`, not an integer).

For example, the filter `where research_orgs.name~"Saarland Uni"` would
match both the organization named "Saarland University" and the one
named "Universitätsklinikum des Saarlandes", and any other organization
whose name includes the terms "Saarland" and "Uni" (the order is
unimportant). 

In [33]:
%%dsldf 
search patents 
    where assignee_names ~ "IBM" 
return assignees limit 5

Returned Assignees: 5
[2mTime: 2.42s[0m


Unnamed: 0,id,count,country_name,name,latitude,city_name,linkout,longitude,state_name,types
0,grid.410484.d,72795,United States,IBM (United States),41.10854,Armonk,[http://www.ibm.com/],-73.72047,New York,[Company]
1,grid.14648.3f,5362,United Kingdom,IBM (United Kingdom),51.026752,Winchester,[https://www.ibm.com/in-en],-1.39726,Hampshire,[Company]
2,grid.424815.e,3377,Germany,IBM (Germany),48.673832,Böblingen,[http://www.ibm.com/de/de/],9.034824,,[Company]
3,grid.424192.8,1499,France,IBM (France),48.843975,Paris,[https://www.ibm.com/fr-fr/],2.39628,,[Company]
4,grid.292504.8,1348,Canada,IBM (Canada),43.819103,Markham,[http://www.ibm.com/ca/en/],-79.33393,Ontario,[Company]


### 3.6 Emptiness filters `is empty`

To filter records which contain specific field or to filter those which
contain an empty field, it is possible to use something like
`where research_orgs is not empty` or `where issn is empty`.

In [34]:
%%dsldf
search publications 
    for "iron graphene" 
    where researchers is empty 
    and research_orgs is not empty 
return publications[id+title+researchers+research_orgs+type] limit 5

Returned Publications: 5 (total = 3989)
[2mTime: 1.42s[0m


Unnamed: 0,title,research_orgs,type,id
0,Algal-based polysaccharides as polymer electro...,"[{'id': 'grid.441908.0', 'country_name': 'Peru...",article,pub.1133689836
1,Antimicrobial edible films in food packaging: ...,"[{'id': 'grid.411890.5', 'country_name': 'Indi...",article,pub.1133731113
2,The application of nanoparticles in cancer imm...,"[{'id': 'grid.412523.3', 'country_name': 'Chin...",article,pub.1134129893
3,Influence of drying and calcination temperatur...,"[{'id': 'grid.163032.5', 'country_name': 'Chin...",article,pub.1133997670
4,Research progress and mechanism of nanomateria...,"[{'id': 'grid.410726.6', 'country_name': 'Chin...",article,pub.1134184045


## 4. Searching for Researchers

The DSL offers different mechanisms for searching for researchers (e.g.
publication authors, grant investigators), each of them presenting
specific advantages.

### 4.1 Exact name searches

Special full-text indices allows to look up a researcher's name and
surname **exactly as they appear in the source documents** they derive from.

This approach has a broad scope, as it allows to search the full
collection of Dimensions documents irrespectively of whether a
researcher was succesfully disambiguated (and hence given a Dimensions
ID). On the other hand, this approach will only match names as they
appear in the source document, so different spellings or initials are
not necessarily returned via a single query. 

```
search in [authors|investigators|inventors]
```

It is possible to look up publications authors using a specific
`search index` called `authors`. 

This method expects case insensitive
phrases, in format $"<first name> <last name>"$ or reverse order. Note
that strings in double quotes that contain nested quotes must always be
escaped by a backslash `\`.

In [35]:
%dsldf search publications in authors for "\"Charles Peirce\"" return publications limit 5

Returned Publications: 5 (total = 145)
[2mTime: 0.58s[0m


Unnamed: 0,author_affiliations,pages,id,year,type,title
0,"[[{'raw_affiliation': [], 'first_name': 'Charl...",37-49,pub.1132626305,2020,chapter,How to Make our Ideas Clear
1,"[[{'raw_affiliation': [], 'first_name': 'Charl...",304-330,pub.1123488526,2019,chapter,10. [Six Papers on Existential Graphs]
2,"[[{'raw_affiliation': [], 'first_name': 'Charl...",331-347,pub.1123488527,2019,chapter,"11. On Existential Graphs, F4"
3,"[[{'raw_affiliation': [], 'first_name': 'Charl...",262-267,pub.1123488522,2019,chapter,6. Positive Logical Graphs (PLG)
4,"[[{'raw_affiliation': [], 'first_name': 'Charl...",477-488,pub.1123488535,2019,chapter,19. [A System of Existential Graphs]


Instead of first name, initials can also be used. These are examples of
valid research search phrases:

-   `\"Peirce, Charles S.\"`
-   `\"Charles S. Peirce\"`
-   `\"CS Peirce\"`
-   `\"Peirce CS\"`
-   `\"C S Peirce\"`
-   `\"Peirce C S\"`
-   `\"C Peirce\"`
-   `\"Peirce C\"`
-   `\"Charles Peirce\"`
-   `\"Peirce Charles\"`

**Warning**: In order to produce valid results an author or an investigator search
query must contain **at least two components or more** (e.g., name and
surname, either in full or initials).

Investigators search is similar to *authors* search, only it allows to search on `grants` and
`clinical trials` using a separate search index `investigators`, and on
`patents` using the index `inventors`.

In [36]:
%%dsldf 
search clinical_trials in investigators for "\"John Smith\"" 
return clinical_trials limit 5

Returned Clinical_trials: 3 (total = 3)
[2mTime: 0.84s[0m


Unnamed: 0,id,title,investigator_details,active_years
0,NCT00689533,VEPTR Implantation to Treat Children With Earl...,"[[John M Flynn, MD, Principal Investigator, Ch...","[2008, 2009, 2010, 2011, 2012, 2013, 2014, 201..."
1,NCT01241149,Prospective Evaluation of Symptom Resolution i...,"[[Ellie Mentler, MD, Principal Investigator, U...",
2,NCT04072380,"A Phase 2, Double-blind, Placebo-controlled, P...","[[Rohith G. Patel, MD, Principal Investigator,...","[2019, 2020]"


In [37]:
%%dsldf 
search grants in investigators for "\"Satoko Shimazaki\"" 
return grants limit 5

Returned Grants: 4 (total = 4)
[2mTime: 0.55s[0m


Unnamed: 0,title,start_year,end_date,start_date,id,funding_org_name,language,active_year,original_title,title_language,project_num,funders
0,"Kabuki Actors, Print Technology, and the Theat...",2021,2022-08-31,2021-09-01,grant.7925589,National Endowment for the Humanities,en,"[2021, 2022]","Kabuki Actors, Print Technology, and the Theat...",en,FEL-263245-19,"[{'id': 'grid.422239.c', 'country_name': 'Unit..."
1,Genealogy research on female saints in the Pal...,2018,2021-03-31,2018-04-01,grant.7527261,Japan Society for the Promotion of Science,ja,"[2018, 2019, 2020, 2021]",古・中英語期における女性聖人伝の系譜研究：Aelfricのテクストと言語を中心に,ja,18K00431,"[{'id': 'grid.54432.34', 'country_name': 'Japa..."
2,Images of Women in the Old English Lives of Sa...,2015,2018-03-31,2015-04-01,grant.5858713,Japan Society for the Promotion of Science,en,"[2015, 2016, 2017, 2018]",Images of Women in the Old English Lives of Sa...,en,15K02313,"[{'id': 'grid.54432.34', 'country_name': 'Japa..."
3,Reception and Transfromation of the Images of ...,2012,2015-03-31,2012-04-01,grant.6086985,Japan Society for the Promotion of Science,en,"[2012, 2013, 2014, 2015]",Reception and Transfromation of the Images of ...,en,24520310,"[{'id': 'grid.54432.34', 'country_name': 'Japa..."


In [38]:
%%dsldf 
search patents in inventors for "\"John Smith\"" 
return patents limit 5

Returned Patents: 5 (total = 636)
[2mTime: 0.69s[0m


Unnamed: 0,title,assignee_names,id,publication_date,filing_status,inventor_names,year,times_cited,granted_year
0,Improvements in or relating to Electric Arc La...,[SMITH JOHN],GB-189625600-A,1897-09-25,Application,[SMITH JOHN],1896.0,0,
1,Improvements in or relating to Steam-heated Co...,[SMITH JOHN],GB-190921910-A,1910-07-28,Application,[SMITH JOHN],1909.0,0,
2,An Improved Method of Packing Glass Bottles.,[SMITH JOHN],GB-189814759-A,1899-07-05,Application,[SMITH JOHN],1898.0,0,
3,"Improvements in or connected with ""Otter Board...",[SMITH JOHN],GB-190114799-A,1902-05-29,Application,[SMITH JOHN],1901.0,0,
4,IMPROVEMENTS IN STOVE AND FIRE GRATES,"[SMITH, JOHN]",CA-4544-A,1875-04-01,Grant,[SMITH JOHN],,0,1875.0


### 4.2 Fuzzy Searches

This type of search is similar to *full-text
search*, with the difference that it
allows searching by only a part of a name, e.g. only the 'last name' of
a person, by using the `where` clause. 

**Note** At this moment, this type of search is only available for
`publications`. Other sources will add this option in the future.

For example:

In [71]:
%%dsldf 
search publications where authors = "Hawking" 
return publications[id+doi+title+authors] limit 5

Returned Publications: 5 (total = 1979)
[2mTime: 2.48s[0m


Unnamed: 0,title,authors,id,doi
0,A search for the dimuon decay of the Standard ...,"[{'raw_affiliation': ['CPPM, Aix-Marseille Uni...",pub.1133087947,10.1016/j.physletb.2020.135980
1,Measurement of the jet mass in high transverse...,"[{'raw_affiliation': ['CPPM, Aix-Marseille Uni...",pub.1133086120,10.1016/j.physletb.2020.135991
2,Observation and Measurement of Forward Proton ...,"[{'raw_affiliation': ['CPPM, Aix-Marseille Uni...",pub.1134007432,10.1103/physrevlett.125.261801
3,Search for Heavy Resonances Decaying into a Ph...,"[{'raw_affiliation': ['CPPM, Aix-Marseille Uni...",pub.1133603040,10.1103/physrevlett.125.251802
4,The Influence of Glacial Cover on Riverine Sil...,[{'raw_affiliation': ['Bristol Glaciology Cent...,pub.1132789280,10.1029/2020gb006611


Generally speaking, using a `where` clause to search authors is less
precise that using the relevant exact-search syntax. 

On the other hand, using a
`where` clause can be handy if one wants to **combine an author search
with another full-text search index**.

For example:

In [40]:
%%dsldf 
search publications 
    in title_abstract_only for "dna replication" 
    where authors = "smith"  
return publications limit 5

Returned Publications: 5 (total = 1552)
[2mTime: 0.79s[0m


Unnamed: 0,title,type,id,author_affiliations,year,journal.id,journal.title,issue,volume,pages
0,Epigenome-wide association study of diet quali...,article,pub.1131341540,[[{'raw_affiliation': ['Hubert Department of G...,2020,jour.1087544,International Journal of Epidemiology,,,
1,High Risk α-HPV E6 Impairs Translesion Synthes...,article,pub.1134030384,"[[{'raw_affiliation': ['Division of Biology, K...",2020,jour.1043163,Cancers,1.0,13.0,28.0
2,Haloferax volcanii—a model archaeon for studyi...,article,pub.1133027047,"[[{'raw_affiliation': [""School of Life Science...",2020,jour.1046605,Open Biology,12.0,10.0,200293.0
3,Rapid poxvirus engineering using CRISPR/Cas9 a...,article,pub.1132270279,[[{'raw_affiliation': ['John Curtin School of ...,2020,jour.1300829,Communications Biology,1.0,3.0,643.0
4,Association between Breastfeeding and DNA Meth...,article,pub.1132201653,[[{'raw_affiliation': ['Postgraduate Programme...,2020,jour.1042723,Nutrients,11.0,12.0,3309.0


### 4.3 Using the disambiguated Researchers database

The Dimensions [Researchers](https://docs.dimensions.ai/dsl/datasource-researchers.html) source is a database of
researchers information algorithmically extracted and disambiguated from
all of the other content sources (publications, grants, clinical trials
etc..).

By using the `researchers` source it is possible to match an
'aggregated' person object linking together multiple publication
authors, grant investigators etc.. irrespectively of the form their
names can take in the original source documents.

However, since database does not contain all authors and investigators information
available in Dimensions. 

E.g. think of authors from older publications,
or authors with very common names that are difficult to disambiguate, or
very new authors, who have only one or few publications. In such cases,
using full-text authors search might be more
appropriate.

Examples:

In [41]:
%%dsldf 
search researchers for "\"Satoko Shimazaki\"" 
return researchers[basics+obsolete] 

Returned Researchers: 4 (total = 4)
[2mTime: 0.60s[0m


Unnamed: 0,research_orgs,last_name,first_name,obsolete,id
0,"[{'id': 'grid.266190.a', 'country_name': 'Unit...",Shimazaki,Satoko,0,ur.015527473602.63
1,"[{'id': 'grid.19006.3e', 'country_name': 'Unit...",Shimazaki,Satoko,0,ur.014307627665.09
2,,Shimazaki,Satoko,0,ur.07751146721.59
3,,Shimazaki,Satoko,1,ur.010537333602.30


NOTE pay attentiont to the `obsolete` field. This indicates the researcher ID status. 0 means that the researcher ID is still **active**, 1 means that the researcher ID is **no longer valid**. This is due to the ongoing process of refinement of Dimensions researchers. 

Hence the query above is best written like this:

In [42]:
%%dsldf 
search researchers where obsolete=0 for "\"Satoko Shimazaki\"" 
return researchers[basics+obsolete] 

Returned Researchers: 3 (total = 3)
[2mTime: 1.10s[0m


Unnamed: 0,research_orgs,last_name,first_name,obsolete,id
0,"[{'id': 'grid.266190.a', 'country_name': 'Unit...",Shimazaki,Satoko,0,ur.015527473602.63
1,"[{'id': 'grid.19006.3e', 'country_name': 'Unit...",Shimazaki,Satoko,0,ur.014307627665.09
2,,Shimazaki,Satoko,0,ur.07751146721.59


With `Researchers`, one can use other fields as well:

In [43]:
%%dsldf 
search researchers 
    where obsolete=0 and last_name="Shimazaki" 
return researchers[basics] limit 5

Returned Researchers: 5 (total = 460)
[2mTime: 0.58s[0m


Unnamed: 0,last_name,first_name,id,research_orgs
0,Shimazaki,Ken-Ichiro,ur.07735103434.64,
1,Shimazaki,Taishi,ur.014464167544.15,"[{'id': 'grid.39158.36', 'country_name': 'Japa..."
2,Shimazaki,Katsunori,ur.012676375026.21,
3,Shimazaki,Kohei,ur.012106470617.55,
4,Shimazaki,Michio,ur.012132522165.33,


## 5. Returning results

After the `search` phrase, a query must contain one or more `return`
phrases, specifying the content and format of the information that
should be returned.



### 5.1 Returning Multiple Sources

Multiple results may not be returned in a single `return` phrase.

In [44]:
%%dsldf 
search publications 
return funders limit 5 
return research_orgs limit 5 
return year

Returned Research_orgs: 5
Returned Year: 20
Returned Funders: 5
[2mTime: 3.07s[0m


Unnamed: 0,id,count,country_name,acronym,longitude,types,city_name,linkout,name,latitude,state_name
0,grid.26999.3d,427017,Japan,UT,139.76222,[Education],Tokyo,[http://www.u-tokyo.ac.jp/en/],University of Tokyo,35.713333,
1,grid.38142.3c,401733,United States,,-71.11665,[Education],Cambridge,[http://www.harvard.edu/],Harvard University,42.377052,Massachusetts
2,grid.17063.33,305886,Canada,,-79.395,[Education],Toronto,[http://www.utoronto.ca/],University of Toronto,43.661667,Ontario
3,grid.214458.e,272655,United States,UM,-83.73822,[Education],Ann Arbor,[https://www.umich.edu/],University of Michigan,42.278305,Michigan
4,grid.258799.8,269874,Japan,,135.77979,[Education],Kyoto,[http://www.kyoto-u.ac.jp/en],Kyoto University,35.026157,



### 5.2 Returning Specific Fields

For control over which information from each given `record` will be
returned, a `source` or `entity` name in the `results` phrase can be
optionally followed by a specification of `fields` and `fieldsets` to be
included in the JSON results for each retrieved record.

The fields specification may be an arbitrary list of `field` names
enclosed in brackets (`[`, `]`), with field names separated by a plus
sign (`+`). Minus sign (`-`) can be used to exclude `field` or a
`fieldset` from the result. Field names thus listed within brackets must
be "known" to the DSL, and therefore only a subset of fields may be used
in this syntax (see note below).

In [45]:
%%dsldf 
search grants 
return grants[grant_number + title + language] limit 5

Returned Grants: 5 (total = 5623964)
[2mTime: 0.52s[0m


Unnamed: 0,grant_number,title,language
0,890218,Functional analysis of ribosome heterogeneity ...,en
1,2018-HRSI-1548,APPROACH to Enriching the Real World Evidence ...,en
2,948141,Simulating ultracold correlated quantum matter...,en
3,887019,The role of microbial Oxylipins in the MIcrobe...,en
4,AH/V001841/1,Playgrounds: A Material Cultural Study of Post...,en


In [46]:
%%dsldf 
search clinical_trials 
return clinical_trials [id+ title + acronym + phase] limit 5

Returned Clinical_trials: 5 (total = 609412)
[2mTime: 0.52s[0m


Unnamed: 0,id,phase,title,acronym
0,KCT0002490,,Effect of Vitamin D Supplementation on the Phy...,
1,KCT0002491,,The median effective dose (ED50) of intravenou...,
2,KCT0002492,,Efficacy and Safety of Chrysanthemum indicum e...,WONP_CIEE
3,KCT0002493,,The effects of thermal softening of single-lum...,
4,KCT0002494,,Efficacy of combined of conventional rehabilit...,


**Shortcuts: `fieldsets`**

The fields specification may be the name of a pre-defined `fieldset`
(e.g. `extras`, `basics`). These are shortcuts that can be handy when testing out new queries, for example. 

NOTE In general when writing code used in integrations or long-standing extraction scripts it is **best to return specific fields rather that a predefined set**. This has also the advantage of making queries faster by avoiding the extraction of unnecessary data.
    

In [47]:
%%dsldf 
search grants 
return grants [basics] limit 5 

Returned Grants: 5 (total = 5623964)
[2mTime: 0.76s[0m
Field 'project_num' is deprecated in favor of grant_number. Please refer to https://docs.dimensions.ai/dsl/releasenotes.html for more details
Field 'title_language' is deprecated in favor of language_title. Please refer to https://docs.dimensions.ai/dsl/releasenotes.html for more details


Unnamed: 0,funding_org_name,original_title,start_year,funders,project_num,title,start_date,language,active_year,title_language,id,end_date
0,European Commission,Functional analysis of ribosome heterogeneity ...,2021,"[{'id': 'grid.270680.b', 'country_name': 'Belg...",890218,Functional analysis of ribosome heterogeneity ...,2021-12-01,en,"[2021, 2022, 2023]",en,grant.9064785,2023-11-30
1,New Brunswick Health Research Foundation,APPROACH to Enriching the Real World Evidence ...,2021,"[{'id': 'grid.484521.e', 'country_name': 'Cana...",2018-HRSI-1548,APPROACH to Enriching the Real World Evidence ...,2021-11-30,en,[2021],en,grant.8690978,
2,European Research Council,Simulating ultracold correlated quantum matter...,2021,"[{'id': 'grid.452896.4', 'country_name': 'Belg...",948141,Simulating ultracold correlated quantum matter...,2021-11-01,en,"[2021, 2022, 2023, 2024, 2025, 2026]",en,grant.9414093,2026-10-31
3,European Commission,The role of microbial Oxylipins in the MIcrobe...,2021,"[{'id': 'grid.270680.b', 'country_name': 'Belg...",887019,The role of microbial Oxylipins in the MIcrobe...,2021-11-01,en,"[2021, 2022, 2023]",en,grant.8964187,2023-10-31
4,Arts and Humanities Research Council,Playgrounds: A Material Cultural Study of Post...,2021,"[{'id': 'grid.426413.6', 'country_name': 'Unit...",AH/V001841/1,Playgrounds: A Material Cultural Study of Post...,2021-10-03,en,"[2021, 2022, 2023]",en,grant.9401634,2023-04-02


In [48]:
%%dsldf 
search publications 
return publications [basics+times_cited] limit 5 

Returned Publications: 5 (total = 115232311)
[2mTime: 1.52s[0m
Field 'author_affiliations' is deprecated in favor of authors. Please refer to https://docs.dimensions.ai/dsl/releasenotes.html for more details


Unnamed: 0,id,year,title,pages,times_cited,type,author_affiliations,journal.id,journal.title
0,pub.1134028727,2021,8 Gender Violence in the Hospitality Industry:...,184-198,0,chapter,,,
1,pub.1126934999,2021,Mosaic Defects of AlN Buffer Layers in GaN/AlN...,1-1,0,article,"[[{'raw_affiliation': [], 'first_name': 'Tuğçe...",jour.1152664,Journal of Polytechnic
2,pub.1134028720,2021,1 The Perils of Sex Work in Montreal: Seeking ...,16-41,0,chapter,,,
3,pub.1134154735,2021,Gövde Buruşmalarının Elastik Olmayan Yanal Bur...,1-1,0,article,"[[{'raw_affiliation': [], 'first_name': 'Mehme...",jour.1152664,Journal of Polytechnic
4,pub.1131456225,2021,Mağaza İmajının İkon Marka Algısına Yansıması,,0,article,"[[{'raw_affiliation': [], 'first_name': 'Yelda...",jour.1359387,Türkiye İletişim Araştırmaları Dergisi


The fields specification may be an (`all`), to indicate that all fields
available for the given `source` should be returned.

In [None]:
%%dsldf
search publications 
return publications [all] limit 5 

### 5.3 Returning Facets

In addition to returning source records matching a query, it is possible
to $facet$ on the [entity](data-entities.ipynb) fields related to a
particular source and return only those entity values as an aggregrated
view of the related source data. This operation is similar to a
$group by$ or $pivot table$.

**Warning** Faceting can return up to a maximum of 1000 results. This is to ensure
adequate performance with all queries. Furthemore, although the `limit`
operator is allowed, the `skip` operator cannot be used.

In [50]:
%%dsldf 
search publications 
    for "coronavirus" 
return research_orgs limit 5

Returned Research_orgs: 5
[2mTime: 0.66s[0m


Unnamed: 0,id,count,country_name,name,latitude,city_name,linkout,longitude,state_name,types,acronym
0,grid.38142.3c,2862,United States,Harvard University,42.377052,Cambridge,[http://www.harvard.edu/],-71.11665,Massachusetts,[Education],
1,grid.21107.35,1793,United States,Johns Hopkins University,39.328888,Baltimore,[https://www.jhu.edu/],-76.62028,Maryland,[Education],JHU
2,grid.4991.5,1765,United Kingdom,University of Oxford,51.753437,Oxford,[http://www.ox.ac.uk/],-1.25401,Oxfordshire,[Education],
3,grid.17063.33,1687,Canada,University of Toronto,43.661667,Toronto,[http://www.utoronto.ca/],-79.395,Ontario,[Education],
4,grid.33199.31,1591,China,Huazhong University of Science and Technology,30.508183,Wuhan,[http://english.hust.edu.cn/],114.41474,,[Education],HUST


In [51]:
%%dsldf 
search publications 
    for "coronavirus" 
return research_org_countries limit 5
return year limit 5
return category_for limit 5

Returned Research_org_countries: 5
Returned Category_for: 5
Returned Year: 5
[2mTime: 0.57s[0m


Unnamed: 0,id,count,name
0,US,61172,United States
1,CN,25265,China
2,GB,20635,United Kingdom
3,IT,12273,Italy
4,DE,11149,Germany


For control over the organization and headers of the JSON query results,
the `return` keyword in a return phrase may be followed by the keyword
`in` and then a `group` name for this group of results, where the group
name is enclosed in double quotes(`"`).

Also, one can define `aliases` that replace the defaul JSON fields names with other ones provided by the user. 

See the [official documentation](https://docs.dimensions.ai/dsl/language.html#aliases) for more details about this feature. 

In [52]:
%%dsl
search publications 
return in "facets" funders 
return in "facets" research_orgs

Returned Facets: 2
[2mTime: 2.80s[0m


<dimcli.DslDataset object #4795823104. Records: 2/115232310>

### 5.4 What the query statistics refer to - sources VS facets

When performing a DSL search, a `_stats` object is return which contains some useful info eg the total number of records available for a search. 

In [53]:
%%dsldf 
search publications
  where year in [2013:2018] and research_orgs="grid.258806.1"
return publications limit 5

Returned Publications: 5 (total = 5077)
[2mTime: 0.55s[0m


Unnamed: 0,id,title,year,author_affiliations,pages,type,volume,issue,journal.id,journal.title
0,pub.1113308928,A Hybrid DCT-CLAHE Approach for Brightness Enh...,2018,[[{'raw_affiliation': ['Department of Electric...,123-127,proceeding,,,,
1,pub.1110958161,Optimized coordinated control of LFC and SMES ...,2018,[[{'raw_affiliation': ['Department of Electric...,39,article,3.0,1.0,jour.1157179,Protection and Control of Modern Power Systems
2,pub.1110012351,The Role of Lanthanum in a Nickel Oxide‐Based ...,2018,[[{'raw_affiliation': ['Graduate School of Lif...,518-526,article,12.0,2.0,jour.1297486,ChemSusChem
3,pub.1110932965,Electrostatic Discharge Threshold on Coverglas...,2018,[[{'raw_affiliation': ['Department of Mechanic...,1445-1452,article,47.0,2.0,jour.1031080,IEEE Transactions on Plasma Science
4,pub.1110925389,Nuclear Ab Initio Calculations with the Unitar...,2018,[[{'raw_affiliation': ['Center for Nuclear Stu...,,proceeding,,,,




It is important to note though that the **total number always refers to the main source, never the facets** one is searching for. 

For example, in this query we return `researchers` linked to publications: 

In [54]:
%%dsldf 
search publications
  where year in [2013:2018] and research_orgs="grid.258806.1"
return researchers limit 5

Returned Researchers: 5
[2mTime: 0.88s[0m


Unnamed: 0,id,count,last_name,first_name,research_orgs,orcid_id
0,ur.01055753603.27,140,Hayase,Shuzi Shuzi,"[grid.14003.36, grid.419082.6, grid.266298.1, ...",
1,ur.01144540527.52,103,Ma,Ting Li,"[grid.258806.1, grid.177174.3, grid.411485.d, ...",[0000-0002-3310-459X]
2,ur.011212042763.67,102,Hikita,Masayuki,"[grid.27476.30, grid.462727.2, grid.258806.1]",
3,ur.016357156077.09,99,Lu,Huimin,"[grid.411497.e, grid.454850.8, grid.41156.37, ...",[0000-0001-9794-3221]
4,ur.07644453127.11,96,Kozako,M Kozako M,"[grid.471634.3, grid.482504.f, grid.258806.1, ...",


NOTE: facet results can be 1000 at most (due to performance limitations) so if there are more than 1000 it is not possible to know the total number. 

### 5.5 Paginating Results

At the end of a `return` phrase, the user can specify the maximum number
of results to be returned and the number of top records to skip over
before returning the first result record, for e.g. returning large
result sets page-by-page (i.e. "paging" results) as described below.

This is done using the keyword `limit` followed by the maximum number of
results to return, optionally followed by the keyword `skip` and the
number of results to skip (the offset).

In [55]:
%%dsldf 
search publications return publications limit 10

Returned Publications: 10 (total = 115232311)
[2mTime: 1.18s[0m


Unnamed: 0,title,type,id,pages,year,author_affiliations,journal.id,journal.title
0,8 Gender Violence in the Hospitality Industry:...,chapter,pub.1134028727,184-198,2021,,,
1,Mosaic Defects of AlN Buffer Layers in GaN/AlN...,article,pub.1126934999,1-1,2021,"[[{'raw_affiliation': [], 'first_name': 'Tuğçe...",jour.1152664,Journal of Polytechnic
2,1 The Perils of Sex Work in Montreal: Seeking ...,chapter,pub.1134028720,16-41,2021,,,
3,Gövde Buruşmalarının Elastik Olmayan Yanal Bur...,article,pub.1134154735,1-1,2021,"[[{'raw_affiliation': [], 'first_name': 'Mehme...",jour.1152664,Journal of Polytechnic
4,Mağaza İmajının İkon Marka Algısına Yansıması,article,pub.1131456225,,2021,"[[{'raw_affiliation': [], 'first_name': 'Yelda...",jour.1359387,Türkiye İletişim Araştırmaları Dergisi
5,Introduction: Accounting for Violence,chapter,pub.1134028719,1-15,2021,,,
6,"6 The Murder of Lori Dupont: Violence, Harassm...",chapter,pub.1134028725,133-159,2021,,,
7,5 Slow Violence and Hidden Injuries: The Work ...,chapter,pub.1134028724,112-132,2021,,,
8,"4 Billy Gohl: Labour, Violence, and Myth in th...",chapter,pub.1134028723,88-111,2021,,,
9,2 The Rules of Discipline: Workers and the Cul...,chapter,pub.1134028721,42-61,2021,,,


If paging information is not provided, the default values
`limit 20 skip 0` are used, so the two following queries are equivalent:

Combining `limit` and `skip` across multiple queries enables paging or
batching of results; e.g. to retrieve 30 grant records divided into 3
pages of 10 records each, the following three queries could be used:

```
return grants limit 10           => get 1st 10 records for page 1 (skip 0, by default)
return grants limit 10 skip 10   => get next 10 for page 2; skip the 10 we already have
return grants limit 10 skip 20   => get another 10 for page 3, for a total of 30
```

### 5.6 Sorting Results

A sort order for the results in a given `return` phrase can be specified
with the keyword `sort by` followed by the name of 
* a `field` (in the
case that a `source` is being requested) 
* an `indicator (aggregation)` (in the case
that one or more facets are being requested). 

 By default, the result set of full text
queries ($search ... for "full text query"$) is sorted by "relevance".
Additionally, it is possible to specify the sort order, using `asc` or
`desc` keywords. By default, descending order is selected.

In [56]:
%%dsldf 
search grants 
    for "nanomaterials"
return grants sort by title desc limit 5 

Returned Grants: 5 (total = 18755)
[2mTime: 0.57s[0m


Unnamed: 0,title_language,funders,original_title,funding_org_name,project_num,id,title,start_date,language,active_year,start_year,end_date
0,en,"[{'id': 'grid.424150.6', 'acronym': 'DFG', 'co...",Transmissionselektronenmikroskop,German Research Foundation,280331443,grant.4841519,Transmissionselektronenmikroskop,2015-01-01,en,[2015],2015,
1,en,"[{'id': 'grid.424150.6', 'acronym': 'DFG', 'co...",Transmissionselektronenmikroskop,German Research Foundation,220923099,grant.4823271,Transmissionselektronenmikroskop,2012-01-01,de,[2012],2012,
2,en,"[{'id': 'grid.425119.a', 'acronym': 'BELSPO', ...",Snowcontrol.,Belgian Federal Science Policy Office,3E120109,grant.6774902,Snowcontrol.,2011-06-16,en,"[2011, 2012, 2013, 2014, 2015]",2011,2015-06-13
3,pl,"[{'id': 'grid.452947.9', 'acronym': 'FNP', 'co...",Stypendium Naukowe START,Foundation for Polish Science,START 81.2014,grant.9182975,START Scholarship,2014-06-01,pl,"[2014, 2015]",2014,2015-06-01
4,pl,"[{'id': 'grid.452947.9', 'acronym': 'FNP', 'co...",Stypendium Naukowe START,Foundation for Polish Science,START 79.2015,grant.9182996,START Scholarship,2015-06-01,pl,"[2015, 2016]",2015,2016-06-01


In [57]:
%%dsldf  
search grants  
    for "nanomaterials"
return grants  sort by relevance desc limit 5

Returned Grants: 5 (total = 18755)
[2mTime: 0.57s[0m


Unnamed: 0,title_language,funders,original_title,funding_org_name,project_num,id,title,start_date,end_date,language,active_year,start_year
0,en,"[{'id': 'grid.437854.9', 'acronym': 'SFI', 'co...",Optically-active chiral nanomaterials,Science Foundation Ireland,11/W.1/I2065,grant.3984032,Optically-active chiral nanomaterials,2012-06-01,2013-05-31,en,"[2012, 2013]",2012
1,en,"[{'id': 'grid.22919.31', 'acronym': 'FCT', 'co...",NOVEL LANTHANIDE LUMINESCENT SYSTEMS: FROM SUP...,Foundation for Science and Technology,35378,grant.3526883,NOVEL LANTHANIDE LUMINESCENT SYSTEMS: FROM SUP...,2000-09-01,2003-12-31,en,"[2000, 2001, 2002, 2003]",2000
2,en,"[{'id': 'grid.22919.31', 'acronym': 'FCT', 'co...",Transport properties and electrochemical appli...,Foundation for Science and Technology,39381,grant.3531153,Transport properties and electrochemical appli...,2003-03-01,2006-08-31,en,"[2003, 2004, 2005, 2006]",2003
3,en,"[{'id': 'grid.452912.9', 'acronym': 'NSERC', '...",catalytic nanomaterials,Natural Sciences and Engineering Research Council,583037,grant.5527054,catalytic nanomaterials,2015-04-01,2016-03-31,en,"[2015, 2016]",2015
4,en,"[{'id': 'grid.425339.a', 'acronym': 'ISF', 'co...",Novel biocomposite nanomaterials,Israel Science Foundation,25813,grant.4849153,Novel biocomposite nanomaterials,2012-01-01,2015-12-31,en,"[2012, 2013, 2014, 2015]",2012


Number of citations per publication

In [58]:
%%dsldf  
search publications
return publications  [doi + times_cited] 
    sort by times_cited limit 5

Returned Publications: 5 (total = 115232310)
[2mTime: 1.04s[0m


Unnamed: 0,times_cited,doi
0,233883,
1,198871,10.1038/227680a0
2,183355,10.1016/0003-2697(76)90527-3
3,96100,10.1006/meth.2001.1262
4,89202,10.1103/physrevlett.77.3865


Recent citations per publication.
Note: Recent citation refers to the number of citations accrued in the last two year period. A single value is stored per document and the year window rolls over in July.

In [59]:
%%dsldf 
search publications
return publications [doi + recent_citations]
    sort by recent_citations limit 5

Returned Publications: 5 (total = 115232311)
[2mTime: 1.31s[0m


Unnamed: 0,recent_citations,doi
0,25499,10.1109/cvpr.2016.90
1,25295,10.1006/meth.2001.1262
2,22533,10.3322/caac.21492
3,19781,10.1103/physrevlett.77.3865
4,18649,10.1191/1478088706qp063oa


When a facet is being returned, the `indicator` used in the
`sort` phrase must either be `count` (the default, such that
`sort by count` is unnecessary), or one of the indicators specified in
the `aggregate` phrase, i.e. one whose values are being computed in the
faceting operation. 


In [60]:
%%dsldf 
search publications 
    for "nanomaterials"
return research_orgs 
    aggregate altmetric_median, rcr_avg sort by rcr_avg limit 5 

Returned Research_orgs: 5
[2mTime: 2.62s[0m


Unnamed: 0,id,count,rcr_avg,altmetric_median,country_name,name,latitude,city_name,linkout,longitude,types,acronym
0,grid.11444.34,1,210.100006,350.0,China,Shanghai Institute of Hypertension,31.211678,Shanghai,[http://www.china-sih.com/],121.467255,[Facility],
1,grid.11485.39,1,210.100006,350.0,United Kingdom,Cancer Research UK,51.531322,London,[http://www.cancerresearchuk.org/],-0.106269,[Nonprofit],CRUK
2,grid.11642.30,1,210.100006,350.0,Reunion,University of La Réunion,-20.901735,Saint-Denis,[http://www.univ-reunion.fr/university-of-reun...,55.48455,[Education],
3,grid.20931.39,1,210.100006,350.0,United Kingdom,Royal Veterinary College,51.5368,London,[http://www.rvc.ac.uk/],-0.134,[Education],RVC
4,grid.226688.0,1,210.100006,350.0,Singapore,Temasek Life Sciences Laboratory,1.294417,Singapore,[http://www.tll.org.sg/],103.777,[Facility],TLL


### 5.7 Unnesting results

Multi-value entity and JSON fields, such as `researchers`, `authors` or `research_orgs` or any of `category_*` fields may be unnested into top level objects. 

This operation makes it easier to do further operations on these objects e.g. counting or processing them further. 

This functionality will transform all of the returned multi-value data and turn them into top level keys, such as `researchers.id`, `researchers.first_name`, `researchers.last_name`, while copying other, non-unnested fields, such as `id` or `title` of publication for each of them. Returned results are therefore multiplied by as many researchers and categories each original publication has, so they will likely be more than the overall query limit, as the limit applies on the source objects, not the unnested one. If multiple fields are being unnested, then a cartesian product of all unnested fields is being returned.




In [81]:
%%dsldf

search publications for "Japan AND Buddhism"
    where researchers is not empty
return publications[id+year+title+unnest(researchers)] limit 10

Returned objects: 12 (total publications= 39444)
[2mTime: 1.60s[0m


Unnamed: 0,title,year,id,researchers.id,researchers.first_name,researchers.last_name,researchers.research_orgs,researchers.orcid_id
0,Tibetan CSL learners’ L2 Motivational Self Sys...,2021,pub.1133492334,ur.014452013576.50,Lubei,Zhang,[grid.263901.f],
1,"Connecting heritage, vulnerabilities and capac...",2021,pub.1133369377,ur.010763326731.79,Ksenia,Chmutina,"[grid.4563.4, grid.6571.5]",
2,Robots are friends as well as foes: Ambivalent...,2021,pub.1132061246,ur.014753034505.67,Jianning,Dang,"[grid.27860.3b, grid.412498.2, grid.20513.35]",[0000-0002-8174-0136]
3,Robots are friends as well as foes: Ambivalent...,2021,pub.1132061246,ur.01036243756.64,Ling,Liu,"[grid.418213.d, grid.16753.36, grid.11135.37, ...",[0000-0002-4898-3013]
4,Informal institutions and comparative advantag...,2021,pub.1131260846,ur.011016512337.06,Pao-Li,Chang,"[grid.214458.e, grid.412634.6]",[0000-0001-5485-7374]
5,Russian soft power from USSR to Putin’s Russia,2020,pub.1133271508,ur.011543771702.89,Elena V,Bykova,[grid.15447.33],
6,Neoliberal capitalism and BRICS on screen,2020,pub.1133271504,ur.015265002403.21,Iiris,Ruoho,[grid.502801.e],
7,BRICS de-Americanizing the Internet?,2020,pub.1133271503,ur.015745210637.94,Daya Kishan,Thussu,"[grid.12896.34, grid.23231.31, grid.8096.7, gr...",
8,Contending soft powers,2020,pub.1133271506,ur.010352300011.45,Herman,Wasserman,,
9,Contending soft powers,2020,pub.1133271506,ur.010227271267.80,Musawenkosi,Ndlovu,[grid.442325.6],[0000-0002-5901-6766]


In [78]:
%%dsldf

search publications for "Japan AND Buddhism"
return publications[id+year+title+unnest(category_for)] limit 5

Returned objects: 10 (total publications= 120925)
[2mTime: 0.90s[0m


Unnamed: 0,year,title,id,category_for.id,category_for.name
0,2021,Tibetan CSL learners’ L2 Motivational Self Sys...,pub.1133492334,3268,1303 Specialist Studies In Education
1,2021,Tibetan CSL learners’ L2 Motivational Self Sys...,pub.1133492334,2213,13 Education
2,2021,A creative destruction approach to replication...,pub.1133082954,2217,17 Psychology and Cognitive Sciences
3,2021,A creative destruction approach to replication...,pub.1133082954,3468,1701 Psychology
4,2021,The shaping of anticipation: The networked dev...,pub.1134273988,3448,1608 Sociology
5,2021,The shaping of anticipation: The networked dev...,pub.1134273988,2216,16 Studies in Human Society
6,2021,Exploring the gratitude model of body apprecia...,pub.1134199615,2217,17 Psychology and Cognitive Sciences
7,2021,Exploring the gratitude model of body apprecia...,pub.1134199615,3468,1701 Psychology
8,2021,Learning in retirement: Developing resilience ...,pub.1131496119,2217,17 Psychology and Cognitive Sciences
9,2021,Learning in retirement: Developing resilience ...,pub.1131496119,3468,1701 Psychology


You can `unnest` as many fields as you want. However the number of results will grow pretty quickly!

In [82]:
%%dsldf

search publications for "Japan AND Buddhism"
return publications[id+year+title+unnest(category_for)+unnest(researchers)+unnest(research_orgs)] limit 5

Returned objects: 42 (total publications= 120925)
[2mTime: 0.99s[0m


Unnamed: 0,year,title,id,category_for.id,category_for.name,research_orgs.id,research_orgs.name,research_orgs.acronym,research_orgs.linkout,research_orgs.country_name,research_orgs.state_name,research_orgs.city_name,research_orgs.latitude,research_orgs.longitude,research_orgs.types,researchers.id,researchers.first_name,researchers.last_name,researchers.research_orgs
0,2021,Tibetan CSL learners’ L2 Motivational Self Sys...,pub.1133492334,3268,1303 Specialist Studies In Education,grid.263901.f,Southwest Jiaotong University,SWJTU,[http://www.swjtu.edu.cn/],China,Sichuan,Chengdu,30.769444,103.98472,[Education],ur.014452013576.50,Lubei,Zhang,[grid.263901.f]
1,2021,Tibetan CSL learners’ L2 Motivational Self Sys...,pub.1133492334,2213,13 Education,grid.263901.f,Southwest Jiaotong University,SWJTU,[http://www.swjtu.edu.cn/],China,Sichuan,Chengdu,30.769444,103.98472,[Education],ur.014452013576.50,Lubei,Zhang,[grid.263901.f]
2,2021,A creative destruction approach to replication...,pub.1133082954,2217,17 Psychology and Cognitive Sciences,grid.273335.3,"University at Buffalo, State University of New...",UB,[https://www.buffalo.edu/],United States,New York,Buffalo,43.001106,-78.78897,[Education],,,,
3,2021,A creative destruction approach to replication...,pub.1133082954,2217,17 Psychology and Cognitive Sciences,grid.262273.0,"Queens College, City University of New York",,[http://www.qc.cuny.edu/Pages/home.aspx],United States,New York,New York,40.73575,-73.817856,[Education],,,,
4,2021,A creative destruction approach to replication...,pub.1133082954,2217,17 Psychology and Cognitive Sciences,grid.148374.d,Massey University,,[http://www.massey.ac.nz/],New Zealand,,Palmerston North,-40.38564,175.61806,[Education],,,,
5,2021,A creative destruction approach to replication...,pub.1133082954,2217,17 Psychology and Cognitive Sciences,grid.17088.36,Michigan State University,MSU,[https://msu.edu/],United States,Michigan,East Lansing,42.723,-84.481,[Education],,,,
6,2021,A creative destruction approach to replication...,pub.1133082954,2217,17 Psychology and Cognitive Sciences,grid.27755.32,University of Virginia,UVA,[http://www.virginia.edu/],United States,Virginia,Charlottesville,38.035,-78.505,[Education],,,,
7,2021,A creative destruction approach to replication...,pub.1133082954,2217,17 Psychology and Cognitive Sciences,grid.257949.4,Ithaca College,,[http://www.ithaca.edu/],United States,New York,Ithaca,42.42311,-76.49521,[Education],,,,
8,2021,A creative destruction approach to replication...,pub.1133082954,2217,17 Psychology and Cognitive Sciences,grid.10049.3c,University of Limerick,UL,[http://www.ul.ie/],Ireland,,Limerick,52.674,-8.571,[Education],,,,
9,2021,A creative destruction approach to replication...,pub.1133082954,2217,17 Psychology and Cognitive Sciences,grid.469459.3,INSEAD,,[http://www.insead.edu],Singapore,,Singapore,1.299995,103.7866,[Education],,,,


## 6. Aggregations

In a `return` phrase requesting one or more `facet` results, aggregation
operations to perform during faceting can be specified after the facet
name(s) by using the keyword `aggregate` followed by a comma-separated
list of one or more `indicator` names corresponding to the `source`
being searched.

In [61]:
%%dsldf
search publications 
    where year > 2010 
return research_orgs  
    aggregate rcr_avg, altmetric_median limit 5

Returned Research_orgs: 5
[2mTime: 13.43s[0m


Unnamed: 0,id,count,rcr_avg,altmetric_median,country_name,name,latitude,city_name,linkout,longitude,state_name,types,acronym
0,grid.38142.3c,198744,2.193411,5.0,United States,Harvard University,42.377052,Cambridge,[http://www.harvard.edu/],-71.11665,Massachusetts,[Education],
1,grid.17063.33,150759,1.732056,4.0,Canada,University of Toronto,43.661667,Toronto,[http://www.utoronto.ca/],-79.395,Ontario,[Education],
2,grid.26999.3d,149047,1.205105,2.0,Japan,University of Tokyo,35.713333,Tokyo,[http://www.u-tokyo.ac.jp/en/],139.76222,,[Education],UT
3,grid.11899.38,144382,1.077967,2.0,Brazil,University of São Paulo,-23.563051,São Paulo,[http://www5.usp.br/en/],-46.730103,,[Education],USP
4,grid.83440.3b,131707,1.962088,4.0,United Kingdom,University College London,51.52447,London,[http://www.ucl.ac.uk/],-0.133982,,[Education],UCL


**What are the metrics/aggregations available?** See the data sources documentation for information about available [indicators](https://docs.dimensions.ai/dsl/datasource-publications.html#publications-indicators).  

Alternatively, we can use the 'schema' API ([describe](https://docs.dimensions.ai/dsl/data-sources.html#metadata-api)) to return this information programmatically:

In [62]:
schema = dsl.query("describe schema")
sources = [x for x in schema['sources']]
# for each source name, extract metrics info
for s in sources:
    print("SOURCE:", s)
    for m in schema['sources'][s]['metrics']:
        print("--", schema['sources'][s]['metrics'][m]['name'], " => ", schema['sources'][s]['metrics'][m]['description'], )

SOURCE: publications
-- count  =>  Total count
-- altmetric_median  =>  Median Altmetric attention score
-- altmetric_avg  =>  Altmetric attention score mean
-- citations_total  =>  Aggregated number of citations
-- citations_avg  =>  Arithmetic mean of citations
-- citations_median  =>  Median of citations
-- recent_citations_total  =>  For a given article, in a given year, the number of citations accrued in the last two year period. Single value stored per document, year window rolls over in July.
-- rcr_avg  =>  Arithmetic mean of `relative_citation_ratio` field.
-- fcr_gavg  =>  Geometric mean of `field_citation_ratio` field (note: This field cannot be used for sorting results).
SOURCE: grants
-- count  =>  Total count
-- funding  =>  Total funding amount, in USD.
SOURCE: patents
-- count  =>  Total count
SOURCE: clinical_trials
-- count  =>  Total count
SOURCE: policy_documents
-- count  =>  Total count
SOURCE: researchers
-- count  =>  Total count
SOURCE: organizations
-- count  

**NOTE** In addition to any specified aggregations, `count` is always computed
and reported when facet results are requested.

In [63]:
%%dsldf
search grants 
    for "5g network" 
return funders 
    aggregate count, funding sort by funding limit 5 

Returned Funders: 5
[2mTime: 0.57s[0m


Unnamed: 0,id,count,funding,country_name,acronym,longitude,types,city_name,linkout,name,latitude,state_name
0,grid.270680.b,202,969574372.0,Belgium,EC,4.36367,[Government],Brussels,[http://ec.europa.eu/index_en.htm],European Commission,50.85165,
1,grid.457785.c,141,70847172.0,United States,NSF CISE,-77.111,[Government],Arlington,[http://www.nsf.gov/dir/index.jsp?org=CISE],Directorate for Computer & Information Science...,38.88058,Virginia
2,grid.421091.f,68,53295321.0,United Kingdom,EPSRC,-1.784602,[Government],Swindon,[https://www.epsrc.ac.uk/],Engineering and Physical Sciences Research Cou...,51.567093,England
3,grid.55047.33,8,50521157.0,Poland,NCRD,21.00763,[Government],Warsaw,[http://www.ncbr.gov.pl/en/],National Centre for Research and Development,52.227455,
4,grid.453115.7,40,35972874.0,China,ITC,114.16658,[Government],Hong Kong,[http://www.itc.gov.hk/en/about/org.htm],Innovation and Technology Commission,22.28264,


Aggregated total number of citations

In [64]:
%%dsldf
search publications
    for "ontologies"
return funders 
    aggregate citations_total 
    sort by citations_total  limit 5

Returned Funders: 5
[2mTime: 0.91s[0m


Unnamed: 0,id,count,citations_total,country_name,name,latitude,city_name,linkout,longitude,acronym,state_name,types
0,grid.280785.0,13704,895803.0,United States,National Institute of General Medical Sciences,38.997833,Bethesda,[http://www.nigms.nih.gov/Pages/default.aspx],-77.09938,NIGMS,Maryland,[Facility]
1,grid.48336.3a,13440,892092.0,United States,National Cancer Institute,39.004326,Rockville,[http://www.cancer.gov/],-77.10119,NCI,Maryland,[Government]
2,grid.270680.b,20684,650861.0,Belgium,European Commission,50.85165,Brussels,[http://ec.europa.eu/index_en.htm],4.36367,EC,,[Government]
3,grid.280128.1,5040,648335.0,United States,National Human Genome Research Institute,38.996967,Bethesda,[https://www.genome.gov/],-77.09693,NHGRI,Maryland,[Facility]
4,grid.419696.5,36530,495653.0,China,National Natural Science Foundation of China,40.005177,Beijing,[http://www.nsfc.gov.cn/publish/portal1/],116.33983,NSFC,,[Government]


Arithmetic mean number of citations

In [65]:
%%dsldf
search publications
return funders 
    aggregate citations_avg 
    sort by citations_avg limit 5

Returned Funders: 5
[2mTime: 1.96s[0m


Unnamed: 0,id,count,citations_avg,country_name,state_name,longitude,types,city_name,linkout,name,latitude,acronym
0,grid.478308.0,190,264.768421,United States,District of Columbia,-77.03973,[Nonprofit],Washington D.C.,[http://www.stewart-trust.org/],Alexander & Margaret Stewart Trust,38.90116,
1,grid.417710.4,154,215.993506,United States,Maryland,-77.20376,[Company],Rockville,[http://www.hgsi.com],Human Genome Sciences (United States),39.09665,
2,grid.453780.d,145,193.931034,United States,District of Columbia,-77.03952,[Nonprofit],Washington D.C.,[http://www.abc2.org/],Accelerate Brain Cancer Cure,38.90672,
3,grid.484432.d,1,188.0,United Kingdom,,-0.123164,[Nonprofit],London,[https://www.macmillan.org.uk/],Macmillan Cancer Support,51.488003,Macmillan Cancer Support
4,grid.478789.d,576,171.751736,United States,Nevada,-115.29985,[Other],Las Vegas,[http://www.dwreynolds.org/],Donald W. Reynolds Foundation,36.19046,


Geometric mean of FCR


In [66]:
%%dsldf
search publications
return funders 
    aggregate fcr_gavg limit 5

Returned Funders: 5
[2mTime: 3.41s[0m


Unnamed: 0,id,fcr_gavg,count,country_name,name,latitude,city_name,linkout,longitude,acronym,types,state_name
0,grid.419696.5,2.364073,2162769,China,National Natural Science Foundation of China,40.005177,Beijing,[http://www.nsfc.gov.cn/publish/portal1/],116.33983,NSFC,[Government],
1,grid.270680.b,3.289544,763648,Belgium,European Commission,50.85165,Brussels,[http://ec.europa.eu/index_en.htm],4.36367,EC,[Government],
2,grid.424020.0,2.580812,668039,China,Ministry of Science and Technology of the Peop...,39.827835,Beijing,[http://www.most.gov.cn/eng/],116.316284,MOST,[Government],
3,grid.54432.34,2.279734,619365,Japan,Japan Society for the Promotion of Science,35.68716,Tokyo,[http://www.jsps.go.jp/],139.74039,JSPS,[Nonprofit],
4,grid.48336.3a,4.91386,588545,United States,National Cancer Institute,39.004326,Rockville,[http://www.cancer.gov/],-77.10119,NCI,[Government],Maryland


Median Altmetric Attention Score

In [67]:
%%dsldf 
search publications
return funders aggregate altmetric_median 
    sort by altmetric_median limit 5 

Returned Funders: 5
[2mTime: 6.64s[0m


Unnamed: 0,id,count,altmetric_median,country_name,name,latitude,city_name,linkout,longitude,acronym,types,state_name
0,grid.470711.4,2,117.0,United Kingdom,Chest Heart and Stroke Scotland,55.946075,Edinburgh,[http://www.chss.org.uk/],-3.219597,CHSS,[Nonprofit],
1,grid.443873.f,5,98.0,United States,LUNGevity Foundation,41.878674,Chicago,[http://www.lungevity.org/],-87.62648,LUNG,[Nonprofit],Illinois
2,grid.473856.b,2,66.0,United States,Administration for Children and Families,38.88594,Washington D.C.,[https://www.acf.hhs.gov/],-77.01637,ACF,[Government],District of Columbia
3,grid.481336.b,1,44.0,United Kingdom,Target Ovarian Cancer,51.529922,London,[http://www.targetovariancancer.org.uk/],-0.101692,Target Ovarian Cancer,[Nonprofit],
4,grid.419979.b,2,43.5,United States,Einstein Healthcare Network,40.036827,Philadelphia,[http://www.einstein.edu/],-75.14314,AEHN,[Healthcare],Pennsylvania


### 6.1 Complex aggregations

The `return` phrase may be followed by a function expression, to return additional calculations, such as per year funding or citations statistics. These functions may take their own arguments, and are calculated using the source data as specified in the `search part` of the query.

At the time of writing, there are two functions available: Publications `citations_per_year` and Grants `funding_per_year`

#### Publications `citations_per_year`

Publication citations is the number of times that publications have been cited by other publications in the database. This function returns the number of citations received in each year.

In [73]:
%%dsldf

search publications for "brexit"
return citations_per_year(2010, 2020)

Returned Citations_per_year: 11
[2mTime: 3.91s[0m


Unnamed: 0,citations_per_year
2010,6.0
2011,11.0
2012,10.0
2013,14.0
2014,25.0
2015,107.0
2016,737.0
2017,5382.0
2018,16401.0
2019,32265.0


#### Grants `funding_per_year`

Returns grant funding per year in the given currency, starting from specified year, ending in specified year (including).

Supported currencies are: CAD,USD,JPY,GBP,CHF,CNY,EUR,NZD,AUD

In [74]:
%%dsldf

search grants for "brexit"
return funding_per_year(2010, 2020, "USD")


Returned Funding_per_year: 11
[2mTime: 0.60s[0m


Unnamed: 0,funding_per_year
2010,0.0
2011,0.0
2012,0.0
2013,4412.0
2014,10020.0
2015,342762.0
2016,823304.0
2017,5973847.0
2018,15563940.0
2019,34619271.0
