<a href="https://colab.research.google.com/github/digital-science/dimensions-api-lab/blob/master/1-getting-started/1-Using-the-Dimcli-library-to-query-the-API.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open Dimensions API Lab In Google Colab"/></a>

# Dimcli: installation and getting started
The purpose of this notebook is to show how to use DimCli. [DimCLI](https://github.com/lambdamusic/dimcli) is an open source python library which contains various commands that make it easier to interact with the Dimensions API from Python notebooks. 

> This guide assumes that you already have a Python 3 working environment and [pip](https://pypi.org/project/pip/) - the python package manager - installed. For more background, see this [link](https://realpython.com/installing-python/).

## Installation

You can install DimCli as follows from a Jupyter notebook:

In [0]:
!pip install dimcli -U

Collecting dimcli
[?25l  Downloading https://files.pythonhosted.org/packages/71/6c/af7eb3124f8d30e8ec4835d3e52cc0d42e40263dd7455b8095a89f44a266/dimcli-0.6.1-py2.py3-none-any.whl (115kB)
[K     |██▉                             | 10kB 18.0MB/s eta 0:00:01[K     |█████▊                          | 20kB 1.8MB/s eta 0:00:01[K     |████████▌                       | 30kB 2.6MB/s eta 0:00:01[K     |███████████▍                    | 40kB 1.7MB/s eta 0:00:01[K     |██████████████▎                 | 51kB 2.1MB/s eta 0:00:01[K     |█████████████████               | 61kB 2.5MB/s eta 0:00:01[K     |████████████████████            | 71kB 2.9MB/s eta 0:00:01[K     |██████████████████████▉         | 81kB 3.3MB/s eta 0:00:01[K     |█████████████████████████▋      | 92kB 3.7MB/s eta 0:00:01[K     |████████████████████████████▌   | 102kB 2.8MB/s eta 0:00:01[K     |███████████████████████████████▍| 112kB 2.8MB/s eta 0:00:01[K     |████████████████████████████████| 122kB 2.8MB/s 
Ins

Then each time you want to use it within a notebook you can load it like this:

In [0]:
import dimcli

## Authentication 

There are [different ways](https://github.com/lambdamusic/dimcli#the-credentials-file) to authenticate with the Dimensions API using DimCli. The easiest is passing your credentials explicilty like this: 

In [0]:
dimcli.login(username="dimensions-username@me.com", password="my-secret-password")

DimCli v0.6.1 - Succesfully connected to <https://app.dimensions.ai> (method: manual login)


This method could be handy if you quickly want to login and cannot save a credentials file. However, this method is not ideal if you want to protect your credentials, especially within a shared environment.  

### More secure method: storing a private credentials file

DimCli allows you to store your access credentials (e.g. email and password) in a file on your computer, so that you don't have to type it in each time. 

> Tip: if you have access to a terminal prompt, you can set up the credentials file also by typing `dimcli --init` (see also [the docs](https://github.com/digital-science/dimcli#creating-a-credentials-file-using-the-helper-script-recommended)). 

Your API credentials need to be stored in a file called `dsl.ini` in the current working directory (eg where your notebooks are located). The file contents should follow **exactly** this structure:

```
[instance.live]
url=https://app.dimensions.ai
login=user@mail.com
password=yourpasswordhere
```

* Make sure you don't change the `instance.live` directive (unless you know what you're doing)
* Of course you want to update the login and password fields as needed! 

If the file and credentials are entered correctly, then all you have to do is: 

In [0]:
dimcli.login()

DimCli v0.4.7 - Succesfully connected to <https://app.dimensions.ai> (method: dsl.ini file)


## Querying 

### Simple Querying

In [0]:
dsl = dimcli.Dsl()
data = dsl.query("""search publications for "black holes" return publications""")

Returned Publications: 20 (total = 1406075)


> PS you can turn off the *Returned publications...* feedback message by passing `verbose=False` to the query. 

The raw json data is accessible via the `json` property of the resulting object.

In [0]:
data.json.keys()

dict_keys(['_stats', 'publications'])

The main JSON keys of the data returned are accessible as properties

In [0]:
len(data.publications)

20

The `count_batch` and `count_total` methods provide quick shortcuts to find out how many records are available:

In [0]:
print("We got", data.count_batch, "results out of", data.count_total)

We got 20 results out of 1406075


If the query returns an error, the `errors` and `errors_string` methods can be handy too:

In [0]:
# ps errors are printed out by default 
data = dsl.query("""search publications for "black holes" return spaceships""")

Returned Errors: 1
Semantic Error
Semantic errors found:
	Facet 'spaceships' is not present in source 'publications'. Available facets are: FOR,FOR_first,HRCS_HC,HRCS_RAC,RCDC,category_bra,category_for,category_hra,category_hrcs_hc,category_hrcs_rac,category_rcdc,funder_countries,funders,journal,mesh_terms,open_access_categories,publisher,research_org_cities,research_org_countries,research_org_state_codes,research_orgs,researchers,type,year


In [0]:
print(data.errors_string)

Semantic ErrorSemantic errors found:
	Facet 'spaceships' is not present in source 'publications'. Available facets are: FOR,FOR_first,HRCS_HC,HRCS_RAC,RCDC,category_bra,category_for,category_hra,category_hrcs_hc,category_hrcs_rac,category_rcdc,funder_countries,funders,journal,mesh_terms,open_access_categories,publisher,research_org_cities,research_org_countries,research_org_state_codes,research_orgs,researchers,type,year


### Iterative querying

Dimcli includes a utility for looping over a query that produces more than 1000 results (the max number of records a single query can return). 

A loop query is generated in the background using the [limit/skip syntax](https://docs.dimensions.ai/dsl/language.html#paginating-results) in order to extract all possible results. 

A few things to note: 

* Each query happens after **one second**, so to comply with the 30 queries per minute API limit. 
* The results are collated into a single `dimcli.Result` object (same as with normal querying) that can be accessed via the methods illustrated above.
* You can use `verbose=False` to off the notifications. 
* You can pass `limit = 500` (or any other number <=1000) to specify how many records to extract per iteration - which by default is 1000 (the max amount). 


In [0]:
data = dsl.query_iterative("""search publications for "black holes" where year=1990 and times_cited > 10 return publications""")

1000 / 3077
2000 / 3077
3000 / 3077
3077 / 3077


In [0]:
len(data.publications)

3077

## What's next

DimCli contains a few [magic commands](https://ipython.readthedocs.io/en/stable/interactive/magics.html) which make it much easier to interrogate the API. See the the other notebooks in this collection for more information.