---
**Execute this tutorial in Binder** [![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/MolSSI/zenopy/5a28d6babe6d197bfd3725f0b207065789fc0a9a?urlpath=lab%2Ftree%2Fdocs%2Fnotebooks%2Fquick-start.ipynb)

---

This guide demonstrates how you can use ``zenopy`` to search through Zenodo's public records and
retrieve individual records from those searches. 
For more tutorials, see the [documentation](https://molssi.github.io/zenopy).

### **Create a config file**

---
**Note**

This section is not required for performing search operations 
through the Zenodo public records. However, Zenodo 
[recommends](https://developers.zenodo.org/#authentication) 
all API requests be authenticated and sent over HTTPS.

---

When you want to use ``zenopy`` for the first time, you need to give it access to your Zenodo (Sandbox) account. The first step is to store your authentication token(s) in a config file. You can create a new text file (say, **.zenodorc**) in the current directory and store our token in it as shown below

In [None]:
%%writefile ./.zenodorc
[ZENODO]
token = 

[SANDBOX]
token = 

---
**Note**

If you do not know how to create an authentication token, refer to the [documentation](https://molssi.github.io/zenopy/howtos/client/cli_token.html).

---

The token stored in the config file will be used by Zenodo servers to authenticate your API requests.

### **Initialize the client**

The next step is to import the ``zenopy`` package

In [1]:
import zenopy

The recommended way is to call the ``zenopy``'s client constructor and 
tell it where the config file is

In [None]:
cli = zenopy.Zenodo(config_file_path="./.zenodorc", use_sandbox=False)

However, because we plan to use ``zenopy`` only for searching through Zenodo's 
public (and published) records in this tutorial, we can create the client object
without an authentication token (an empty string).

In [2]:
cli = zenopy.Zenodo(token="", use_sandbox=False)



---
**Note**

Do not forget to set the ``use_sandbox`` argument to ``False``.
Otherwise, you will be searching through Zenodo Sandbox's servers.

---

Great. Now, the ``zenopy`` client is connected to the Zenodo servers
and is ready to be used to search through their records.

### **Search Through Zenodo’s public records**

The client object allows us to create an instance of the ``_Records`` class

In [3]:
rec_obj = cli.init_records()
rec_obj

<zenopy.records._Records at 0x7f9ca868ad10>

Now, you can call the **rec_obj**'s ``list_records()`` to search 
through Zenodo's public records.

In [22]:
rec_list = rec_obj.list_records()
rec_list

The value of 'content_type' argument is None.
ZenoPy will adopt JSON encoding.

The value of 'status' argument is None.
ZenoPy will search for 'published' record.

The value of 'sort' argument is None.
ZenoPy will sort the search results according to 'bestmatch' sort option.



[<zenopy.record.Record at 0x7f9c917ee3b0>,
 <zenopy.record.Record at 0x7f9c917ee0b0>,
 <zenopy.record.Record at 0x7f9c917edba0>,
 <zenopy.record.Record at 0x7f9c917ee050>,
 <zenopy.record.Record at 0x7f9c917edfc0>,
 <zenopy.record.Record at 0x7f9c917edd50>,
 <zenopy.record.Record at 0x7f9c917ed8a0>,
 <zenopy.record.Record at 0x7f9c917ec370>,
 <zenopy.record.Record at 0x7f9c917edff0>,
 <zenopy.record.Record at 0x7f9c917edcf0>]

By default, the ``list_records()`` returns a list of 10 public records.
You can change the number of records fetched by passing the ``size`` argument.

Before moving forward, let's create a utility function that prints the document
indices in the record list and their corresponding titles. This will save us
a little bit of time later when we want to inspect the results of other search
queries.

In [14]:
def print_record_titles(record_list: list[zenopy.record.Record] = None) -> None:
    if record_list is None:
        raise RuntimeError("The 'record_list' argument cannot be None.")
    for idx, rec_idx in enumerate(record_list):
        print(idx, rec_idx.title)

Nice. Now, let's try our utility function on the record list we just
obtained

In [23]:
print_record_titles(record_list=rec_list)

0 Data from: Pollen specialization is associated with later phenology in Osmia mason bees  (Hymenoptera: Megachilidae)
1 MillionConcepts/dustgoggles: v0.6.0
2 ТАКРОРИЙ ЭКИНЛАР ЭКИШДА ҲАМКОР ЭКИНЛАРНИНГ АҲАМИЯТИ
3 "Islimiy kompozitsiya tuzish"
4 Towards collective, evidence-based investments in open infrastructure
5 LEGOS-A
6 APĂRAREA DREPTULUI DE PROPRIETATE INDUSTRIALĂ PRIN MIJLOACE DE DREPT PENAL
7 Dataset to submitted manuscript "Trans-cis isomerization kinetics of cyanine dyes reports on the folding states of exogeneous RNA G-quadruplexes in live cells"
8 Estudo da UFSCar analisa células de um tipo agressivo de câncer de mama
9 globalbioticinteractions/globalbioticinteractions: v0.24.7


As you can see, a general search will give us a random set of records that might not be
interesting to our research. Let's narrow our search down and focus on the public 
records that are available in the 
[MolSSI Zenodo Community](https://zenodo.org/communities/molssi/?page=1&size=20).

In [8]:
molssi_rec_list = rec_obj.list_records(communities="molssi")

The value of 'content_type' argument is None.
ZenoPy will adopt JSON encoding.

The value of 'status' argument is None.
ZenoPy will search for 'published' record.

The value of 'sort' argument is None.
ZenoPy will sort the search results according to 'bestmatch' sort option.



This list is a sample of scientific data that have become available by
the members of the computational molecular sciences community. Let's inspect
the list using our ``print_record_titles()`` utility function.

In [16]:
print_record_titles(molssi_rec_list)

0 SEAMM: Simulation Environment for Atomistic and Molecular Modeling
1 Plug-in for SEAMM for building fluid systems with PACKMOL.
2 MolSSI Guidelines on APOD Cyclic Parallelization Strategy
3 Plug-in for SEAMM to allow custom python scripts in flowcharts
4 DES5M
5 DESS66 and DESS66x8
6 DES370K
7 DES15K
8 Plug-in for SEAMM to create structures from SMILES
9 MolSSI Formatting Guidelines for Machine Learning Products


### **Retrieving individual records**

Sometimes, we may be interested in retrieving the individual 
records from the results of our previous search and inspect their
metadata more closely. To do so, we need the unique record ID
that can be extracted from records' ``_id`` attribute

In [18]:
des5m_id = molssi_rec_list[4]._id

Now, we can fetch the record object corresponding to this ID
using the ``retrieve_record()`` as shown below

In [25]:
rec_des5m = rec_obj.retrieve_record(id_=des5m_id)

We can now access all metadata via record's ``data`` attribute

In [26]:
rec_des5m.data

{'conceptdoi': '10.5281/zenodo.5706001',
 'conceptrecid': '5706001',
 'created': '2021-11-16T18:22:07.847558+00:00',
 'doi': '10.5281/zenodo.5706002',
 'files': [{'bucket': '58116936-8b26-47fa-b9b5-8b3960779662',
   'checksum': 'md5:50d11239119f787c8137a78973c932e6',
   'key': 'DESS5M.zip',
   'links': {'self': 'https://zenodo.org/api/files/58116936-8b26-47fa-b9b5-8b3960779662/DESS5M.zip'},
   'size': 3825454761,
   'type': 'zip'}],
 'id': 5706002,
 'links': {'badge': 'https://zenodo.org/badge/doi/10.5281/zenodo.5706002.svg',
  'bucket': 'https://zenodo.org/api/files/58116936-8b26-47fa-b9b5-8b3960779662',
  'conceptbadge': 'https://zenodo.org/badge/doi/10.5281/zenodo.5706001.svg',
  'conceptdoi': 'https://doi.org/10.5281/zenodo.5706001',
  'doi': 'https://doi.org/10.5281/zenodo.5706002',
  'html': 'https://zenodo.org/record/5706002',
  'latest': 'https://zenodo.org/api/records/5706002',
  'latest_html': 'https://zenodo.org/record/5706002',
  'self': 'https://zenodo.org/api/records/5706

As the resulting information is in JSON format, accessing different
fields in the metadata is a convenient task. For example, let's print
the list of the authors behind the **DES5M** dataset

In [33]:
rec_des5m.metadata["creators"]

[{'affiliation': 'D. E. Shaw Research', 'name': 'Donchev, Alexander G'},
 {'affiliation': 'D. E. Shaw Research', 'name': 'Taube, Andrew G'},
 {'affiliation': 'D. E. Shaw Research', 'name': 'Decolvenaere, Elizabeth'},
 {'affiliation': 'D. E. Shaw Research', 'name': 'Hargus, Cory'},
 {'affiliation': 'D. E. Shaw Research', 'name': 'McGibbon, Robert T'},
 {'affiliation': 'D. E. Shaw Research', 'name': 'Law, Ka-Hei'},
 {'affiliation': 'D. E. Shaw Research', 'name': 'Gregersen, Brent A'},
 {'affiliation': 'D. E. Shaw Research', 'name': 'Li Je-Leun'},
 {'affiliation': 'D. E. Shaw Research', 'name': 'Palmo, Kim'},
 {'affiliation': 'D. E. Shaw Research', 'name': 'Siva. Karthik'},
 {'affiliation': 'D. E. Shaw Research', 'name': 'Bergdorf, Michael'},
 {'affiliation': 'D. E. Shaw Research', 'name': 'Klepeis. John L'},
 {'affiliation': 'D. E. Shaw Research', 'name': 'Shaw, David E'}]