# Part 1 - Introduction

In [1]:
from mdf_forge.forge import Forge  # This is the only required import for Forge.

## Authentication
Authentication is handled automatically. Just follow the prompt once and let Forge take care of the rest.


In [2]:
# You can set up Forge with no arguments. Forge will automatically authenticate and connect to MDF.
mdf = Forge()

## Basic Queries

### Basic full text search
Using the `search()` method, you can perform a basic text search of the data in MDF.
You will get back a list of matching entries (up to 10,000).

Let's say we want to find data on aluminum. We can just search for "Al" like so:

In [3]:
res = mdf.search("Al")
res[0]

{'files': [{'data_type': 'ASCII text, with very long lines, with no line terminators',
   'filename': 'nist_xps_41530.json',
   'globus': 'globus://e38ee745-6d04-11e5-ba46-22000b92c6ec/MDF/mdf_connect/prod/data/nist_xps_db_v1/nist_xps_41530.json',
   'length': 1248,
   'mime_type': 'text/plain',
   'sha512': '69912ca91261bba53dc0df956338baebf81a3f9d1281f4e9108200c3b8473f073ffdff7437a55c8ac3d08d40074a68a5509bbeb1a391426f838427398f3963dd',
   'url': 'https://e38ee745-6d04-11e5-ba46-22000b92c6ec.e.globus.org/MDF/mdf_connect/prod/data/nist_xps_db_v1/nist_xps_41530.json'}],
 'material': {'composition': 'Al', 'elements': ['Al']},
 'mdf': {'ingest_date': '2018-03-28T15:04:09.928376Z',
  'mdf_id': '5abbaeff34a2263dfa3de0e1',
  'parent_id': '5abbaee934a2263dfa3dbf78',
  'resource_type': 'record',
  'scroll_id': 8553,
  'source_name': 'nist_xps_db_v1'},
 'nist_xps_db_v1': {'binding_energy_ev': '72.5',
  'energy_uncertainty_ev': '',
  'notes': 'Al(111).',
  'temperature_k': '300'}}

### Advanced-mode searches
You can also query more precisely with the `advanced=True` argument. The basic use is the form `key.subkey:value`. The full documentation for the query syntaz can be found here: http://globus-search-docs.s3-website-us-east-1.amazonaws.com/stable/api/search.html#_query_syntax

In this example, we can search for "Al" inside the "mdf.elements" key.

We're also going to limit the number of results to 10.

In [4]:
res = mdf.search("material.elements:Al", advanced=True, limit=10)
res[0]

{'crystal_structure': {'number_of_atoms': 4,
  'space_group_number': 225,
  'volume': 66.01028844534864},
 'files': [{'data_type': 'ASCII text',
   'filename': '15628.cif',
   'globus': 'globus://e38ee745-6d04-11e5-ba46-22000b92c6ec/MDF/mdf_connect/prod/data/amcs_v1/15628.cif',
   'length': 4073,
   'mime_type': 'text/plain',
   'sha512': 'cccd06663cc04a45b2fda17680e78d509e37d6d69a5bd1bf37695ab90e45ae602bd7a44dc365a3b2a2fa1708510a3803e94533c0c6624b25214dfdacb20aee1d',
   'url': 'https://e38ee745-6d04-11e5-ba46-22000b92c6ec.e.globus.org/MDF/mdf_connect/prod/data/amcs_v1/15628.cif'}],
 'material': {'composition': 'Al4', 'elements': ['Al']},
 'mdf': {'ingest_date': '2018-03-26T22:23:18.021883Z',
  'mdf_id': '5ab972da34a2262cce3694b9',
  'parent_id': '5ab972d634a2262cce368d5c',
  'resource_type': 'record',
  'scroll_id': 1885,
  'source_name': 'amcs_v1'}}

If you want to search on a value with special characters, such as a colon or space, you must wrap the value in double quotes. Otherwise, you may get unexpected results.

In [5]:
res = mdf.search('dc.titles.title:"High-throughput Ab-initio Dilute Solute Diffusion Database"', advanced=True)
res[0]

{'dc': {'contributors': [{'affiliations': ['University of Wisconsin-Madison'],
    'contributorName': 'Morgan, Dane',
    'contributorType': 'ContactPerson',
    'familyName': 'Morgan',
    'givenName': 'Dane'}],
  'creators': [{'affiliations': ['University of Wisconsin-Madison'],
    'creatorName': 'Morgan, Dane',
    'familyName': 'Morgan',
    'givenName': 'Dane'},
   {'affiliations': ['University of Wisconsin-Madison'],
    'creatorName': 'Mayeshiba, Tam',
    'familyName': 'Mayeshiba',
    'givenName': 'Tam'},
   {'affiliations': ['University of Wisconsin-Madison'],
    'creatorName': 'Henry, Wu',
    'familyName': 'Henry',
    'givenName': 'Wu'}],
  'dates': [{'date': '2017-08-07T16:07:32.938812Z', 'dateType': 'Collected'}],
  'descriptions': [{'description': 'We demonstrate automated generation of diffusion databases from high-throughput density functional theory (DFT) calculations. A total of more than 230 dilute solute diffusion systems in Mg, Al, Cu, Ni, Pd, and Pt host latti