# Databento Python client tutorial

**Welcome to the Databento official Python client library tutorial!**

This tutorial will cover the following:
- Using the historical client to request for metadata
- Using the historical client to request for time series market data
- Working with the `Bento` data I/O helper object

**Tips:**
- We can call `help()` on any class or method to see the 'docstring'


## Historical data client

Once we have installed the Python client library, we can import it and initialize a historical client for requests.
We'll use this `client` throught the remainder of the tutorial.

To initialize a client you'll need to provide a valid API access key. You can find these on the 'Access Keys' page of the user portal by logging into your account at https://databento.com.

In [1]:
import databento as db

In [2]:
client = db.Historical(key="YOUR_ACCESS_KEY")

## Requesting metadata

Before we make any requests for actual data, we can explore various metadata to discover what is available.

We'll first initialize a parameters dictionary and use this to pass the values as kwargs to avoid some code duplication.

In [3]:
client.metadata.list_datasets()

['GLBX.MDP3', 'XNAS.ITCH']

In [4]:
client.metadata.list_schemas(dataset="GLBX.MDP3")

['mbo',
 'mbp-1',
 'mbp-10',
 'tbbo',
 'trades',
 'ohlcv-1s',
 'ohlcv-1m',
 'ohlcv-1h',
 'ohlcv-1d',
 'definition',
 'statistics',
 'status']

In [5]:
client.metadata.list_fields(dataset="GLBX.MDP3", schema="trades", encoding="csv")

{'GLBX.MDP3': {'csv': {'trades': {'ts_recv': 'int',
    'ts_event': 'int',
    'ts_in_delta': 'int',
    'pub_id': 'int',
    'product_id': 'int',
    'action': 'string',
    'side': 'string',
    'flags': 'int',
    'price': 'int',
    'size': 'int',
    'sequence': 'int'}}}}

## Requesting time series data

Now we will request some historical time series data which will be used throughout the remainder of the tutorial.

Here we will request for all E-mini S&P500 futures contracts active between 2020-12-26 and 2020-12-30 using `smart` symbology:

In [6]:
data = client.timeseries.stream(
    dataset="GLBX.MDP3",
    symbols="ES.FUT",
    stype_in="smart",
    schema="mbo",
    start="2020-12-27",
    end="2020-12-30",
    encoding="dbz",
    compression="zstd",
    limit=1000,  # <-- request limited to 1000 records
)

## Working with the Bento helper object

All timeseries data requests will contain an accompanying metadata header which includes:
- The original query paramaters
- Symbology mappings
- Instrument 'mini-definitions'

### Metadata properties

In [7]:
data.dataset

'GLBX.MDP3'

In [8]:
data.schema

<Schema.MBO: 'mbo'>

In [9]:
data.symbols

['ES.FUT']

In [10]:
data.stype_in

<SType.SMART: 'smart'>

In [11]:
data.stype_out

<SType.PRODUCT_ID: 'product_id'>

In [12]:
data.start

Timestamp('2020-12-27 00:00:00+0000', tz='UTC')

In [13]:
data.end

Timestamp('2020-12-30 00:00:00+0000', tz='UTC')

In [14]:
data.encoding

<Encoding.DBZ: 'dbz'>

In [15]:
data.compression

<Compression.ZSTD: 'zstd'>

In [16]:
data.shape

(1000, 14)

In [17]:
data.dtype

dtype([('nwords', 'u1'), ('type', 'u1'), ('pub_id', '<u2'), ('product_id', '<u4'), ('ts_event', '<u8'), ('order_id', '<u8'), ('price', '<i8'), ('size', '<u4'), ('flags', 'i1'), ('chan_id', 'u1'), ('side', 'S1'), ('action', 'S1'), ('ts_recv', '<u8'), ('ts_in_delta', '<i4'), ('sequence', '<u4')])

In [18]:
data.struct_size

56

### Symbology resolution

The metadata contains all information which would have been provided in a `symbology.resolve` request:

In [19]:
data.symbology

{'result': {'ESH1': [{'t0': '2020-12-27', 't1': '2020-12-30', 's': '5482'}],
  'ESH1-ESH2': [{'t0': '2020-12-27', 't1': '2020-12-30', 's': '21885'}],
  'ESH1-ESM1': [{'t0': '2020-12-27', 't1': '2020-12-30', 's': '19651'}],
  'ESH1-ESU1': [{'t0': '2020-12-27', 't1': '2020-12-30', 's': '4223'}],
  'ESH1-ESZ1': [{'t0': '2020-12-27', 't1': '2020-12-30', 's': '20076'}],
  'ESH2': [{'t0': '2020-12-27', 't1': '2020-12-30', 's': '5782'}],
  'ESM1': [{'t0': '2020-12-27', 't1': '2020-12-30', 's': '3853'}],
  'ESM1-ESH2': [{'t0': '2020-12-27', 't1': '2020-12-30', 's': '22223'}],
  'ESM1-ESU1': [{'t0': '2020-12-27', 't1': '2020-12-30', 's': '4673'}],
  'ESM1-ESZ1': [{'t0': '2020-12-27', 't1': '2020-12-30', 's': '22279'}],
  'ESU1': [{'t0': '2020-12-27', 't1': '2020-12-30', 's': '1030'}],
  'ESU1-ESH2': [{'t0': '2020-12-27', 't1': '2020-12-30', 's': '16280'}],
  'ESU1-ESZ1': [{'t0': '2020-12-27', 't1': '2020-12-30', 's': '20117'}],
  'ESZ1': [{'t0': '2020-12-27', 't1': '2020-12-30', 's': '8858'}],


### Symbology mappings

A subset of the symbology metadata includes the mappings between the requested symbols 'in' and the symbol type 'out'.

In [20]:
data.mappings

{'ESH1': [{'t0': '2020-12-27', 't1': '2020-12-30', 's': '5482'}],
 'ESH1-ESH2': [{'t0': '2020-12-27', 't1': '2020-12-30', 's': '21885'}],
 'ESH1-ESM1': [{'t0': '2020-12-27', 't1': '2020-12-30', 's': '19651'}],
 'ESH1-ESU1': [{'t0': '2020-12-27', 't1': '2020-12-30', 's': '4223'}],
 'ESH1-ESZ1': [{'t0': '2020-12-27', 't1': '2020-12-30', 's': '20076'}],
 'ESH2': [{'t0': '2020-12-27', 't1': '2020-12-30', 's': '5782'}],
 'ESM1': [{'t0': '2020-12-27', 't1': '2020-12-30', 's': '3853'}],
 'ESM1-ESH2': [{'t0': '2020-12-27', 't1': '2020-12-30', 's': '22223'}],
 'ESM1-ESU1': [{'t0': '2020-12-27', 't1': '2020-12-30', 's': '4673'}],
 'ESM1-ESZ1': [{'t0': '2020-12-27', 't1': '2020-12-30', 's': '22279'}],
 'ESU1': [{'t0': '2020-12-27', 't1': '2020-12-30', 's': '1030'}],
 'ESU1-ESH2': [{'t0': '2020-12-27', 't1': '2020-12-30', 's': '16280'}],
 'ESU1-ESZ1': [{'t0': '2020-12-27', 't1': '2020-12-30', 's': '20117'}],
 'ESZ1': [{'t0': '2020-12-27', 't1': '2020-12-30', 's': '8858'}],
 'ESZ1-ESH2': [{'t0': '2

### Instrument definitions

The metadata also contains 'mini-definitions', which are a subset of the full `definition` schema (also available through the `timeseries` endpoint).

In [21]:
data.definitions

{'ESH1': [{'symbol': 'ESH1',
   'date': 20201227,
   'asset': 'ES',
   'exchange': 'XCME',
   'security_type': 'FUT',
   'min_price_increment': 25000000000,
   'display_factor': 10000000,
   'activation': 1576852200000000000,
   'expiration': 1616160600000000000,
   'currency': 'USD',
   'ts_event': 1609088715926494829}],
 'ESH2': [{'symbol': 'ESH2',
   'date': 20201227,
   'asset': 'ES',
   'exchange': 'XCME',
   'security_type': 'FUT',
   'min_price_increment': 25000000000,
   'display_factor': 10000000,
   'activation': 1608301800000000000,
   'expiration': 1647610200000000000,
   'currency': 'USD',
   'ts_event': 1609088715926494829}],
 'ESM1': [{'symbol': 'ESM1',
   'date': 20201227,
   'asset': 'ES',
   'exchange': 'XCME',
   'security_type': 'FUT',
   'min_price_increment': 25000000000,
   'display_factor': 10000000,
   'activation': 1584711000000000000,
   'expiration': 1624023000000000000,
   'currency': 'USD',
   'ts_event': 1609088715926494829}],
 'ESZ1': [{'symbol': 'ESZ1',

In [22]:
data.instrument('ESH1')

[{'symbol': 'ESH1',
  'date': 20201227,
  'asset': 'ES',
  'exchange': 'XCME',
  'security_type': 'FUT',
  'min_price_increment': 25000000000,
  'display_factor': 10000000,
  'activation': 1576852200000000000,
  'expiration': 1616160600000000000,
  'currency': 'USD',
  'ts_event': 1609088715926494829}]