Skip to content

Commit

Permalink
Add usage documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
danilopeixoto committed Dec 20, 2021
1 parent 9e85e8a commit 7db6d05
Show file tree
Hide file tree
Showing 5 changed files with 437 additions and 21 deletions.
222 changes: 214 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,15 +26,15 @@ Feature store and data catalog for machine learning.

Install package:

```
```console
pip install metastore
```

### Development

Install package:

```
```console
pip install -e .[development]
```

Expand All @@ -44,43 +44,249 @@ pip install -e .[development]
Format source code:

```
```console
autopep8 --recursive --in-place setup.py metastore/ tests/
```

Lint source code:

```
```console
pylint setup.py metastore/ tests/
```

Test package:

```
```console
pytest
```

Report test coverage:

```
```console
pytest --cov --cov-fail-under 80
```

> **Note** Set the `--cov-fail-under` flag to 80% to validate the code coverage metric.
Generate documentation:

```
```console
sphinx-apidoc -f -e -T -d 2 -o docs/metastore/api-reference/ metastore/
```

Build documentation (optional):

```
```console
cd docs/
sphinx-build -b html metastore/ build/
```

## Usage

### Create project definition

```yaml
# metastore.yaml

project:
name: 'customer_transactions'
display_name: 'Customer transactions'
description: 'Customer transactions feature store.'
author: 'Metastore Developers'
tags:
- 'customer'
- 'transaction'
version: '1.0.0'
credential_store:
type: 'local'
path: '/path/to/.env'
metadata_store:
type: 'file'
path: 's3://path/to/metadata.db'
s3_endpoint:
type: 'secret'
name: 'S3_ENDPOINT'
s3_access_key:
type: 'secret'
name: 'S3_ACCESS_KEY'
s3_secret_key:
type: 'secret'
name: 'S3_SECRET_KEY'
feature_store:
offline_store:
type: 'file'
path: 's3://path/to/features/'
s3_endpoint:
type: 'secret'
name: 'S3_ENDPOINT'
s3_access_key:
type: 'secret'
name: 'S3_ACCESS_KEY'
s3_secret_key:
type: 'secret'
name: 'S3_SECRET_KEY'
online_store:
type: 'redis'
hostname:
type: 'secret'
name: 'REDIS_HOSTNAME'
port:
type: 'secret'
name: 'REDIS_PORT'
database:
type: 'secret'
name: 'REDIS_DATABASE'
password:
type: 'secret'
name: 'REDIS_PASSWORD'
data_sources:
- name: 'customer_transactions'
type: 'postgresql'
table: 'public.customer_transaction'
hostname:
type: 'secret'
name: 'POSTGRESQL_HOSTNAME'
port:
type: 'secret'
name: 'POSTGRESQL_PORT'
database:
type: 'secret'
name: 'POSTGRESQL_DATABASE'
username:
type: 'secret'
name: 'POSTGRESQL_USERNAME'
password:
type: 'secret'
name: 'POSTGRESQL_PASSWORD'
```

### Create feature definitions

```python
# feature_definitions.py

from datetime import timedelta

from metastore import (
FeatureStore,
FeatureGroup,
Feature,
ValueType
)


feature_store = FeatureStore(repository='/path/to/repository/')

customer_transactions_feature_group = FeatureGroup(
name='customer_transactions',
record_identifiers=['customer_id'],
event_time_feature='timestamp',
features=[
Feature(name='customer_id', value_type=ValueType.INTEGER),
Feature(name='timestamp', value_type=ValueType.STRING),
Feature(name='daily_transactions', value_type=ValueType.FLOAT),
Feature(name='total_transactions', value_type=ValueType.FLOAT)
]
)

feature_store.apply(customer_transactions_feature_group)
```

### Ingest features

```python
# ingest_features.py

from metastore import FeatureStore


feature_store = FeatureStore(repository='/path/to/repository/')

dataframe = feature_store.read_dataframe(
'customer_transactions',
index_column='customer_id',
partitions=100
)

feature_store.ingest('customer_transactions', dataframe)
```

### Materialize features

```python
# materialize_features.py

from datetime import datetime, timedelta

from metastore import FeatureStore


feature_store = FeatureStore(repository='/path/to/repository/')

feature_store.materialize(
'customer_transactions',
end_date=datetime.utcnow(),
expires_in=timedelta(days=1)
)
```

### Retrieve historical features

```python
# retrieve_historical_features.py

from datetime import datetime

import pandas as pd
from metastore import FeatureStore


feature_store = FeatureStore(repository='/path/to/repository/')

record_identifiers = pd.DataFrame({
'customer_id': [00001],
'timestamp': [datetime.utcnow()]
})

dataframe = feature_store.get_historical_features(
record_identifiers=record_identifiers,
features=[
'customer_transactions:daily_transactions',
'customer_transactions:total_transactions'
]
).compute()

metadata = dataframe.attrs['metastore']
print(metadata)
```

### Retrieve online features

```python
# retrieve_online_features.py

import pandas as pd
from metastore import FeatureStore


feature_store = FeatureStore(repository='/path/to/repository/')

record_identifiers = pd.DataFrame({
'customer_id': [00001]
})

dataframe = feature_store.get_online_features(
record_identifiers=record_identifiers,
features=[
'customer_transactions:daily_transactions',
'customer_transactions:total_transactions'
]
).compute()

metadata = dataframe.attrs['metastore']
print(metadata)
```

## Documentation

Please refer to the official [Metastore Documentation](https://metastore.readthedocs.io).
Expand Down
12 changes: 8 additions & 4 deletions docs/metastore/_static/styles/custom.css
Original file line number Diff line number Diff line change
Expand Up @@ -10,14 +10,18 @@
}

pre {
border: 1px solid #e6e6e6 !important;
padding: 10px 20px !important;
background-color: #f8fafc !important;
border: none !important;
box-shadow: none !important;
border-radius: 0.2rem !important;
}

code {
padding: 2px 4px;
background-color: #fff3f8;
border-radius: 2px;
padding: 2px 4px !important;
background-color: #f8fafc !important;
border: none !important;
border-radius: 0.2rem !important;
}

div.deprecated {
Expand Down
14 changes: 7 additions & 7 deletions docs/metastore/contributing.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@

Install package:

```
```console
pip install -e .[development]
```

Expand All @@ -22,25 +22,25 @@ Set up a virtual environment for development.

Format source code:

```
```console
autopep8 --recursive --in-place setup.py metastore/ tests/
```

Lint source code:

```
```console
pylint setup.py metastore/ tests/
```

Test package:

```
```console
pytest
```

Report test coverage:

```
```console
pytest --cov --cov-fail-under 80
```

Expand All @@ -50,13 +50,13 @@ Set the `--cov-fail-under` flag to 80% to validate the code coverage metric.

Generate documentation:

```
```console
sphinx-apidoc -f -e -T -d 2 -o docs/metastore/api-reference/ metastore/
```

Build documentation (optional):

```
```console
cd docs/
sphinx-build -b html metastore/ build/
```

0 comments on commit 7db6d05

Please sign in to comment.