Skip to content

Commit

Permalink
GitBook: [master] 2 pages modified
Browse files Browse the repository at this point in the history
  • Loading branch information
woop authored and gitbook-bot committed Apr 8, 2021
1 parent 6d7678f commit f20ccfa
Show file tree
Hide file tree
Showing 2 changed files with 128 additions and 0 deletions.
76 changes: 76 additions & 0 deletions docs/how-to-guides/create-a-feature-repository.md
Original file line number Diff line number Diff line change
@@ -1,2 +1,78 @@
# Create a feature repository

We believe that the best way to keep track of your feature definitions is to manage them as code. To define features, you simply describe your feature and data source declarations in pure Python. Then Feast CLI can read Python files with feature definitions, parse the definitions and help you create and manage the infrastructure required to serve these features in production.

## What is a Feature Repository?

Feature Repository is nothing more than a collection of Python files containing feature declarations, and a config file with some Feast settings. Typically, Feast users store those files in a git repository, hence the name. Note, however, that Feast makes no hard assumptions about your source control repository structure and doesn't even require you to use git.

## Creating a Feature Repository

The easiest way to get started is to use `feast init` command:

```bash
$ mkdir my_feature_repo && cd my_feature_repo
$ feast init
Generated feature_store.yaml and example features in example_repo.py
Now try runing `feast apply` to apply, or `feast materialize` to sync data to the online store
```

You can see that all this does is create a python file with feature definitions, some sample data, and a Feast configuration for local development:

```bash
$ tree
.
├── data
│   └── driver_stats.parquet
├── example.py
└── feature_store.yaml

1 directory, 3 files
```

## What's Inside a Feature Repository

Feast configuration is stored in a file named `feature_store.yaml`. There are no restrictions on how Python feature definition files can be named, as long as they're valid Python module names \(so no dashes\). There could be multiple files as well.

If you take a look at `feature_store.yaml` you'll see something like this:

{% code title="feature\_store.yaml" %}
```yaml
project: robust_tortoise
metadata_store: data/metadata.db
provider: local
online_store:
local:
path: data/online_store.db
```
{% endcode %}

Here `project` is a unique identifier for the Feature Repository generated by `feast init`. You can also notice that this configuration file uses a "local" provider that is most useful for development, as all data is stored and served locally on your computer. Because we're using a Local provider, both metadata store and online feature store are just files on your local file system.

Now, if you open `example.py` you'll see some example Feature Views and Data Source definitions. The file is too large to quote here but you should see something like this when you open it:

```python
from feast import Entity, Feature, FeatureView, ValueType
from feast.data_source import FileSource

...

driver_hourly_stats = FileSource(
...
)

driver = Entity(...)

driver_hourly_stats_view = FeatureView(
name="driver_hourly_stats",
entities=["driver_id"],
...
)
```

The way to declare Feature Views and other objects in Feast Feature Repository is to simply write Python code to instantiate the objects, set the parameters and make sure to assign them to a top-level module variable.

Feast CLI will process all Python files from the Feature Repository as modules and find all top-level variables. You don't need to name Python files or variables in a certain way; just make sure there is a separate variable for each Feast object.



52 changes: 52 additions & 0 deletions docs/how-to-guides/deploy-a-feature-store.md
Original file line number Diff line number Diff line change
@@ -1,2 +1,54 @@
# Deploy a feature store

After creating a Feature Repository, we use Feast CLI to create all required infrastructure to serve the features we defined there.

{% hint style="info" %}
Here we'll be using the example repository we created in the previous guide, [Create a feature store](create-a-feature-repository.md). You can re-create it by running `feast init` in a new directory.
{% endhint %}

## Deploying

To have Feast create all infrastructure, you can just run `feast apply` in the Feature Repository directory. It should be a pretty straightforward process:

```
$ feast.py
Processing example.py as example
Done!
```

Depending on whether the Feature Repository is configured to use the Local provider or one of the cloud providers like GCP or AWS, it may take from a couple of seconds to a minute.

## What happens during `feast apply`

#### 1. Scan the Feature Repository

Feast will scan Python files in your Feature Repository, and find all Feast object definitions, such as Feature Views, Entities, and Data Sources.

#### 2. Update metadata

If all definitions look valid, Feast will sync the metadata about Feast objects to the Metadata Store. Metadata store is a tiny database storing most of the same information you have in the Feature Repository, plus some state in a more structured form. It is necessary mostly because the production feature serving infrastructure won't be able to access Python files in the Feature Repository at run time, but it will be able to efficiently and securely read the feature definitions from the Metadata Store.

#### 3. Create cloud infrastructure

At this step, Feast CLI will create all necessary infrastructure for feature serving and materialization to work. What exactly gets created depends on what provider is configured to be used in `feature_store.yaml` in the Feature Repository.

For example, for Local provider, it is as easy as creating a file on your local filesystem as a key-value store to serve feature data from. Local provider is most usable for local testing, no real production serving happens there.

A more interesting configuration is when we're configured Feast to use GCP provider and Cloud Datastore to store feature data. When you run `feast apply`, Feast will make sure you have valid credentials and create some metadata objects in the Datastore for each Feature View.

Similarly, when using AWS, Feast will make sure that resources like DynamoDB tables are created for every Feature View.

{% hint style="warning" %}
Since `feast deploy` \(when configured to use non-Local provider\) will create cloud infrastructure in your AWS or GCP account, it may incur some costs on your cloud bill. While we aim to design it in a way that Feast cloud resources don't cost much when not serving features, preferring "serverless" cloud services that bill per request, please refer to the specific Provider documentation to make sure there are no surprises.
{% endhint %}

## Cleaning up

If you no longer need the infrastructure, you can run `feast destroy` to clean up. **Note that this will irrevocably delete all data in the online store, so use it with care.**

\*\*\*\*





0 comments on commit f20ccfa

Please sign in to comment.