-
Notifications
You must be signed in to change notification settings - Fork 951
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
GitBook: [master] 59 pages and 46 assets modified
- Loading branch information
1 parent
01481eb
commit 67bd17f
Showing
41 changed files
with
113 additions
and
238 deletions.
There are no files selected for viewing
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,52 +1,2 @@ | ||
# Architecture | ||
|
||
![](../.gitbook/assets/image%20%286%29%20%283%29%20%283%29%20%283%29%20%283%29%20%283%29%20%283%29%20%283%29%20%283%29.png) | ||
|
||
## Sequence description | ||
|
||
1. **Log Raw Events:** Production backend applications are configured to emit internal state changes as events to a stream. | ||
2. **Create Stream Features:** Stream processing systems like Flink, Spark, and Beam are used to transform and refine events and to produce features that are logged back to the stream. | ||
3. **Log Streaming Features:** Both raw and refined events are logged into a data lake or batch storage location. | ||
4. **Create Batch Features:** ELT/ETL systems like Spark and SQL are used to transform data in the batch store. | ||
5. **Define and Ingest Features:** The Feast user defines [feature tables](feature-tables.md) based on the features available in batch and streaming sources and publish these definitions to Feast Core. | ||
6. **Poll Feature Definitions:** The Feast Job Service polls for new or changed feature definitions. | ||
7. **Start Ingestion Jobs:** Every new feature table definition results in a new ingestion job being provisioned \(see limitations\). | ||
8. **Batch Ingestion:** Batch ingestion jobs are short-lived jobs that load data from batch sources into either an offline or online store \(see limitations\). | ||
9. **Stream Ingestion:** Streaming ingestion jobs are long-lived jobs that load data from stream sources into online stores. A stream source and batch source on a feature table must have the same features/fields. | ||
10. **Model Training:** A model training pipeline is launched. It uses the Feast Python SDK to retrieve a training dataset and trains a model. | ||
11. **Get Historical Features:** Feast exports a point-in-time correct training dataset based on the list of features and entity DataFrame provided by the model training pipeline. | ||
12. **Deploy Model:** The trained model binary \(and list of features\) are deployed into a model serving system. | ||
13. **Get Prediction:** A backend system makes a request for a prediction from the model serving service. | ||
14. **Retrieve Online Features:** The model serving service makes a request to the Feast Online Serving service for online features using a Feast SDK. | ||
15. **Return Prediction:** The model serving service makes a prediction using the returned features and returns the outcome. | ||
|
||
{% hint style="warning" %} | ||
Limitations | ||
|
||
* Feast 0.8 has no offline store. Batch retrieval is direct from source. We plan to implement an optional offline store in Feast 0.9 | ||
* Only Redis is supported for online storage. | ||
* Batch ingestion jobs must be triggered from your own scheduler like Airflow. Streaming ingestion jobs are automatically launched by the Feast Job Service. | ||
{% endhint %} | ||
|
||
## Components: | ||
|
||
A complete Feast deployment contains the following components: | ||
|
||
* **Feast Core:** Acts as the central registry for feature and entity definitions in Feast. | ||
* **Feast Job Service:** Manages data processing jobs that load data from sources into stores, and jobs that export training datasets. | ||
* **Feast Serving:** Provides low-latency access to feature values in an online store. | ||
* **Feast Python SDK CLI:** The primary user facing SDK. Used to: | ||
* Manage feature definitions with Feast Core. | ||
* Launch jobs through the Feast Job Service. | ||
* Retrieve training datasets. | ||
* Retrieve online features. | ||
* **Online Store:** The online store is a database that stores only the latest feature values for each entity. The online store can be populated by either batch ingestion jobs \(in the case the user has no streaming source\), or can be populated by a streaming ingestion job from a streaming source. Feast Online Serving looks up feature values from the online store. | ||
* **Offline Store:** The offline store persists batch data that has been ingested into Feast. This data is used for producing training datasets. | ||
* **Feast Spark SDK:** A Spark specific Feast SDK. Allows teams to use Spark for loading features into an online store and for building training datasets over offline sources. | ||
|
||
Please see the [configuration reference](../reference/configuration-reference.md#overview) for more details on configuring these components. | ||
|
||
{% hint style="info" %} | ||
Java and Go Clients are also available for online feature retrieval. See [API Reference](../reference/api/). | ||
{% endhint %} | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,64 +1,2 @@ | ||
# Entities | ||
|
||
## Overview | ||
|
||
An entity is any domain object that can be modeled and about which information can be stored. Entities are usually recognizable concepts, either concrete or abstract, such as persons, places, things, or events. | ||
|
||
Examples of entities in the context of ride-hailing and food delivery: `customer`, `order`, `driver`, `restaurant`, `dish`, `area`. | ||
|
||
Entities are important in the context of feature stores since features are always properties of a specific entity. For example, we could have a feature `total_trips_24h` for driver `D011234` with a feature value of `11`. | ||
|
||
Feast uses entities in the following way: | ||
|
||
* Entities serve as the keys used to look up features for producing training datasets and online feature values. | ||
* Entities serve as a natural grouping of features in a feature table. A feature table must belong to an entity \(which could be a composite entity\) | ||
|
||
## Structure of an Entity | ||
|
||
When creating an entity specification, consider the following fields: | ||
|
||
* **Name**: Name of the entity | ||
* **Description**: Description of the entity | ||
* **Value Type**: Value type of the entity. Feast will attempt to coerce entity columns in your data sources into this type. | ||
* **Labels**: Labels are maps that allow users to attach their own metadata to entities | ||
|
||
A valid entity specification is shown below: | ||
|
||
```python | ||
customer = Entity( | ||
name="customer_id", | ||
description="Customer id for ride customer", | ||
value_type=ValueType.INT64, | ||
labels={} | ||
) | ||
``` | ||
|
||
## Working with an Entity | ||
|
||
### Creating an Entity: | ||
|
||
```python | ||
# Create a customer entity | ||
customer_entity = Entity(name="customer_id", description="ID of car customer") | ||
client.apply(customer_entity) | ||
``` | ||
|
||
### Updating an Entity: | ||
|
||
```python | ||
# Update a customer entity | ||
customer_entity = client.get_entity("customer_id") | ||
customer_entity.description = "ID of bike customer" | ||
client.apply(customer_entity) | ||
``` | ||
|
||
Permitted changes include: | ||
|
||
* The entity's description and labels | ||
|
||
The following changes are note permitted: | ||
|
||
* Project | ||
* Name of an entity | ||
* Type | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
# Feature Views | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,90 +1,2 @@ | ||
# Sources | ||
|
||
### Overview | ||
|
||
Sources are descriptions of external feature data and are registered to Feast as part of [feature tables](feature-tables.md). Once registered, Feast can ingest feature data from these sources into stores. | ||
|
||
Currently, Feast supports the following source types: | ||
|
||
#### Batch Source | ||
|
||
* File \(as in Spark\): Parquet \(only\). | ||
* BigQuery | ||
|
||
#### Stream Source | ||
|
||
* Kafka | ||
* Kinesis | ||
|
||
The following encodings are supported on streams | ||
|
||
* Avro | ||
* Protobuf | ||
|
||
### Structure of a Source | ||
|
||
For both batch and stream sources, the following configurations are necessary: | ||
|
||
* **Event timestamp column**: Name of column containing timestamp when event data occurred. Used during point-in-time join of feature values to [entity timestamps](glossary.md#entity-timestamp). | ||
* **Created timestamp column**: Name of column containing timestamp when data is created. Used to deduplicate data when multiple copies of the same [entity key](glossary.md#entity-key) is ingested. | ||
|
||
Example data source specifications: | ||
|
||
{% tabs %} | ||
{% tab title="batch\_sources.py" %} | ||
```python | ||
from feast import FileSource | ||
from feast.data_format import ParquetFormat | ||
|
||
batch_file_source = FileSource( | ||
file_format=ParquetFormat(), | ||
file_url="file:///feast/customer.parquet", | ||
event_timestamp_column="event_timestamp", | ||
created_timestamp_column="created_timestamp", | ||
) | ||
``` | ||
{% endtab %} | ||
|
||
{% tab title="stream\_sources.py" %} | ||
```python | ||
from feast import KafkaSource | ||
from feast.data_format import ProtoFormat | ||
|
||
stream_kafka_source = KafkaSource( | ||
bootstrap_servers="localhost:9094", | ||
message_format=ProtoFormat(class_path="class.path"), | ||
topic="driver_trips", | ||
event_timestamp_column="event_timestamp", | ||
created_timestamp_column="created_timestamp", | ||
) | ||
``` | ||
{% endtab %} | ||
{% endtabs %} | ||
|
||
The [Feast Python API documentation](https://api.docs.feast.dev/python/) provides more information about options to specify for the above sources. | ||
|
||
### Working with a Source | ||
|
||
#### Creating a Source | ||
|
||
Sources are defined as part of [feature tables](feature-tables.md): | ||
|
||
```python | ||
batch_bigquery_source = BigQuerySource( | ||
table_ref="gcp_project:bq_dataset.bq_table", | ||
event_timestamp_column="event_timestamp", | ||
created_timestamp_column="created_timestamp", | ||
) | ||
|
||
stream_kinesis_source = KinesisSource( | ||
bootstrap_servers="localhost:9094", | ||
record_format=ProtoFormat(class_path="class.path"), | ||
region="us-east-1", | ||
stream_name="driver_trips", | ||
event_timestamp_column="event_timestamp", | ||
created_timestamp_column="created_timestamp", | ||
) | ||
``` | ||
|
||
Feast ensures that the source complies with the schema of the feature table. These specified data sources can then be included inside a feature table specification and registered to Feast Core. | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,83 @@ | ||
--- | ||
description: Versioning policies and status of Feast components | ||
--- | ||
|
||
# Versioning Policy | ||
|
||
### Versioning policy and branch workflow | ||
|
||
Feast uses [semantic versioning](https://semver.org/). | ||
|
||
Contributors are encouraged to understand our branch workflow described below, for choosing where to branch when making a change \(and thus the merge base for a pull request\). | ||
|
||
* Major and minor releases are cut from the `master` branch. | ||
* Each major and minor release has a long-lived maintenance branch, e.g., `v0.3-branch`. This is called a "release branch". | ||
* From the release branch the pre-release release candidates are tagged, e.g., `v0.3.0-rc.1` | ||
* From the release candidates the stable patch version releases are tagged, e.g.,`v0.3.0`. | ||
|
||
A release branch should be substantially _feature complete_ with respect to the intended release. Code that is committed to `master` may be merged or cherry-picked on to a release branch, but code that is directly committed to a release branch should be solely applicable to that release \(and should not be committed back to master\). | ||
|
||
In general, unless you're committing code that only applies to a particular release stream \(for example, temporary hot-fixes, back-ported security fixes, or image hashes\), you should base changes from `master` and then merge or cherry-pick to the release branch. | ||
|
||
### Feast Component Matrix | ||
|
||
The following table shows the **status** \(stable, beta, or alpha\) of Feast components. | ||
|
||
Application status indicators for Feast: | ||
|
||
* **Stable** means that the component has reached a sufficient level of stability and adoption that the Feast community has deemed the component stable. Please see the stability criteria below. | ||
* **Beta** means that the component is working towards a version 1.0 release. Beta does not mean a component is unstable, it simply means the component has not met the full criteria of stability. | ||
* **Alpha** means that the component is in the early phases of development and/or integration into Feast. | ||
|
||
| Application | Status | Version | Notes | | ||
| :--- | :--- | :--- | :--- | | ||
| [Feast Serving](https://github.com/feast-dev/feast-java) | Beta | [v0.25.2](https://github.com/feast-dev/feast-java/releases/tag/v0.25.2) | APIs are considered stable and will not have breaking changes within 3 minor versions. | | ||
| [Feast Core](https://github.com/feast-dev/feast-java) | Beta | [v0.25.2](https://github.com/feast-dev/feast-java/releases/tag/v0.25.2) | At risk of deprecation | | ||
| [Feast Java Client](https://github.com/feast-dev/feast-java) | Beta | [v0.25.2](https://github.com/feast-dev/feast-java/releases/tag/v0.25.2) | | | ||
| [Feast Python SDK](https://github.com/feast-dev/feast) | Beta | [v0.9.4](https://github.com/feast-dev/feast/releases/tag/v0.9.4) | | | ||
| [Feast Go Client](https://github.com/feast-dev/feast) | Beta | [v0.9.4](https://github.com/feast-dev/feast/releases/tag/v0.9.4) | | | ||
| [Feast Spark Python SDK](https://github.com/feast-dev/feast-spark) | Alpha | [v0.1.2](https://github.com/feast-dev/feast-spark/releases/tag/v0.1.2) | | | ||
| [Feast Spark Launchers](https://github.com/feast-dev/feast-spark) | Alpha | [v0.1.2](https://github.com/feast-dev/feast-spark/releases/tag/v0.1.2) | | | ||
| [Feast Job Service](https://github.com/feast-dev/feast-spark) | Alpha | [v0.1.2](https://github.com/feast-dev/feast-spark/releases/tag/v0.1.2) | At risk of deprecation | | ||
| [Feast Helm Chart](https://github.com/feast-dev/feast-helm-charts) | Beta | [v0.100.4](https://github.com/feast-dev/feast-helm-charts/releases/tag/v0.100.4) | | | ||
| | | | | | ||
|
||
Criteria for reaching _**stable**_ status: | ||
|
||
* Contributors from at least two organizations | ||
* Complete end-to-end test suite | ||
* Scalability and load testing if applicable | ||
* Automated release process \(docker images, PyPI packages, etc\) | ||
* API reference documentation | ||
* No deprecative changes | ||
* Must include logging and monitoring | ||
|
||
Criteria for reaching **beta** status | ||
|
||
* Contributors from at least two organizations | ||
* End-to-end test suite | ||
* API reference documentation | ||
* Deprecative changes must span multiple minor versions and allow for an upgrade path. | ||
|
||
### Levels of support <a id="levels-of-support"></a> | ||
|
||
Feast components have various levels of support based on the component status. | ||
|
||
| Application status | Level of support | | ||
| :--- | :--- | | ||
| Stable | The Feast community offers best-effort support for stable applications. Stable components will be offered long term support | | ||
| Beta | The Feast community offers best-effort support for beta applications. Beta applications will be supported for at least 2 more minor releases. | | ||
| Alpha | The response differs per application in alpha status, depending on the size of the community for that application and the current level of active development of the application. | | ||
|
||
### Support from the Feast community <a id="support-from-the-kubeflow-community"></a> | ||
|
||
Feast has an active and helpful community of users and contributors. | ||
|
||
The Feast community offers support on a best-effort basis for stable and beta applications. Best-effort support means that there’s no formal agreement or commitment to solve a problem but the community appreciates the importance of addressing the problem as soon as possible. The community commits to helping you diagnose and address the problem if all the following are true: | ||
|
||
* The cause falls within the technical framework that Feast controls. For example, the Feast community may not be able to help if the problem is caused by a specific network configuration within your organization. | ||
* Community members can reproduce the problem. | ||
* The reporter of the problem can help with further diagnosis and troubleshooting. | ||
|
||
Please see the [Community](../community.md) page for channels through which support can be requested. | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.