Skip to content

Commit

Permalink
GitBook: [master] 59 pages and 46 assets modified
Browse files Browse the repository at this point in the history
  • Loading branch information
woop authored and gitbook-bot committed Apr 5, 2021
1 parent 01481eb commit 67bd17f
Show file tree
Hide file tree
Showing 41 changed files with 113 additions and 238 deletions.
2 changes: 1 addition & 1 deletion docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@

Feast \(**Fea**ture **St**ore\) is an operational data system for managing and serving machine learning features to models in production.

![](.gitbook/assets/feast-architecture-diagrams%20%281%29%20%281%29%20%281%29%20%282%29%20%283%29%20%284%29%20%283%29%20%281%29%20%281%29%20%281%29%20%285%29.svg)
![](.gitbook/assets/feast-architecture-diagrams%20%281%29%20%281%29%20%281%29%20%282%29%20%283%29%20%284%29%20%283%29%20%281%29%20%281%29%20%281%29%20%281%29%20%285%29.svg)

## Problems Feast Solves

Expand Down
29 changes: 11 additions & 18 deletions docs/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,32 +6,24 @@
* [Changelog](https://github.com/feast-dev/feast/blob/master/CHANGELOG.md)
* [Community](community.md)

## Tutorials

## How-to Guides

---

* [Create a feature repository](create-a-feature-repository.md)
* [Deploy a feature store](deploy-a-feature-store.md)
* [Load data into the online store](load-data-into-the-online-store.md)
* [Build a training dataset](build-a-training-dataset.md)
* [Read features from the online store](read-features-from-the-online-store.md)
* [Create a feature repository](how-to-guides/create-a-feature-repository.md)
* [Deploy a feature store](how-to-guides/deploy-a-feature-store.md)
* [Load data into the online store](how-to-guides/load-data-into-the-online-store.md)
* [Build a training dataset](how-to-guides/build-a-training-dataset.md)
* [Read features from the online store](how-to-guides/read-features-from-the-online-store.md)

## Concepts

---

* [Architecture](architecture.md)
* [Feature Views](feature-views.md)
* [Entities](entities.md)
* [Sources](sources.md)
* [Architecture](concepts/architecture.md)
* [Feature Views](concepts/feature-views.md)
* [Entities](concepts/entities.md)
* [Sources](concepts/sources.md)

## Reference

---

* [Repository Config](repository-config.md)
* [Repository Config](reference/repository-config.md)
* [Python API reference](http://rtd.feast.dev/)

## Feast on Kubernetes
Expand Down Expand Up @@ -88,5 +80,6 @@

* [Contribution Process](contributing/contributing.md)
* [Development Guide](contributing/development-guide.md)
* [Versioning Policy](contributing/versioning-policy.md)
* [Release Process](contributing/release-process.md)

50 changes: 0 additions & 50 deletions docs/concepts/architecture.md
Original file line number Diff line number Diff line change
@@ -1,52 +1,2 @@
# Architecture

![](../.gitbook/assets/image%20%286%29%20%283%29%20%283%29%20%283%29%20%283%29%20%283%29%20%283%29%20%283%29%20%283%29.png)

## Sequence description

1. **Log Raw Events:** Production backend applications are configured to emit internal state changes as events to a stream.
2. **Create Stream Features:** Stream processing systems like Flink, Spark, and Beam are used to transform and refine events and to produce features that are logged back to the stream.
3. **Log Streaming Features:** Both raw and refined events are logged into a data lake or batch storage location.
4. **Create Batch Features:** ELT/ETL systems like Spark and SQL are used to transform data in the batch store.
5. **Define and Ingest Features:** The Feast user defines [feature tables](feature-tables.md) based on the features available in batch and streaming sources and publish these definitions to Feast Core.
6. **Poll Feature Definitions:** The Feast Job Service polls for new or changed feature definitions.
7. **Start Ingestion Jobs:** Every new feature table definition results in a new ingestion job being provisioned \(see limitations\).
8. **Batch Ingestion:** Batch ingestion jobs are short-lived jobs that load data from batch sources into either an offline or online store \(see limitations\).
9. **Stream Ingestion:** Streaming ingestion jobs are long-lived jobs that load data from stream sources into online stores. A stream source and batch source on a feature table must have the same features/fields.
10. **Model Training:** A model training pipeline is launched. It uses the Feast Python SDK to retrieve a training dataset and trains a model.
11. **Get Historical Features:** Feast exports a point-in-time correct training dataset based on the list of features and entity DataFrame provided by the model training pipeline.
12. **Deploy Model:** The trained model binary \(and list of features\) are deployed into a model serving system.
13. **Get Prediction:** A backend system makes a request for a prediction from the model serving service.
14. **Retrieve Online Features:** The model serving service makes a request to the Feast Online Serving service for online features using a Feast SDK.
15. **Return Prediction:** The model serving service makes a prediction using the returned features and returns the outcome.

{% hint style="warning" %}
Limitations

* Feast 0.8 has no offline store. Batch retrieval is direct from source. We plan to implement an optional offline store in Feast 0.9
* Only Redis is supported for online storage.
* Batch ingestion jobs must be triggered from your own scheduler like Airflow. Streaming ingestion jobs are automatically launched by the Feast Job Service.
{% endhint %}

## Components:

A complete Feast deployment contains the following components:

* **Feast Core:** Acts as the central registry for feature and entity definitions in Feast.
* **Feast Job Service:** Manages data processing jobs that load data from sources into stores, and jobs that export training datasets.
* **Feast Serving:** Provides low-latency access to feature values in an online store.
* **Feast Python SDK CLI:** The primary user facing SDK. Used to:
* Manage feature definitions with Feast Core.
* Launch jobs through the Feast Job Service.
* Retrieve training datasets.
* Retrieve online features.
* **Online Store:** The online store is a database that stores only the latest feature values for each entity. The online store can be populated by either batch ingestion jobs \(in the case the user has no streaming source\), or can be populated by a streaming ingestion job from a streaming source. Feast Online Serving looks up feature values from the online store.
* **Offline Store:** The offline store persists batch data that has been ingested into Feast. This data is used for producing training datasets.
* **Feast Spark SDK:** A Spark specific Feast SDK. Allows teams to use Spark for loading features into an online store and for building training datasets over offline sources.

Please see the [configuration reference](../reference/configuration-reference.md#overview) for more details on configuring these components.

{% hint style="info" %}
Java and Go Clients are also available for online feature retrieval. See [API Reference](../reference/api/).
{% endhint %}

62 changes: 0 additions & 62 deletions docs/concepts/entities.md
Original file line number Diff line number Diff line change
@@ -1,64 +1,2 @@
# Entities

## Overview

An entity is any domain object that can be modeled and about which information can be stored. Entities are usually recognizable concepts, either concrete or abstract, such as persons, places, things, or events.

Examples of entities in the context of ride-hailing and food delivery: `customer`, `order`, `driver`, `restaurant`, `dish`, `area`.

Entities are important in the context of feature stores since features are always properties of a specific entity. For example, we could have a feature `total_trips_24h` for driver `D011234` with a feature value of `11`.

Feast uses entities in the following way:

* Entities serve as the keys used to look up features for producing training datasets and online feature values.
* Entities serve as a natural grouping of features in a feature table. A feature table must belong to an entity \(which could be a composite entity\)

## Structure of an Entity

When creating an entity specification, consider the following fields:

* **Name**: Name of the entity
* **Description**: Description of the entity
* **Value Type**: Value type of the entity. Feast will attempt to coerce entity columns in your data sources into this type.
* **Labels**: Labels are maps that allow users to attach their own metadata to entities

A valid entity specification is shown below:

```python
customer = Entity(
name="customer_id",
description="Customer id for ride customer",
value_type=ValueType.INT64,
labels={}
)
```

## Working with an Entity

### Creating an Entity:

```python
# Create a customer entity
customer_entity = Entity(name="customer_id", description="ID of car customer")
client.apply(customer_entity)
```

### Updating an Entity:

```python
# Update a customer entity
customer_entity = client.get_entity("customer_id")
customer_entity.description = "ID of bike customer"
client.apply(customer_entity)
```

Permitted changes include:

* The entity's description and labels

The following changes are note permitted:

* Project
* Name of an entity
* Type

2 changes: 2 additions & 0 deletions docs/concepts/feature-views.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
# Feature Views

88 changes: 0 additions & 88 deletions docs/concepts/sources.md
Original file line number Diff line number Diff line change
@@ -1,90 +1,2 @@
# Sources

### Overview

Sources are descriptions of external feature data and are registered to Feast as part of [feature tables](feature-tables.md). Once registered, Feast can ingest feature data from these sources into stores.

Currently, Feast supports the following source types:

#### Batch Source

* File \(as in Spark\): Parquet \(only\).
* BigQuery

#### Stream Source

* Kafka
* Kinesis

The following encodings are supported on streams

* Avro
* Protobuf

### Structure of a Source

For both batch and stream sources, the following configurations are necessary:

* **Event timestamp column**: Name of column containing timestamp when event data occurred. Used during point-in-time join of feature values to [entity timestamps](glossary.md#entity-timestamp).
* **Created timestamp column**: Name of column containing timestamp when data is created. Used to deduplicate data when multiple copies of the same [entity key](glossary.md#entity-key) is ingested.

Example data source specifications:

{% tabs %}
{% tab title="batch\_sources.py" %}
```python
from feast import FileSource
from feast.data_format import ParquetFormat

batch_file_source = FileSource(
file_format=ParquetFormat(),
file_url="file:///feast/customer.parquet",
event_timestamp_column="event_timestamp",
created_timestamp_column="created_timestamp",
)
```
{% endtab %}

{% tab title="stream\_sources.py" %}
```python
from feast import KafkaSource
from feast.data_format import ProtoFormat

stream_kafka_source = KafkaSource(
bootstrap_servers="localhost:9094",
message_format=ProtoFormat(class_path="class.path"),
topic="driver_trips",
event_timestamp_column="event_timestamp",
created_timestamp_column="created_timestamp",
)
```
{% endtab %}
{% endtabs %}

The [Feast Python API documentation](https://api.docs.feast.dev/python/) provides more information about options to specify for the above sources.

### Working with a Source

#### Creating a Source

Sources are defined as part of [feature tables](feature-tables.md):

```python
batch_bigquery_source = BigQuerySource(
table_ref="gcp_project:bq_dataset.bq_table",
event_timestamp_column="event_timestamp",
created_timestamp_column="created_timestamp",
)

stream_kinesis_source = KinesisSource(
bootstrap_servers="localhost:9094",
record_format=ProtoFormat(class_path="class.path"),
region="us-east-1",
stream_name="driver_trips",
event_timestamp_column="event_timestamp",
created_timestamp_column="created_timestamp",
)
```

Feast ensures that the source complies with the schema of the feature table. These specified data sources can then be included inside a feature table specification and registered to Feast Core.

15 changes: 0 additions & 15 deletions docs/contributing/release-process.md
Original file line number Diff line number Diff line change
@@ -1,20 +1,5 @@
# Release Process

## Versioning policy and branch workflow

Feast uses [semantic versioning](https://semver.org/). As such, while it is still pre-1.0 breaking changes will happen in minor versions.

Contributors are encouraged to understand our branch workflow described below, for choosing where to branch when making a change \(and thus the merge base for a pull request\).

* Major and minor releases are cut from the `master` branch.
* Each major and minor release has a long-lived maintenance branch, for example `v0.3-branch`. This is called a "release branch".
* From the release branch, pre-release release candidates are tagged ie `v0.3.0-rc.1`
* From the release candidates, stable patch version releases are tagged, for example `v0.3.0`.

A release branch should be substantially _feature complete_ with respect to the intended release. Code that is committed to `master` may be merged or cherry-picked on to a release branch, but code that is directly committed to a release branch should be solely applicable to that release \(and should not be committed back to master\).

In general, unless you're committing code that only applies to a particular release stream \(for example, temporary hot-fixes, back-ported security fixes, or image hashes\), you should base changes from `master` and then merge or cherry-pick to the release branch.

## Release process

For Feast maintainers, these are the concrete steps for making a new release.
Expand Down
83 changes: 83 additions & 0 deletions docs/contributing/versioning-policy.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
---
description: Versioning policies and status of Feast components
---

# Versioning Policy

### Versioning policy and branch workflow

Feast uses [semantic versioning](https://semver.org/).

Contributors are encouraged to understand our branch workflow described below, for choosing where to branch when making a change \(and thus the merge base for a pull request\).

* Major and minor releases are cut from the `master` branch.
* Each major and minor release has a long-lived maintenance branch, e.g., `v0.3-branch`. This is called a "release branch".
* From the release branch the pre-release release candidates are tagged, e.g., `v0.3.0-rc.1`
* From the release candidates the stable patch version releases are tagged, e.g.,`v0.3.0`.

A release branch should be substantially _feature complete_ with respect to the intended release. Code that is committed to `master` may be merged or cherry-picked on to a release branch, but code that is directly committed to a release branch should be solely applicable to that release \(and should not be committed back to master\).

In general, unless you're committing code that only applies to a particular release stream \(for example, temporary hot-fixes, back-ported security fixes, or image hashes\), you should base changes from `master` and then merge or cherry-pick to the release branch.

### Feast Component Matrix

The following table shows the **status** \(stable, beta, or alpha\) of Feast components.

Application status indicators for Feast:

* **Stable** means that the component has reached a sufficient level of stability and adoption that the Feast community has deemed the component stable. Please see the stability criteria below.
* **Beta** means that the component is working towards a version 1.0 release. Beta does not mean a component is unstable, it simply means the component has not met the full criteria of stability.
* **Alpha** means that the component is in the early phases of development and/or integration into Feast.

| Application | Status | Version | Notes |
| :--- | :--- | :--- | :--- |
| [Feast Serving](https://github.com/feast-dev/feast-java) | Beta | [v0.25.2](https://github.com/feast-dev/feast-java/releases/tag/v0.25.2) | APIs are considered stable and will not have breaking changes within 3 minor versions. |
| [Feast Core](https://github.com/feast-dev/feast-java) | Beta | [v0.25.2](https://github.com/feast-dev/feast-java/releases/tag/v0.25.2) | At risk of deprecation |
| [Feast Java Client](https://github.com/feast-dev/feast-java) | Beta | [v0.25.2](https://github.com/feast-dev/feast-java/releases/tag/v0.25.2) | |
| [Feast Python SDK](https://github.com/feast-dev/feast) | Beta | [v0.9.4](https://github.com/feast-dev/feast/releases/tag/v0.9.4) | |
| [Feast Go Client](https://github.com/feast-dev/feast) | Beta | [v0.9.4](https://github.com/feast-dev/feast/releases/tag/v0.9.4) | |
| [Feast Spark Python SDK](https://github.com/feast-dev/feast-spark) | Alpha | [v0.1.2](https://github.com/feast-dev/feast-spark/releases/tag/v0.1.2) | |
| [Feast Spark Launchers](https://github.com/feast-dev/feast-spark) | Alpha | [v0.1.2](https://github.com/feast-dev/feast-spark/releases/tag/v0.1.2) | |
| [Feast Job Service](https://github.com/feast-dev/feast-spark) | Alpha | [v0.1.2](https://github.com/feast-dev/feast-spark/releases/tag/v0.1.2) | At risk of deprecation |
| [Feast Helm Chart](https://github.com/feast-dev/feast-helm-charts) | Beta | [v0.100.4](https://github.com/feast-dev/feast-helm-charts/releases/tag/v0.100.4) | |
| | | | |

Criteria for reaching _**stable**_ status:

* Contributors from at least two organizations
* Complete end-to-end test suite
* Scalability and load testing if applicable
* Automated release process \(docker images, PyPI packages, etc\)
* API reference documentation
* No deprecative changes
* Must include logging and monitoring

Criteria for reaching **beta** status

* Contributors from at least two organizations
* End-to-end test suite
* API reference documentation
* Deprecative changes must span multiple minor versions and allow for an upgrade path.

### Levels of support <a id="levels-of-support"></a>

Feast components have various levels of support based on the component status.

| Application status | Level of support |
| :--- | :--- |
| Stable | The Feast community offers best-effort support for stable applications. Stable components will be offered long term support |
| Beta | The Feast community offers best-effort support for beta applications. Beta applications will be supported for at least 2 more minor releases. |
| Alpha | The response differs per application in alpha status, depending on the size of the community for that application and the current level of active development of the application. |

### Support from the Feast community <a id="support-from-the-kubeflow-community"></a>

Feast has an active and helpful community of users and contributors.

The Feast community offers support on a best-effort basis for stable and beta applications. Best-effort support means that there’s no formal agreement or commitment to solve a problem but the community appreciates the importance of addressing the problem as soon as possible. The community commits to helping you diagnose and address the problem if all the following are true:

* The cause falls within the technical framework that Feast controls. For example, the Feast community may not be able to help if the problem is caused by a specific network configuration within your organization.
* Community members can reproduce the problem.
* The reporter of the problem can help with further diagnosis and troubleshooting.

Please see the [Community](../community.md) page for channels through which support can be requested.

2 changes: 1 addition & 1 deletion docs/feast-on-kubernetes/advanced-1/security.md
Original file line number Diff line number Diff line change
Expand Up @@ -449,7 +449,7 @@ Authorization provides access control to FeatureTables and/or Features based on

#### **Authorization API/Server**

![Feast Authorization Flow](../../.gitbook/assets/rsz_untitled23%20%282%29%20%282%29%20%282%29%20%283%29%20%283%29%20%283%29%20%283%29%20%283%29%20%283%29%20%283%29.jpg)
![Feast Authorization Flow](../../.gitbook/assets/rsz_untitled23%20%282%29%20%282%29%20%282%29%20%283%29%20%283%29%20%283%29%20%283%29%20%283%29%20%283%29%20%283%29%20%283%29.jpg)

Feast delegates Authorization grants to an external Authorization Server that implements the [Authorization Open API specification](https://github.com/feast-dev/feast/blob/master/common/src/main/resources/api.yaml).

Expand Down
2 changes: 1 addition & 1 deletion docs/feast-on-kubernetes/concepts/architecture.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Architecture

![](../../.gitbook/assets/image%20%286%29%20%283%29%20%283%29%20%283%29%20%283%29%20%283%29%20%283%29%20%283%29%20%283%29.png)
![](../../.gitbook/assets/image%20%286%29%20%283%29%20%283%29%20%283%29%20%283%29%20%283%29%20%283%29%20%283%29%20%283%29%20%283%29.png)

## Sequence description

Expand Down

0 comments on commit 67bd17f

Please sign in to comment.