Skip to content

Commit

Permalink
Documentation fixes
Browse files Browse the repository at this point in the history
Signed-off-by: Willem Pienaar <git@willem.co>
  • Loading branch information
woop committed May 13, 2021
1 parent 81f81d7 commit 7cb9739
Show file tree
Hide file tree
Showing 35 changed files with 344 additions and 289 deletions.
23 changes: 10 additions & 13 deletions docs/README.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
# Introduction

## What is Feast?
### What is Feast?

Feast \(**Fea**ture **St**ore\) is an operational data system for managing and serving machine learning features to models in production.

![](.gitbook/assets/feast-architecture-diagrams%20%281%29%20%281%29%20%281%29%20%282%29%20%283%29%20%284%29%20%283%29%20%281%29%20%281%29%20%281%29%20%281%29%20%281%29.svg)
![](.gitbook/assets/feast_hero_010.png)

## Problems Feast Solves
### Problems Feast Solves

**Models need consistent access to data:** ML systems built on traditional data infrastructure are often coupled to databases, object stores, streams, and files. A result of this coupling, however, is that any change in data infrastructure may break dependent ML systems. Another challenge is that dual implementations of data retrieval for training and serving can lead to inconsistencies in data, which in turn can lead to training-serving skew.

Expand All @@ -24,35 +24,32 @@ Feast solves the challenge of data leakage by providing point-in-time correct fe

Feast addresses this problem by introducing feature reuse through a centralized system \(a registry\). This registry enables multiple teams working on different projects not only to contribute features, but also to reuse these same features. With Feast, data scientists can start new ML projects by selecting previously engineered features from a centralized registry, and are no longer required to develop new features for each project.

## Problems Feast does not yet solve
### Problems Feast does not yet solve

**Feature engineering:** We aim for Feast to support light-weight feature engineering as part of our API.

**Feature discovery:** We also aim for Feast to include a first-class user interface for exploring and discovering entities and features.

**‌Feature validation:** We additionally aim for Feast to improve support for statistics generation of feature data and subsequent validation of these statistics. Current support is limited.

## What Feast is not
### What Feast is not

[**ETL**](https://en.wikipedia.org/wiki/Extract,_transform,_load) **or** [**ELT**](https://en.wikipedia.org/wiki/Extract,_load,_transform) **system:** Feast is not \(and does not plan to become\) a general purpose data transformation or pipelining system. Feast plans to include a light-weight feature engineering toolkit, but we encourage teams to integrate Feast with upstream ETL/ELT systems that are specialized in transformation.

**Data warehouse:** Feast is not a replacement for your data warehouse or the source of truth for all transformed data in your organization. Rather, Feast is a light-weight downstream layer that can serve data from an existing data warehouse \(or other data sources\) to models in production.

**Data catalog:** Feast is not a general purpose data catalog for your organization. Feast is purely focused on cataloging features for use in ML pipelines or systems, and only to the extent of facilitating the reuse of features.

## How can I get started?
### How can I get started?

{% hint style="info" %}
The best way to learn Feast is to use it. Head over to our [Quickstart](feast-on-kubernetes/getting-started/install-feast/quickstart.md) and try out our examples!
The best way to learn Feast is to use it. Head over to our [Quickstart](quickstart.md) and try it out!
{% endhint %}

Explore the following resources to get started with Feast:

* [Getting Started](feast-on-kubernetes/getting-started/) provides guides on [Installing Feast](feast-on-kubernetes/getting-started/install-feast/) and [Connecting to Feast](feast-on-kubernetes/getting-started/connect-to-feast/).
* [Concepts](feast-on-kubernetes/concepts/overview.md) describes all important Feast API concepts.
* [User guide](feast-on-kubernetes/user-guide/define-and-ingest-features.md) provides guidance on completing Feast workflows.
* [Examples](https://github.com/feast-dev/feast/tree/master/examples) contains a Jupyter notebook that you can run on your Feast deployment.
* [Advanced](feast-on-kubernetes/advanced-1/troubleshooting.md) contains information about both advanced and operational aspects of Feast.
* [Reference](feast-on-kubernetes/reference-1/api/) contains detailed API and design documents for advanced users.
* [How-to guides](how-to-guides/create-a-feature-repository.md) show you how to complete typical Feast workflows.
* [Concepts](concepts/architecture.md) describes all important Feast API concepts.
* [Reference](reference/feature-store-yaml.md) contains detailed API and design documents.
* [Contributing](contributing/contributing.md) contains resources for anyone who wants to contribute to Feast.

2 changes: 1 addition & 1 deletion docs/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@
* [Feature Repository](concepts/feature-repository.md)
* [Feature Views](concepts/feature-views.md)
* [Apply](concepts/apply.md)
* [Glossary](concepts/glossary.md)

## Reference

Expand Down Expand Up @@ -49,7 +50,6 @@
* [Sources](feast-on-kubernetes/concepts/sources.md)
* [Feature Tables](feast-on-kubernetes/concepts/feature-tables.md)
* [Stores](feast-on-kubernetes/concepts/stores.md)
* [Glossary](feast-on-kubernetes/concepts/glossary.md)
* [Tutorials](feast-on-kubernetes/tutorials-1/README.md)
* [Minimal Ride Hailing Example](https://github.com/feast-dev/feast/blob/master/examples/minimal/minimal_ride_hailing.ipynb)
* [User Guide](feast-on-kubernetes/user-guide/README.md)
Expand Down
13 changes: 8 additions & 5 deletions docs/community.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
* [Mailing list](https://groups.google.com/d/forum/feast-dev): We have both a user and developer mailing list.
* Feast users should join [feast-discuss@googlegroups.com](mailto:feast-discuss@googlegroups.com) group by clicking [here](https://groups.google.com/g/feast-discuss).
* Feast developers should join [feast-dev@googlegroups.com](mailto:feast-dev@googlegroups.com) group by clicking [here](https://groups.google.com/d/forum/feast-dev).
* [Google Drive](https://drive.google.com/drive/u/0/folders/0AAe8j7ZK3sxSUk9PVA): The above groups will also grant access to our public [Feast Google Drive](https://drive.google.com/drive/u/0/folders/0AAe8j7ZK3sxSUk9PVA). The drive is used as a central repository for all Feast resources. For example:
* [Google Folder](https://drive.google.com/drive/u/0/folders/1jgMHOPDT2DvBlJeO9LCM79DP4lm4eOrR): This folder is used as a central repository for all Feast resources. For example:
* Design proposals in the form of Request for Comments \(RFC\).
* User surveys and meeting minutes.
* Slide decks of conferences our contributors have spoken at.
Expand All @@ -27,14 +27,17 @@

We have a user and contributor community call every two weeks \(Asia & US friendly\).

### Frequency \(every 2 weeks\)
{% hint style="info" %}
Please join the above Feast user groups in order to see calendar invites to the community calls
{% endhint %}

### Frequency \(alternating times every 2 weeks\)

* **Asia \(UTC+08:00\):** Wednesday 10:00 am to 10:30 am.
* **US West Coast \(PT\):** Tuesday 18:00 pm to 18:30 pm.
* Tuesday 18:00 pm to 18:30 pm \(US, Asia\)
* Tuesday 10:00 am to 10:30 am \(US, Europe\)

### Links

* Calendar: [Feast Community Calendar \(Linux Foundation\)](https://wiki.lfaidata.foundation/pages/viewpage.action?pageId=30408973)
* Zoom: [https://zoom.us/j/6325193230](https://zoom.us/j/6325193230)
* Meeting notes: [https://bit.ly/feast-notes](https://bit.ly/feast-notes%20)

4 changes: 2 additions & 2 deletions docs/concepts/architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

![Feast 0.10 Architecture Diagram](../.gitbook/assets/image%20%284%29.png)

## Functionality
### Functionality

* **Create Batch Features:** ELT/ETL systems like Spark and SQL are used to transform data in the batch store.
* **Feast Apply:** The user \(or CI\) publishes versioned controlled feature definitions using `feast apply`. This CLI command updates infrastructure and persists definitions in the object store registry.
Expand All @@ -13,7 +13,7 @@
* **Prediction:** A backend system makes a request for a prediction from the model serving service.
* **Get Online Features:** The model serving service makes a request to the Feast Online Serving service for online features using a Feast SDK.

## Components:
### Components

A complete Feast deployment contains the following components:

Expand Down
21 changes: 11 additions & 10 deletions docs/concepts/feature-views.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Feature Views

## Overview
### Overview

Feature views are objects used to define and productionize logical groups of features for training and serving.

Expand All @@ -17,7 +17,7 @@ Feast does not yet apply feature transformations. Feast acts as the productioniz

Entities, features, and sources must be defined in order to define a feature view.

## Entity
### Entity

Define an entity for the driver. Entities can be thought of as primary keys used to retrieve features. Entities are also used to join multiple tables/views during the construction of feature vectors.

Expand All @@ -36,9 +36,9 @@ driver = Entity(
)
```

## Feature
### Feature

A feature is an individual measurable property observed on an entity. For example the amount of transactions \(feature\) a customer \(entity\) has completed.
A feature is an individual measurable property observed on an entity. For example the amount of transactions \(feature\) a customer \(entity\) has completed.

Features are defined as part of feature views. Since Feast does not transformation data, a feature is essentially a schema that only contains a name and a type:

Expand All @@ -47,35 +47,36 @@ conversion_rate = Feature(
# Name of the feature. Used during lookup of feautres from the feature store
# The name must be unique
name="conv_rate",

# The type used for storage of features (both at source and when materialized
# into a store)
dtype=ValueType.FLOAT
)
```

## Source
### Source

Indicates a data source from which feature values can be retrieved. Sources are queried when building training datasets or materializing features into an online store.

```python

driver_stats_source = BigQuerySource(
# The BigQuery table where features can be found
table_ref="feast-oss.demo_data.driver_stats",

# The event timestamp is used for point-in-time joins and for ensuring only
# features within the TTL are returned
event_timestamp_column="datetime",

# The (optional) created timestamp is used to ensure there are no duplicate
# feature rows in the offline store or when building training datasets
created_timestamp_column="created",
)
```

## Feature View
### Feature View

A Feature View is a
A Feature View is a

{% tabs %}
{% tab title="driver\_trips\_feature\_table.py" %}
Expand Down
10 changes: 6 additions & 4 deletions docs/contributing/development-guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -92,11 +92,11 @@ Feast is composed of [multiple components](https://docs.feast.dev/v/master/conce

## Making a Pull Request

### Incorporating upstream changes from master
#### Incorporating upstream changes from master

Our preference is the use of `git rebase` instead of `git merge` : `git pull -r`

### Signing commits
#### Signing commits

Commits have to be signed before they are allowed to be merged into the Feast codebase:

Expand All @@ -105,7 +105,7 @@ Commits have to be signed before they are allowed to be merged into the Feast co
git commit -s -m "My first commit"
```

### Good practices to keep in mind
#### Good practices to keep in mind

* Fill in the description based on the default template configured when you first open the PR
* What this PR does/why we need it
Expand Down Expand Up @@ -136,7 +136,7 @@ Feast Protobuf API defines the common API used by Feast's Components:
* Feast Protobuf API specifications are written in [proto3](https://developers.google.com/protocol-buffers/docs/proto3) in the Main Feast Repository.
* Changes to the API should be proposed via a [GitHub Issue](https://github.com/feast-dev/feast/issues/new/choose) for discussion first.

### Generating Language Bindings
#### Generating Language Bindings

The language specific bindings have to be regenerated when changes are made to the Feast Protobuf API:

Expand All @@ -146,3 +146,5 @@ The language specific bindings have to be regenerated when changes are made to t
| [Main Feast Repository](https://github.com/feast-dev/feast) | Golang | Run `make compile-protos-go` to generate bindings |
| [Feast Java](https://github.com/feast-dev/feast-java) | Java | No action required: bindings are generated automatically during compilation. |

####

5 changes: 2 additions & 3 deletions docs/contributing/release-process.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,13 +23,12 @@ For Feast maintainers, these are the concrete steps for making a new release.
3. Check that versions are updated with `env TARGET_MERGE_BRANCH=master make lint-versions`
7. Create a [GitHub release](https://github.com/feast-dev/feast/releases) which includes a summary of im~~p~~ortant changes as well as any artifacts associated with the release. Make sure to include the same change log as added in [CHANGELOG.md](https://github.com/feast-dev/feast/blob/master/CHANGELOG.md). Use `Feast vX.Y.Z` as the title.
8. Update the[ Upgrade Guide](../feast-on-kubernetes/advanced-1/upgrading.md) to include the action required instructions for users to upgrade to this new release. Instructions should include a migration for each breaking change made to this release.
9. Update[ Feast Supported Versions](release-process.md) to include the supported versions of each component.

When a tag that matches a Semantic Version string is pushed, CI will automatically build and push the relevant artifacts to their repositories or package managers \(docker images, Python wheels, etc\). JVM artifacts are promoted from Sonatype OSSRH to Maven Central, but it sometimes takes some time for them to be available. The `sdk/go/v tag` is required to version the Go SDK go module so that users can go get a specific tagged release of the Go SDK.

### Creating a change log

We use an [open source change log generator](https://hub.docker.com/r/ferrarimarco/github-changelog-generator/) to generate change logs. The process still requires a little bit of manual effort.
We use an [open source change log generator](https://hub.docker.com/r/ferrarimarco/github-changelog-generator/) to generate change logs. The process still requires a little bit of manual effort.

1. Create a GitHub token as [per these instructions](https://github.com/github-changelog-generator/github-changelog-generator#github-token). The token is used as an input argument \(`-t`\) to the change log generator.
2. The change log generator configuration below will look for unreleased changes on a specific branch. The branch will be `master` for a major/minor release, or a release branch \(`v0.4-branch`\) for a patch release. You will need to set the branch using the `--release-branch` argument.
Expand Down Expand Up @@ -62,5 +61,5 @@ docker run -it --rm ferrarimarco/github-changelog-generator \

It's important to flag breaking changes and deprecation to the API for each release so that we can maintain API compatibility.

Developers should have flagged PRs with breaking changes with the `compat/breaking` label. However, it's important to double check each PR's release notes and contents for changes that will break API compatibility and manually label `compat/breaking` to PRs with undeclared breaking changes. The change log will have to be regenerated if any new labels have to be added.
Developers should have flagged PRs with breaking changes with the `compat/breaking` label. However, it's important to double check each PR's release notes and contents for changes that will break API compatibility and manually label `compat/breaking` to PRs with undeclared breaking changes. The change log will have to be regenerated if any new labels have to be added.

0 comments on commit 7cb9739

Please sign in to comment.