From 7068085b6a8735d8eda2267ebee4e979b3ee9e4a Mon Sep 17 00:00:00 2001 From: Xiaoyong Zhu Date: Sat, 9 Jul 2022 06:09:55 -0700 Subject: [PATCH 1/9] reduce redundant docs --- README.md | 175 ------------------------------------------------- docs/README.md | 112 +++++++++++++++++-------------- 2 files changed, 63 insertions(+), 224 deletions(-) delete mode 100644 README.md diff --git a/README.md b/README.md deleted file mode 100644 index 816cfcf7f..000000000 --- a/README.md +++ /dev/null @@ -1,175 +0,0 @@ -# Feathr – An Enterprise-Grade, High Performance Feature Store - -[![License](https://img.shields.io/badge/License-Apache%202.0-blue)](https://github.com/linkedin/feathr/blob/main/LICENSE) -[![GitHub Release](https://img.shields.io/github/v/release/linkedin/feathr.svg?style=flat&sort=semver&color=blue)](https://github.com/linkedin/feathr/releases) -[![Docs Latest](https://img.shields.io/badge/docs-latest-blue.svg)](https://linkedin.github.io/feathr/) -[![Python API](https://img.shields.io/readthedocs/feathr?label=Python%20API)](https://feathr.readthedocs.io/en/latest/) - -## What is Feathr? - -Feathr is the feature store that is used in production in LinkedIn for many years and was open sourced in April 2022. Read our announcement on [Open Sourcing Feathr](https://engineering.linkedin.com/blog/2022/open-sourcing-feathr---linkedin-s-feature-store-for-productive-m) and [Feathr on Azure](https://azure.microsoft.com/en-us/blog/feathr-linkedin-s-feature-store-is-now-available-on-azure/). - -Feathr lets you: - -- **Define features** based on raw data sources (batch and streaming) using pythonic APIs. -- **Register and get features by names** during model training and model inference. -- **Share features** across your team and company. - -Feathr automatically computes your feature values and joins them to your training data, using point-in-time-correct semantics to avoid data leakage, and supports materializing and deploying your features for use online in production. - -## 🌟 Feathr Highlights - -- **Battle tested in production for more than 6 years:** LinkedIn has been using Feathr in production for over 6 years and have a dedicated team improving it. -- **Scalable with built-in optimizations:** For example, based on some internal use case, Feathr can process billions of rows and PB scale data with built-in optimizations such as bloom filters and salted joins. -- **Rich support for point-in-time joins and aggregations:** Feathr has high performant built-in operators designed for Feature Store, including time-based aggregation, sliding window joins, look-up features, all with point-in-time correctness. -- **Highly customizable user-defined functions (UDFs)** with native PySpark and Spark SQL support to lower the learning curve for data scientists. -- **Pythonic APIs** to access everything with low learning curve; Integrated with model building so data scientists can be productive from day one. -- **Derived Features** which is a unique capability across all the feature store solutions. This encourage feature consumers to build features on existing features and encouraging feature reuse. -- **Rich type system** including support for embeddings for advanced machine learning/deep learning scenarios. One of the common use cases is to build embeddings for customer profiles, and those embeddings can be reused across an organization in all the machine learning applications. -- **Native cloud integration** with simplified and scalable architecture, which is illustrated in the next section. -- **Feature sharing and reuse made easy:** Feathr has built-in feature registry so that features can be easily shared across different teams and boost team productivity. - -## 📓 Documentation - -- For more details on Feathr, read our [documentation](https://linkedin.github.io/feathr/). -- For Python API references, read the [Python API Reference](https://feathr.readthedocs.io/). -- For technical talks on Feathr, see the [slides here](./docs/talks/Feathr%20Feature%20Store%20Talk.pdf). The recording is [here](https://www.youtube.com/watch?v=gZg01UKQMTY). - -## 🛠️ Install Feathr Client Locally - -If you want to install Feathr client in a python environment, use this: - -```bash -pip install feathr -``` - -Or use the latest code from GitHub: - -```bash -pip install git+https://github.com/linkedin/feathr.git#subdirectory=feathr_project -``` - -## ☁️ Running Feathr on Cloud - -Feathr has native integrations with Databricks and Azure Synapse: - -- Please read the [Quick Start Guide for Feathr on Databricks](./docs/quickstart_databricks.md) to run Feathr with Databricks. -- Please read the [Quick Start Guide for Feathr on Azure Synapse](./docs/quickstart_synapse.md) to run Feathr with Azure Synapse. - -## 🔡 Feathr Examples - -Please read [Feathr Capabilities](https://linkedin.github.io/feathr/concepts/feathr-capabilities.html) for more examples. Below are a few selected ones: - -### Rich UDF Support - -Feathr has highly customizable UDFs with native PySpark and Spark SQL integration to lower learning curve for data scientists: - -```python -def add_new_dropoff_and_fare_amount_column(df: DataFrame): - df = df.withColumn("f_day_of_week", dayofweek("lpep_dropoff_datetime")) - df = df.withColumn("fare_amount_cents", df.fare_amount.cast('double') * 100) - return df - -batch_source = HdfsSource(name="nycTaxiBatchSource", - path="abfss://feathrazuretest3fs@feathrazuretest3storage.dfs.core.windows.net/demo_data/green_tripdata_2020-04.csv", - preprocessing=add_new_dropoff_and_fare_amount_column, - event_timestamp_column="new_lpep_dropoff_datetime", - timestamp_format="yyyy-MM-dd HH:mm:ss") -``` - -### Defining Window Aggregation Features - -```python -agg_features = [Feature(name="f_location_avg_fare", - key=location_id, # Query/join key of the feature(group) - feature_type=FLOAT, - transform=WindowAggTransformation( # Window Aggregation transformation - agg_expr="cast_float(fare_amount)", - agg_func="AVG", # Apply average aggregation over the window - window="90d")), # Over a 90-day window - ] - -agg_anchor = FeatureAnchor(name="aggregationFeatures", - source=batch_source, - features=agg_features) -``` - -### Define features on top of other features - Derived Features - -```python -# Compute a new feature(a.k.a. derived feature) on top of an existing feature -derived_feature = DerivedFeature(name="f_trip_time_distance", - feature_type=FLOAT, - key=trip_key, - input_features=[f_trip_distance, f_trip_time_duration], - transform="f_trip_distance * f_trip_time_duration") - -# Another example to compute embedding similarity -user_embedding = Feature(name="user_embedding", feature_type=DENSE_VECTOR, key=user_key) -item_embedding = Feature(name="item_embedding", feature_type=DENSE_VECTOR, key=item_key) - -user_item_similarity = DerivedFeature(name="user_item_similarity", - feature_type=FLOAT, - key=[user_key, item_key], - input_features=[user_embedding, item_embedding], - transform="cosine_similarity(user_embedding, item_embedding)") -``` - -### Define Streaming Features - -Read the [Streaming Source Ingestion Guide](https://linkedin.github.io/feathr/how-to-guides/streaming-source-ingestion.html) for more details. - -### Point in Time Joins - -Read [Point-in-time Correctness and Point-in-time Join in Feathr](https://linkedin.github.io/feathr/concepts/point-in-time-join.html) for more details. - -### Running Feathr Examples - -Follow the [quick start Jupyter Notebook](./feathr_project/feathrcli/data/feathr_user_workspace/product_recommendation_demo.ipynb) to try it out. There is also a companion [quick start guide](https://linkedin.github.io/feathr/quickstart_synapse.html) containing a bit more explanation on the notebook. - -## 🗣️ Tech Talks on Feathr - -- [Introduction to Feathr - Beginner's guide](https://www.youtube.com/watch?v=gZg01UKQMTY) -- [Document Intelligence using Azure Feature Store (Feathr) and SynapseML - ](https://mybuild.microsoft.com/en-US/sessions/5bdff7d5-23e6-4f0d-9175-da8325d05c2a?source=sessions) - -## ⚙️ Cloud Integrations and Architecture - -![Architecture Diagram](./docs/images/architecture.png) - -| Feathr component | Cloud Integrations | -| ------------------------------- | --------------------------------------------------------------------------- | -| Offline store – Object Store | Azure Blob Storage, Azure ADLS Gen2, AWS S3 | -| Offline store – SQL | Azure SQL DB, Azure Synapse Dedicated SQL Pools, Azure SQL in VM, Snowflake | -| Streaming Source | Kafka, EventHub | -| Online store | Azure Cache for Redis | -| Feature Registry and Governance | Azure Purview | -| Compute Engine | Azure Synapse Spark Pools, Databricks | -| Machine Learning Platform | Azure Machine Learning, Jupyter Notebook, Databricks Notebook | -| File Format | Parquet, ORC, Avro, JSON, Delta Lake | -| Credentials | Azure Key Vault | - -## 🚀 Roadmap - -For a complete roadmap with estimated dates, please [visit this page](https://github.com/linkedin/feathr/milestones?direction=asc&sort=title&state=open). - -- [x] Private Preview release -- [x] Public Preview release -- [ ] Future release - - [x] Support streaming - - [x] Support common data sources - - [ ] Support online transformation - - [ ] Support feature versioning - - [ ] Support feature monitoring - - [ ] Support feature store UI - - [ ] Lineage - - [ ] Search - - [ ] Support feature data deletion and retention - -## 👨‍👨‍👦‍👦 Community Guidelines - -Build for the community and build by the community. Check out [Community Guidelines](CONTRIBUTING.md). - -## 📢 Slack Channel - -Join our [Slack channel](https://feathrai.slack.com) for questions and discussions (or click the [invitation link](https://join.slack.com/t/feathrai/shared_invite/zt-1bgiu8eup-yOAKsOOIVGBVjT8B~XMu~A)). diff --git a/docs/README.md b/docs/README.md index d663d0a87..816cfcf7f 100644 --- a/docs/README.md +++ b/docs/README.md @@ -1,10 +1,9 @@ ---- -layout: default -title: Home -nav_order: 1 -description: "Feathr – An Enterprise-Grade, High Performance Feature Store" -permalink: / ---- +# Feathr – An Enterprise-Grade, High Performance Feature Store + +[![License](https://img.shields.io/badge/License-Apache%202.0-blue)](https://github.com/linkedin/feathr/blob/main/LICENSE) +[![GitHub Release](https://img.shields.io/github/v/release/linkedin/feathr.svg?style=flat&sort=semver&color=blue)](https://github.com/linkedin/feathr/releases) +[![Docs Latest](https://img.shields.io/badge/docs-latest-blue.svg)](https://linkedin.github.io/feathr/) +[![Python API](https://img.shields.io/readthedocs/feathr?label=Python%20API)](https://feathr.readthedocs.io/en/latest/) ## What is Feathr? @@ -13,40 +12,32 @@ Feathr is the feature store that is used in production in LinkedIn for many year Feathr lets you: - **Define features** based on raw data sources (batch and streaming) using pythonic APIs. -- **Register and get features by names** during model training and model inferencing. +- **Register and get features by names** during model training and model inference. - **Share features** across your team and company. Feathr automatically computes your feature values and joins them to your training data, using point-in-time-correct semantics to avoid data leakage, and supports materializing and deploying your features for use online in production. -## Feathr Highlights +## 🌟 Feathr Highlights -- **Scalable with built-in optimizations.** For example, based on some internal use case, Feathr can process billions of rows and PB scale data with built-in optimizations such as bloom filters and salted joins. +- **Battle tested in production for more than 6 years:** LinkedIn has been using Feathr in production for over 6 years and have a dedicated team improving it. +- **Scalable with built-in optimizations:** For example, based on some internal use case, Feathr can process billions of rows and PB scale data with built-in optimizations such as bloom filters and salted joins. - **Rich support for point-in-time joins and aggregations:** Feathr has high performant built-in operators designed for Feature Store, including time-based aggregation, sliding window joins, look-up features, all with point-in-time correctness. - **Highly customizable user-defined functions (UDFs)** with native PySpark and Spark SQL support to lower the learning curve for data scientists. - **Pythonic APIs** to access everything with low learning curve; Integrated with model building so data scientists can be productive from day one. +- **Derived Features** which is a unique capability across all the feature store solutions. This encourage feature consumers to build features on existing features and encouraging feature reuse. - **Rich type system** including support for embeddings for advanced machine learning/deep learning scenarios. One of the common use cases is to build embeddings for customer profiles, and those embeddings can be reused across an organization in all the machine learning applications. - **Native cloud integration** with simplified and scalable architecture, which is illustrated in the next section. - **Feature sharing and reuse made easy:** Feathr has built-in feature registry so that features can be easily shared across different teams and boost team productivity. -## Running Feathr on Azure with 3 Simple Steps - -Feathr has native cloud integration. To use Feathr on Azure, you only need three steps: - -1. Get the `Principal ID` of your account by running `az ad signed-in-user show --query id -o tsv` in the link below (Select "Bash" if asked), and write down that value (something like `b65ef2e0-42b8-44a7-9b55-abbccddeefff`). Think this ID as something representing you when accessing Azure, and it will be used to grant permissions in the next step in the UI. - -[Launch Cloud Shell](https://shell.azure.com/bash) - -2. Click the button below to deploy a minimal set of Feathr resources for demo purpose. You will need to fill in the `Principal ID` and `Resource Prefix`. You will need "Owner" permission of the selected subscription. - -[![Deploy to Azure](https://aka.ms/deploytoazurebutton)](https://portal.azure.com/#create/Microsoft.Template/uri/https%3A%2F%2Fraw.githubusercontent.com%2Flinkedin%2Ffeathr%2Fmain%2Fdocs%2Fhow-to-guides%2Fazure_resource_provision.json) +## 📓 Documentation -3. Run the Feathr Jupyter Notebook by clicking the button below. You only need to change the specified `Resource Prefix`. +- For more details on Feathr, read our [documentation](https://linkedin.github.io/feathr/). +- For Python API references, read the [Python API Reference](https://feathr.readthedocs.io/). +- For technical talks on Feathr, see the [slides here](./docs/talks/Feathr%20Feature%20Store%20Talk.pdf). The recording is [here](https://www.youtube.com/watch?v=gZg01UKQMTY). -[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/linkedin/feathr/main?labpath=feathr_project%2Ffeathrcli%2Fdata%2Ffeathr_user_workspace%2Fproduct_recommendation_demo.ipynb) +## 🛠️ Install Feathr Client Locally -## Installing Feathr Client Locally - -If you are not using the above Jupyter Notebook and want to install Feathr client locally, use this: +If you want to install Feathr client in a python environment, use this: ```bash pip install feathr @@ -58,7 +49,14 @@ Or use the latest code from GitHub: pip install git+https://github.com/linkedin/feathr.git#subdirectory=feathr_project ``` -## Feathr Examples +## ☁️ Running Feathr on Cloud + +Feathr has native integrations with Databricks and Azure Synapse: + +- Please read the [Quick Start Guide for Feathr on Databricks](./docs/quickstart_databricks.md) to run Feathr with Databricks. +- Please read the [Quick Start Guide for Feathr on Azure Synapse](./docs/quickstart_synapse.md) to run Feathr with Azure Synapse. + +## 🔡 Feathr Examples Please read [Feathr Capabilities](https://linkedin.github.io/feathr/concepts/feathr-capabilities.html) for more examples. Below are a few selected ones: @@ -125,37 +123,53 @@ Read the [Streaming Source Ingestion Guide](https://linkedin.github.io/feathr/ho Read [Point-in-time Correctness and Point-in-time Join in Feathr](https://linkedin.github.io/feathr/concepts/point-in-time-join.html) for more details. -## Running Feathr Examples +### Running Feathr Examples + +Follow the [quick start Jupyter Notebook](./feathr_project/feathrcli/data/feathr_user_workspace/product_recommendation_demo.ipynb) to try it out. There is also a companion [quick start guide](https://linkedin.github.io/feathr/quickstart_synapse.html) containing a bit more explanation on the notebook. -Follow the [quick start Jupyter Notebook](https://github.com/linkedin/feathr/blob/main/feathr_project/feathrcli/data/feathr_user_workspace/product_recommendation_demo.ipynb) to try it out. -There is also a companion [quick start guide](https://linkedin.github.io/feathr/quickstart_synapse.html) containing a bit more explanation on the notebook. +## 🗣️ Tech Talks on Feathr -## Cloud Architecture +- [Introduction to Feathr - Beginner's guide](https://www.youtube.com/watch?v=gZg01UKQMTY) +- [Document Intelligence using Azure Feature Store (Feathr) and SynapseML + ](https://mybuild.microsoft.com/en-US/sessions/5bdff7d5-23e6-4f0d-9175-da8325d05c2a?source=sessions) -Feathr has native integration with Azure and other cloud services, and here's the high-level architecture to help you get started. -![Architecture](images/architecture.png) +## ⚙️ Cloud Integrations and Architecture -## Next Steps +![Architecture Diagram](./docs/images/architecture.png) -### Quickstart +| Feathr component | Cloud Integrations | +| ------------------------------- | --------------------------------------------------------------------------- | +| Offline store – Object Store | Azure Blob Storage, Azure ADLS Gen2, AWS S3 | +| Offline store – SQL | Azure SQL DB, Azure Synapse Dedicated SQL Pools, Azure SQL in VM, Snowflake | +| Streaming Source | Kafka, EventHub | +| Online store | Azure Cache for Redis | +| Feature Registry and Governance | Azure Purview | +| Compute Engine | Azure Synapse Spark Pools, Databricks | +| Machine Learning Platform | Azure Machine Learning, Jupyter Notebook, Databricks Notebook | +| File Format | Parquet, ORC, Avro, JSON, Delta Lake | +| Credentials | Azure Key Vault | -- [Quickstart for Azure Synapse](quickstart_synapse.md) +## 🚀 Roadmap -### Concepts +For a complete roadmap with estimated dates, please [visit this page](https://github.com/linkedin/feathr/milestones?direction=asc&sort=title&state=open). -- [Feature Definition](concepts/feature-definition.md) -- [Feature Generation](concepts/feature-generation.md) -- [Feature Join](concepts/feature-join.md) -- [Point-in-time Correctness](concepts/point-in-time-join.md) +- [x] Private Preview release +- [x] Public Preview release +- [ ] Future release + - [x] Support streaming + - [x] Support common data sources + - [ ] Support online transformation + - [ ] Support feature versioning + - [ ] Support feature monitoring + - [ ] Support feature store UI + - [ ] Lineage + - [ ] Search + - [ ] Support feature data deletion and retention -### How-to-guides +## 👨‍👨‍👦‍👦 Community Guidelines -- [Azure Deployment](how-to-guides/azure-deployment.md) -- [Local Feature Testing](how-to-guides/local-feature-testing.md) -- [Feature Definition Troubleshooting Guide](how-to-guides/troubleshoot-feature-definition.md) -- [Feathr Expression Language](how-to-guides/expression-language.md) -- [Feathr Job Configuration](how-to-guides/feathr-job-configuration.md) +Build for the community and build by the community. Check out [Community Guidelines](CONTRIBUTING.md). -## API Documentation +## 📢 Slack Channel -- [Python API Documentation](https://feathr.readthedocs.io/en/latest/) +Join our [Slack channel](https://feathrai.slack.com) for questions and discussions (or click the [invitation link](https://join.slack.com/t/feathrai/shared_invite/zt-1bgiu8eup-yOAKsOOIVGBVjT8B~XMu~A)). From e85ca3b1fa2029f14e6c6abf220bccbc5886680f Mon Sep 17 00:00:00 2001 From: Xiaoyong Zhu Date: Sat, 9 Jul 2022 06:19:47 -0700 Subject: [PATCH 2/9] Update quickstart_synapse.md --- docs/quickstart_synapse.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/quickstart_synapse.md b/docs/quickstart_synapse.md index 279b51916..874ef64bf 100644 --- a/docs/quickstart_synapse.md +++ b/docs/quickstart_synapse.md @@ -1,7 +1,7 @@ --- layout: default title: Quick Start Guide With Azure Synapse -nav_order: 2 +nav_order: 4 --- # Feathr Quickstart Guide With Azure Synapse From ec80e578d95a41a4bd3b480d0ec67936f80ef98f Mon Sep 17 00:00:00 2001 From: Xiaoyong Zhu Date: Sat, 9 Jul 2022 06:22:25 -0700 Subject: [PATCH 3/9] Update README.md --- docs/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/README.md b/docs/README.md index 816cfcf7f..161d81fd0 100644 --- a/docs/README.md +++ b/docs/README.md @@ -1,4 +1,4 @@ -# Feathr – An Enterprise-Grade, High Performance Feature Store +# An Enterprise-Grade, High Performance Feature Store - Feathr [![License](https://img.shields.io/badge/License-Apache%202.0-blue)](https://github.com/linkedin/feathr/blob/main/LICENSE) [![GitHub Release](https://img.shields.io/github/v/release/linkedin/feathr.svg?style=flat&sort=semver&color=blue)](https://github.com/linkedin/feathr/releases) From 17b7e8d44672aa58e77ec8d6942e515e7cc40fc5 Mon Sep 17 00:00:00 2001 From: Xiaoyong Zhu Date: Sat, 9 Jul 2022 06:24:42 -0700 Subject: [PATCH 4/9] update sequence --- docs/concepts/feathr-concepts-for-beginners.md | 1 - docs/quickstart_databricks.md | 1 - docs/quickstart_synapse.md | 1 - 3 files changed, 3 deletions(-) diff --git a/docs/concepts/feathr-concepts-for-beginners.md b/docs/concepts/feathr-concepts-for-beginners.md index 9615447fc..825fb3e6d 100644 --- a/docs/concepts/feathr-concepts-for-beginners.md +++ b/docs/concepts/feathr-concepts-for-beginners.md @@ -1,7 +1,6 @@ --- layout: default title: Concepts for Beginners -nav_order: 2 --- # Concepts for Beginners diff --git a/docs/quickstart_databricks.md b/docs/quickstart_databricks.md index 2aeb35608..70fb6541e 100644 --- a/docs/quickstart_databricks.md +++ b/docs/quickstart_databricks.md @@ -1,7 +1,6 @@ --- layout: default title: Quick Start Guide With Databricks -nav_order: 3 --- # Feathr Quickstart Guide for Databricks diff --git a/docs/quickstart_synapse.md b/docs/quickstart_synapse.md index 874ef64bf..b9b6a08be 100644 --- a/docs/quickstart_synapse.md +++ b/docs/quickstart_synapse.md @@ -1,7 +1,6 @@ --- layout: default title: Quick Start Guide With Azure Synapse -nav_order: 4 --- # Feathr Quickstart Guide With Azure Synapse From 3e6c74110648f7b3ecac40bd1cd5d4fc3f97d451 Mon Sep 17 00:00:00 2001 From: Xiaoyong Zhu Date: Sat, 9 Jul 2022 06:27:39 -0700 Subject: [PATCH 5/9] Update README.md --- docs/README.md | 20 ++++++++------------ 1 file changed, 8 insertions(+), 12 deletions(-) diff --git a/docs/README.md b/docs/README.md index 161d81fd0..f29b121f4 100644 --- a/docs/README.md +++ b/docs/README.md @@ -153,18 +153,14 @@ Follow the [quick start Jupyter Notebook](./feathr_project/feathrcli/data/feathr For a complete roadmap with estimated dates, please [visit this page](https://github.com/linkedin/feathr/milestones?direction=asc&sort=title&state=open). -- [x] Private Preview release -- [x] Public Preview release -- [ ] Future release - - [x] Support streaming - - [x] Support common data sources - - [ ] Support online transformation - - [ ] Support feature versioning - - [ ] Support feature monitoring - - [ ] Support feature store UI - - [ ] Lineage - - [ ] Search - - [ ] Support feature data deletion and retention + +- [x] Support streaming +- [x] Support common data sources +- [ ] Support online transformation +- [ ] Support feature versioning +- [ ] Support feature monitoring +- [ ] Support feature store UI, including Lineage and Search functionalities +- [ ] Support feature data deletion and retention ## 👨‍👨‍👦‍👦 Community Guidelines From d6782891b8d424a3da0fa07ed406ad0c00aa92c9 Mon Sep 17 00:00:00 2001 From: Xiaoyong Zhu Date: Sat, 9 Jul 2022 06:32:28 -0700 Subject: [PATCH 6/9] Update README.md --- docs/README.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/docs/README.md b/docs/README.md index f29b121f4..8f34bfdb1 100644 --- a/docs/README.md +++ b/docs/README.md @@ -33,7 +33,7 @@ Feathr automatically computes your feature values and joins them to your trainin - For more details on Feathr, read our [documentation](https://linkedin.github.io/feathr/). - For Python API references, read the [Python API Reference](https://feathr.readthedocs.io/). -- For technical talks on Feathr, see the [slides here](./docs/talks/Feathr%20Feature%20Store%20Talk.pdf). The recording is [here](https://www.youtube.com/watch?v=gZg01UKQMTY). +- For technical talks on Feathr, see the [slides here](./talks/Feathr%20Feature%20Store%20Talk.pdf). The recording is [here](https://www.youtube.com/watch?v=gZg01UKQMTY). ## 🛠️ Install Feathr Client Locally @@ -53,8 +53,8 @@ pip install git+https://github.com/linkedin/feathr.git#subdirectory=feathr_proje Feathr has native integrations with Databricks and Azure Synapse: -- Please read the [Quick Start Guide for Feathr on Databricks](./docs/quickstart_databricks.md) to run Feathr with Databricks. -- Please read the [Quick Start Guide for Feathr on Azure Synapse](./docs/quickstart_synapse.md) to run Feathr with Azure Synapse. +- Please read the [Quick Start Guide for Feathr on Databricks](./quickstart_databricks.md) to run Feathr with Databricks. +- Please read the [Quick Start Guide for Feathr on Azure Synapse](./quickstart_synapse.md) to run Feathr with Azure Synapse. ## 🔡 Feathr Examples @@ -135,7 +135,7 @@ Follow the [quick start Jupyter Notebook](./feathr_project/feathrcli/data/feathr ## ⚙️ Cloud Integrations and Architecture -![Architecture Diagram](./docs/images/architecture.png) +![Architecture Diagram](./images/architecture.png) | Feathr component | Cloud Integrations | | ------------------------------- | --------------------------------------------------------------------------- | From 7a7de1fcfb37df642ba29cab15a7191637cfff6d Mon Sep 17 00:00:00 2001 From: Xiaoyong Zhu Date: Sat, 9 Jul 2022 06:33:33 -0700 Subject: [PATCH 7/9] Update README.md --- docs/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/README.md b/docs/README.md index 8f34bfdb1..514f27334 100644 --- a/docs/README.md +++ b/docs/README.md @@ -125,7 +125,7 @@ Read [Point-in-time Correctness and Point-in-time Join in Feathr](https://linked ### Running Feathr Examples -Follow the [quick start Jupyter Notebook](./feathr_project/feathrcli/data/feathr_user_workspace/product_recommendation_demo.ipynb) to try it out. There is also a companion [quick start guide](https://linkedin.github.io/feathr/quickstart_synapse.html) containing a bit more explanation on the notebook. +Follow the [quick start Jupyter Notebook](../feathr_project/feathrcli/data/feathr_user_workspace/product_recommendation_demo.ipynb) to try it out. There is also a companion [quick start guide](https://linkedin.github.io/feathr/quickstart_synapse.html) containing a bit more explanation on the notebook. ## 🗣️ Tech Talks on Feathr From 5be4e6fc27d27f9615351a686cabbd5397ead055 Mon Sep 17 00:00:00 2001 From: Xiaoyong Zhu Date: Sun, 10 Jul 2022 20:42:36 -0700 Subject: [PATCH 8/9] resolve comments --- docs/quickstart_databricks.md | 2 +- docs/quickstart_synapse.md | 4 ++-- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/quickstart_databricks.md b/docs/quickstart_databricks.md index 70fb6541e..a549ef3e4 100644 --- a/docs/quickstart_databricks.md +++ b/docs/quickstart_databricks.md @@ -3,7 +3,7 @@ layout: default title: Quick Start Guide With Databricks --- -# Feathr Quickstart Guide for Databricks +# Feathr Quick Start Guide With Databricks For Databricks, you can simply upload [this notebook](./samples/databricks/databricks_quickstart_nyc_taxi_driver.ipynb) to your Databricks cluster and just run it in the Databricks cluster. It has been pre-configured to use the current Databricks cluster to submit jobs. diff --git a/docs/quickstart_synapse.md b/docs/quickstart_synapse.md index b9b6a08be..4eabd027b 100644 --- a/docs/quickstart_synapse.md +++ b/docs/quickstart_synapse.md @@ -1,9 +1,9 @@ --- layout: default -title: Quick Start Guide With Azure Synapse +title: Feathr Quick Start Guide With Databricks --- -# Feathr Quickstart Guide With Azure Synapse +# Feathr Feathr Quick Start Guide With Databricks ## Overview From 866bd1061a3ed23cc7b2c26ec32160ed6c80be39 Mon Sep 17 00:00:00 2001 From: Xiaoyong Zhu Date: Sun, 10 Jul 2022 20:47:07 -0700 Subject: [PATCH 9/9] update typo --- docs/quickstart_databricks.md | 4 ++-- docs/quickstart_synapse.md | 4 ++-- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/docs/quickstart_databricks.md b/docs/quickstart_databricks.md index a549ef3e4..fee93738a 100644 --- a/docs/quickstart_databricks.md +++ b/docs/quickstart_databricks.md @@ -1,9 +1,9 @@ --- layout: default -title: Quick Start Guide With Databricks +title: Quick Start Guide with Databricks --- -# Feathr Quick Start Guide With Databricks +# Feathr Quick Start Guide with Databricks For Databricks, you can simply upload [this notebook](./samples/databricks/databricks_quickstart_nyc_taxi_driver.ipynb) to your Databricks cluster and just run it in the Databricks cluster. It has been pre-configured to use the current Databricks cluster to submit jobs. diff --git a/docs/quickstart_synapse.md b/docs/quickstart_synapse.md index 4eabd027b..0d0a536bf 100644 --- a/docs/quickstart_synapse.md +++ b/docs/quickstart_synapse.md @@ -1,9 +1,9 @@ --- layout: default -title: Feathr Quick Start Guide With Databricks +title: Quick Start Guide with Azure Synapse --- -# Feathr Feathr Quick Start Guide With Databricks +# Feathr Quick Start Guide with Azure Synapse ## Overview