Merge pull request #853 from metarank/master

Docs update
metarank · Jan 31, 2023 · 9148705 · 9148705
2 parents aeacec8 + b8172cc
commit 9148705
Show file tree

Hide file tree

Showing 7 changed files with 60 additions and 53 deletions.
diff --git a/doc/_toc.md b/doc/_toc.md
@@ -21,9 +21,9 @@
     * [Scalars](configuration/features/scalar.md)
     * [Text](configuration/features/text.md)
     * [User Profile](configuration/features/user-session.md)
-  * [Recommendations](configuration/recommendations/overview.md)
+  * [Recommendations](configuration/recommendations.md)
+    * [Trending items](configuration/recommendations/trending.md)
     * [Similar items](configuration/recommendations/similar.md)
-    * [Popular items](configuration/recommendations/trending.md)
   * [Models](configuration/supported-ranking-models.md)
   * [Data Sources](configuration/data-sources.md)
   * [Persistence](configuration/persistence.md)

diff --git a/doc/api.md b/doc/api.md
@@ -5,7 +5,7 @@ Metarank's API provides an easy way to integrate Metarank with your applications
 - [Feedback API](#feedback) receives the stream of events
 - [Train API](#train) trains the model on your data
 - [Ranking API](#ranking) provides personalized results generated by the trained model
-- [Recommend API]() - retrieval of recommendations.
+- [Recommend API](#recommendations) - retrieval of recommendations.
 - [Prometheus endpoint](#prometheus-metrics) - to have a nice metrics dashboard about Metarank internals.
 
 ## Feedback

diff --git a/doc/configuration/recommendations.md b/doc/configuration/recommendations.md
@@ -0,0 +1,5 @@
+# Recommendations in Metarank
+
+Starting from version `0.6.x`, Metarank supports two types of recommendations:
+* [Trending](recommendations/trending.md): popularity-sorted list of items with customized ordering.
+* [Similar items](recommendations/similar.md): matrix-factorization collaborative filtering recommender of items you may also like. 
diff --git a/doc/configuration/recommendations/overview.md b/doc/configuration/recommendations/overview.md
diff --git a/doc/configuration/recommendations/similar.md b/doc/configuration/recommendations/similar.md
@@ -1,44 +1,44 @@
 # Similar items
 
-A `similar` recommender model can give you items other visitors also liked, while viewing the item you're currently observing. 
+`similar` recommendation model can give you items other visitors also liked, while viewing the item you're currently observing. 
 
-Common use-cases for such model are:
+Common use-cases for this model are:
 * you-may-also-like recommendations on item page: the context of the recommendation is a single item you're viewing now.
 * also-purchased widget on the cart page: the context of the recommendation is the contents of your card.
 
-## Underlying model
-
-Metarank uses a variation of [Matrix Factorization](https://developers.google.com/machine-learning/recommendation/collaborative/matrix) collaborative filtering algorithm for recommendations based on a paper [Fast Matrix Factorization for Online Recommendation with Implicit Feedback](https://arxiv.org/abs/1708.05024) by X.He, H.Zhang, MY.Kan and TS.Chua.
-
-![matrix factorization](../../img/mf.svg)
-
-The ALS family of algorithms for recommendations decompose a sparse matrix of user-item interactions into a set of smaller dense vectors of implicit user and item features (or user and item embeddings). The cool things about these embeddings is that similar items will have similar embeddings!
-
-So Metarank does the following:
-* computes item embeddings
-* pre-builds a [HNSW](https://www.pinecone.io/learn/hnsw/) index for fast lookups for similar embeddings
-* on inference time (when you call the [/recommend/modelname](../../api.md) endpoint), it makes a k-NN index lookup of similar items.
-
-Main pros and cons of such apporach:
-* pros: fast even for giant inventories, simple to implement
-* cons: lower precision as neural networks based methods like [BERT4rec](https://arxiv.org/abs/1904.06690), recommendations are not personalized.
-
-There is an ongoing work in Metarank project to implement NN-based methods and make current ALS implementation personalized.
-
 ## Configuration
 
 ```yaml
   similar:
     type: als
-    interactions: [click] # which interactions to use
+    interactions: [click, like, purchase] # which interactions to use
     factors: 100 # optional, number of implicit factors in the model, default 100
     iterations: 100 # optional, number of training iterations, default 100
 ```
 
 There are two important parameters in the configuration:
 * `factors`: how many hidden parameters the model tries to compute. The more - the better, but slower. Usually defined within the rage of 50-500.
-* `iterations`: how many factor refinements attempts are made. The more - the better, but slower. Normal range - 50-300.
+* `iterations`: how many factor refinement attempts are made. The more - the better, but slower. Normal range - 50-300.
 
-Rule of thump - set these parameters low, and then increase slightly until training time becomes completely unreasonable.
+Rule of thumb - set these parameters low, and then increase slightly until training time becomes completely unreasonable.
+
+See request & response formats in the [API section](../../api.md#recommendations).
+
+## Underlying model
+
+Metarank uses a variation of [Matrix Factorization](https://developers.google.com/machine-learning/recommendation/collaborative/matrix) collaborative filtering algorithm for recommendations based on the [Fast Matrix Factorization for Online Recommendation with Implicit Feedback](https://arxiv.org/abs/1708.05024) by X.He, H.Zhang, MY.Kan and TS.Chua.
+
+![matrix factorization](../../img/mf.svg)
+
+The ALS family of algorithms for recommendations decomposes a sparse matrix of user-item interactions into a set of smaller dense vectors of implicit user and item features (or user and item embeddings). The cool thing about these embeddings is that similar items will have similar embeddings!
+
+So Metarank does the following:
+* computes item embeddings.
+* pre-builds a [HNSW](https://www.pinecone.io/learn/hnsw/) index for fast lookups for similar embeddings.
+* during inference (when you call the [/recommend/modelname](../../api.md#recommendations) endpoint), it makes a k-NN index lookup of similar items.
+
+Main pros and cons of such apporach:
+* *pros*: fast even for giant inventories, simple to implement
+* *cons*: lower precision compared to neural networks based methods like [BERT4rec](https://arxiv.org/abs/1904.06690), recommendations are not personalized.
 
-See request & response formats in the [API section](../../api.md).
+*There is an ongoing work in Metarank project to implement NN-based methods and make current ALS implementation personalized.*
diff --git a/doc/configuration/recommendations/trending.md b/doc/configuration/recommendations/trending.md
@@ -1,8 +1,10 @@
 # Trending items
 
-`trending` recommendation model is used to highlight the most popular items on your site. But it's not about sorting items by popularity! Metarank can:
-* combine multiple types of interactions: you can mix clicks and purchases with multiple weights.
-* time decay: clicks made yesterday are much more important than clicks from the last months.
+`trending` recommendation model is used to highlight the trending (or in other workds, most popular) items in your application. But it's not just about sorting items by popularity! 
+
+Metarank can:
+* combine multiple types of interactions: you can mix clicks, likes and purchases with different weights.
+* time decay: clicks made yesterday are much more important than the clicks from the last months.
 * multiple configurations: trending over the last week, and bestsellers over the last year.
 
 ## Configuration
@@ -17,31 +19,36 @@ models:
         decay: 0.8 # optional, default 1.0 - no decay
         weight: 1.0 # optional, default 1.0
         window: 30d # optional, default 30 days
+      - interaction: like
+        decay: 0.9
+        weight: 1.5
+        window: 60d
       - interaction: purchase
         decay: 0.95
         weight: 3.0
+
 ```
 
 The config above defines a trending model, accessible over the `/recommend/yolo-trending` [API endpoint](../../api.md):
-* the final item score combines click and purchase events
-* purchase has 3x more weight than click
+* the final item score combines click, like and purchase events
+* purchase has 3x more weight than click, like has 1.5x more weight than click
 * purchase has less agressive time decay
-* only the last 30 days of data are used
+* only the last 30 days of data are used for clicks and purchases, but 60 days are used for likes
 
 ## Time decay and weight
 
 The final score used to sort the items is defined by the following formula:
 ```
 score = count * weight * decay ^ days_diff(now, timestamp)
 ```
-If there's multiple interaction types defined, each per-type score is added together for the final score.
+When multiple interaction types are defined, per-type scores are added together to get the final score.
 
-Such an unusual way of defining decay can allow a more granular control over the decaying. For example, that's how click importance is weighted for different `decay` values:
+Time decay configuration allows a granular control over the decaying. Here's a click importance is weighted for different `decay` values:
 
 ![decay with different options](../../img/decay.png)
 
-We recommend setting a decay:
+We recommend setting decay:
 * within a range of 0.8-0.95 for 1-month periods.
 * within a range of 0.95-0.99 for larger periods.
 
-See request & response formats in the [API section](../../api.md).
+See request & response formats in the [API section](../../api.md#recommendations).
diff --git a/doc/intro.md b/doc/intro.md
@@ -1,22 +1,22 @@
 # What is Metarank?
 
-Metarank is a recommendation and personalization service - a self-hosted reranking API to improve CTR and conversion. 
+[Metarank](https://metarank.ai) is a recommendation and personalization service - a self-hosted reranking API to improve CTR and conversion. 
 
 Main features:
-* Recommendations: [trending](configuration/recommendations/trending.md) and [similar-items](configuration/recommendations/trending.md) (MF ALS). 
+* Recommendations: [trending](configuration/recommendations/trending.md) and [similar-items](configuration/recommendations/similar.md) (MF ALS). 
 * Personalization: [secondary reranking](quickstart/quickstart.md) (LambdaMART)
-* A/B testing, [multiple model serving](configuration/overview.md#models)
-* [Bootstrapping](quickstart/quickstart.md#quickstart) on historical traffic data
+* AutoML: [automatic feature generation](howto/autofeature.md) and [model re-training](howto/model-retraining.md)
+* A/B testing: [multiple model serving](configuration/overview.md#models)
 
 ## Common use-cases
 
 Metarank is an open-source service for:
-* Algorithmic feed like on FB/Twitter.
-* CTR-optimized category/search page ordering on Airbnb. 
-* Items similar to the one you're viewing on Amazon.
-* Popular items on any ecommerce store.
+* Algorithmic feed like on Faceook or Twitter.
+* CTR-optimized category/search page ordering like on Airbnb. 
+* Items similar to the one you're viewing like on Amazon.
+* Popular items like on any ecommerce store.
 
-Metarank's recommendations are based on interaction history (like clicks and purchases), and secondary reranking - on user & item metadata and a rich set of typical ranking feature generators:
+Metarank can generate recommendations based on the interaction history: clicks, likes or purchases. Personalized secondary reranking can use user and item metadata and a rich set of typical ranking feature generators to provide personalized results:
 * [User-Agent](configuration/features/user-session.md#user-agent-field-extractor), [Referer](configuration/features/user-session.md#referer) field parsers
 * [Counters](configuration/features/counters.md#counters), [rolling window counters](configuration/features/counters.md#windowed-counter), [rates](configuration/features/counters.md#rate) (CTR & conversion)
 * [categorical](configuration/features/scalar.md#index-vs-one-hot-what-to-choose) (with one-hot, label and XGBoost/LightGBM native encodings)
@@ -155,4 +155,4 @@ curl http://localhost:8080/rank/xgboost \
 
 Check out a more in-depth [Quickstart](quickstart/quickstart.md) and full [Reference](installation.md). 
 
-If you have any questions, don't hesitate to join our [Slack](https://communityinviter.com/apps/metarank/metarank)!
+If you have any questions, don't hesitate to join our [Slack](https://metarank.ai/slack)!