Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
67 changes: 0 additions & 67 deletions src/docs/master/api/apidocs.md
Original file line number Diff line number Diff line change
Expand Up @@ -1169,73 +1169,6 @@ Get the latest items in category.

:::

::: details GET /api/popular

Get popular items.

#### Parameters

| Name | Locate | Type | Description | Required |
|-|-|-|-|-|
| `n` | query | integer | Number of returned recommendations | |
| `offset` | query | integer | Offset of returned recommendations | |
| `user-id` | query | string | Remove read items of a user | |

#### Response Body

```json
[
{
"Id": "crocodilia",
"Score": 3.1415926
},
{
"Id": "crocodilia",
"Score": 3.1415926
},
{
"Id": "crocodilia",
"Score": 3.1415926
}
]
```

:::

::: details GET /api/popular/{category}

Get popular items in category.

#### Parameters

| Name | Locate | Type | Description | Required |
|-|-|-|-|-|
| `category` | path | string | Category of returned items. | ✅ |
| `n` | query | integer | Number of returned items | |
| `offset` | query | integer | Offset of returned items | |
| `user-id` | query | string | Remove read items of a user | |

#### Response Body

```json
[
{
"Id": "crocodilia",
"Score": 3.1415926
},
{
"Id": "crocodilia",
"Score": 3.1415926
},
{
"Id": "crocodilia",
"Score": 3.1415926
}
]
```

:::

::: details GET /api/recommend/{user-id}

Get recommendation for user.
Expand Down
28 changes: 1 addition & 27 deletions src/docs/master/concepts/algorithms.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ The `<cache_size>` comes from the configuration file:
```toml
[recommend]

# The cache size for recommended/popular/latest items. The default value is 100.
# The cache size for recommended/latest items. The default value is 100.
cache_size = 100
```

Expand All @@ -52,32 +52,6 @@ The latest items recommendation is equivalent to the following SQL:
select item_id from items order by time_stamp desc limit <cache_size>;
```

### Popular Items

Many websites show recent popular items to users such as Twitter trending. The popular items recommendation is equivalent to the following SQL:

```sql
select item_id from (
select item_id, count(*) as feedback_count from feedback
where feedback_type in <positive_feedback_types>
and time_stamp >= NOW() - INTERVAL <popular_window>
group by item_id) t
order by feedback_count desc limit <cache_size>;
```

::: tip

The `<popular_window> ` in the configuration file corresponds to the window of popular items.

```toml
[recommend.popular]

# The time window of popular items. The default values is 4320h.
popular_window = "720h"
```

:::

## Similarity Algorithms

In some scenarios, users like specific types of items, for example, gamers like to solve puzzles and young children like to watch cartoons.
Expand Down
2 changes: 0 additions & 2 deletions src/docs/master/concepts/data-objects.md
Original file line number Diff line number Diff line change
Expand Up @@ -71,8 +71,6 @@ Multiple categories can be distinguished by topics such as food, travel, etc., o
|-|-|-|
| `GET` | `/api/latest` | Get latest items. |
| `GET` | `/api/latest/{category}` | Get latest items in specified category. |
| `GET` | `/api/popular` | Get popular items. |
| `GET` | `/api/popular/{category}` | Get popular items in specified category. |
| `GET` | `/api/recommend/{user-id}` | Get recommendation for user. |
| `GET` | `/api/recommend/{user-id}/{category}` | Get recommendation for user in specified category. |
| `GET` | `/api/item/{item-id}/neighbors` | Get neighbors of a item. |
Expand Down
33 changes: 14 additions & 19 deletions src/docs/master/concepts/how-it-works.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ The workflow of Gorse is depicted in the following flowchart:
```mermaid
flowchart TD
database[(Database)]--user, items and feedback-->load[Load dataset]
load--latest and popular items-->cache[(Cache)]
load--latest items-->cache[(Cache)]
find_users--User Neighbors-->cache
find_items--Item Neighbors-->cache
subgraph Master Node
Expand All @@ -21,12 +21,12 @@ flowchart TD
cache2--cached recommendations-->api[RESTful APIs]
database2[(Database)]<--user, items and feedback-->api
subgraph Server Node
api<--popular and hidden items-->local_cache[Local Cache]
api<--hidden items-->local_cache[Local Cache]
end

cache--user neighbors-->user_based(User Similarity-based\nRecommendation)
cache--item neighbors-->item_based(Item Similarity-based\nRecommendation)
cache--latest and popular items-->fm_predict
cache--latest items-->fm_predict
fm_predict--recommendation-->cache2[(Cache)]
subgraph Worker Node
user_based--recommendation-->fm_predict
Expand All @@ -41,7 +41,7 @@ flowchart TD

## Architecture

The master node loads data from the database. In the process of loading data, popular items and the latest items are written to the cache. Then, the master node searches for neighbors and training recommendation models. In the background, the random search is used to find the optimal recommendation model for current data. The worker nodes pull recommendation models from the master node and generate recommendations for each user. The server nodes provide RESTful APIs. Workers and servers connect to the master node via GRPC, which is configured in the configuration file.
The master node loads data from the database. In the process of loading data, the latest items are written to the cache. Then, the master node searches for neighbors and training recommendation models. In the background, the random search is used to find the optimal recommendation model for current data. The worker nodes pull recommendation models from the master node and generate recommendations for each user. The server nodes provide RESTful APIs. Workers and servers connect to the master node via GRPC, which is configured in the configuration file.

```toml
[master]
Expand Down Expand Up @@ -76,7 +76,7 @@ The intermediate cache is configurable. Increasing cache size might improve reco
```toml
[recommend]

# The cache size for recommended/popular/latest items. The default value is 10.
# The cache size for recommended/latest items. The default value is 10.
cache_size = 100

# Recommended cache expire time. The default value is 72h.
Expand All @@ -87,7 +87,7 @@ The recommendation flow will be introduced in the top-down method.

### Master: Neighbors and Models

The master node is driven by data loading. Data loading happens in every `model_fit_period`. The latest items and popular items can be collected during loading data. Once data is loaded, the following tasks start.
The master node is driven by data loading. Data loading happens in every `model_fit_period`. The latest items can be collected during loading data. Once data is loaded, the following tasks start.

- **Find Neighbors:** User neighbors and item neighbors are found and cached.
- **Fit MF and FM** The matrix factorization model and factorization machine model are trained and delivered to workers.
Expand Down Expand Up @@ -138,7 +138,7 @@ Workers nodes generate and write offline recommendations to the cache database.

```mermaid
flowchart TD
cache[(Cache)]--"latest items\n{{if enable_latest_recommend }}\n\npopular items\n{{ if enable_popular_recommend }}"-->concat
cache[(Cache)]--"latest items\n{{if enable_latest_recommend }}"-->concat
cache--user neighbors-->user_based["User Similarity-based Recommendation\n{{ if enable_user_based_recommend }}"]
cache--item neighbors-->item_based["Item Similarity-based Recommendation\n{{ if enable_item_based_recommend }}"]
user_based--recommendation-->concat
Expand All @@ -149,11 +149,11 @@ flowchart TD
fm--recommendation-->remove[Remove\nRead Items]
database2[(Database)]--feedback-->remove
remove--recommendation-->explore[Explore\nRecommendation]
cache2[(Cache)]--latest and popular items-->explore
cache2[(Cache)]--latest items-->explore
explore--recommendation-->cache3[(Cache)]
```

First, the worker collects candidates from the latest items, popular items, user similarity-based recommendations, item similarity-based recommendations and matrix factorization recommendations. Sources of candidates can be enabled or disabled in the configuration. Then, candidates are ranked by the factorization machine and read items are removed. If `enable_click_through_prediction` is `false`, candidates are ranked randomly. Finally, popular items and the latest items will be injected into recommendations with probabilities defined in `explore_recommend`. Offline recommendation results will be written to the cache.
First, the worker collects candidates from the latest items, user similarity-based recommendations, item similarity-based recommendations and matrix factorization recommendations. Sources of candidates can be enabled or disabled in the configuration. Then, candidates are ranked by the factorization machine and read items are removed. If `enable_click_through_prediction` is `false`, candidates are ranked randomly. Finally, the latest items will be injected into recommendations with probabilities defined in `explore_recommend`. Offline recommendation results will be written to the cache.

```toml
[recommend.offline]
Expand All @@ -167,9 +167,6 @@ refresh_recommend_period = "24h"
# Enable latest recommendation during offline recommendation. The default value is false.
enable_latest_recommend = true

# Enable popular recommendation during offline recommendation. The default value is false.
enable_popular_recommend = false

# Enable user-based similarity recommendation during offline recommendation. The default value is false.
enable_user_based_recommend = true

Expand All @@ -183,11 +180,10 @@ enable_collaborative_recommend = true
# would be merged randomly. The default value is false.
enable_click_through_prediction = true

# The explore recommendation method is used to inject popular items or latest items into recommended result:
# popular: Recommend popular items to cold-start users.
# The explore recommendation method is used to inject latest items into recommended result:
# latest: Recommend latest items to cold-start users.
# The default values is { popular = 0.0, latest = 0.0 }.
explore_recommend = { popular = 0.1, latest = 0.2 }
# The default values is { latest = 0.0 }.
explore_recommend = { latest = 0.2 }
```

### Server: Online Recommendation
Expand All @@ -210,7 +206,7 @@ auto_insert_item = true

#### Recommendation APIs

Recommendation APIs return recommendation results. For non-personalized recommendations (latest items, popular items or neighbors), the server node fetches cached recommendations from the cache database and sends responses. But the server node needs to do more work for personalized recommendations.
Recommendation APIs return recommendation results. For non-personalized recommendations (latest items or neighbors), the server node fetches cached recommendations from the cache database and sends responses. But the server node needs to do more work for personalized recommendations.

- **Recommendation:** Offline recommendations by workers are written to responses and read items will be removed. But if the offline recommendation cache is consumed, fallback recommenders will be used. Recommenders in `fallback_recommend` will be tried in order.

Expand Down Expand Up @@ -244,7 +240,6 @@ flowchart LR

# The fallback recommendation method is used when cached recommendation drained out:
# item_based: Recommend similar items.
# popular: Recommend popular items.
# latest: Recommend latest items.
# Recommenders are used in order. The default values is ["latest"].
fallback_recommend = ["item_based", "latest"]
Expand All @@ -256,7 +251,7 @@ num_feedback_fallback_item_based = 10
Besides recommendations, there are two important configurations for servers.

- **Clock Error:** Gorse supports feedback with future timestamps, thus Gorse relies on a correct clock. However, clocks in different hosts might be different, `clock_error` should be the maximal difference between clocks.
- **Cache Expiration:** Servers cache hidden items and popular items in the local cache to avoid accessing the external database too frequently. The local cache is refreshed every `cache_expire`.
- **Cache Expiration:** Servers cache hidden items in the local cache to avoid accessing the external database too frequently. The local cache is refreshed every `cache_expire`.

```toml
[server]
Expand Down
2 changes: 1 addition & 1 deletion src/docs/master/concepts/non-personalized.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ shortTitle: Non-personalized
---
# Non-personalized Recommenders

Gorse recommender system offers two non-personalized recommenders: the latest recommender and the popular recommender. However, they are not flexible enough. To improve the flexibility of non-personalized recommendations, Gorse provides a new implementation of non-personalized recommenders.
Gorse recommender system offers a non-personalized recommender: the latest recommender. However, it is not flexible enough. To improve the flexibility of non-personalized recommendations, Gorse provides a new implementation of non-personalized recommenders.

## Configuration

Expand Down
9 changes: 1 addition & 8 deletions src/docs/master/config.md
Original file line number Diff line number Diff line change
Expand Up @@ -122,12 +122,6 @@ Document: https://github.com/mailru/go-clickhouse#dsn
| `positive_feedback_ttl` | string | `0` | [Time-to-live of positive feedback](./concepts/data-objects#time-to-live-1) |
| `item_ttl` | string | `0` | [Time-to-live of items](./concepts/data-objects#time-to-live) |

### `recommend.popular`

| Key | Type | Default | Description |
| --- | --- | --- | --- |
| `popular_window` | integer | `4320h` | [Time window of popular items in days](./concepts/algorithms#popular-items) |

### `recommend.user_neighbors`

| Key | Type | Default | Description |
Expand Down Expand Up @@ -174,12 +168,11 @@ Document: https://github.com/mailru/go-clickhouse#dsn
| `check_recommend_period` | integer | `1m` | [Period to check recommendation for users](./concepts/how-it-works#worker-offline-recommendation) |
| `refresh_recommend_period` | integer | `24h` | [Period to refresh offline recommendation cache](./concepts/how-it-works#worker-offline-recommendation) |
| `enable_latest_recommend` | boolean | `false` | [Enable latest recommendation during offline recommendation](./concepts/how-it-works.html#worker-offline-recommendation) |
| `enable_popular_recommend` | boolean | `false` | [Enable popular recommendation during offline recommendation](./concepts/how-it-works.html#worker-offline-recommendation) |
| `enable_user_based_recommend` | boolean | `false` | [Enable user-based similarity recommendation during offline recommendation](./concepts/how-it-works.html#worker-offline-recommendation) |
| `enable_item_based_recommend` | boolean | `false` | [Enable item-based similarity recommendation during offline recommendation](./concepts/how-it-works.html#worker-offline-recommendation) |
| `enable_collaborative_recommend` | boolean | `true` | [Enable collaborative filtering recommendation during offline recommendation](./concepts/how-it-works.html#worker-offline-recommendation) |
| `enable_click_through_prediction` | boolean | `false` | [Enable click-though rate prediction during offline recommendation. Otherwise, results from multi-way recommendation would be merged randomly](./concepts/how-it-works.html#worker-offline-recommendation) |
| `explore_recommend` | map | `{ popular = 0.0, latest = 0.0 }` | [The explore recommendation method is used to inject popular items or latest items into recommended result](./concepts/how-it-works.html#worker-offline-recommendation) |
| `explore_recommend` | map | `{ latest = 0.0 }` | [The explore recommendation method is used to inject latest items into recommended result](./concepts/how-it-works.html#worker-offline-recommendation) |

### `recommend.online`

Expand Down
67 changes: 0 additions & 67 deletions src/zh/docs/master/api/apidocs.md
Original file line number Diff line number Diff line change
Expand Up @@ -1169,73 +1169,6 @@ Get the latest items in category.

:::

::: details GET /api/popular

Get popular items.

#### Parameters

| Name | Locate | Type | Description | Required |
|-|-|-|-|-|
| `n` | query | integer | Number of returned recommendations | |
| `offset` | query | integer | Offset of returned recommendations | |
| `user-id` | query | string | Remove read items of a user | |

#### Response Body

```json
[
{
"Id": "crocodilia",
"Score": 3.1415926
},
{
"Id": "crocodilia",
"Score": 3.1415926
},
{
"Id": "crocodilia",
"Score": 3.1415926
}
]
```

:::

::: details GET /api/popular/{category}

Get popular items in category.

#### Parameters

| Name | Locate | Type | Description | Required |
|-|-|-|-|-|
| `category` | path | string | Category of returned items. | ✅ |
| `n` | query | integer | Number of returned items | |
| `offset` | query | integer | Offset of returned items | |
| `user-id` | query | string | Remove read items of a user | |

#### Response Body

```json
[
{
"Id": "crocodilia",
"Score": 3.1415926
},
{
"Id": "crocodilia",
"Score": 3.1415926
},
{
"Id": "crocodilia",
"Score": 3.1415926
}
]
```

:::

::: details GET /api/recommend/{user-id}

Get recommendation for user.
Expand Down
Loading