# Recommender System

## Methods

- Popularity based: suggest popular dresses by purchase count
- Classification: classifier will give a binary value of that product liked by this user or not
- Collaborative Filtering
 - Nearest Neighbor Filter:
   - User based: Find the users who have similar taste of products as the current user
   - Item based: Recommend Items that are similar to the item user bought
 - Matrix Factorization: http://www.quuxlabs.com/blog/2010/09/matrix-factorization-a-simple-tutorial-and-implementation-in-python/

## Features

https://zhuanlan.zhihu.com/p/221783604

https://zhuanlan.zhihu.com/p/99259582

### Original Features
- 用户的固有属性：性别，年龄，地域，学历，职业，注册时间
- 用户的历史行为：最近阅读的文章、最近有效展示但未阅读的文章、最近的其它对文章的正负反馈行为如点赞、评论、转发、收藏、不喜欢
- 用户所处环境特征：当前时间、推荐场景、网络类型、当前地域等
- 商品、文章的特征：关键词、类别、发布时间

### Derived Features
- 为了区分新用户和老用户，可以构造用户注册时间（分段），用户（过去一周，过去一个月）活跃度（根据打开app次数，停留时长等分段）特征，以及他们与文章特征的交叉特征。
- 为了区分新老文章，可以引入文章的发布时间（分布到当前时间差并分段），文章的累计和最近（比如最近1天，一周）的展示点击阅读时长数量，以及他们与文章关键、文章类型、用户关键词等的交叉特征。
- 为了区分文章的质量，可以引入文章最近阅读时长，平均阅读时长，点击展示比，完播次数与完播率，点赞转发评论数量等特征。
- 为了区分用户的阅读习惯，可以引入用户不同时间段（比如早中晚，周日，周末）最近阅读的数量和时长，用户最近阅读的文章的质量等。

### Important Features
- 用户长短期关键词和文章关键词，及他们的交叉（衡量用户画像与文章匹配情况）
- 用户最近阅读文章关键词、类别及与当前文章关键词、类别的交叉（衡量用户最近阅读偏好与文章匹配情况）
- 文章最近的点击展示时长量统计特征（衡量文章的质量与热度）
- 相似文章的点展时长量统计特征（文章冷启动）
- 相似用户的最近阅读文章类型、关键词特征（用户冷启动）
- 用户最近点赞分享评论文章类别、关键词（强正反馈）
- 用户负反馈及连续推而不点的文章标题、内容关键词（强负反馈）


## Model Evaluation

https://zhuanlan.zhihu.com/p/67287992![image.png](attachment:image.png)

### Offline
- precision @top k
- AUC-ROC
- DCG, $r_i$ = $I$(user likes the $i^{th}$ product)

$$DCG = \sum_{i=1}^p \frac{r_i}{log_2 (i+1)}, NDCG@k = \frac{DCG}{iDCG}$$

### Online
- input data distribution, real-time performance, variety, personality
- diversity: whether the model recommends same items all the time
- coverage = % items in training data recommended on test set
- personalization = 1 - avg(cos sim between users)

### Why if offline and online results are different?
- latency of time-dependent feature updates
- whether train and test sets overlap
- overfitting

### Whay happened if data shifts?
- monitor real-time performance
- trigger retraining if necessary


## Cold-Start User

### Content-based filtering
- utilize metadata about the new product
- additional info from user
- advantage:
  - independent of user
  - data sparsity does not arise
  - can suggest new items to users

### Popularity-based
- product on the list which almost all new customers buy
- disadvantage: lack of personalization

### Multi-Armed Bandit Model (Reinforcement Learning)
- find the product that maximizes the gain $\mu_j + \sqrt{\frac{ln(t)}{t_j}}$
- tradeoff between exploration and exploitation
  - exploitation: If we found a new item that sells well, we want to show it to users more
  - exploration: want to show others which have not been shown as much, because they can be even more popular than the items you have shown already

### Deep Learning

## Case Study: Job Recommendation

### 1. How to Collect Labels?
Collecting labels is crucial for training supervised machine learning models. In a job recommendation system, labels might represent whether a job recommendation was relevant or whether a candidate applied for or was hired for a job.

**Approaches:**
- **Historical Data:** Use historical data where resumes are matched with jobs the candidates applied to or were hired for. These applications or hires can serve as positive labels.
- **User Feedback:** Incorporate explicit feedback from users (e.g., thumbs up/down on a recommendation, ratings, or preferences). This feedback can be treated as positive or negative labels.
- **Implicit Feedback:** Analyze user behavior, such as clicks on job recommendations, time spent viewing job details, or application submissions. Positive behavior can be used as positive labels, while the absence of interaction can be negative or neutral labels.
- **Expert Annotation:** If historical data is limited, you might use domain experts to manually label data, though this is time-consuming and expensive.

### 2. How to Select Features?

**Potential Features:**
- **Resume-Based Features:**
  - **Skills:** Extract skills from the resume using NLP techniques like Named Entity Recognition (NER).
  - **Experience:** Quantify the years of experience in specific industries or roles.
  - **Education:** Degree level, fields of study, and institutions attended.
  - **Past Roles and Companies:** Titles held, companies worked at, and the hierarchy of positions.
  - **Location:** Current location of the candidate, willingness to relocate, or work remotely.
  
- **Job-Based Features:**
  - **Job Title:** The title of the job and its similarity to the candidate's past job titles.
  - **Required Skills:** A match between required skills and the candidate’s skills.
  - **Location:** Proximity of the job location to the candidate's location.
  - **Industry:** Match between the candidate's previous industry experience and the industry of the job.

- **Interaction Features:**
  - **Click-Through Rate (CTR):** Previous interaction history with similar jobs.
  - **Application History:** Whether the candidate has applied for similar jobs in the past.

### 3. What ML Models are Appropriate to Solve the Problems?

**Models:**
- **Content-Based Filtering:** Recommends jobs based on the similarity between the resume and job descriptions. This can involve vectorizing text features using methods like TF-IDF or word embeddings (e.g., Word2Vec, BERT) and then computing cosine similarity.
  
- **Collaborative Filtering:** Uses past interactions to recommend jobs based on what similar candidates have applied to or been hired for. This can be done using matrix factorization techniques (e.g., SVD) or deep learning methods (e.g., Neural Collaborative Filtering).

- **Hybrid Models:** Combine content-based and collaborative filtering to leverage both resume content and interaction history.

- **Deep Learning Models:** 
  - **Deep Neural Networks (DNNs):** For feature-rich environments, DNNs can be trained to learn complex relationships between features.
  - **Siamese Networks:** Can be used to learn similarity between resumes and job descriptions by comparing pairs.
  - **Transformer Models:** Pre-trained language models like BERT can be fine-tuned to understand and match resumes with job descriptions.
    - Concatenate tabular features into a sentence/sequence (e.g. “NYU, Math, Masters, Amazon, Applied Scientist, 4 YOE, LinkedIn, MLE, Senior”)

### 4. How to Conduct A/B Test Experiments? What Metrics Can Be Used?

**Steps:**
- **Randomization:** Randomly assign users to different groups, where one group receives recommendations from the new model and the other from the baseline model.
- **Duration:** Run the experiment for a sufficient duration to gather enough data for statistical significance.
  
**Metrics:**
- **Click-Through Rate (CTR):** The percentage of job recommendations clicked by users.
- **Conversion Rate:** The percentage of recommendations that result in a job application.
- **Time to First Interaction:** How quickly users interact with the recommended jobs.
- **User Engagement:** Metrics like session duration, number of job views, etc.
- **Long-Term Metrics:** Job acceptance rate, user retention, and satisfaction surveys.

### 5. How to Evaluate Online and Offline Performance During Experiments?

**Offline Evaluation:**
- **Precision@k and Recall@k:** Measure how many of the top-k recommendations are relevant.
- **Mean Reciprocal Rank (MRR):** Evaluates how high the first relevant recommendation is ranked.
- **Area Under the Curve (AUC):** AUC-ROC measures the trade-off between true positive rate and false positive rate.
- **Cross-Validation:** Use cross-validation on historical data to assess model performance before deployment.

**Online Evaluation:**
- **Real-Time Metrics:** Track metrics like CTR, conversion rate, and user engagement in real-time.
- **A/B Testing:** Compare performance metrics between different models in the live environment.
- **User Feedback:** Collect and analyze qualitative feedback from users to identify potential improvements.
