Skip to content

Latest commit

 

History

History
514 lines (513 loc) · 323 KB

kdd2022.md

File metadata and controls

514 lines (513 loc) · 323 KB

KDD2022 Paper List

论文 作者 组织 摘要 翻译 代码 引用数
Contrastive Cross-domain Recommendation in Matching Ruobing Xie, Qi Liu, Liangdong Wang, Shukai Liu, Bo Zhang, Leyu Lin WeChat, Tencent, Beijing, China Cross-domain recommendation (CDR) aims to provide better recommendation results in the target domain with the help of the source domain, which is widely used and explored in real-world systems. However, CDR in the matching (i.e., candidate generation) module struggles with the data sparsity and popularity bias issues in both representation learning and knowledge transfer. In this work, we propose a novel Contrastive Cross-Domain Recommendation (CCDR) framework for CDR in matching. Specifically, we build a huge diversified preference network to capture multiple information reflecting user diverse interests, and design an intra-domain contrastive learning (intra-CL) and three inter-domain contrastive learning (inter-CL) tasks for better representation learning and knowledge transfer. The intra-CL enables more effective and balanced training inside the target domain via a graph augmentation, while the inter-CL builds different types of cross-domain interactions from user, taxonomy, and neighbor aspects. In experiments, CCDR achieves significant improvements on both offline and online evaluations in a real-world system. Currently, we have deployed our CCDR on WeChat Top Stories, affecting plenty of users. The source code is in https://github.com/lqfarmer/CCDR. 跨域推荐(CDR)是在源域的帮助下在目标域中提供更好的推荐结果,在现实系统中得到了广泛的应用和探索。然而,匹配(即候选人生成)模块中的 CDR 在表示学习和知识转移中都面临着数据稀疏和流行偏差的问题。在这项工作中,我们提出了一个新的对比跨域推荐(CCDR)框架的 CDR 匹配。具体来说,我们建立了一个巨大的多样化偏好网络来捕获反映用户不同兴趣的多种信息,并设计了一个域内对比学习(域内对比学习)和三个域间对比学习(域间对比学习)任务来更好地表示学习和知识转移。CL 内部通过图增强实现了目标域内更有效和平衡的培训,而 CL 间从用户、分类和邻居方面构建不同类型的跨域交互。在实验中,CCDR 对现实系统中的离线和在线评估都有显著的改进。目前,我们已经在微信上部署了 CCDR,影响了大量的用户。源代码在 https://github.com/lqfarmer/ccdr 里。 code 12
Graph-Flashback Network for Next Location Recommendation Xuan Rao, Lisi Chen, Yong Liu, Shuo Shang, Bin Yao, Peng Han code 9
Meta-Learned Metrics over Multi-Evolution Temporal Graphs Dongqi Fu, Liri Fang, Ross Maciejewski, Vetle I. Torvik, Jingrui He code 9
FLDetector: Defending Federated Learning Against Model Poisoning Attacks via Detecting Malicious Clients Zaixi Zhang, Xiaoyu Cao, Jinyuan Jia, Neil Zhenqiang Gong code 8
Surrogate for Long-Term User Experience in Recommender Systems Yuyan Wang, Mohit Sharma, Can Xu, Sriraj Badam, Qian Sun, Lee Richardson, Lisa Chung, Ed H. Chi, Minmin Chen Google Research, Brain Team, Mountain View, CA, USA; Google, Mountain View, CA, USA Over the years we have seen recommender systems shifting focus from optimizing short-term engagement toward improving long-term user experience on the platforms. While defining good long-term user experience is still an active research area, we focus on one specific aspect of improved long-term user experience here, which is user revisiting the platform. These long term outcomes however are much harder to optimize due to the sparsity in observing these events and low signal-to-noise ratio (weak connection) between these long-term outcomes and a single recommendation. To address these challenges, we propose to establish the association between these long-term outcomes and a set of more immediate term user behavior signals that can serve as surrogates for optimization. To this end, we conduct a large-scale study of user behavior logs on one of the largest industrial recommendation platforms serving billions of users. We study a broad set of sequential user behavior patterns and standardize a procedure to pinpoint the subset that has strong predictive power of the change in users' long-term visiting frequency. Specifically, they are predictive of users' increased visiting to the platform in $5$ months among the group of users with the same visiting frequency to begin with. We validate the identified subset of user behaviors by incorporating them as reward surrogates for long-term user experience in a reinforcement learning (RL) based recommender. Results from multiple live experiments on the industrial recommendation platform demonstrate the effectiveness of the proposed set of surrogates in improving long-term user experience. 多年来,我们已经看到推荐系统的重点从优化短期参与转向改善平台上的长期用户体验。虽然定义良好的长期用户体验仍然是一个活跃的研究领域,我们在这里重点关注改善长期用户体验的一个具体方面,即用户重新访问平台。然而,这些长期结果更难优化,因为观察这些事件的信噪比很少,而且这些长期结果与单一建议之间的联系很弱。为了应对这些挑战,我们建议在这些长期结果和一组更直接的用户行为信号之间建立联系,这些信号可以作为优化的替代品。为此,我们在为数十亿用户服务的最大的工业推荐平台之一上,对用户行为日志进行了大规模的研究。我们研究了一组广泛的顺序用户行为模式,并标准化了一个过程,以确定子集有强大的预测能力的变化,用户的长期访问频率。具体来说,他们预测用户访问该平台的次数将在5美元一个月内增加,而且访问频率从一开始就相同。我们通过将已识别的用户行为子集作为基于强化学习(RL)的推荐系统中长期用户体验的奖励替代品来验证它们。在工业推荐平台上的多个现场实验结果证明了所提出的替代品集在改善长期用户体验方面的有效性。 code 7
Multi-modal Siamese Network for Entity Alignment Liyi Chen, Zhi Li, Tong Xu, Han Wu, Zhefeng Wang, Nicholas Jing Yuan, Enhong Chen code 7
GraphMAE: Self-Supervised Masked Graph Autoencoders Zhenyu Hou, Xiao Liu, Yukuo Cen, Yuxiao Dong, Hongxia Yang, Chunjie Wang, Jie Tang code 7
MSDR: Multi-Step Dependency Relation Networks for Spatial Temporal Forecasting Dachuan Liu, Jin Wang, Shuo Shang, Peng Han code 7
Joint Knowledge Graph Completion and Question Answering Lihui Liu, Boxin Du, Jiejun Xu, Yinglong Xia, Hanghang Tong code 7
Graph Neural Networks for Multimodal Single-Cell Data Integration Hongzhi Wen, Jiayuan Ding, Wei Jin, Yiqi Wang, Yuying Xie, Jiliang Tang code 7
Multi-Behavior Hypergraph-Enhanced Transformer for Sequential Recommendation Yuhao Yang, Chao Huang, Lianghao Xia, Yuxuan Liang, Yanwei Yu, Chenliang Li National University of Singapore, Singapore, Singapore; University of Hong Kong, Hong Kong, China; Ocean University of China, Qingdao, China; Wuhan University, Wuhan, China Learning dynamic user preference has become an increasingly important component for many online platforms (e.g., video-sharing sites, e-commerce systems) to make sequential recommendations. Previous works have made many efforts to model item-item transitions over user interaction sequences, based on various architectures, e.g., recurrent neural networks and self-attention mechanism. Recently emerged graph neural networks also serve as useful backbone models to capture item dependencies in sequential recommendation scenarios. Despite their effectiveness, existing methods have far focused on item sequence representation with singular type of interactions, and thus are limited to capture dynamic heterogeneous relational structures between users and items (e.g., page view, add-to-favorite, purchase). To tackle this challenge, we design a Multi-Behavior Hypergraph-enhanced T ransformer framework (MBHT) to capture both short-term and long-term cross-type behavior dependencies. Specifically, a multi-scale Transformer is equipped with low-rank self-attention to jointly encode behavior-aware sequential patterns from fine-grained and coarse-grained levels. Additionally,we incorporate the global multi-behavior dependency into the hypergraph neural architecture to capture the hierarchical long-range item correlations in a customized manner. Experimental results demonstrate the superiority of our MBHT over various state-of- the-art recommendation solutions across different settings. Further ablation studies validate the effectiveness of our model design and benefits of the new MBHT framework. Our implementation code is released at: https://github.com/yuh-yang/MBHT-KDD22. 学习动态用户偏好已经成为许多在线平台(如视频分享网站、电子商务系统)提供顺序推荐的一个越来越重要的组成部分。以往的研究基于多种体系结构,如递归神经网络和自我注意机制,对用户交互序列上的项目-项目转换进行了大量的研究。最近出现的图形神经网络也可以作为有用的骨干模型,以捕获项目依赖的顺序推荐场景。尽管现有的方法很有效,但是现有的方法都集中在单一交互类型的项目序列表示上,因此仅限于捕获用户和项目之间的动态异构关系结构(例如,页面查看、添加到收藏夹、购买)。为了应对这一挑战,我们设计了一个多行为超图增强型 T 变换器框架(MBHT)来捕获短期和长期的跨类型行为依赖。具体而言,多尺度变压器配备低级自注意,以从细粒度和粗粒度级别联合编码行为感知的序列模式。此外,我们将全局多行为依赖引入到超图神经结构中,以自定义的方式获取层次化的远程项目相关性。实验结果表明,我们的 MBHT 优于不同设置的各种最先进的推荐解决方案。进一步的消融研究验证了我们的模型设计的有效性和新的 MBHT 框架的好处。我们的实施代码在以下 https://github.com/yuh-yang/mbht-kdd22发布:。 code 6
CommerceMM: Large-Scale Commerce MultiModal Representation Learning with Omni Retrieval Licheng Yu, Jun Chen, Animesh Sinha, Mengjiao Wang, Yu Chen, Tamara L. Berg, Ning Zhang Meta AI, Menlo Park, CA, USA We introduce CommerceMM - a multimodal model capable of providing a diverse and granular understanding of commerce topics associated to the given piece of content (image, text, image+text), and having the capability to generalize to a wide range of tasks, including Multimodal Categorization, Image-Text Retrieval, Query-to-Product Retrieval, Image-to-Product Retrieval, etc. We follow the pre-training + fine-tuning training regime and present 5 effective pre-training tasks on image-text pairs. To embrace more common and diverse commerce data with text-to-multimodal, image-to-multimodal, and multimodal-to-multimodal mapping, we propose another 9 novel cross-modal and cross-pair retrieval tasks, called Omni-Retrieval pre-training. We also propose a novel approach of modality randomization to dynamically adjust our model under different efficiency constraints. The pre-training is conducted in an efficient manner with only two forward/backward updates for the combined 14 tasks. Extensive experiments and analysis show the effectiveness of each task. When combining all pre-training tasks, our model achieves state-of-the-art performance on 7 commerce-related downstream tasks after fine-tuning. 我们介绍 CommerceMM ——一个多模态模型,它能够提供对与给定内容(图像、文本、图像 + 文本)相关的商业主题的多样化和细粒度的理解,并且能够泛化到广泛的任务,包括多模态分类、图像-文本检索、查询到产品检索、图像到产品检索等。我们遵循预先训练 + 微调训练制度,提出了5个有效的图像-文本对预先训练任务。为了使用文本到多模式、图像到多模式以及多模式到多模式映射来接受更多常见和多样化的商业数据,我们提出了另外9个新的跨模式和交叉对检索任务,称为 Omni-Retrieval pre-training。提出了一种新的模态随机化方法,在不同的效率约束下动态调整模型。预先培训是在一个有效的方式进行,只有两个向前/向后更新的合并14个任务。大量的实验和分析表明了每个任务的有效性。当结合所有的预训练任务时,我们的模型在经过微调后在7个与商业相关的下游任务上达到了最先进的性能。 code 6
Learning to Rotate: Quaternion Transformer for Complicated Periodical Time Series Forecasting Weiqi Chen, Wenwei Wang, Bingqing Peng, Qingsong Wen, Tian Zhou, Liang Sun code 6
FederatedScope-GNN: Towards a Unified, Comprehensive and Efficient Package for Federated Graph Learning Zhen Wang, Weirui Kuang, Yuexiang Xie, Liuyi Yao, Yaliang Li, Bolin Ding, Jingren Zhou code 6
CrossCBR: Cross-view Contrastive Learning for Bundle Recommendation Yunshan Ma, Yingzhi He, An Zhang, Xiang Wang, TatSeng Chua code 5
Data-Efficient Brain Connectome Analysis via Multi-Task Meta-Learning Yi Yang, Yanqiao Zhu, Hejie Cui, Xuan Kan, Lifang He, Ying Guo, Carl Yang code 5
Matrix Profile XXIV: Scaling Time Series Anomaly Detection to Trillions of Datapoints and Ultra-fast Arriving Data Streams Yue Lu, Renjie Wu, Abdullah Mueen, Maria A. Zuluaga, Eamonn J. Keogh code 5
ROLAND: Graph Learning Framework for Dynamic Graphs Jiaxuan You, Tianyu Du, Jure Leskovec code 5
Multiplex Heterogeneous Graph Convolutional Network Pengyang Yu, Chaofan Fu, Yanwei Yu, Chao Huang, Zhongying Zhao, Junyu Dong code 5
ERNIE-GeoL: A Geography-and-Language Pre-trained Model and its Applications in Baidu Maps Jizhou Huang, Haifeng Wang, Yibo Sun, Yunsheng Shi, Zhengjie Huang, An Zhuo, Shikun Feng code 5
ChemicalX: A Deep Learning Library for Drug Pair Scoring Benedek Rozemberczki, Charles Tapley Hoyt, Anna Gogleva, Piotr Grabowski, Klas Karis, Andrej Lamov, Andriy Nikolov, Sebastian Nilsson, Michaël Ughetto, Yu Wang, Tyler Derr, Benjamin M. Gyori code 5
DuARE: Automatic Road Extraction with Aerial Images and Trajectory Data at Baidu Maps Jianzhong Yang, Xiaoqing Ye, Bin Wu, Yanlei Gu, Ziyu Wang, Deguo Xia, Jizhou Huang code 5
TwHIN: Embedding the Twitter Heterogeneous Information Network for Personalized Recommendation Ahmed ElKishky, Thomas Markovich, Serim Park, Chetan Verma, Baekjin Kim, Ramy Eskander, Yury Malkov, Frank Portman, Sofía Samaniego, Ying Xiao, Aria Haghighi Twitter, San Francisco, CA, USA; Twitter Cortex, New York, NY, USA; Twitter Cortex, Seattle, WA, USA; Twitter Cortex, Boston, MA, USA; Twitter Cortex, San Francisco, CA, USA Social networks, such as Twitter, form a heterogeneous information network (HIN) where nodes represent domain entities (e.g., user, content, advertiser, etc.) and edges represent one of many entity interactions (e.g, a user re-sharing content or "following" another). Interactions from multiple relation types can encode valuable information about social network entities not fully captured by a single relation; for instance, a user's preference for accounts to follow may depend on both user-content engagement interactions and the other users they follow. In this work, we investigate knowledge-graph embeddings for entities in the Twitter HIN (TwHIN); we show that these pretrained representations yield significant offline and online improvement for a diverse range of downstream recommendation and classification tasks: personalized ads rankings, account follow-recommendation, offensive content detection, and search ranking. We discuss design choices and practical challenges of deploying industry-scale HIN embeddings, including compressing them to reduce end-to-end model latency and handling parameter drift across versions. 社交网络,如 Twitter,形成了一个异构的信息网络(HIN) ,其中节点代表领域实体(例如,用户,内容,广告商等) ,边缘代表许多实体交互之一(例如,用户重新分享内容或“关注”另一个)。来自多种关系类型的交互可以编码关于社交网络实体的有价值的信息,而这些信息并没有被单个关系完全捕获; 例如,用户对账户的偏好可能同时取决于用户内容参与交互和他们所关注的其他用户。在这项工作中,我们调查了知识图表嵌入实体在 Twitter HIN (TwHIN) ; 我们表明,这些预先训练的表示产生了显着的离线和在线改善的下游推荐和分类任务的范围: 个性化广告排名,帐户跟踪推荐,攻击性内容检测和搜索排名。我们讨论了部署行业规模的 HIN 嵌入的设计选择和实际挑战,包括压缩它们以减少端到端模型延迟和处理跨版本的参数漂移。 code 4
Multi-Task Fusion via Reinforcement Learning for Long-Term User Satisfaction in Recommender Systems Qihua Zhang, Junning Liu, Yuzhuo Dai, Yiyan Qi, Yifan Yuan, Kunlun Zheng, Fan Huang, Xianfeng Tan Tencent, Shenzhen, China; Tencent, Beijing, China Recommender System (RS) is an important online application that affects billions of users every day. The mainstream RS ranking framework is composed of two parts: a Multi-Task Learning model (MTL) that predicts various user feedback, i.e., clicks, likes, sharings, and a Multi-Task Fusion model (MTF) that combines the multi-task outputs into one final ranking score with respect to user satisfaction. There has not been much research on the fusion model while it has great impact on the final recommendation as the last crucial process of the ranking. To optimize long-term user satisfaction rather than obtain instant returns greedily, we formulate MTF task as Markov Decision Process (MDP) within a recommendation session and propose a Batch Reinforcement Learning (RL) based Multi-Task Fusion framework (BatchRL-MTF) that includes a Batch RL framework and an online exploration. The former exploits Batch RL to learn an optimal recommendation policy from the fixed batch data offline for long-term user satisfaction, while the latter explores potential high-value actions online to break through the local optimal dilemma. With a comprehensive investigation on user behaviors, we model the user satisfaction reward with subtle heuristics from two aspects of user stickiness and user activeness. Finally, we conduct extensive experiments on a billion-sample level real-world dataset to show the effectiveness of our model. We propose a conservative offline policy estimator (Conservative-OPEstimator) to test our model offline. Furthermore, we take online experiments in a real recommendation environment to compare performance of different models. As one of few Batch RL researches applied in MTF task successfully, our model has also been deployed on a large-scale industrial short video platform, serving hundreds of millions of users. 推荐系统(RS)是一个重要的在线应用程序,每天影响数十亿用户。RS 的主流排名框架由两部分组成: 一个是多任务学习模型(Multi-Task Learning model,MTL) ,它预测用户的各种反馈,即点击、喜欢、分享; 另一个是多任务融合模型(Multi-Task Fusion model,MTF) ,它将多任务输出结合成一个用户满意度的最终排名得分。融合模型作为排名的最后一个关键过程,对最终推荐有着重要的影响。为了优化长期用户满意度,而不是贪婪地获得即时回报,我们在一个推荐会话中将 MTF 任务制定为马可夫决策过程(mDP) ,并提出了一个基于批处理强化学习(RL)的多任务融合框架(BatchRL-MTF) ,其中包括一个批处理强化学习框架和一个在线探索。前者利用批量 RL 从离线的固定批量数据中学习最优推荐策略以获得长期用户满意度,后者利用在线的潜在高价值行为来突破局部最优困境。通过对用户行为的全面调查,从用户粘性和用户主动性两个方面采用微妙的启发式方法建立了用户满意奖励模型。最后,我们在十亿个样本级别的真实世界数据集上进行了广泛的实验,以显示我们的模型的有效性。我们提出了一个保守的离线策略估计(保守-最优估计)来测试我们的模型离线。此外,我们在一个真实的推荐环境中进行在线实验,比较不同模型的性能。作为少数几个成功应用于 MTF 任务的批量 RL 研究之一,我们的模型也已经部署在一个大型工业短视频平台上,为数亿用户服务。 code 4
Feature-aware Diversified Re-ranking with Disentangled Representations for Relevant Recommendation Zihan Lin, Hui Wang, Jingshu Mao, Wayne Xin Zhao, Cheng Wang, Peng Jiang, JiRong Wen Renmin University of China, Beijing Key Laboratory of Big Data Management and Analysis Methods, & Beijing Academy of Artificial Intelligence, Beijing, China; Kuaishou Inc., Beijing, China; Renmin University of China, Beijing, China Relevant recommendation is a special recommendation scenario which provides relevant items when users express interests on one target item (e.g., click, like and purchase). Besides considering the relevance between recommendations and trigger item, the recommendations should also be diversified to avoid information cocoons. However, existing diversified recommendation methods mainly focus on item-level diversity which is insufficient when the recommended items are all relevant to the target item. Moreover, redundant or noisy item features might affect the performance of simple feature-aware recommendation approaches. Faced with these issues, we propose a Feature Disentanglement Self-Balancing Re-ranking framework (FDSB) to capture feature- aware diversity. The framework consists of two major modules, namely disentangled attention encoder (DAE) and self-balanced multi-aspect ranker. In DAE, we use multi-head attention to learn disentangled aspects from rich item features. In the ranker, we develop an aspect-specific ranking mechanism that is able to adaptively balance the relevance and diversity for each aspect. In experiments, we conduct offline evaluation on the collected dataset and deploy FDSB on KuaiShou app for online ??/?? test on the function of relevant recommendation. The significant improvements on both recommendation quality and user experience verify the effectiveness of our approach. 相关推荐是一种特殊的推荐场景,当用户对一个目标项目表示兴趣时(例如,点击、喜欢和购买) ,它会提供相关的项目。除了考虑建议与触发项目之间的相关性之外,建议还应当多样化,以避免信息茧。然而,现有的多样化推荐方法主要侧重于项目层次的多样性,当推荐项目都与目标项目相关时,这种多样性是不够的。此外,冗余或嘈杂的项目特征可能会影响简单的特征感知推荐方法的性能。针对这些问题,我们提出了一种特征分离自平衡重排框架(FDSB)来捕获特征感知的多样性。该框架包括两个主要模块,即分离注意编码器(DAE)和自平衡多方面排序器。在 DAE 中,我们使用多头注意从丰富的项目特征中学习分离的方面。在排名中,我们开发了一个方面特定的排名机制,能够自适应地平衡每个方面的相关性和多样性。在实验中,我们对收集到的数据集进行离线评估,并在快手应用上部署 FDSB 以实现在线? ? ?/??检验有关推荐的作用。在推荐质量和用户体验方面的重大改进验证了我们方法的有效性。 code 4
Counteracting User Attention Bias in Music Streaming Recommendation via Reward Modification Xiao Zhang, Sunhao Dai, Jun Xu, Zhenhua Dong, Quanyu Dai, JiRong Wen Huawei Noah's Ark Lab, Shenzhen, China; Renmin University of China, Beijing, China In streaming media applications, like music Apps, songs are recommended in a continuous way in users' daily life. The recommended songs are played automatically although users may not pay any attention to them, posing a challenge of user attention bias in training recommendation models, i.e., the training instances contain a large number of false-positive labels (users' feedback). Existing approaches either directly use the auto-feedbacks or heuristically delete the potential false-positive labels. Both of the approaches lead to biased results because the false-positive labels cause the shift of training data distribution, hurting the accuracy of the recommendation models. In this paper, we propose a learning-based counterfactual approach to adjusting the user auto-feedbacks and learning the recommendation models using Neural Dueling Bandit algorithm, called NDB. Specifically, NDB maintains two neural networks: a user attention network for computing the importance weights that are used for modifying the original rewards, and another random network trained with dueling bandit for conducting online recommendations based on the modified rewards. Theoretical analysis showed that the modified rewards are statistically unbiased, and the learned bandit policy enjoys a sub-linear regret bound. Experimental results demonstrated that NDB can significantly outperform the state-of-the-art baselines. 在流媒体应用程序中,比如音乐应用程序,歌曲被持续推荐到用户的日常生活中。虽然用户可能没有注意到这些歌曲,但推荐的歌曲会自动播放,这对训练推荐模型中的用户注意偏差提出了挑战,即训练实例中包含大量假阳性标签(用户反馈)。现有的方法要么直接使用自动反馈,要么启发性地删除潜在的假阳性标签。这两种方法都会导致结果偏差,因为假阳性标签会引起训练数据分布的变化,从而影响推荐模型的准确性。在本文中,我们提出了一种基于学习的反事实方法来调整用户自动反馈和学习推荐模型的神经决斗盗贼算法,称为 NDB。具体来说,新开发银行维护两个神经网络: 一个是用户注意力网络,用于计算用于修改原始奖励的重要性权重,另一个是与决斗强盗一起训练的随机网络,用于根据修改后的奖励进行在线推荐。理论分析表明,修正后的奖励具有统计上的无偏性,学会的土匪政策具有亚线性后悔界限。实验结果表明,新数据库的性能明显优于最先进的基线。 code 4
Knowledge-enhanced Black-box Attacks for Recommendations Jingfan Chen, Wenqi Fan, Guanghui Zhu, Xiangyu Zhao, Chunfeng Yuan, Qing Li, Yihua Huang code 4
Towards Universal Sequence Representation Learning for Recommender Systems Yupeng Hou, Shanlei Mu, Wayne Xin Zhao, Yaliang Li, Bolin Ding, JiRong Wen code 4
On Structural Explanation of Bias in Graph Neural Networks Yushun Dong, Song Wang, Yu Wang, Tyler Derr, Jundong Li code 4
SMORE: Knowledge Graph Completion and Multi-hop Reasoning in Massive Knowledge Graphs Hongyu Ren, Hanjun Dai, Bo Dai, Xinyun Chen, Denny Zhou, Jure Leskovec, Dale Schuurmans code 4
Improving Fairness in Graph Neural Networks via Mitigating Sensitive Attribute Leakage Yu Wang, Yuying Zhao, Yushun Dong, Huiyuan Chen, Jundong Li, Tyler Derr code 4
COSTA: Covariance-Preserving Feature Augmentation for Graph Contrastive Learning Yifei Zhang, Hao Zhu, Zixing Song, Piotr Koniusz, Irwin King code 4
How does Heterophily Impact the Robustness of Graph Neural Networks?: Theoretical Connections and Practical Implications Jiong Zhu, Junchen Jin, Donald Loveland, Michael T. Schaub, Danai Koutra code 4
Company-as-Tribe: Company Financial Risk Assessment on Tribe-Style Graph with Hierarchical Graph Neural Networks Wendong Bi, Bingbing Xu, Xiaoqian Sun, Zidong Wang, Huawei Shen, Xueqi Cheng code 4
Distributed Hybrid CPU and GPU training for Graph Neural Networks on Billion-Scale Heterogeneous Graphs Da Zheng, Xiang Song, Chengru Yang, Dominique LaSalle, George Karypis code 4
Graph Neural Networks: Foundation, Frontiers and Applications Lingfei Wu, Peng Cui, Jian Pei, Liang Zhao, Xiaojie Guo code 4
Deconfounding Duration Bias in Watch-time Prediction for Video Recommendation Ruohan Zhan, Changhua Pei, Qiang Su, Jianfeng Wen, Xueliang Wang, Guanyu Mu, Dong Zheng, Peng Jiang, Kun Gai code 3
Learning Binarized Graph Representations with Multi-faceted Quantization Reinforcement for Top-K Recommendation Yankai Chen, Huifeng Guo, Yingxue Zhang, Chen Ma, Ruiming Tang, Jingjie Li, Irwin King code 3
Addressing Unmeasured Confounder for Recommendation with Sensitivity Analysis Sihao Ding, Peng Wu, Fuli Feng, Yitong Wang, Xiangnan He, Yong Liao, Yongdong Zhang code 3
Disentangled Ontology Embedding for Zero-shot Learning Yuxia Geng, Jiaoyan Chen, Wen Zhang, Yajing Xu, Zhuo Chen, Jeff Z. Pan, Yufeng Huang, Feiyu Xiong, Huajun Chen code 3
Detecting Arbitrary Order Beneficial Feature Interactions for Recommender Systems Yixin Su, Yunxiang Zhao, Sarah M. Erfani, Junhao Gan, Rui Zhang code 3
AdaFS: Adaptive Feature Selection in Deep Recommender System Weilin Lin, Xiangyu Zhao, Yejing Wang, Tong Xu, Xian Wu code 3
ClusterEA: Scalable Entity Alignment with Stochastic Training and Normalized Mini-batch Similarities Yunjun Gao, Xiaoze Liu, Junyang Wu, Tianyi Li, Pengfei Wang, Lu Chen code 3
Source Localization of Graph Diffusion via Variational Autoencoders for Graph Inverse Problems Chen Ling, Junji Jiang, Junxiang Wang, Liang Zhao code 3
Variational Flow Graphical Model Shaogang Ren, Belhal Karimi, Dingcheng Li, Ping Li code 3
Interpretability, Then What? Editing Machine Learning Models to Reflect Human Knowledge and Values Zijie J. Wang, Alex Kale, Harsha Nori, Peter Stella, Mark E. Nunnally, Duen Horng Chau, Mihaela Vorvoreanu, Jennifer Wortman Vaughan, Rich Caruana code 3
GBPNet: Universal Geometric Representation Learning on Protein Structures Sarp Aykent, Tian Xia Auburn University, Auburn, AL, USA Representation learning of protein 3D structures is challenging and essential for applications, e.g., computational protein design or protein engineering. Recently, geometric deep learning has achieved great success in non-Euclidean domains. Although protein can be represented as a graph naturally, it remains under-explored mainly due to the significant challenges in modeling the complex representations and capturing the inherent correlation in the 3D structure modeling. Several challenges include: 1) It is challenging to extract and preserve multi-level rotation and translation equivariant information during learning. 2) Difficulty in developing appropriate tools to effectively leverage the input spatial representations to capture complex geometries across the spatial dimension. 3) Difficulty in incorporating various geometric features and preserving the inherent structural relations. In this work, we introduce geometric bottleneck perceptron, and a general SO(3)-equivariant message passing neural network built on top of it for protein structure representation learning. The proposed geometric bottleneck perceptron can be incorporated into diverse network architecture backbones to process geometric data in different domains. This research shed new light on geometric deep learning in 3D structure studies. Empirically, we demonstrate the strength of our proposed approach on three core downstream tasks, where our model achieves significant improvements and outperforms existing benchmarks. The implementation is available at https://github.com/sarpaykent/GBPNet. 蛋白质三维结构的表示学习是具有挑战性和必要的应用,例如,计算蛋白质设计或蛋白质工程。近年来,几何深度学习在非欧几里德领域取得了巨大的成功。虽然蛋白质可以自然地表示为一个图形,但是它仍然没有得到充分的开发,主要是由于在建模复杂的表示和捕获三维结构建模中的内在关联方面的重大挑战。这些挑战包括: 1)在学习过程中提取和保存多层次旋转和翻译等变信息是一个挑战。2)难以开发合适的工具来有效地利用输入空间表示来捕获跨空间维度的复杂几何图形。3)难以结合各种几何特征和保持固有的结构关系。本文介绍了几何瓶颈感知器,并在此基础上构建了一个通用的 SO (3)等变信息传递神经网络,用于蛋白质结构表示学习。提出的几何瓶颈感知器可以整合到不同的网络结构骨架中,用于处理不同领域的几何数据。本研究为三维结构研究中的几何深度学习提供了新的思路。实际上,我们在三个核心的下游任务中展示了我们提议的方法的优势,在这些任务中,我们的模型实现了显著的改进,并优于现有的基准测试。有关实施方案可于 https://github.com/sarpaykent/gbpnet 索取。 code 3
Motif Prediction with Graph Neural Networks Maciej Besta, Raphael Grob, Cesare Miglioli, Nicola Bernold, Grzegorz Kwasniewski, Gabriel Gjini, Raghavendra Kanakagiri, Saleh Ashkboos, Lukas Gianinazzi, Nikoli Dryden, Torsten Hoefler University of Illinois at Urbana-Champaign, Urbana-Champaign, IL, USA; ETH Zurich, Zurich, Switzerland; University of Geneva, Geneva, Switzerland; ETH Zürich, Zurich, Switzerland Link prediction is one of the central problems in graph mining. However, recent studies highlight the importance of higher-order network analysis, where complex structures called motifs are the first-class citizens. We first show that existing link prediction schemes fail to effectively predict motifs. To alleviate this, we establish a general motif prediction problem and we propose several heuristics that assess the chances for a specified motif to appear. To make the scores realistic, our heuristics consider - among others - correlations between links, i.e., the potential impact of some arriving links on the appearance of other links in a given motif. Finally, for highest accuracy, we develop a graph neural network (GNN) architecture for motif prediction. Our architecture offers vertex features and sampling schemes that capture the rich structural properties of motifs. While our heuristics are fast and do not need any training, GNNs ensure highest accuracy of predicting motifs, both for dense (e.g., k-cliques) and for sparse ones (e.g., k-stars). We consistently outperform the best available competitor by more than 10% on average and up to 32% in area under the curve. Importantly, the advantages of our approach over schemes based on uncorrelated link prediction increase with the increasing motif size and complexity. We also successfully apply our architecture for predicting more arbitrary clusters and communities, illustrating its potential for graph mining beyond motif analysis. 链接预测是图挖掘的核心问题之一。然而,最近的研究强调了高阶网络分析的重要性,在这种网络分析中,被称为图案的复杂结构是一等公民。我们首先证明了现有的链路预测方案不能有效地预测图案。为了解决这个问题,我们建立了一个通用的主题预测问题,并提出了几种启发式算法来评估特定主题出现的可能性。为了使得分数更加真实,我们的启发式方法考虑了链接之间的相关性,也就是说,一些到达的链接对给定主题中其他链接的外观的潜在影响。最后,为了获得最高的精度,我们开发了一个图形神经网络(GNN)结构用于模体预测。我们的体系结构提供了顶点特征和抽样方案,这些特征和抽样方案捕获了图案丰富的结构属性。虽然我们的启发式算法是快速的,不需要任何训练,GNN 确保预测图案的最高准确性,无论是对于密集的(例如,k- 团)和稀疏的(例如,k- 星)。我们始终超越最好的竞争对手超过10% 的平均水平和高达32% 的面积下的曲线。重要的是,与基于不相关链路预测的方案相比,我们的方法的优势随着基序大小和复杂度的增加而增加。我们还成功地应用了我们的体系结构来预测更多的任意集群和社区,说明了它在图形挖掘方面的潜力超越了主题分析。 code 3
Efficient Orthogonal Multi-view Subspace Clustering Mansheng Chen, ChangDong Wang, Dong Huang, JianHuang Lai, Philip S. Yu code 3
Local Evaluation of Time Series Anomaly Detection Algorithms Alexis Huet, José Manuel Navarro, Dario Rossi code 3
Feature Overcorrelation in Deep Graph Neural Networks: A New Perspective Wei Jin, Xiaorui Liu, Yao Ma, Charu C. Aggarwal, Jiliang Tang code 3
Learned Token Pruning for Transformers Sehoon Kim, Sheng Shen, David Thorsley, Amir Gholami, Woosuk Kwon, Joseph Hassoun, Kurt Keutzer code 3
KPGT: Knowledge-Guided Pre-training of Graph Transformer for Molecular Property Prediction Han Li, Dan Zhao, Jianyang Zeng code 3
Learning Causal Effects on Hypergraphs Jing Ma, Mengting Wan, Longqi Yang, Jundong Li, Brent J. Hecht, Jaime Teevan code 3
Pre-training Enhanced Spatial-temporal Graph Neural Network for Multivariate Time Series Forecasting Zezhi Shao, Zhao Zhang, Fei Wang, Yongjun Xu code 3
GPPT: Graph Pre-training and Prompt Tuning to Generalize Graph Neural Networks Mingchen Sun, Kaixiong Zhou, Xin He, Ying Wang, Xin Wang code 3
Reinforcement Subgraph Reasoning for Fake News Detection Ruichao Yang, Xiting Wang, Yiqiao Jin, Chaozhuo Li, Jianxun Lian, Xing Xie code 3
Unsupervised Key Event Detection from Massive Text Corpora Yunyi Zhang, Fang Guo, Jiaming Shen, Jiawei Han code 3
Learning Sparse Latent Graph Representations for Anomaly Detection in Multivariate Time Series Siho Han, Simon S. Woo code 3
DuIVA: An Intelligent Voice Assistant for Hands-free and Eyes-free Voice Interaction with the Baidu Maps App Jizhou Huang, Haifeng Wang, Shiqiang Ding, Shaolei Wang code 3
A New Generation of Perspective API: Efficient Multilingual Character-level Transformers Alyssa Lees, Vinh Q. Tran, Yi Tay, Jeffrey Sorensen, Jai Prakash Gupta, Donald Metzler, Lucy Vasserman code 3
OAG-BERT: Towards a Unified Backbone Language Model for Academic Knowledge Services Xiao Liu, Da Yin, Jingnan Zheng, Xingjian Zhang, Peng Zhang, Hongxia Yang, Yuxiao Dong, Jie Tang code 3
Fed-LTD: Towards Cross-Platform Ride Hailing via Federated Learning to Dispatch Yansheng Wang, Yongxin Tong, Zimu Zhou, Ziyao Ren, Yi Xu, Guobin Wu, Weifeng Lv code 3
Graph Attention Multi-Layer Perceptron Wentao Zhang, Ziqi Yin, Zeang Sheng, Yang Li, Wen Ouyang, Xiaosen Li, Yangyu Tao, Zhi Yang, Bin Cui code 3
ItemSage: Learning Product Embeddings for Shopping Recommendations at Pinterest Paul Baltescu, Haoyu Chen, Nikil Pancha, Andrew Zhai, Jure Leskovec, Charles Rosenberg Pinterest, San Francisco, CA, USA Learned embeddings for products are an important building block for web-scale e-commerce recommendation systems. At Pinterest, we build a single set of product embeddings called ItemSage to provide relevant recommendations in all shopping use cases including user, image and search based recommendations. This approach has led to significant improvements in engagement and conversion metrics, while reducing both infrastructure and maintenance cost. While most prior work focuses on building product embeddings from features coming from a single modality, we introduce a transformer-based architecture capable of aggregating information from both text and image modalities and show that it significantly outperforms single modality baselines. We also utilize multi-task learning to make ItemSage optimized for several engagement types, leading to a candidate generation system that is efficient for all of the engagement objectives of the end-to-end recommendation system. Extensive offline experiments are conducted to illustrate the effectiveness of our approach and results from online A/B experiments show substantial gains in key business metrics (up to +7% gross merchandise value/user and +11% click volume). 产品学习嵌入是网络电子商务推荐系统的重要组成部分。在 Pinterest,我们构建了一套名为 ItemSage 的产品嵌入,在所有购物用例中提供相关推荐,包括用户、图片和基于搜索的推荐。这种方法显著改善了参与度和转换度量,同时降低了基础设施和维护成本。虽然大多数先前的工作集中于构建来自单一模式的特征的产品嵌入,但是我们引入了一个基于转换器的体系结构,该体系结构能够聚合来自文本和图像模式的信息,并表明它明显优于单一模式基线。我们还利用多任务学习来使 ItemSage 针对几种参与类型进行优化,从而产生一个对端到端推荐系统的所有参与目标都有效的候选人生成系统。为了说明我们的方法的有效性,我们进行了大量的离线实验,在线 A/B 实验的结果显示,在关键的商业指标方面取得了实质性的进展(高达7% 的商品总价值/用户和 + 11% 的点击量)。 code 2
Towards Unified Conversational Recommender Systems via Knowledge-Enhanced Prompt Learning Xiaolei Wang, Kun Zhou, JiRong Wen, Wayne Xin Zhao Renmin University of China, Beijing, China Conversational recommender systems (CRS) aim to proactively elicit user preference and recommend high-quality items through natural language conversations. Typically, a CRS consists of a recommendation module to predict preferred items for users and a conversation module to generate appropriate responses. To develop an effective CRS, it is essential to seamlessly integrate the two modules. Existing works either design semantic alignment strategies, or share knowledge resources and representations between the two modules. However, these approaches still rely on different architectures or techniques to develop the two modules, making it difficult for effective module integration. To address this problem, we propose a unified CRS model named UniCRS based on knowledge-enhanced prompt learning. Our approach unifies the recommendation and conversation subtasks into the prompt learning paradigm, and utilizes knowledge-enhanced prompts based on a fixed pre-trained language model (PLM) to fulfill both subtasks in a unified approach. In the prompt design, we include fused knowledge representations, task-specific soft tokens, and the dialogue context, which can provide sufficient contextual information to adapt the PLM for the CRS task. Besides, for the recommendation subtask, we also incorporate the generated response template as an important part of the prompt, to enhance the information interaction between the two subtasks. Extensive experiments on two public CRS datasets have demonstrated the effectiveness of our approach. Our code is publicly available at the link: https://github.com/RUCAIBox/UniCRS. 会话推荐系统(CRS)的目标是通过自然语言的对话主动地引导用户偏好,并推荐高质量的项目。通常,CRS 由一个推荐模块和一个会话模块组成,前者用于预测用户的首选项,后者用于生成适当的响应。为了开发一个有效的 CRS 系统,必须将这两个模块无缝地结合起来。现有的工作或者设计语义对齐策略,或者在两个模块之间共享知识资源和表示。然而,这些方法仍然依赖于不同的体系结构或技术来开发这两个模块,这使得有效的模块集成变得困难。针对这一问题,提出了一种基于知识增强的快速学习的统一 CRS 模型 UniCRS。该方法将推荐子任务和会话子任务统一到快速学习范式中,并利用基于固定预训练语言模型(PLM)的知识增强提示来统一实现推荐子任务和会话子任务。在快速设计中,我们包括融合知识表示、任务特定的软标记和对话上下文,它们可以提供足够的上下文信息来使 PLM 适应 CRS 任务。此外,对于推荐子任务,我们还将生成的响应模板作为提示的重要组成部分,以增强两个子任务之间的信息交互。在两个公共 CRS 数据集上的大量实验已经证明了我们方法的有效性。我们的代码可在以下 https://github.com/rucaibox/unicrs 公开获得:。 code 2
Device-cloud Collaborative Recommendation via Meta Controller Jiangchao Yao, Feng Wang, Xichen Ding, Shaohu Chen, Bo Han, Jingren Zhou, Hongxia Yang Hong Kong Baptist University, Hong Kong, China; DAMO Academy, Alibaba Group, Hangzhou, China; CMIC, Shanghai Jiao Tong University, Shanghai, China; Ant Group, Beijing, China On-device machine learning enables the lightweight deployment of recommendation models in local clients, which reduces the burden of the cloud-based recommenders and simultaneously incorporates more real-time user features. Nevertheless, the cloud-based recommendation in the industry is still very important considering its powerful model capacity and the efficient candidate generation from the billion-scale item pool. Previous attempts to integrate the merits of both paradigms mainly resort to a sequential mechanism, which builds the on-device recommender on top of the cloud-based recommendation. However, such a design is inflexible when user interests dramatically change: the on-device model is stuck by the limited item cache while the cloud-based recommendation based on the large item pool do not respond without the new re-fresh feedback. To overcome this issue, we propose a meta controller to dynamically manage the collaboration between the on-device recommender and the cloud-based recommender, and introduce a novel efficient sample construction from the causal perspective to solve the dataset absence issue of meta controller. On the basis of the counterfactual samples and the extended training, extensive experiments in the industrial recommendation scenarios show the promise of meta controller in the device-cloud collaboration. 设备上的机器学习支持在本地客户机中轻量级部署推荐模型,这减轻了基于云的推荐模型的负担,同时包含了更多的实时用户特性。尽管如此,考虑到其强大的模型容量和从数十亿规模的项目库中有效地生成候选项,基于云的推荐在业界仍然非常重要。以前整合这两种模式优点的尝试主要依赖于一种顺序机制,这种机制在基于云的推荐之上构建设备上的推荐。然而,当用户的兴趣发生巨大变化时,这样的设计是不灵活的: 设备上的模型被有限的项目缓存卡住了,而基于大型项目池的基于云的推荐在没有新的更新反馈的情况下不会响应。针对这一问题,本文提出了一种元控制器来动态管理设备上的推荐器和基于云的推荐器之间的协作,并从因果关系的角度提出了一种新的有效的样本结构来解决元控制器数据集缺失的问题。在反事实样本和扩展训练的基础上,在工业推荐场景中的大量实验显示了元控制器在设备-云协作中的应用前景。 code 2
MolSearch: Search-based Multi-objective Molecular Generation and Property Optimization Mengying Sun, Jing Xing, Han Meng, Huijun Wang, Bin Chen, Jiayu Zhou Michigan State University, Grand Rapids, MI, USA; Michigan State University, East Lansing, MI, USA; Agios Pharmaceuticals, Cambridge, MA, USA Leveraging computational methods to generate small molecules with desired properties has been an active research area in the drug discovery field. Towards real-world applications, however, efficient generation of molecules that satisfy multiple property requirements simultaneously remains a key challenge. In this paper, we tackle this challenge using a search-based approach and propose a simple yet effective framework called MolSearch for multi-objective molecular generation (optimization).We show that given proper design and sufficient domain information, search-based methods can achieve performance comparable or even better than deep learning methods while being computationally efficient. Such efficiency enables massive exploration of chemical space given constrained computational resources. In particular, MolSearch starts with existing molecules and uses a two-stage search strategy to gradually modify them into new ones, based on transformation rules derived systematically and exhaustively from large compound libraries. We evaluate MolSearch in multiple benchmark generation settings and demonstrate its effectiveness and efficiency. 利用计算方法生成具有期望特性的小分子已经成为药物发现领域的一个活跃的研究领域。然而,对于实际应用来说,同时满足多种性能要求的高效生产分子仍然是一个关键的挑战。在本文中,我们使用基于搜索的方法来解决这一挑战,并提出了一个简单而有效的框架,称为 MolSearch 的多目标分子生成(优化)。我们表明,给定适当的设计和充分的领域信息,基于搜索的方法可以实现性能可比,甚至比深度学习方法更好,同时计算效率。这样的效率使得在计算资源有限的情况下对化学空间进行大规模探索成为可能。特别是,MolSearch 从现有的分子开始,使用一个两阶段的搜索策略,逐渐修改成新的,基于转换规则,系统地和详尽地从大型化合物库。我们评估了 MolSearch 在多个基准测试生成环境中的性能,并证明了它的有效性和效率。 code 2
Invariant Preference Learning for General Debiasing in Recommendation Zimu Wang, Yue He, Jiashuo Liu, Wenchao Zou, Philip S. Yu, Peng Cui University of Illinois at Chicago, Chicago, IL, USA; Siemens China, Shanghai, China; Tsinghua University, Beijing, China Current recommender systems have achieved great successes in online services, such as E-commerce and social media. However, they still suffer from the performance degradation in real scenarios, because various biases always occur in the generation process of user behaviors. Despite the recent development of addressing some specific type of bias, a variety of data bias, some of which are even unknown, are often mixed up in real applications. Although the uniform (or unbiased) data may help for the purpose of general debiasing, such data can either be hardly available or induce high experimental cost. In this paper, we consider a more practical setting where we aim to conduct general debiasing with the biased observational data alone. We assume that the observational user behaviors are determined by invariant preference (i.e. a user's true preference) and the variant preference (affected by some unobserved confounders). We propose a novel recommendation framework called InvPref which iteratively decomposes the invariant preference and variant preference from biased observational user behaviors by estimating heterogeneous environments corresponding to different types of latent bias. Extensive experiments, including the settings of general debiasing and specific debiasing, verify the advantages of our method. 现有的推荐系统在电子商务和社交媒体等在线服务领域取得了巨大的成功。然而,在实际场景中,它们仍然会受到性能下降的影响,因为在用户行为的生成过程中总是会出现各种偏差。尽管最近的发展解决一些特定类型的偏差,各种各样的数据偏差,其中一些甚至是未知的,往往是混合在实际应用。虽然统一(或无偏)的数据可能有助于一般的去偏目的,这样的数据可能难以获得或诱导高实验成本。在本文中,我们考虑一个更实际的设置,其中我们的目的是进行一般的去偏与有偏的观测数据单独。我们假设观察用户行为是由不变偏好(即用户的真实偏好)和变异偏好(受一些未观察到的混杂因素的影响)决定的。提出了一种新的推荐框架 InvPref,该框架通过估计不同类型潜在偏差对应的异质环境,迭代分解有偏差的观察用户行为的不变偏好和变异偏好。广泛的实验,包括一般消偏和具体消偏的设置,验证了我们的方法的优点。 code 2
Automatic Controllable Product Copywriting for E-Commerce Xiaojie Guo, Qingkai Zeng, Meng Jiang, Yun Xiao, Bo Long, Lingfei Wu JD.COM, Beijing, China; JD.COM Silicon Valley Research Center, Mountain View, CA, USA; University of Notre Dame, Notre Dame, IN, USA Automatic product description generation for e-commerce has witnessed significant advancement in the past decade. Product copy- writing aims to attract users' interest and improve user experience by highlighting product characteristics with textual descriptions. As the services provided by e-commerce platforms become diverse, it is necessary to adapt the patterns of automatically-generated descriptions dynamically. In this paper, we report our experience in deploying an E-commerce Prefix-based Controllable Copywriting Generation (EPCCG) system into the JD.com e-commerce product recommendation platform. The development of the system contains two main components: 1) copywriting aspect extraction; 2) weakly supervised aspect labelling; 3) text generation with a prefix-based language model; and 4) copywriting quality control. We conduct experiments to validate the effectiveness of the proposed EPCCG. In addition, we introduce the deployed architecture which cooperates the EPCCG into the real-time JD.com e-commerce recommendation platform and the significant payoff since deployment. The codes for implementation are provided at https://github.com/xguo7/Automatic-Controllable-Product-Copywriting-for-E-Commerce.git. 电子商务中的产品描述自动生成技术在过去的十年中取得了长足的进步。产品文案的目的是通过文字描述突出产品特征,吸引用户的兴趣,提高用户体验。随着电子商务平台提供的服务变得多样化,有必要动态调整自动生成描述的模式。在本文中,我们报告了在京东电子商务产品推荐平台上部署基于前缀的可控文案生成(ECCG)系统的经验。该系统的开发包括两个主要组成部分: 1)文案方面提取; 2)弱监督方面标注; 3)基于前缀语言模型的文本生成; 4)文案质量控制。我们进行了实验,以验证所提出的心电图的有效性。此外,我们还将 EPCCG 协同的已部署体系结构引入到实时 JD.com 电子商务推荐平台中,并且从部署以来获得了显著的回报。实施守则载于 https://github.com/xguo7/automatic-controllable-product-copywriting-for-e-commerce.git。 code 2
Multi-task Hierarchical Classification for Disk Failure Prediction in Online Service Systems Yudong Liu, Hailan Yang, Pu Zhao, Minghua Ma, Chengwu Wen, Hongyu Zhang, Chuan Luo, Qingwei Lin, Chang Yi, Jiaojian Wang, Chenjian Zhang, Paul Wang, Yingnong Dang, Saravan Rajmohan, Dongmei Zhang code 2
EGM: Enhanced Graph-based Model for Large-scale Video Advertisement Search Tan Yu, Jie Liu, Yi Yang, Yi Li, Hongliang Fei, Ping Li code 2
Saliency-Regularized Deep Multi-Task Learning Guangji Bai, Liang Zhao Emory University, Atlanta, GA, USA Multi-task learning (MTL) is a framework that enforces multiple learning tasks to share their knowledge to improve their generalization abilities. While shallow multi-task learning can learn task relations, it can only handle pre-defined features. Modern deep multi-task learning can jointly learn latent features and task sharing, but they are obscure in task relation. Also, they pre-define which layers and neurons should share across tasks and cannot learn adaptively. To address these challenges, this paper proposes a new multi-task learning framework that jointly learns latent features and explicit task relations by complementing the strength of existing shallow and deep multitask learning scenarios. Specifically, we propose to model the task relation as the similarity between tasks' input gradients, with a theoretical analysis of their equivalency. In addition, we innovatively propose a multi-task learning objective that explicitly learns task relations by a new regularizer. Theoretical analysis shows that the generalizability error has been reduced thanks to the proposed regularizer. Extensive experiments on several multi-task learning and image classification benchmarks demonstrate the proposed method's effectiveness, efficiency as well as reasonableness in the learned task relation patterns. 多任务学习(MTL)是一种强制多任务共享知识以提高其泛化能力的学习框架。浅层多任务学习虽然可以学习任务关系,但只能处理预定义的特征。现代深度多任务学习可以联合学习任务的潜在特征和任务共享,但在任务关系方面较为模糊。此外,他们预先定义了哪些层和神经元应该跨任务共享,而不能自适应地学习。针对这些挑战,本文提出了一种新的多任务学习框架,通过补充现有浅层和深层多任务学习场景的优势,联合学习潜在特征和显性任务关系。具体来说,我们提出将任务关系建模为任务输入梯度之间的相似性,并对其等效性进行了理论分析。此外,我们创新地提出了一个多任务学习目标,通过一个新的正则化器显式学习任务关系。理论分析表明,该正则化器可以减小泛化误差。通过对多个多任务学习和图像分类基准的大量实验,证明了该方法在学习任务关系模式方面的有效性、高效性和合理性。 code 2
Discovering Significant Patterns under Sequential False Discovery Control Sebastian Dalleiger, Jilles Vreeken code 2
Connecting Low-Loss Subspace for Personalized Federated Learning SeokJu Hahn, Minwoo Jeong, Junghye Lee code 2
Dual-Geometric Space Embedding Model for Two-View Knowledge Graphs Roshni G. Iyer, Yunsheng Bai, Wei Wang, Yizhou Sun code 2
Rep2Vec: Repository Embedding via Heterogeneous Graph Adversarial Contrastive Learning Yiyue Qian, Yiming Zhang, Qianlong Wen, Yanfang Ye, Chuxu Zhang code 2
Multi-Agent Graph Convolutional Reinforcement Learning for Dynamic Electric Vehicle Charging Pricing Weijia Zhang, Hao Liu, Jindong Han, Yong Ge, Hui Xiong code 2
Learning Backward Compatible Embeddings Weihua Hu, Rajas Bansal, Kaidi Cao, Nikhil Rao, Karthik Subbian, Jure Leskovec code 2
EdgeWatch: Collaborative Investigation of Data Integrity at the Edge based on Blockchain Bo Li, Qiang He, Liang Yuan, Feifei Chen, Lingjuan Lyu, Yun Yang code 2
Persia: An Open, Hybrid System Scaling Deep Learning-based Recommenders up to 100 Trillion Parameters Xiangru Lian, Binhang Yuan, Xuefeng Zhu, Yulong Wang, Yongjun He, Honghuan Wu, Lei Sun, Haodong Lyu, Chengjun Liu, Xing Dong, Yiqiao Liao, Mingnan Luo, Congfei Zhang, Jingru Xie, Haonan Li, Lei Chen, Renjie Huang, Jianying Lin, Chengchun Shu, Xuezhong Qiu, Zhishan Liu, Dongying Kong, Lei Yuan, Hai Yu, Sen Yang, Ce Zhang, Ji Liu code 2
HiPAL: A Deep Framework for Physician Burnout Prediction Using Activity Logs in Electronic Health Records Hanyang Liu, Sunny S. Lou, Benjamin C. Warner, Derek R. Harford, Thomas George Kannampallil, Chenyang Lu code 2
Multiwave COVID-19 Prediction from Social Awareness Using Web Search and Mobility Data Jiawei Xue, Takahiro Yabe, Kota Tsubouchi, Jianzhu Ma, Satish V. Ukkusuri code 2
Make Fairness More Fair: Fair Item Utility Estimation and Exposure Re-Distribution Jiayin Wang, Weizhi Ma, Jiayu Li, Hongyu Lu, Min Zhang, Biao Li, Yiqun Liu, Peng Jiang, Shaoping Ma code 2
Adaptive Model Pooling for Online Deep Anomaly Detection from a Complex Evolving Data Stream Susik Yoon, Youngjun Lee, JaeGil Lee, Byung Suk Lee code 2
Automatically Discovering User Consumption Intents in Meituan Yinfeng Li, Chen Gao, Xiaoyi Du, Huazhou Wei, Hengliang Luo, Depeng Jin, Yong Li code 2
Streaming Hierarchical Clustering Based on Point-Set Kernel Xin Han, Ye Zhu, Kai Ming Ting, DeChuan Zhan, Gang Li code 2
Compressing Deep Graph Neural Networks via Adversarial Knowledge Distillation Huarui He, Jie Wang, Zhanqiu Zhang, Feng Wu code 2
UD-GNN: Uncertainty-aware Debiased Training on Semi-Homophilous Graphs Yang Liu, Xiang Ao, Fuli Feng, Qing He code 2
Interpreting Trajectories from Multiple Views: A Hierarchical Self-Attention Network for Estimating the Time of Arrival Zebin Chen, Xiaolin Xiao, YueJiao Gong, Jun Fang, Nan Ma, Hua Chai, Zhiguang Cao code 2
Causal Inference-Based Root Cause Analysis for Online Service Systems with Intervention Recognition Mingjie Li, Zeyan Li, Kanglin Yin, Xiaohui Nie, Wenchi Zhang, Kaixin Sui, Dan Pei code 2
RES: A Robust Framework for Guiding Visual Explanation Yuyang Gao, Tong Steven Sun, Guangji Bai, Siyi Gu, Sungsoo Ray Hong, Liang Zhao code 2
ProActive: Self-Attentive Temporal Point Process Flows for Activity Sequences Vinayak Gupta, Srikanta Bedathur code 2
Communication-Efficient Robust Federated Learning with Noisy Labels Junyi Li, Jian Pei, Heng Huang code 2
Reliable Representations Make A Stronger Defender: Unsupervised Structure Refinement for Robust GNN Kuan Li, Yang Liu, Xiang Ao, Jianfeng Chi, Jinghua Feng, Hao Yang, Qing He code 2
Mask and Reason: Pre-Training Knowledge Graph Transformers for Complex Logical Queries Xiao Liu, Shiyu Zhao, Kai Su, Yukuo Cen, Jiezhong Qiu, Mengdi Zhang, Wei Wu, Yuxiao Dong, Jie Tang code 2
Learning on Graphs with Out-of-Distribution Nodes Yu Song, Donglin Wang code 2
Causal Attention for Interpretable and Generalizable Graph Classification Yongduo Sui, Xiang Wang, Jiancan Wu, Min Lin, Xiangnan He, TatSeng Chua code 2
Graph Neural Networks with Node-wise Architecture Zhen Wang, Zhewei Wei, Yaliang Li, Weirui Kuang, Bolin Ding code 2
CLARE: A Semi-supervised Community Detection Algorithm Xixi Wu, Yun Xiong, Yao Zhang, Yizhu Jiao, Caihua Shan, Yiheng Sun, Yangyong Zhu, Philip S. Yu code 2
Learning the Evolutionary and Multi-scale Graph Structure for Multivariate Time Series Forecasting Junchen Ye, Zihan Liu, Bowen Du, Leilei Sun, Weimiao Li, Yanjie Fu, Hui Xiong code 2
M3Care: Learning with Missing Modalities in Multimodal Healthcare Data Chaohe Zhang, Xu Chu, Liantao Ma, Yinghao Zhu, Yasha Wang, Jiangtao Wang, Junfeng Zhao code 2
SAMCNet: Towards a Spatially Explainable AI Approach for Classifying MxIF Oncology Data Majid Farhadloo, Carl Molnar, Gaoxiang Luo, Yan Li, Shashi Shekhar, Rachel L. Maus, Svetomir N. Markovic, Alexey A. Leontovich, Raymond Moore code 2
No One Left Behind: Inclusive Federated Learning over Heterogeneous Devices Ruixuan Liu, Fangzhao Wu, Chuhan Wu, Yanlin Wang, Lingjuan Lyu, Hong Chen, Xing Xie code 2
What Makes Good Contrastive Learning on Small-Scale Wearable-based Tasks? Hangwei Qian, Tian Tian, Chunyan Miao code 2
Shallow and Deep Non-IID Learning on Complex Data Longbing Cao, Philip S. Yu, Zhilin Zhao code 2
Gradual AutoML using Lale Martin Hirzel, Kiran Kate, Parikshit Ram, Avraham Shinnar, Jason Tsay code 2
Robust Time Series Analysis and Applications: An Industrial Perspective Qingsong Wen, Linxiao Yang, Tian Zhou, Liang Sun code 2
PECOS: Prediction for Enormous and Correlated Output Spaces HsiangFu Yu, Jiong Zhang, WeiCheng Chang, JyunYu Jiang, Wei Li, ChoJui Hsieh code 2
Extracting Relevant Information from User's Utterances in Conversational Search and Recommendation Ali Montazeralghaem, James Allan University of Massachusetts Amherst, Amherst, MA, USA Conversational search and recommendation systems can ask clarifying questions through the conversation and collect valuable information from users. However, an important question remains: how can we extract relevant information from the user's utterances and use it in the retrieval or recommendation in the next turn of the conversation? Utilizing relevant information from users' utterances leads the system to better results at the end of the conversation. In this paper, we propose a model based on reinforcement learning, namely RelInCo, which takes the user's utterances and the context of the conversation and classifies each word in the user's utterances as belonging to the relevant or non-relevant class. RelInCo uses two Actors: 1) Arrangement-Actor, which finds the most relevant order of words in user's utterances, and 2) Selector-Actor, which determines which words, in the order provided by the arrangement Actor, can bring the system closer to the target of the conversation. In this way, we can find relevant information in the user's utterance and use it in the conversation. The objective function in our model is designed in such a way that it can maximize any desired retrieval and recommendation metrics (i.e., the ultimate 会话搜索和推荐系统可以通过会话提出澄清问题,并从用户那里收集有价值的信息。然而,一个重要的问题仍然存在: 我们如何从用户的话语中提取相关信息,并将其用于下一轮对话中的检索或推荐?利用用户话语中的相关信息,可以使系统在对话结束时获得更好的结果。在本文中,我们提出了一个基于强化学习的模型,即 RelinCo,该模型根据用户的话语和对话的上下文,将用户话语中的每个单词归类为相关或非相关类别。RelInCo 使用了两个参与者: 1)安排-参与者,它找到用户话语中最相关的词语顺序; 2)选择-参与者,它根据安排-参与者提供的顺序决定哪些词语可以使系统更接近对话的目标。通过这种方式,我们可以在用户的话语中找到相关信息,并在对话中加以利用。我们模型中的目标函数是这样设计的,它可以最大化任何所需的检索和推荐指标(即,最终的 code 1
Uni-Retriever: Towards Learning the Unified Embedding Based Retriever in Bing Sponsored Search Jianjin Zhang, Zheng Liu, Weihao Han, Shitao Xiao, Ruicheng Zheng, Yingxia Shao, Hao Sun, Hanqing Zhu, Premkumar Srinivasan, Weiwei Deng, Qi Zhang, Xing Xie Microsoft, Beijing, China; Beijing University of Posts and Telecommunications, Beijing, China; Microsoft, Newark, NJ, USA; Microsoft, Seattle, DC, USA Embedding based retrieval (EBR) is a fundamental building block in many web applications. However, EBR in sponsored search is distinguished from other generic scenarios and technically challenging due to the need of serving multiple retrieval purposes: firstly, it has to retrieve high-relevance ads, which may exactly serve user's search intent; secondly, it needs to retrieve high-CTR ads so as to maximize the overall user clicks. In this paper, we present a novel representation learning framework Uni-Retriever developed for Bing Search, which unifies two different training modes knowledge distillation and contrastive learning to realize both required objectives. On one hand, the capability of making high-relevance retrieval is established by distilling knowledge from the "relevance teacher model''. On the other hand, the capability of making high-CTR retrieval is optimized by learning to discriminate user's clicked ads from the entire corpus. The two training modes are jointly performed as a multi-objective learning process, such that the ads of high relevance and CTR can be favored by the generated embeddings. Besides the learning strategy, we also elaborate our solution for EBR serving pipeline built upon the substantially optimized DiskANN, where massive-scale EBR can be performed with competitive time and memory efficiency, and accomplished in high-quality. We make comprehensive offline and online experiments to evaluate the proposed techniques, whose findings may provide useful insights for the future development of EBR systems. Uni-Retriever has been mainstreamed as the major retrieval path in Bing's production thanks to the notable improvements on the representation and EBR serving quality. 嵌入式基于检索(EBR)是许多 Web 应用程序的基础构件。然而,由于需要服务于多种检索目的,赞助商搜索中的 EBR 不同于其他一般情况,在技术上具有挑战性: 首先,它必须检索高相关度的广告,这可能恰好服务于用户的搜索意图; 其次,它需要检索高点击率的广告,以最大限度地提高用户的总体点击率。本文提出了一种新的面向 Bing 搜索的 Uni-Retriever 表示学习框架,该框架将两种不同的训练模式知识提取和对比学习相结合,实现了两种不同的目标。一方面,从“关联教师模型”中提取知识,建立高关联检索能力;。另一方面,通过学习从整个语料库中区分用户点击广告,优化了高点击率检索的能力。这两种训练模式作为一个多目标学习过程共同执行,使得嵌入生成的广告更有利于高关联度和点击率的广告。除了学习策略,我们还详细阐述了我们的解决方案,EBR 服务流水线的基础上大幅度优化的 DiskANN,其中大规模的 EBR 可以执行竞争时间和内存效率,并完成在高质量。我们进行了全面的离线和在线实验来评估所提出的技术,其结果可能为未来 EBR 系统的发展提供有用的见解。统一检索已成为主流的检索路径在必应的生产显着改善的表示和 EBR 服务质量。 code 1
An Online Multi-task Learning Framework for Google Feed Ads Auction Models Ning Ma, Mustafa Ispir, Yuan Li, Yongpeng Yang, Zhe Chen, Derek Zhiyuan Cheng, Lan Nie, Kishor Barman Google Inc., Mountain View, CA, USA In this paper, we introduce a large scale online multi-task deep learning framework for modeling multiple feed ads auction prediction tasks on an industry-scale feed ads recommendation platform. Multiple prediction tasks are combined into one single model which is continuously trained on real time new ads data. Multi-tasking ads auction models in real-time faces many real-world challenges. For example, each task may be trained on different set of training data; the labels of different tasks may have different arrival time due to label delay; different tasks will interact with each other; combining the losses of each task is non-trivial. We tackle these challenges using practical and novel techniques such as multi-stage training for handling label delay, Multi-gate Mixture-of-Experts (MMoE) to optimize model interaction and an auto-parameter learning algorithm to optimize the loss weights of different tasks. We demonstrate that our proposed techniques can lead to quality improvements and substantial resource saving compared to modeling each single task independently. 本文介绍了一个大规模的在线多任务深度学习框架,在一个行业规模的推荐平台上对多种推广广告拍卖预测任务进行建模。将多个预测任务组合成一个单独的模型,对实时的新广告数据进行连续的训练。实时多任务广告拍卖模型在现实生活中面临着许多挑战。例如,每个任务可以在不同的训练数据集上进行训练; 由于标签延迟,不同任务的标签可能有不同的到达时间; 不同的任务将相互作用;。针对这些问题,我们采用了多阶段训练来处理标签延迟,多门专家混合(MMoE)来优化模型交互,以及自动参数学习算法来优化不同任务的损失权重。我们证明,与独立建模每个单独的任务相比,我们提出的技术可以导致质量改进和大量资源节省。 code 1
NxtPost: User To Post Recommendations In Facebook Groups Kaushik Rangadurai, Yiqun Liu, Siddarth Malreddy, Xiaoyi Liu, Piyush Maheshwari, Vishwanath Sangale, Fedor Borisyuk Meta Platforms Inc., Menlo Park, CA, USA In this paper, we present NxtPost, a deployed user-to-post content based sequential recommender system for Facebook Groups. Inspired by recent advances in NLP, we have adapted a Transformer based model to the domain of sequential recommendation. We explore causal masked multi-head attention that optimizes both short and long-term user interests. From a user's past activities validated by defined safety process, NxtPost seeks to learn a representation for the user's dynamic content preference and to predict the next post user may be interested in. In contrast to previous Transformer based methods, we do not assume that the recommendable posts have a fixed corpus. Accordingly, we use an external item/token embedding to extend a sequence-based approach to a large vocabulary. We achieve 49% abs. improvement in offline evaluation. As a result of NxtPost deployment, 0.6% more users are meeting new people, engaging with the community, sharing knowledge and getting support. The paper shares our experience in developing a personalized sequential recommender system, lessons deploying the model for cold start users, how to deal with freshness, and tuning strategies to reach higher efficiency in online A/B experiments. 在本文中,我们介绍了 NxtPost,这是一个为 Facebook group 部署的基于用户到发布内容的顺序推荐系统。受自然语言处理最新进展的启发,我们将一个基于 Transform- 的模型应用于顺序推荐领域。我们探索因果掩盖多头注意,优化短期和长期用户的兴趣。通过定义的安全过程验证用户过去的活动,NxtPost 试图学习用户动态内容偏好的表示,并预测下一个帖子用户可能感兴趣的内容。与以前基于 former 的方法相比,我们不假定推荐的帖子具有固定的语料库。因此,我们使用外部项/令牌嵌入来将基于序列的方法扩展到大型词汇表。我们有49% 的腹肌。离线评估的改进。作为 NxtPost 部署的结果,0.6% 的用户正在结识新朋友,参与社区活动,分享知识并获得支持。本文分享了我们在开发个性化连续推荐系统的经验、为冷启动用户部署模型的教训、如何处理新鲜感,以及在线 A/B 实验中为提高效率而调整策略的经验。 code 1
ReprBERT: Distilling BERT to an Efficient Representation-Based Relevance Model for E-Commerce Shaowei Yao, Jiwei Tan, Xi Chen, Juhao Zhang, Xiaoyi Zeng, Keping Yang Alibaba Group, Hangzhou, China Text relevance or text matching of query and product is an essential technique for e-commerce search engine, which helps users find the desirable products and is also crucial to ensuring user experience. A major difficulty for e-commerce text relevance is the severe vocabulary gap between query and product. Recently, neural networks have been the mainstream for the text matching task owing to the better performance for semantic matching. Practical e-commerce relevance models are usually representation-based architecture, which can pre-compute representations offline and are therefore online efficient. Interaction-based models, although can achieve better performance, are mostly time-consuming and hard to be deployed online. Recently BERT has achieved significant progress on many NLP tasks including text matching, and it is of great value but also big challenge to deploy BERT to the e-commerce relevance task. To realize this goal, we propose ReprBERT, which has the advantages of both excellent performance and low latency, by distilling the interaction-based BERT model to a representation-based architecture. To reduce the performance decline, we investigate the key reasons and propose two novel interaction strategies to resolve the absence of representation interaction and low-level semantic interaction. Finally, ReprBERT can achieve only about 1.5% AUC loss from the interaction-based BERT, but has more than 10% AUC improvement compared to previous state-of-the-art representation-based models. ReprBERT has already been deployed on the search engine of Taobao and serving the entire search traffic, achieving significant gain of user experience and business profit. 查询和产品的文本相关性或文本匹配是电子商务搜索引擎的关键技术,它可以帮助用户找到想要的产品,也是保证用户体验的关键。电子商务文本相关性的一个主要困难是查询和产品之间严重的词汇差距。近年来,神经网络以其较好的语义匹配性能成为文本匹配的主流。实用的电子商务相关性模型通常是基于表示的体系结构,它可以离线预先计算表示,因此具有在线效率。基于交互的模型,尽管可以获得更好的性能,但是大部分都是耗时的,并且很难在线部署。近年来,BERT 在包括文本匹配在内的许多自然语言处理任务中取得了显著的进展,将 BERT 部署到电子商务相关任务中具有很大的价值,但也面临很大的挑战。为了实现这一目标,我们提出了 ReprBERT,它具有良好的性能和低延迟的优点,通过提炼基于交互的 BERT 模型到一个基于表示的体系结构。为了减少表征交互和低层次语义交互的缺失,本文研究了表征交互和低层次语义交互的关键原因,并提出了两种新的交互策略来解决表征交互和低层次语义交互的缺失问题。最后,ReprBERT 只能从基于交互的 BERT 中获得约1.5% 的 AUC 损失,但与以前的基于最先进表示的模型相比,具有超过10% 的 AUC 改善。ReprBERT 已经部署在淘宝的搜索引擎上,服务于整个搜索流量,取得了显著的用户体验和商业利润收益。 code 1
Learning Supplementary NLP Features for CTR Prediction in Sponsored Search Dong Wang, Shaoguang Yan, Yunqing Xia, Kavé Salamatian, Weiwei Deng, Qi Zhang University of Savoie & Tallinn University of Technology, Annecy, France; Microsoft Corporation, Beijing, China In sponsored search engines, pre-trained language models have shown promising performance improvements on Click-Through-Rate (CTR) prediction. A widely used approach for utilizing pre-trained language models in CTR prediction consists of fine-tuning the language models with click labels and early stopping on peak value of the obtained Area Under the ROC Curve (AUC). Thereafter the output of these fine-tuned models, i.e., the final score or intermediate embedding generated by language model, is used as a new Natural Language Processing (NLP) feature into CTR prediction baseline. This cascade approach avoids complicating the CTR prediction baseline, while keeping flexibility and agility. However, we show in this work that calibrating separately the language model based on the peak single model AUC does not always yield NLP features that give the best performance in CTR prediction model ultimately. Our analysis reveals that the misalignment is due to overlap and redundancy between the new NLP features and the existing features in CTR prediction baseline. In other words, the NLP features can improve CTR prediction better if such overlap can be reduced. For this purpose, we introduce a simple and general joint-training framework for fine-tuning of language models, combined with the already existing features in CTR prediction baseline, to extract supplementary knowledge for NLP feature. Moreover, we develop an efficient Supplementary Knowledge Distillation (SuKD) that transfers the supplementary knowledge learned by a heavy language model to a light and serviceable model. Comprehensive experiments on both public data and commercial data presented in this work demonstrate that the new NLP features resulting from the joint-training framework can outperform significantly the ones from the independent fine-tuning based on click labels. we also show that the light model distilled with SuKD can provide obvious AUC improvement in CTR prediction over the traditional feature-based knowledge distillation. 在赞助商搜索引擎中,预先训练好的语言模型在点击率(Click-Through-Rate,CTR)预测方面显示出有希望的性能改进。一个广泛使用的方法,利用预先训练的语言模型在点击率预测包括微调的语言模型与点击标签和早期停止在 ROC 曲线下面积(AUC)峰值获得。然后,这些微调模型的输出,即语言模型生成的最终分数或中间嵌入,被用作 CTR 预测基线的一个新的自然语言处理(NLP)特征。这种级联方法避免了使 CTR 预测基线复杂化,同时保持了灵活性和敏捷性。然而,我们的工作表明,基于峰值单模型 AUC 分别标定语言模型并不总是产生 NLP 特征,最终给出 CTR 预测模型的最佳性能。我们的分析表明,失调是由于重叠和冗余之间的新 NLP 特征和现有的特征在 CTR 预测基线。换句话说,如果能够减少这种重叠,NLP 特征能够更好地提高 CTR 预测。为此,本文提出了一种简单通用的语言模型微调联合训练框架,结合 CTR 预测基线中已有的特征,提取 NLP 特征的补充知识。此外,我们开发了一个有效的补充知识提取(SuKD) ,将重语言模型所学到的补充知识转化为一个简单易用的模型。对公共数据和商业数据的综合实验表明,联合训练框架所产生的新的自然语言处理特征可以显著优于基于点击标签的独立微调。与传统的基于特征的知识提取方法相比,用 SuKD 提取的光模型在 CTR 预测方面可以提供明显的 AUC 改进。 code 1
AutoShard: Automated Embedding Table Sharding for Recommender Systems Daochen Zha, Louis Feng, Bhargav Bhushanam, Dhruv Choudhary, Jade Nie, Yuandong Tian, Jay Chae, Yinbin Ma, Arun Kejariwal, Xia Hu Meta Platforms, Inc., Menlo Park, CA, USA; Rice University, Houston, TX, USA Embedding learning is an important technique in deep recommendation models to map categorical features to dense vectors. However, the embedding tables often demand an extremely large number of parameters, which become the storage and efficiency bottlenecks. Distributed training solutions have been adopted to partition the embedding tables into multiple devices. However, the embedding tables can easily lead to imbalances if not carefully partitioned. This is a significant design challenge of distributed systems named embedding table sharding, i.e., how we should partition the embedding tables to balance the costs across devices, which is a non-trivial task because 1) it is hard to efficiently and precisely measure the cost, and 2) the partition problem is known to be NP-hard. In this work, we introduce our novel practice in Meta, namely AutoShard, which uses a neural cost model to directly predict the multi-table costs and leverages deep reinforcement learning to solve the partition problem. Experimental results on an open-sourced large-scale synthetic dataset and Meta's production dataset demonstrate the superiority of AutoShard over the heuristics. Moreover, the learned policy of AutoShard can transfer to sharding tasks with various numbers of tables and different ratios of the unseen tables without any fine-tuning. Furthermore, AutoShard can efficiently shard hundreds of tables in seconds. The effectiveness, transferability, and efficiency of AutoShard make it desirable for production use. Our algorithms have been deployed in Meta production environment. A prototype is available at https://github.com/daochenzha/autoshard 嵌入式学习是深度推荐模型中将分类特征映射到密集向量的一项重要技术。然而,嵌入式表往往需要大量的参数,成为存储和效率的瓶颈。采用分布式训练解决方案将嵌入表划分为多个设备。然而,如果不仔细分区,嵌入表很容易导致不平衡。这是分布式系统嵌入表分片的一个重大设计挑战,即我们应该如何划分嵌入表来平衡设备之间的成本,这是一个非常重要的任务,因为1)很难有效和精确地度量成本,2)划分问题是已知的 NP 难题。在这项工作中,我们介绍了我们在 Meta 中的新实践,即 AutoShard,它使用一个神经成本模型来直接预测多表成本,并利用深度强化学习来解决分区问题。在一个开源的大规模合成数据集和 Meta 生产数据集上的实验结果证明了 AutoShard 相对于启发式算法的优越性。此外,AutoShard 的学习策略可以转换为使用不同数量的表和看不见的表的不同比例的分片任务,而不需要进行任何微调。此外,AutoShard 可以在几秒钟内高效地切分数百个表。AutoShard 的有效性、可转移性和效率使其适合生产使用。我们的算法已经部署在元生产环境中。Https://github.com/daochenzha/autoshard 上有一个原型 code 1
On-Device Learning for Model Personalization with Large-Scale Cloud-Coordinated Domain Adaption Yikai Yan, Chaoyue Niu, Renjie Gu, Fan Wu, Shaojie Tang, Lifeng Hua, Chengfei Lyu, Guihai Chen University of Texas at Dallas, Richardson, TX, USA; Alibaba Group, Hangzhou, China; Shanghai Jiao Tong University, Shanghai, China Cloud-based learning is currently the mainstream in both academia and industry. However, the global data distribution, as a mixture of all the users' data distributions, for training a global model may deviate from each user's local distribution for inference, making the global model non-optimal for each individual user. To mitigate distribution discrepancy, on-device training over local data for model personalization is a potential solution, but suffers from serious overfitting. In this work, we propose a new device-cloud collaborative learning framework under the paradigm of domain adaption, called MPDA, to break the dilemmas of purely cloud-based learning and on-device training. From the perspective of a certain user, the general idea of MPDA is to retrieve some similar data from the cloud's global pool, which functions as large-scale source domains, to augment the user's local data as the target domain. The key principle of choosing which outside data depends on whether the model trained over these data can generalize well over the local data. We theoretically analyze that MPDA can reduce distribution discrepancy and overfitting risk. We also extensively evaluate over the public MovieLens 20M and Amazon Electronics datasets, as well as an industrial dataset collected from Mobile Taobao over a period of 30 days. We finally build a device-tunnel-cloud system pipeline, deploy MPDA in the icon area of Mobile Taobao for click-through rate prediction, and conduct online A/B testing. Both offline and online results demonstrate that MPDA outperforms the baselines of cloud-based learning and on-device training only over local data, from multiple offline and online metrics. 基于云的学习是目前学术界和工业界的主流。然而,全局数据分布作为所有用户数据分布的混合,用于训练全局模型可能偏离每个用户的局部分布进行推理,使得全局模型对于每个用户不是最优的。为了缓解分布差异,对模型个性化的本地数据进行设备上的训练是一个潜在的解决方案,但是存在严重的过拟合问题。在这项工作中,我们提出了一个新的设备-云计算合作学习框架,在领域适应的范例下称为 MPDA,以打破纯粹基于云的学习和设备上培训的困境。从某个用户的角度来看,MPDA 的总体思想是从作为大规模源域的云的全局池中检索一些类似的数据,以增加用户的本地数据作为目标域。选择哪些外部数据的关键原则取决于对这些数据进行训练的模型是否能够比本地数据更好地推广。从理论上分析了 MPDA 可以降低分布差异和过拟合风险。我们还广泛评估了公开的 MovieLens 20M 和亚马逊电子数据集,以及在30天内从移动淘宝收集的工业数据集。最后,我们建立了设备-隧道-云系统流水线,在移动淘宝的图标区域部署 MPDA 进行点进率预测,并进行在线 A/B 测试。离线和在线结果都表明,MPDA 仅在多个离线和在线指标的本地数据上优于基于云的学习和设备上培训的基线。 code 1
Debiasing Learning for Membership Inference Attacks Against Recommender Systems Zihan Wang, Na Huang, Fei Sun, Pengjie Ren, Zhumin Chen, Hengliang Luo, Maarten de Rijke, Zhaochun Ren Meituan, Beijing, China; University of Amsterdam, Amsterdam, Netherlands; Shandong University, Qingdao, China; Alibaba Group, Beijing, China Learned recommender systems may inadvertently leak information about their training data, leading to privacy violations. We investigate privacy threats faced by recommender systems through the lens of membership inference. In such attacks, an adversary aims to infer whether a user's data is used to train the target recommender. To achieve this, previous work has used a shadow recommender to derive training data for the attack model, and then predicts the membership by calculating difference vectors between users' historical interactions and recommended items. State-of-the-art methods face two challenging problems: (i) training data for the attack model is biased due to the gap between shadow and target recommenders, and (ii) hidden states in recommenders are not observational, resulting in inaccurate estimations of difference vectors. To address the above limitations, we propose a Debiasing Learning for Membership Inference Attacks against recommender systems (DL-MIA) framework that has four main components: (i) a difference vector generator, (ii) a disentangled encoder, (iii) a weight estimator, and (iv) an attack model. To mitigate the gap between recommenders, a variational auto-encoder (VAE) based disentangled encoder is devised to identify recommender invariant and specific features. To reduce the estimation bias, we design a weight estimator, assigning a truth-level score for each difference vector to indicate estimation accuracy. We evaluate DL-MIA against both general recommenders and sequential recommenders on three real-world datasets. Experimental results show that DL-MIA effectively alleviates training and estimation biases simultaneously, and Íachieves state-of-the-art attack performance. 经验丰富的推荐系统可能无意中泄露有关其培训数据的信息,从而导致侵犯隐私。我们通过成员推理的视角来研究推荐系统所面临的隐私威胁。在这种攻击中,对手的目的是推断用户的数据是否被用来训练目标推荐器。为了实现这一目标,以前的工作是使用阴影推荐来获取攻击模型的训练数据,然后通过计算用户历史交互和推荐项目之间的差异向量来预测成员关系。最先进的方法面临两个具有挑战性的问题: (i)攻击模型的训练数据由于阴影和目标推荐器之间的差距而有偏差,以及(ii)推荐器中的隐藏状态不是观察性的,导致差异向量的估计不准确。为了解决上述局限性,我们提出了针对推荐系统(DL-MIA)的成员推断攻击的去偏学习框架,其具有四个主要组成部分: (i)差分矢量生成器,(ii)分离编码器,(iii)权重估计器和(iv)攻击模型。为了缩小推荐器之间的差距,设计了一种基于变分自动编码器(VAE)的解纠缠编码器来识别推荐器的不变性和特定特征。为了减少估计偏差,我们设计了一个权重估计器,为每个差异向量指定一个真值水平分数来表示估计的准确性。我们在三个真实世界的数据集上评估 DL-MIA 与通用推荐和顺序推荐的对比。实验结果表明,DL-MIA 同时有效地减小了训练偏差和估计偏差,并取得了一流的攻击性能。 code 1
Automatic Generation of Product-Image Sequence in E-commerce Xiaochuan Fan, Chi Zhang, Yong Yang, Yue Shang, Xueying Zhang, Zhen He, Yun Xiao, Bo Long, Lingfei Wu JD.COM, Beijing, UNK, China; JD.COM Research, Mountain View, CA, USA Product images are essential for providing desirable user experience in an e-commerce platform. For a platform with billions of products, it is extremely time-costly and labor-expensive to manually pick and organize qualified images. Furthermore, there are the numerous and complicated image rules that a product image needs to comply in order to be generated/selected. To address these challenges, in this paper, we present a new learning framework in order to achieve Automatic Generation of Product-Image Sequence (AGPIS) in e-commerce. To this end, we propose a Multi-modality Unified Image-sequence Classifier (MUIsC), which is able to simultaneously detect all categories of rule violations through learning. MUIsC leverages textual review feedback as the additional training target and utilizes product textual description to provide extra semantic information. %Without using prior knowledge or manually-crafted task, a single MUIsC model is able to learn the holistic knowledge of image reviewing and detect all categories of rule violations simultaneously. Based on offline evaluations, we show that the proposed MUIsC significantly outperforms various baselines. Besides MUIsC, we also integrate some other important modules in the proposed framework, such as primary image selection, non-compliant content detection, and image deduplication. With all these modules, our framework works effectively and efficiently in JD.com recommendation platform. By Dec 2021, our AGPIS framework has generated high-standard images for about 1.5 million products and achieves 13.6% in reject rate. Code of this work is available at https://github.com/efan3000/muisc. 在电子商务平台中,产品图像对于提供理想的用户体验至关重要。对于一个拥有数十亿产品的平台来说,手动挑选和组织合格的图像是非常耗费时间和人力的。此外,还有许多复杂的图像规则,产品图像需要遵守这些规则才能生成/选择。针对这些挑战,本文提出了一种新的学习框架,以实现电子商务中产品图像序列(AGPIS)的自动生成。为此,我们提出了一种多模态统一图像序列分类器(MUIsC) ,它能够通过学习同时检测所有类别的违规行为。MUisC 利用文本评论反馈作为额外的培训目标,并利用产品文本描述提供额外的语义信息。% 在不使用先前知识或手工制作任务的情况下,单一的 MUIsC 模型能够学习图像审查的整体知识,并同时发现所有类别的违规行为。基于离线评估,我们表明所提出的 MUIsC 明显优于各种基线。除了 MUIsC,我们还整合了一些其他的重要模块,如初始图像选择、不兼容的内容检测和图像去重。通过所有这些模块,我们的框架在 JD.com 推荐平台上高效地工作。到2021年12月,我们的 AGPIS 框架已经为大约150万个产品生成了高标准的图像,并且实现了13.6% 的拒绝率。这项工作的代码可在 https://github.com/efan3000/muisc 查阅。 code 1
Semantic Retrieval at Walmart Alessandro Magnani, Feng Liu, Suthee Chaidaroon, Sachin Yadav, Praveen Reddy Suram, Ajit Puthenputhussery, Sijie Chen, Min Xie, Anirudh Kashi, Tony Lee, Ciya Liao code 1
Training Large-Scale News Recommenders with Pretrained Language Models in the Loop Shitao Xiao, Zheng Liu, Yingxia Shao, Tao Di, Bhuvan Middha, Fangzhao Wu, Xing Xie code 1
DDR: Dialogue Based Doctor Recommendation for Online Medical Service Zhi Zheng, Zhaopeng Qiu, Hui Xiong, Xian Wu, Tong Xu, Enhong Chen, Xiangyu Zhao code 1
FedMSplit: Correlation-Adaptive Federated Multi-Task Learning across Multimodal Split Networks Jiayi Chen, Aidong Zhang code 1
A Spectral Representation of Networks: The Path of Subgraphs Shengmin Jin, Hao Tian, Jiayu Li, Reza Zafarani code 1
Condensing Graphs via One-Step Gradient Matching Wei Jin, Xianfeng Tang, Haoming Jiang, Zheng Li, Danqing Zhang, Jiliang Tang, Bing Yin code 1
RGVisNet: A Hybrid Retrieval-Generation Neural Framework Towards Automatic Data Visualization Generation Yuanfeng Song, Xuefang Zhao, Raymond ChiWing Wong, Di Jiang code 1
Clustering with Fair-Center Representation: Parameterized Approximation Algorithms and Heuristics Suhas Thejaswi, Ameet Gadekar, Bruno Ordozgoiti, Michal Osadnik code 1
Towards Representation Alignment and Uniformity in Collaborative Filtering Chenyang Wang, Yuanqing Yu, Weizhi Ma, Min Zhang, Chong Chen, Yiqun Liu, Shaoping Ma code 1
Comprehensive Fair Meta-learned Recommender System Tianxin Wei, Jingrui He code 1
RetroGraph: Retrosynthetic Planning with Graph Search Shufang Xie, Rui Yan, Peng Han, Yingce Xia, Lijun Wu, Chenjuan Guo, Bin Yang, Tao Qin code 1
Ultrahyperbolic Knowledge Graph Embeddings Bo Xiong, Shichao Zhu, Mojtaba Nayyeri, Chengjin Xu, Shirui Pan, Chuan Zhou, Steffen Staab code 1
HICF: Hyperbolic Informative Collaborative Filtering Menglin Yang, Zhihao Li, Min Zhou, Jiahong Liu, Irwin King code 1
Improving Social Network Embedding via New Second-Order Continuous Graph Neural Networks Yanfu Zhang, Shangqian Gao, Jian Pei, Heng Huang code 1
SoccerCPD: Formation and Role Change-Point Detection in Soccer Matches Using Spatiotemporal Tracking Data Hyunsung Kim, Bit Kim, Dongwook Chung, Jinsung Yoon, SangKi Ko code 1
Multi-Aspect Dense Retrieval Weize Kong, Swaraj Khadanga, Cheng Li, Shaleen Kumar Gupta, Mingyang Zhang, Wensong Xu, Michael Bendersky code 1
Multi-objective Optimization of Notifications Using Offline Reinforcement Learning Prakruthi Prabhakar, Yiping Yuan, Guangyu Yang, Wensheng Sun, Ajith Muralidharan code 1
Seq2Event: Learning the Language of Soccer Using Transformer-based Match Event Prediction Ian Simpson, Ryan J. Beal, Duncan Locke, Timothy J. Norman code 1
Friend Recommendations with Self-Rescaling Graph Neural Networks Xiran Song, Jianxun Lian, Hong Huang, Mingqi Wu, Hai Jin, Xing Xie code 1
4SDrug: Symptom-based Set-to-set Small and Safe Drug Recommendation Yanchao Tan, Chengjun Kong, Leisheng Yu, Pan Li, Chaochao Chen, Xiaolin Zheng, Vicki Hertzberg, Carl Yang code 1
Interpretable Personalized Experimentation Han Wu, Sarah Tan, Weiwei Li, Mia Garrard, Adam Obeng, Drew Dimmery, Shaun Singh, Hanson Wang, Daniel R. Jiang, Eytan Bakshy code 1
Graph-based Representation Learning for Web-scale Recommender Systems Ahmed ElKishky, Michael M. Bronstein, Ying Xiao, Aria Haghighi code 1
concept2code: Deep Reinforcement Learning for Conversational AI Omprakash Sonie, Abir Chakraborty, Ankan Mullick code 1
Variational Inference for Training Graph Neural Networks in Low-Data Regime through Joint Structure-Label Estimation Danning Lao, Xinyu Yang, Qitian Wu, Junchi Yan code 1
Pairwise Adversarial Training for Unsupervised Class-imbalanced Domain Adaptation Weili Shi, Ronghang Zhu, Sheng Li code 1
TrajGAT: A Graph-based Long-term Dependency Modeling Approach for Trajectory Similarity Computation Di Yao, Haonan Hu, Lun Du, Gao Cong, Shi Han, Jingping Bi code 1
A/B Testing Intuition Busters: Common Misunderstandings in Online Controlled Experiments Ron Kohavi, Alex Deng, Lukas Vermeer code 1
Pricing the Long Tail by Explainable Product Aggregation and Monotonic Bandits Marco Mussi, Gianmarco Genalti, Francesco Trovò, Alessandro Nuara, Nicola Gatti, Marcello Restelli code 1
PinnerFormer: Sequence Modeling for User Representation at Pinterest Nikil Pancha, Andrew Zhai, Jure Leskovec, Charles Rosenberg code 1
Open-Domain Aspect-Opinion Co-Mining with Double-Layer Span Extraction Mohna Chakraborty, Adithya Kulkarni, Qi Li Iowa State University, Ames, IA, USA The aspect-opinion extraction tasks extract aspect terms and opinion terms from reviews. The supervised extraction methods achieve state-of-the-art performance but require large-scale human-annotated training data. Thus, they are restricted for open-domain tasks due to the lack of training data. This work addresses this challenge and simultaneously mines aspect terms, opinion terms, and their correspondence in a joint model. We propose an Open-Domain Aspect-Opinion Co-Mining (ODAO) method with a Double-Layer span extraction framework. Instead of acquiring human annotations, ODAO first generates weak labels for unannotated corpus by employing rules-based on universal dependency parsing. Then, ODAO utilizes this weak supervision to train a double-layer span extraction framework to extract aspect terms (ATE), opinion terms (OTE), and aspect-opinion pairs (AOPE). ODAO applies canonical correlation analysis as an early stopping indicator to avoid the model over-fitting to the noise to tackle the noisy weak supervision. ODAO applies a self-training process to gradually enrich the training data to tackle the weak supervision bias issue. We conduct extensive experiments and demonstrate the power of the proposed ODAO. The results on four benchmark datasets for aspect-opinion co-extraction and pair extraction tasks show that ODAO can achieve competitive or even better performance compared with the state-of-the-art fully supervised methods. 方面意见提取任务从评论中提取方面术语和意见术语。有监督的提取方法取得了最先进的性能,但需要大规模的人工注释的训练数据。因此,由于缺乏训练数据,它们在开放域任务中受到限制。这项工作解决了这个挑战,同时挖掘方面术语,意见术语,以及它们在联合模型中的对应关系。我们提出了一个开放领域的方面-意见共同挖掘(ODAO)方法与双层跨度提取框架。ODAO 不是获取人工注释,而是首先通过使用基于通用依赖解析的规则为未注释的语料库生成弱标签。然后,ODAO 利用这种弱监督训练一个双层跨度提取框架来提取方面术语(ATE)、观点术语(OTE)和方面-观点对(AOPE)。《噪音管制条例》采用典型相关分析作为及早停止的指标,以避免模型过分配合噪音,以对付噪音较大而监管薄弱的情况。ODAO 采用自我训练过程,逐步丰富训练数据,解决监督偏差问题。我们进行了广泛的实验,并演示了所提出的 ODAO 的功能。通过对四个基准数据集的侧面意见协同提取和对提取任务的实验结果表明,ODAO 算法可以获得比现有全监督算法更好的性能。 code 1
Spatio-Temporal Trajectory Similarity Learning in Road Networks Ziquan Fang, Yuntao Du, Xinjun Zhu, Danlei Hu, Lu Chen, Yunjun Gao, Christian S. Jensen code 1
Detecting Cash-out Users via Dense Subgraphs Yingsheng Ji, Zheng Zhang, Xinlei Tang, Jiachen Shen, Xi Zhang, Guangwen Yang code 1
Semantic Enhanced Text-to-SQL Parsing via Iteratively Learning Schema Linking Graph Aiwei Liu, Xuming Hu, Li Lin, Lijie Wen code 1
Semi-supervised Drifted Stream Learning with Short Lookback Weijieying Ren, Pengyang Wang, Xiaolin Li, Charles E. Hughes, Yanjie Fu code 1
State Dependent Parallel Neural Hawkes Process for Limit Order Book Event Stream Prediction and Simulation Zijian Shi, John Cartlidge code 1
Improving Data-driven Heterogeneous Treatment Effect Estimation Under Structure Uncertainty Christopher Tran, Elena Zheleva code 1
Variational Graph Author Topic Modeling Delvin Ce Zhang, Hady Wirawan Lauw code 1
Alexa Teacher Model: Pretraining and Distilling Multi-Billion-Parameter Encoders for Natural Language Understanding Systems Jack FitzGerald, Shankar Ananthakrishnan, Konstantine Arkoudas, Davide Bernardi, Abhishek Bhagia, Claudio Delli Bovi, Jin Cao, Rakesh Chada, Amit Chauhan, Luoxin Chen, Anurag Dwarakanath, Satyam Dwivedi, Turan Gojayev, Karthik Gopalakrishnan, Thomas Gueudré, Dilek HakkaniTur, Wael Hamza, Jonathan J. Hüser, Kevin Martin Jose, Haidar Khan, Beiye Liu, Jianhua Lu, Alessandro Manzotti, Pradeep Natarajan, Karolina Owczarzak, Gokmen Oz, Enrico Palumbo, Charith Peris, Chandana Satya Prakash, Stephen Rawls, Andy Rosenbaum, Anjali Shenoy, Saleh Soltan, Mukund Harakere Sridhar, Lizhen Tan, Fabian Triefenbach, Pan Wei, Haiyang Yu, Shuai Zheng, Gökhan Tür, Prem Natarajan code 1
Augmenting Log-based Anomaly Detection Models to Reduce False Anomalies with Human Feedback Tong Jia, Ying Li, Yong Yang, Gang Huang, Zhonghai Wu code 1
DNA-Stabilized Silver Nanocluster Design via Regularized Variational Autoencoders Fariha Moomtaheen, Matthew Killeen, James T. Oswald, Anna GonzàlezRosell, Peter Mastracco, Alexander Gorovits, Stacy M. Copp, Petko Bogdanov code 1
Amazon SageMaker Model Monitor: A System for Real-Time Insights into Deployed Machine Learning Models David Nigenda, Zohar Karnin, Muhammad Bilal Zafar, Raghu Ramesha, Alan Tan, Michele Donini, Krishnaram Kenthapadi code 1
A Graph Learning Based Framework for Billion-Scale Offline User Identification Daixin Wang, Zujian Weng, Zhengwei Wu, Zhiqiang Zhang, Peng Cui, Hongwei Zhao, Jun Zhou code 1
Learning Large-scale Subsurface Simulations with a Hybrid Graph Network Simulator Tailin Wu, Qinchen Wang, Yinan Zhang, Rex Ying, Kaidi Cao, Rok Sosic, Ridwan Jalali, Hassan Hamam, Marko Maucec, Jure Leskovec code 1
User Engagement in Mobile Health Applications Babaniyi Yusuf Olaniyi, Ana Fernández del Río, África Periáñez, Lauren Bellhouse code 1
Advances in Exploratory Data Analysis, Visualisation and Quality for Data Centric AI Systems Hima Patel, Shanmukha C. Guttula, Ruhi Sharma Mittal, Naresh Manwani, Laure BertiÉquille, Abhijit Manatkar code 1
Submodular Feature Selection for Partial Label Learning WeiXuan Bao, JunYi Hang, MinLing Zhang Southeast University & Key Laboratory of Computer Network and Information Integration (Southeast University), Ministry of Education, Nanjing, China Partial label learning induces a multi-class classifier from training examples each associated with a candidate label set where the ground-truth label is concealed. Feature selection improves the generalization ability of learning system via selecting essential features for classification from the original feature set, while the task of partial label feature selection is challenging due to ambiguous labeling information. In this paper, the first attempt towards partial label feature selection is investigated via mutual-information-based dependency maximization. Specifically, the proposed approach SAUTE iteratively maximizes the dependency between selected features and labeling information, where the value of mutual information is estimated from confidence-based latent variable inference. In each iteration, the near-optimal features are selected greedily according to properties of submodular mutual information function, while the density of latent label variable is inferred with the help of updated labeling confidences over candidate labels by resorting to kNN aggregation in the induced lower-dimensional feature space. Extensive experiments over synthetic as well as real-world partial label data sets show that the generalization ability of well-established partial label learning algorithms can be significantly improved after coupling with the proposed feature selection approach. 部分标签学习从训练样本中归纳出一个多类分类器,每个样本与一个隐藏地面真实标签的候选标签集相关联。特征选择通过从原始特征集中选择分类所需的基本特征来提高学习系统的泛化能力,而部分标记特征选择则由于标记信息不明确而面临挑战。本文首次研究了基于互信息的依赖最大化方法在部分标签特征选择中的应用。特别地,提出的方法 SAUTE 迭代地最大化选择的特征和标记信息之间的依赖性,其中互信息的价值是估计基于置信度的潜变量推断。在每次迭代中,根据子模互信息函数的性质贪婪地选择接近最优的特征,利用诱导的低维特征空间中的 kNN 聚集,借助候选标签上更新的标签置信度推断潜在标签变量的密度。通过对合成和实际部分标签数据集的大量实验表明,与所提出的特征选择方法相结合,可以显著提高已有部分标签学习算法的泛化能力。 code 1
Practical Lossless Federated Singular Vector Decomposition over Billion-Scale Data Di Chai, Leye Wang, Junxue Zhang, Liu Yang, Shuowei Cai, Kai Chen, Qiang Yang Hong Kong University of Science and Technology, Hong Kong, China; Peking University, Beijing, China With the enactment of privacy-preserving regulations, e.g., GDPR, federated SVD is proposed to enable SVD-based applications over different data sources without revealing the original data. However, many SVD-based applications cannot be well supported by existing federated SVD solutions. The crux is that these solutions, adopting either differential privacy (DP) or homomorphic encryption (HE), suffer from accuracy loss caused by unremovable noise or degraded efficiency due to inflated data. In this paper, we propose FedSVD, a practical lossless federated SVD method over billion-scale data, which can simultaneously achieve lossless accuracy and high efficiency. At the heart of FedSVD is a lossless matrix masking scheme delicately designed for SVD: 1) While adopting the masks to protect private data, FedSVD completely removes them from the final results of SVD to achieve lossless accuracy; and 2) As the masks do not inflate the data, FedSVD avoids extra computation and communication overhead during the factorization to maintain high efficiency. Experiments with real-world datasets show that FedSVD is over 10000x faster than the HE-based method and has 10 orders of magnitude smaller error than the DP-based solution (ε=0.1, δ=0.1) on SVD tasks. We further build and evaluate FedSVD over three real-world applications: principal components analysis (PCA), linear regression (LR), and latent semantic analysis (LSA), to show its superior performance in practice. On federated LR tasks, compared with two state-of-the-art solutions: FATE [17] and SecureML [19], FedSVD-LR is 100x faster than SecureML and 10x faster than FATE. 随着 GDPR 等隐私保护规则的制定,联邦奇异值分解被提出,以使基于奇异值分解的应用能够在不同数据源之间进行而不暴露原始数据。然而,现有的联邦 SVD 解决方案不能很好地支持许多基于 SVD 的应用程序。问题的关键是,这些解决方案,无论是采用差分隐私(DP)或同态加密(HE) ,都会受到不可去除的噪声或因数据膨胀而导致效率降低所造成的精度损失。在本文中,我们提出了 FedSVD,一种实用的无损联邦 SVD 方法,它可以同时达到无损精度和高效率。FedSVD 的核心是一种为 SVD 精心设计的无损矩阵掩蔽方案: 1)在采用掩蔽保护私有数据的同时,FedSVD 从 SVD 的最终结果中完全去除掩蔽,以实现无损精度; 2)由于掩蔽不会使数据膨胀,FedSVD 避免了因子分解过程中的额外计算和通信开销,保持了高效率。实际数据集的实验表明,在奇异值分解任务中,FedSVD 比基于 HE 的方法快10000倍以上,并且比基于 DP 的方法(ε = 0.1,δ = 0.1)误差小10数量级。我们进一步构建和评估 FedSVD 在三个现实世界中的应用: 主成分分析(PCA)、线性回归分析(LR)和潜在语义学分析(LSA) ,以显示其在实践中的卓越性能。在联邦 LR 任务上,与 FATE [17]和 SecureML [19]这两种最先进的解决方案相比,FedSVD-LR 比 SecureML 快100倍,比 FATE 快10倍。 code 1
Efficient Join Order Selection Learning with Graph-based Representation Jin Chen, Guanyu Ye, Yan Zhao, Shuncheng Liu, Liwei Deng, Xu Chen, Rui Zhou, Kai Zheng code 1
RLogic: Recursive Logical Rule Learning from Knowledge Graphs Kewei Cheng, Jiahao Liu, Wei Wang, Yizhou Sun code 1
TARNet: Task-Aware Reconstruction for Time-Series Transformer Ranak Roy Chowdhury, Xiyuan Zhang, Jingbo Shang, Rajesh K. Gupta, Dezhi Hong code 1
Sufficient Vision Transformer Zhi Cheng, Xiu Su, Xueyu Wang, Shan You, Chang Xu code 1
Collaboration Equilibrium in Federated Learning Sen Cui, Jian Liang, Weishen Pan, Kun Chen, Changshui Zhang, Fei Wang code 1
Robust Event Forecasting with Spatiotemporal Confounder Learning Songgaojun Deng, Huzefa Rangwala, Yue Ning code 1
Robust Inverse Framework using Knowledge-guided Self-Supervised Learning: An application to Hydrology Rahul Ghosh, Arvind Renganathan, Kshitij Tayal, Xiang Li, Ankush Khandelwal, Xiaowei Jia, Christopher J. Duffy, John Nieber, Vipin Kumar code 1
Core-periphery Partitioning and Quantum Annealing Catherine F. Higham, Desmond J. Higham, Francesco Tudisco code 1
Few-Shot Fine-Grained Entity Typing with Automatic Label Interpretation and Instance Generation Jiaxin Huang, Yu Meng, Jiawei Han code 1
Global Self-Attention as a Replacement for Graph Convolution Md. Shamim Hussain, Mohammed J. Zaki, Dharmashankar Subramanian code 1
JuryGCN: Quantifying Jackknife Uncertainty on Graph Convolutional Networks Jian Kang, Qinghai Zhou, Hanghang Tong code 1
Graph Rationalization with Environment-based Augmentations Gang Liu, Tong Zhao, Jiaxin Xu, Tengfei Luo, Meng Jiang code 1
Graph-in-Graph Network for Automatic Gene Ontology Description Generation Fenglin Liu, Bang Yang, Chenyu You, Xian Wu, Shen Ge, Adelaide Woicik, Sheng Wang code 1
Geometer: Graph Few-Shot Class-Incremental Learning via Prototype Representation Bin Lu, Xiaoying Gan, Lina Yang, Weinan Zhang, Luoyi Fu, Xinbing Wang code 1
Spatio-Temporal Graph Few-Shot Learning with Cross-City Knowledge Transfer Bin Lu, Xiaoying Gan, Weinan Zhang, Huaxiu Yao, Luoyi Fu, Xinbing Wang code 1
Learning Differential Operators for Interpretable Time Series Modeling Yingtao Luo, Chang Xu, Yang Liu, Weiqing Liu, Shun Zheng, Jiang Bian code 1
Core-periphery Models for Hypergraphs Marios Papachristou, Jon M. Kleinberg code 1
Synthesising Audio Adversarial Examples for Automatic Speech Recognition Xinghua Qu, Pengfei Wei, Mingyong Gao, Zhu Sun, Yew Soon Ong, Zejun Ma code 1
On Missing Labels, Long-tails and Propensities in Extreme Multi-label Classification Erik Schultheis, Marek Wydmuch, Rohit Babbar, Krzysztof Dembczynski code 1
ERNet: Unsupervised Collective Extraction and Registration in Neuroimaging Data Yao Su, Zhentian Qian, Lifang He, Xiangnan Kong code 1
Towards an Optimal Asymmetric Graph Structure for Robust Semi-supervised Node Classification Zixing Song, Yifei Zhang, Irwin King code 1
Stabilizing Voltage in Power Distribution Networks via Multi-Agent Reinforcement Learning with Transformer Minrui Wang, Mingxiao Feng, Wengang Zhou, Houqiang Li code 1
Task-Adaptive Few-shot Node Classification Song Wang, Kaize Ding, Chuxu Zhang, Chen Chen, Jundong Li code 1
Disentangled Dynamic Heterogeneous Graph Learning for Opioid Overdose Prediction Qianlong Wen, Zhongyu Ouyang, Jianfei Zhang, Yiyue Qian, Yanfang Ye, Chuxu Zhang code 1
Multi-fidelity Hierarchical Neural Processes Dongxia Wu, Matteo Chinazzi, Alessandro Vespignani, YiAn Ma, Rose Yu code 1
Availability Attacks Create Shortcuts Da Yu, Huishuai Zhang, Wei Chen, Jian Yin, TieYan Liu code 1
Model Degradation Hinders Deep Graph Neural Networks Wentao Zhang, Zeang Sheng, Ziqi Yin, Yuezihan Jiang, Yikuan Xia, Jun Gao, Zhi Yang, Bin Cui code 1
Contrastive Learning with Complex Heterogeneity Lecheng Zheng, Jinjun Xiong, Yada Zhu, Jingrui He code 1
AntiBenford Subgraphs: Unsupervised Anomaly Detection in Financial Networks Tianyi Chen, Charalampos E. Tsourakakis code 1
Talent Demand-Supply Joint Prediction with Dynamic Heterogeneous Graph Enhanced Meta-Learning Zhuoning Guo, Hao Liu, Le Zhang, Qi Zhang, Hengshu Zhu, Hui Xiong code 1
Greykite: Deploying Flexible Forecasting at Scale at LinkedIn Reza Hosseini, Albert Chen, Kaixu Yang, Sayan Patra, Yi Su, Saad Eddin Al Orjany, Sishi Tang, Parvez Ahammad code 1
A Fully Differentiable Set Autoencoder Nikita Janakarajan, Jannis Born, Matteo Manica code 1
Precision CityShield Against Hazardous Chemicals Threats via Location Mining and Self-Supervised Learning Jiahao Ji, Jingyuan Wang, Junjie Wu, Boyang Han, Junbo Zhang, Yu Zheng code 1
Towards Learning Disentangled Representations for Time Series Yuening Li, Zhengzhang Chen, Daochen Zha, Mengnan Du, Jingchao Ni, Denghui Zhang, Haifeng Chen, Xia Hu code 1
CS-RAD: Conditional Member Status Refinement and Ability Discovery for Social Network Applications Yiming Ma code 1
GraphWorld: Fake Graphs Bring Real Insights for GNNs John Palowitch, Anton Tsitsulin, Brandon Mayer, Bryan Perozzi code 1
Temporal Multimodal Multivariate Learning Hyoshin Park, Justice Darko, Niharika Deshpande, Venktesh Pandey, Hui Su, Masahiro Ono, Dedrick Barkely, Larkin Folsom, Derek J. Posselt, Steve Chien code 1
Downscaling Earth System Models with Deep Learning Sungwon Park, Karandeep Singh, Arjun Nellikkattil, Elke Zeller, TungDuong Mai, Meeyoung Cha code 1
DocLayNet: A Large Human-Annotated Dataset for Document-Layout Segmentation Birgit Pfitzmann, Christoph Auer, Michele Dolfi, Ahmed S. Nassar, Peter W. J. Staar code 1
What is the Most Effective Intervention to Increase Job Retention for this Disabled Worker? Ha Xuan Tran, Thuc Duy Le, Jiuyong Li, Lin Liu, Jixue Liu, Yanchang Zhao, Tony Waters code 1
Reinforcement Learning-based Placement of Charging Stations in Urban Road Networks Leonie von Wahl, Nicolas Tempelmeier, Ashutosh Sao, Elena Demidova code 1
Learning to Discover Causes of Traffic Congestion with Limited Labeled Data Mudan Wang, Huan Yan, Hongjie Sui, Fan Zuo, Yue Liu, Yong Li code 1
A Framework for Multi-stage Bonus Allocation in Meal Delivery Platform Zhuolin Wu, Li Wang, Fangsheng Huang, Linjun Zhou, Yu Song, Chengpeng Ye, Pengyu Nie, Hao Ren, Jinghua Hao, Renqing He, Zhizhao Sun code 1
Uncertainty Quantification of Sparse Travel Demand Prediction with Spatial-Temporal Graph Neural Networks Dingyi Zhuang, Shenhao Wang, Haris N. Koutsopoulos, Jinhua Zhao code 1
Effective Social Network-Based Allocation of COVID-19 Vaccines Jiangzhuo Chen, Stefan Hoops, Achla Marathe, Henning S. Mortveit, Bryan L. Lewis, Srinivasan Venkatramanan, Arash Haddadan, Parantapa Bhattacharya, Abhijin Adiga, Anil Vullikanti, Aravind Srinivasan, Mandy L. Wilson, Gal Ehrlich, Maier Fenster, Stephen G. Eubank, Christopher L. Barrett, Madhav V. Marathe code 1
Automatic Phenotyping by a Seed-guided Topic Model Ziyang Song, Yuanyi Hu, Aman Verma, David L. Buckeridge, Yue Li code 1
Activity Trajectory Generation via Modeling Spatiotemporal Dynamics Yuan Yuan, Jingtao Ding, Huandong Wang, Depeng Jin, Yong Li code 1
Multimodal AutoML for Image, Text and Tabular Data Nick Erickson, Xingjian Shi, James Sharpnack, Alexander J. Smola code 1
Model Monitoring in Practice: Lessons Learned and Open Challenges Krishnaram Kenthapadi, Himabindu Lakkaraju, Pradeep Natarajan, Mehrnoosh Sameki code 1
Algorithmic Fairness on Graphs: Methods and Trends Jian Kang, Hanghang Tong code 1
A Practical Introduction to Federated Learning Yaliang Li, Bolin Ding, Jingren Zhou code 1
Toolkit for Time Series Anomaly Detection Dhaval Patel, Dzung Phan, Markus Mueller, Amaresh Rajasekharan code 1
Epidemic Forecasting with a Data-Centric Lens Alexander Rodríguez, Harshavardhan Kamarthi, B. Aditya Prakash code 1
EXTR: Click-Through Rate Prediction with Externalities in E-Commerce Sponsored Search Chi Chen, Hui Chen, Kangzhi Zhao, Junsheng Zhou, Li He, Hongbo Deng, Jian Xu, Bo Zheng, Yong Zhang, Chunxiao Xing Alibaba Group, BeiJing, China; Tsinghua University, BeiJing, China Click-Through Rate (CTR) prediction, estimating the probability of a user clicking on items, plays a key fundamental role in sponsored search. E-commerce platforms display organic search results and advertisements (ads), collectively called items, together as a mixed list. The items displayed around the predicted ad, i.e. external items, may affect the user clicking on the predicted. Previous CTR models assume the user click only relies on the ad itself, which overlooks the effects of external items, referred to as external effects, or externalities. During the advertising prediction, the organic results have been generated by the organic system, while the final displayed ads on multiple ad slots have not been figured out, which leads to two challenges: 1) the predicted (target) ad may win any ad slot, bringing about diverse externalities. 2) external ads are undetermined, resulting in incomplete externalities. Facing the above challenges, inspired by the Transformer, we propose EXternality TRansformer (EXTR) which regards target ad with all slots as query and external items as key&value to model externalities in all exposure situations in parallel. Furthermore, we design a Potential Allocation Generator (PAG) for EXTR, to learn the allocation of potential external ads to complete the externalities. Extensive experimental results on Alibaba datasets demonstrate the effectiveness of externalities in the task of CTR prediction and illustrate that our proposed approach can bring significant profits to the real-world e-commerce platform. EXTR now has been successfully deployed in the online search advertising system in Alibaba, serving the main traffic. 点进率(ctrl)预测,估计用户点击项目的概率,在赞助商搜索中起着关键的基础作用。电子商务平台显示有机搜索结果和广告(广告) ,统称项目,一起作为一个混合清单。在预测广告周围显示的项目,即外部项目,可能会影响用户点击预测广告。以前的 CTR 模型假设用户的点击只依赖于广告本身,它忽略了外部项目的影响,称为外部影响,或外部性。在广告预测过程中,有机结果是由有机系统产生的,而最终在多个广告时段上显示的广告还没有计算出来,这就带来了两个挑战: 1)预测的(目标)广告可能赢得任何一个广告时段,带来不同的外部性。2)外部广告不确定性,导致外部性不完全。面对上述挑战,我们提出外部性变压器(EXTR)的启发,以所有时隙为查询目标广告和外部项目为关键和价值模型的外部性在所有曝光情况下并行。此外,我们还为 EXTR 设计了一个潜在分配生成器(PAG) ,学习如何分配潜在的外部广告来完成外部性。对阿里巴巴数据集的大量实验结果显示了外部性在点击率预测任务中的有效性,并说明我们建议的方法可以为现实世界的电子商务平台带来显著的利润。EXTR 现已成功应用于阿里巴巴的在线搜索广告系统,为主要流量提供服务。 code 0
PARSRec: Explainable Personalized Attention-fused Recurrent Sequential Recommendation Using Session Partial Actions Ehsan Gholami, Mohammad Motamedi, Ashwin Aravindakshan University of California, Davis, Davis, CA, USA The emerging meta- and multi-verse landscape is yet another step towards the more prevalent use of already ubiquitous online markets. In such markets, recommender systems play critical roles by offering items of interest to the users, thereby narrowing down a vast search space that comprises hundreds of thousands of products. Recommender systems are usually designed to learn common user behaviors and rely on them for inference. This approach, while effective, is oblivious to subtle idiosyncrasies that differentiate humans from each other. Focusing on this observation, we propose an architecture that relies on common patterns as well as individual behaviors to tailor its recommendations for each person. Simulations under a controlled environment show that our proposed model learns interpretable personalized user behaviors. Our empirical results on Nielsen Consumer Panel dataset indicate that the proposed approach achieves up to 27.9% performance improvement compared to the state-of-the-art. 新兴的元和多元宇宙景观是朝着更普遍地使用已经无处不在的在线市场迈出的又一步。在这样的市场中,推荐系统通过向用户提供感兴趣的项目发挥着关键作用,从而缩小了由成千上万个产品组成的巨大搜索空间。推荐系统通常被设计用来学习常见的用户行为,并依赖它们进行推理。这种方法虽然有效,却忽略了区分人与人之间的微妙特质。基于这一观察,我们提出了一个依赖于公共模式和个人行为的体系结构,以便为每个人量身定制其建议。在受控环境下的仿真表明,我们提出的模型学习可解释的个性化用户行为。我们对 AC尼尔森面板数据集的实验结果表明,与最先进的技术相比,提出的方法实现了高达27.9% 的性能改进。 code 0
Pretraining Representations of Multi-modal Multi-query E-commerce Search Xinyi Liu, Wanxian Guan, Lianyun Li, Hui Li, Chen Lin, Xubin Li, Si Chen, Jian Xu, Hongbo Deng, Bo Zheng Xiamen University, Xiamen, China; Alibaba Group, Hangzhou, China The importance of modeling contextual information within a search session has been widely acknowledged. However, learning representations of multi-query multi-modal (MM) search, in which Mobile Taobao users repeatedly submit textual and visual queries, remains unexplored in literature. Previous work which learns task-specific representations of textual query sessions fails to capture diverse query types and correlations in MM search sessions. This paper presents to represent MM search sessions by heterogeneous graph neural network (HGN). A multi-view contrastive learning framework is proposed to pretrain the HGN, with two views to model different intra-query, inter-query, and inter-modality information diffusion in MM search. Extensive experiments demonstrate that, the pretrained session representation can benefit state-of-the-art baselines on various downstream tasks, such as personalized click prediction, query suggestion, and intent classification. 在搜索会话中建模上下文信息的重要性已经得到了广泛的认可。然而,多查询多模态(MM)搜索的学习表征,其中移动淘宝用户重复提交文本和视觉查询,仍然没有文献探索。前面的工作学习了文本查询会话的特定任务表示,但未能在 MM 搜索会话中捕获不同的查询类型和相关性。本文提出用异构图神经网络(HGN)来表示 MM 搜索会话。提出了一种多视图对比学习框架对 HGN 进行预训练,使用两种视图对 MM 搜索中不同的查询内、查询间和模态间信息扩散进行建模。大量的实验表明,预先训练的会话表示可以使各种下游任务的最先进的基线受益,例如个性化的点击预测、查询建议和意图分类。 code 0
Deep Search Relevance Ranking in Practice Linsey Pang, Wei Liu, Kenghao Chang, Xue Li, Moumita Bhattacharya, Xianjing Liu, Stephen Guo Walmart Global Tech, Sunnyvale, CA, USA; Salesforce, San Francisco, CA, USA; Microsoft, Mountain View, CA, USA; University of Technology Sydney, Sydney, Australia; Twitter, San Jose , CA, USA; Netflix, Los Gatos, CA, USA Machine learning techniques for developing industry-scale search engines have long been a prominent part of most domains and their online products. Search relevance algorithms are key components of products across different fields, including e-commerce, streaming services, and social networks. In this tutorial, we give an introduction to such large-scale search ranking systems, specifically focusing on deep learning techniques in this area. The topics we cover are the following: (1) Overview of search ranking systems in practice, including classical and machine learning techniques; (2) Introduction to sequential and language models in the context of search ranking; and (3) Knowledge distillation approaches for this area. For each of the aforementioned sessions, we first give an introductory talk and then go over an hands-on tutorial to really hone in on the concepts. We cover fundamental concepts using demos, case studies, and hands-on examples, including the latest Deep Learning methods that have achieved state-of-the-art results in generating the most relevant search results. Moreover, we show example implementations of these methods in python, leveraging a variety of open-source machine-learning/deep-learning libraries as well as real industrial data or open-source data. 用于开发行业规模搜索引擎的机器学习技术长期以来一直是大多数领域及其在线产品的重要组成部分。搜索相关算法是不同领域产品的关键组成部分,包括电子商务、流媒体服务和社交网络。在本教程中,我们将介绍这种大规模的搜索排名系统,特别关注这一领域的深度学习技术。我们讨论的主题如下: (1)搜索排名系统在实践中的概述,包括经典的和机器学习技术; (2)在搜索排名的背景下序列和语言模型的介绍; 和(3)这个领域的知识提取方法。对于前面提到的每一个会议,我们首先做一个介绍性的演讲,然后通过一个实践教程来真正地深入理解这些概念。我们使用演示、案例研究和实践例子介绍基本概念,包括最新的深度学习方法,这些方法在生成最相关的搜索结果时取得了最先进的结果。此外,我们还展示了这些方法在 python 中的实现示例,利用了各种开源机器学习/深度学习库以及真实的工业数据或开源数据。 code 0
Debiasing the Cloze Task in Sequential Recommendation with Bidirectional Transformers Khalil Damak, Sami Khenissi, Olfa Nasraoui University of Louisville, Louisville, KY, USA Bidirectional Transformer architectures are state-of-the-art sequential recommendation models that use a bi-directional representation capacity based on the Cloze task, a.k.a. Masked Language Modeling. The latter aims to predict randomly masked items within the sequence. Because they assume that the true interacted item is the most relevant one, an exposure bias results, where non-interacted items with low exposure propensities are assumed to be irrelevant. The most common approach to mitigating exposure bias in recommendation has been Inverse Propensity Scoring (IPS), which consists of down-weighting the interacted predictions in the loss function in proportion to their propensities of exposure, yielding a theoretically unbiased learning. In this work, we argue and prove that IPS does not extend to sequential recommendation because it fails to account for the temporal nature of the problem. We then propose a novel propensity scoring mechanism, which can theoretically debias the Cloze task in sequential recommendation. Finally we empirically demonstrate the debiasing capabilities of our proposed approach and its robustness to the severity of exposure bias. 双向转换器体系结构是最先进的顺序推荐模型,它使用基于完形填空任务的双向表示能力,也就是掩码语言建模。后者旨在预测序列中随机掩盖的项目。因为他们假设真正的相互作用的项目是最相关的一个,暴露偏差的结果,其中没有相互作用的项目低暴露倾向被认为是无关紧要的。减轻推荐中暴露偏倚的最常见方法是逆倾向评分(IPS) ,其包括按照暴露倾向的比例降低损失函数中的相互作用预测的权重,从而产生理论上无偏倚的学习。在这项工作中,我们争论和证明 IPS 没有扩展到顺序推荐,因为它没有考虑到问题的时间性质。然后,我们提出了一种新的倾向评分机制,它可以在理论上降低完形填空任务的顺序推荐。最后,我们通过实证证明了我们提出的方法的去偏能力及其对暴露偏差严重程度的鲁棒性。 code 0
A Generalized Doubly Robust Learning Framework for Debiasing Post-Click Conversion Rate Prediction Quanyu Dai, Haoxuan Li, Peng Wu, Zhenhua Dong, XiaoHua Zhou, Rui Zhang, Rui Zhang, Jie Sun Peking University, Beijing, China; ruizhang.info, Shenzhen, China; Beijing Technology and Business University, Beijing, China; Huawei Noah's Ark Lab, Shenzhen, China; Huawei Hong Kong Theory Lab, Hong Kong, China Post-click conversion rate (CVR) prediction is an essential task for discovering user interests and increasing platform revenues in a range of industrial applications. One of the most challenging problems of this task is the existence of severe selection bias caused by the inherent self-selection behavior of users and the item selection process of systems. Currently, doubly robust (DR) learning approaches achieve the state-of-the-art performance for debiasing CVR prediction. However, in this paper, by theoretically analyzing the bias, variance and generalization bounds of DR methods, we find that existing DR approaches may have poor generalization caused by inaccurate estimation of propensity scores and imputation errors, which often occur in practice. Motivated by such analysis, we propose a generalized learning framework that not only unifies existing DR methods, but also provides a valuable opportunity to develop a series of new debiasing techniques to accommodate different application scenarios. Based on the framework, we propose two new DR methods, namely DR-BIAS and DR-MSE. DR-BIAS directly controls the bias of DR loss, while DR-MSE balances the bias and variance flexibly, which achieves better generalization performance. In addition, we propose a novel tri-level joint learning optimization method for DR-MSE in CVR prediction, and an efficient training algorithm correspondingly. We conduct extensive experiments on both real-world and semi-synthetic datasets, which validate the effectiveness of our proposed methods. 点击后转换率(CVR)预测是发现用户兴趣和增加平台收入的一个重要任务,在一系列的工业应用。这项任务最具挑战性的问题之一是由于用户固有的自我选择行为和系统的项目选择过程所引起的严重选择偏差的存在。目前,双鲁棒(DR)学习方法在降低 CVR 预测偏差方面取得了最好的效果。然而,通过对 DR 方法的偏差、方差和泛化界限的理论分析,我们发现现有的 DR 方法可能由于在实际应用中经常出现的倾向分数估计不准确和插补错误而导致泛化能力较差。基于这样的分析,我们提出了一个通用的学习框架,它不仅统一了现有的 DR 方法,而且为开发一系列新的去偏技术以适应不同的应用场景提供了宝贵的机会。在此基础上,提出了两种新的 DR 方法: DR-BIAS 和 DR-MSE。DR-BIAS 直接控制 DR 损失的偏差,而 DR-MSE 灵活地平衡偏差和方差,从而获得更好的泛化性能。此外,本文还提出了一种新的基于 DR-MSE 的 CVR 预测三层联合学习优化方法,并给出了相应的训练算法。我们在真实世界和半合成数据集上进行了广泛的实验,验证了我们提出的方法的有效性。 code 0
User-Event Graph Embedding Learning for Context-Aware Recommendation Dugang Liu, Mingkai He, Jinwei Luo, Jiangxu Lin, Meng Wang, Xiaolian Zhang, Weike Pan, Zhong Ming Huawei Technologies Co Ltd, Shenzhen, China; Southeast University, Nanjing, China; Shenzhen University, Shenzhen, China; Shenzhen University & Guangdong Laboratory of Artificial Intelligence and Digital Economy (SZ), Shenzhen, China Most methods for context-aware recommendation focus on improving the feature interaction layer, but overlook the embedding layer. However, an embedding layer with random initialization often suffers in practice from the sparsity of the contextual features, as well as the interactions between the users (or items) and context. In this paper, we propose a novel user-event graph embedding learning (UEG-EL) framework to address these two sparsity challenges. Specifically, our UEG-EL contains three modules: 1) a graph construction module is used to obtain a user-event graph containing nodes for users, intents and items, where the intent nodes are generated by applying intent node attention (INA) on nodes of the contextual features; 2) a user-event collaborative graph convolution module is designed to obtain the refined embeddings of all features by executing a new convolution strategy on the user-event graph, where each intent node acts as a hub to efficiently propagate the information among different features; 3) a recommendation module is equipped to integrate some existing context-aware recommendation model, where the feature embeddings are directly initialized with the obtained refined embeddings. Moreover, we identify a unique challenge of the basic framework, that is, the contextual features associated with too many instances may suffer from noise when aggregating the information. We thus further propose a simple but effective variant, i.e., UEG-EL-V, in order to prune the information propagation of the contextual features. Finally, we conduct extensive experiments on three public datasets to verify the effectiveness and compatibility of our UEG-EL and its variant. 大多数上下文感知的推荐方法侧重于改进特征交互层,而忽略了嵌入层。然而,具有随机初始化的嵌入层在实践中经常受到上下文特征稀疏性以及用户(或项目)与上下文之间交互的影响。本文提出了一种新的用户事件图嵌入学习(UEG-EL)框架来解决这两个稀疏性问题。具体来说,我们的 UEG-EL 包含三个模块: 1)一个图形构造模块用于获得一个包含用户、意图和项目节点的用户事件图,其中意图节点是通过在上下文特征的节点上应用意图节点注意力(INA)来生成的; 2)一个用户事件协作图卷积模块用于通过在用户事件图上执行一个新的卷积策略来获得所有特征的精细嵌入,其中每个意图节点作为一个中心来有效地传播不同特征之间的信息; 3)一个推荐模块用于集成一些现有的上下文感知的推荐模型,其中特征嵌入是直接初始化。此外,我们发现了基本框架的一个独特的挑战,即与太多实例相关的上下文特征在聚合信息时可能会受到噪声的影响。因此,我们进一步提出了一个简单而有效的变体,即 UEG-EL-V,以修剪信息传播的上下文特征。最后,我们在三个公共数据集上进行了广泛的实验,以验证我们的 UEG-EL 及其变体的有效性和兼容性。 code 0
Adversarial Gradient Driven Exploration for Deep Click-Through Rate Prediction Kailun Wu, Weijie Bian, Zhangming Chan, Lejian Ren, Shiming Xiang, Shuguang Han, Hongbo Deng, Bo Zheng Alibaba Group, Beijing, China; Institute of Automation, Chinese Academy of Sciences, Beijing, China Exploration-Exploitation (E& E) algorithms are commonly adopted to deal with the feedback-loop issue in large-scale online recommender systems. Most of existing studies believe that high uncertainty can be a good indicator of potential reward, and thus primarily focus on the estimation of model uncertainty. We argue that such an approach overlooks the subsequent effect of exploration on model training. From the perspective of online learning, the adoption of an exploration strategy would also affect the collecting of training data, which further influences model learning. To understand the interaction between exploration and training, we design a Pseudo-Exploration module that simulates the model updating process after a certain item is explored and the corresponding feedback is received. We further show that such a process is equivalent to adding an adversarial perturbation to the model input, and thereby name our proposed approach as an the Adversarial Gradient Driven Exploration (AGE). For production deployment, we propose a dynamic gating unit to pre-determine the utility of an exploration. This enables us to utilize the limited amount of resources for exploration, and avoid wasting pageview resources on ineffective exploration. The effectiveness of AGE was firstly examined through an extensive number of ablation studies on an academic dataset. Meanwhile, AGE has also been deployed to one of the world-leading display advertising platforms, and we observe significant improvements on various top-line evaluation metrics. 在大规模在线推荐系统中,探索-开发(E & E)算法是处理反馈回路问题的常用算法。大多数已有的研究认为高不确定性可以作为潜在报酬的一个很好的指标,因此主要集中在模型不确定性的估计上。我们认为这种方法忽视了探索对模型训练的后续影响。从在线学习的角度来看,探索策略的采用也会影响训练数据的收集,从而进一步影响模型学习。为了理解探索与训练的相互作用,我们设计了一个拟探索模块,模拟探索某一项目并收到相应反馈后的模型更新过程。我们进一步表明,这样一个过程是相当于添加一个对抗扰动的模型输入,从而命名我们提出的方法作为一个对抗梯度驱动探索(AGE)。对于生产部署,我们提出了一个动态门控单元来预先确定勘探的效用。这使我们能够利用有限的资源进行探索,避免在无效探索上浪费页面浏览资源。AGE 的有效性首先通过一个学术数据集上的大量消融研究进行了检验。与此同时,AGE 也被部署到世界领先的展示广告平台之一,我们观察到各种顶线评估指标的显著改进。 code 0
Graph-based Multilingual Language Model: Leveraging Product Relations for Search Relevance Nurendra Choudhary, Nikhil Rao, Karthik Subbian, Chandan K. Reddy Virginia Tech, Arlington, VA, USA; Amazon, Palo Alto, CA, USA The large-scale nature of product catalog and the changing demands of customer queries makes product search a challenging problem. The customer queries are ambiguous and implicit. They may be looking for an exact match of their query, or a functional equivalent (i.e., substitute), or an accessory to go with it (i.e., complement). It is important to distinguish these three categories from merely classifying an item for a customer query as relevant or not. This information can help direct the customer and improve search applications to understand the customer mission. In this paper, we formulate search relevance as a multi-class classification problem and propose a graph-based solution to classify a given query-item pair as exact, substitute, complement, or irrelevant (ESCI). The customer engagement (clicks, add-to-cart, and purchases) between query and items serve as a crucial information for this problem. However, existing approaches rely purely on the textual information (such as BERT) and do not sufficiently focus on the structural relationships. Another challenge in including the structural information is the sparsity of such data in some regions. We propose Structure-Aware multilingual LAnguage Model (SALAM), that utilizes a language model along with a graph neural network, to extract region-specific semantics as well as relational information for the classification of query-product pairs. Our model is first pre-trained on a large region-agnostic dataset and behavioral graph data and then fine-tuned on region-specific versions to address the sparsity. We show in our experiments that SALAM significantly outperforms the current matching frameworks on the ESCI classification task in several regions. We also demonstrate the effectiveness of using a two-phased training setup (i.e., pre-training and fine-tuning) in capturing region-specific information. Also, we provide various challenges and solutions for using the model in an industrial setting and outline its contribution to the e-commerce engine. 产品目录的大规模性和客户查询需求的变化使得产品搜索成为一个具有挑战性的问题。客户查询是模糊和隐式的。他们可能在寻找与他们的查询完全匹配的查询,或者功能等价的查询(即替代查询) ,或者附属查询(即补充查询)。区分这三个类别与仅仅为客户查询分类一个项目是否相关是很重要的。这些信息可以帮助指导客户并改进搜索应用程序,以理解客户的使命。本文将搜索相关性表述为一个多类分类问题,并提出了一种基于图的解决方案,将给定的查询项对分类为精确、替代、补充或不相关(ESCI)。查询和项目之间的客户参与(单击、添加到购物车和购买)是解决此问题的关键信息。然而,现有的方法仅仅依赖于文本信息(比如 BERT) ,并没有充分关注结构关系。在纳入结构信息方面的另一个挑战是,一些区域的此类数据稀少。提出了一种基于结构感知的多语言语言模型(SALAM) ,该模型利用语言模型和图神经网络提取区域特定的语义和关系信息,用于查询产品对的分类。我们的模型首先在大型区域不可知数据集和行为图数据上进行预训练,然后在区域特定版本上进行微调,以解决稀疏性问题。我们的实验表明,在 ESCI 分类任务中,SALAM 在几个地区的性能明显优于目前的匹配框架。我们还演示了使用两阶段的训练设置(即预训练和微调)在捕获特定区域的信息方面的有效性。此外,我们提供了在工业环境中使用该模型的各种挑战和解决方案,并概述了其对电子商务引擎的贡献。 code 0
ASPIRE: Air Shipping Recommendation for E-commerce Products via Causal Inference Framework Abhirup Mondal, Anirban Majumder, Vineet Chaoji Amazon, Bengaluru, India Speed of delivery is critical for the success of e-commerce platforms. Faster delivery promise to the customer results in increased conversion and revenue. There are typically two mechanisms to control the delivery speed - a) replication of products across warehouses, and b) air-shipping the product. In this paper, we present a machine learning based framework to recommend air-shipping eligibility for products. Specifically, we develop a causal inference framework (referred to as Air Shipping Recommendation or ASPIRE) that balances the trade-off between revenue or conversion and delivery cost to decide whether a product should be shipped via air. We propose a doubly-robust estimation technique followed by an optimization algorithm to determine air eligibility of products and calculate the uplift in revenue and shipping cost. We ran extensive experiments (both offline and online) to demonstrate the superiority of our technique as compared to the incumbent policies and baseline approaches. ASPIRE resulted in a lift of +79 bps of revenue as measured through an A/B experiment in an emerging marketplace on Amazon. 交付速度对电子商务平台的成功至关重要。更快的交付承诺给客户的结果增加转换和收入。通常有两种机制来控制交付速度: a)在仓库之间复制产品,b)空运产品。在本文中,我们提出了一个基于机器学习的框架来推荐产品的空运资格。具体来说,我们开发了一个因果推理框架(称为航空运输建议书或 ASPIRE) ,平衡收入或转换和交付成本之间的权衡,以决定是否应该通过空运运输产品。我们提出了一个双稳健估计技术和一个优化算法来确定产品的空气合格性,并计算收入和运输成本的提高。我们进行了大量的实验(线下和线上) ,以证明我们的技术相对于现有的策略和基线方法的优越性。通过在亚马逊新兴市场的 A/B 实验,ASPIRE 的收入提高了79个基点。 code 0
Improving Relevance Modeling via Heterogeneous Behavior Graph Learning in Bing Ads Bochen Pang, Chaozhuo Li, Yuming Liu, Jianxun Lian, Jianan Zhao, Hao Sun, Weiwei Deng, Xing Xie, Qi Zhang Microsoft, Beijing, China; Microsoft Research Asia, Beijing, China; University of Notre Dame, Indiana, IN, USA As the fundamental basis of sponsored search, relevance modeling measures the closeness between the input queries and the candidate ads. Conventional relevance models solely rely on the textual data, which suffer from the scarce semantic signals within the short queries. Recently, user historical click behaviors are incorporated in the format of click graphs to provide additional correlations beyond pure textual semantics, which contributes to advancing the relevance modeling performance. However, user behaviors are usually arbitrary and unpredictable, leading to the noisy and sparse graph topology. In addition, there exist other types of user behaviors besides clicks, which may also provide complementary information. In this paper, we study the novel problem of heterogeneous behavior graph learning to facilitate relevance modeling task. Our motivation lies in learning an optimal and task-relevant heterogeneous behavior graph consisting of multiple types of user behaviors. We further propose a novel HBGLR model to learn the behavior graph structure by mining the sophisticated correlations between node semantics and graph topology, and encode the textual semantics and structural heterogeneity into the learned representations. Our proposal is evaluated over real-world industry datasets, and has been mainstreamed in the Bing ads. Both offline and online experimental results demonstrate its superiority. 作为赞助商搜索的基础,相关性建模测量了输入查询和候选广告之间的密切程度。传统的关联模型仅仅依赖于文本数据,而文本数据受到短查询中语义信号稀缺的影响。近年来,用户的历史点击行为被整合到点击图的格式中,提供了超越纯文本语义的额外相关性,这有助于提高相关性建模的性能。然而,用户行为通常是任意和不可预测的,导致噪声和稀疏图拓扑。此外,除了点击之外,还存在其他类型的用户行为,这些行为也可能提供补充信息。本文研究了异构行为图学习的新问题,以促进相关建模任务的完成。我们的动机在于学习一个由多种类型的用户行为组成的最优的和与任务相关的异构行为图。我们进一步提出了一种新的 HBGLR 模型,通过挖掘节点语义和图拓扑之间复杂的相关性来学习行为图结构,并将文本语义和结构异质性编码到所学习的表示中。我们的建议是评估在现实世界的行业数据集,并已成为主流的必应广告。离线和在线实验结果都证明了该方法的优越性。 code 0
Type Linking for Query Understanding and Semantic Search Giorgos Stoilos, Nikos Papasarantopoulos, Pavlos Vougiouklis, Patrik Bansky Huawei Technologies, Edinburgh, United Kingdom Huawei is currently undertaking an effort to build map and web search services using query understanding and semantic search techniques. We present our efforts to built a low-latency type mention detection and linking service for map search. In addition to latency challenges, we only had access to low quality and biased training data plus we had to support 13 languages. Consequently, our service is based mostly on unsupervised term- and vector-based methods. Nevertheless, we trained a Transformer-based query tagger which we integrated with the rest of the pipeline using a reward and penalisation approach. We present techniques that we designed in order to address challenges with the type dictionary, incompatibilities in scoring between the term-based and vector-based methods as well as over-segmentation issues in Thai, Chinese, and Japanese. We have evaluated our approach on the Huawei map search use case as well as on community Question Answering benchmarks. 华为目前正致力于利用查询理解和语义搜索技术建立地图和网络搜索服务。我们介绍了我们的努力,建立一个低延迟类型提及检测和地图搜索链接服务。除了延迟挑战,我们只能访问低质量和有偏见的培训数据,加上我们必须支持13种语言。因此,我们的服务主要是基于无监督的术语和向量方法。尽管如此,我们还是训练了一个基于 Transformer 的查询标记器,并使用奖励和惩罚方法将其与管道的其他部分集成在一起。我们提出的技术,我们设计的目的是为了解决类型字典的挑战,在评分之间的基于术语和基于向量的方法以及在泰国,中国和日本的过分割问题。我们评估了华为地图搜索用例和社区问答基准的方法。 code 0
Combo-Fashion: Fashion Clothes Matching CTR Prediction with Item History Chenxu Zhu, Peng Du, Weinan Zhang, Yong Yu, Yang Cao Alibaba Group, Hangzhou, China; Shanghai Jiao Tong University, Shanghai, China As one of the fundamental trends for future development of recommender systems, Fashion Clothes Matching Recommendation for click-through rate (CTR) prediction has become an increasingly essential task. Unlike traditional single-item recommendation, a combo item, composed of a top item (e.g. a shirt) and a bottom item (e.g. a skirt), is recommended. In such a task, the matching effect between these two single items plays a crucial role, and greatly influences the users' preferences; however, it is usually neglected by previous approaches in CTR prediction. In this work, we tackle this problem by designing a novel algorithm called Combo-Fashion, which extracts the matching effect by introducing the matching history of the combo item with two cascaded modules: (i) Matching Search Module (MSM) seeks the popular combo items and undesirable ones as a positive set and a negative set, respectively; (ii) Matching Prediction Module (MPM) models the precise relationship between the candidate combo item and the positive/negative set by an attention-based deep model. Besides, the CPM Fashion Attribute, considered from characteristic, pattern and material, is applied to capture the matching effect further. As part of this work, we release two large-scale datasets consisting of 3.56 million and 6.01 million user behaviors with rich context and fashion information in millions of combo items. The experimental results over these two real-world datasets have demonstrated the superiority of our proposed model with significant improvements. Furthermore, we have deployed Combo-Fashion onto the platform of Taobao to recommend the combo items to the users, where an 8-day online A/B test proved the effectiveness of Combo-Fashion with an improvement of pCTR by 1.02% and uCTR by 0.70%. 作为推荐系统未来发展的基本趋势之一,服装搭配推荐系统的点进率预测已经成为一项日益重要的任务。不同于传统的单一项目推荐,一个组合项目,组成的顶部项目(如衬衫)和底部项目(如裙子) ,是推荐的。在这样一个任务中,这两个项目之间的匹配效果起着至关重要的作用,并且对用户的偏好有很大的影响,但是在以往的 CTR 预测方法中往往忽略了这一点。针对这一问题,本文设计了一种新的组合时尚算法,该算法通过引入组合项目的匹配历史来提取匹配效果,该算法由两个级联模块组成: (1)匹配搜索模块(MSM)分别将流行的组合项目和不受欢迎的组合项目作为一个正集和一个负集来搜索; (2)匹配预测模块(MPM)通过基于注意的深度模型来建立候选组合项目与正/负集之间的精确关系。此外,从特征、图案和材质三个方面考虑,运用 CPM 时尚属性进一步捕捉匹配效果。作为这项工作的一部分,我们发布了两个大型数据集,包括356万和601万用户行为,其中包含数百万个组合项目的丰富上下文和时尚信息。在这两个实际数据集上的实验结果显示了我们提出的模型的优越性,并有显著的改进。此外,我们还在淘宝平台上部署了 Combo-Fashion,向用户推荐组合项目,通过8天的在线 A/B 测试证明了 Combo-Fashion 的有效性,pCTR 提高了1.02% ,uCTR 提高了0.70% 。 code 0
Reward Optimizing Recommendation using Deep Learning and Fast Maximum Inner Product Search Imad Aouali, Amine Benhalloum, Martin Bompaire, Achraf Ait Sidi Hammou, Sergey Ivanov, Benjamin Heymann, David Rohde, Otmane Sakhi, Flavian Vasile, Maxime Vono Criteo, Paris, France How can we build and optimize a recommender system that must rapidly fill slates (i.e. banners) of personalized recommendations? The combination of deep learning stacks with fast maximum inner product search (MIPS) algorithms have shown it is possible to deploy flexible models in production that can rapidly deliver personalized recommendations to users. Albeit promising, this methodology is unfortunately not sufficient to build a recommender system which maximizes the reward, e.g. the probability of click. Usually instead a proxy loss is optimized and A/B testing is used to test if the system actually improved performance. This tutorial takes participants through the necessary steps to model the reward and directly optimize the reward of recommendation engines built upon fast search algorithms to produce high-performance reward-optimizing recommender systems. 我们如何构建和优化一个必须快速填充个性化推荐板块(即横幅)的推荐系统?深度学习栈与快速最大内部产品搜索(MIPS)算法的结合表明,在生产中部署灵活的模型可以迅速向用户提供个性化的建议。尽管这种方法很有前途,但不幸的是,它不足以建立一个最大化回报的推荐系统,例如点击的概率。通常代理丢失是优化和 A/B 测试用于测试系统是否实际上提高了性能。本教程将带领参与者通过必要的步骤来建立奖励模型,并直接优化建立在快速搜索算法基础上的推荐引擎的奖励,从而产生高性能的奖励优化推荐系统。 code 0
Low-rank Nonnegative Tensor Decomposition in Hyperbolic Space Bo Hui, WeiShinn Ku Auburn University, Auburn, AL, USA Tensor decomposition aims to factorize an input tensor into a number of latent factors. Due to the low-rank nature of tensor in real applications, the latent factors can be used to perform tensor completion in numerous tasks, such as knowledge graph completion and timely recommendation. However, existing works solve the problem in Euclidean space, where the tensor is decomposed into Euclidean vectors. Recent studies show that hyperbolic space is roomier than Euclidean space. With the same dimension, a hyperbolic vector can represent richer information (e.g., hierarchical structure) than a Euclidean vector. In this paper, we propose to decompose tensor in hyperbolic space. Considering that the most popular optimization tools (e.g, SGD, Adam) have not been generalized in hyperbolic space, we design an adaptive optimization algorithm according to the distinctive property of hyperbolic manifold. To address the non-convex property of the problem, we adopt gradient ascent in our optimization algorithm to avoid getting trapped in local optimal landscapes. We conduct experiments on various tensor completion tasks and the result validates the superiority of our method over these baselines that solve the problem in Euclidean space. 张量分解旨在将输入张量分解为若干潜在因子。由于张量在实际应用中的低秩特性,潜在因子可以用来完成许多任务,如知识图的完成和及时推荐。然而,现有的工作解决了欧氏空间中的问题,其中张量分解成欧氏向量。最近的研究表明双曲空间比欧几里得空间更宽敞。在相同的维度下,双曲向量可以比矢量表示更丰富的信息(例如,层次结构)。在这篇文章中,我们提出在双曲空间中分解张量。考虑到最流行的优化工具(例如,SGD,Adam)还没有在双曲空间中推广,我们根据双曲流形的独特性质设计了一个自适应优化算法。为了解决该问题的非凸性,在优化算法中采用了梯度上升的方法,以避免陷入局部最优景观中。我们对各种张量完成任务进行了实验,实验结果验证了该方法相对于这些基线解决欧氏空间问题的优越性。 code 0
Personalized Chit-Chat Generation for Recommendation Using External Chat Corpora Changyu Chen, Xiting Wang, Xiaoyuan Yi, Fangzhao Wu, Xing Xie, Rui Yan Microsoft Research Asia, Beijing, China; Renmin University of China, Beijing, China Chit-chat has been shown effective in engaging users in human-computer interaction. We find with a user study that generating appropriate chit-chat for news articles can help expand user interest and increase the probability that a user reads a recommended news article. Based on this observation, we propose a method to generate personalized chit-chat for news recommendation. Different from existing methods for personalized text generation, our method only requires an external chat corpus obtained from an online forum, which can be disconnected from the recommendation dataset from both the user and item (news) perspectives. This is achieved by designing a weak supervision method for estimating users' personalized interest in a chit-chat post by transferring knowledge learned by a news recommendation model. Based on the method for estimating user interest, a reinforcement learning framework is proposed to generate personalized chit-chat. Extensive experiments, including the automatic offline evaluation and user studies, demonstrate the effectiveness of our method. 聊天已被证明能有效地吸引用户参与人机交互。我们通过用户研究发现,为新闻文章产生适当的闲聊可以帮助扩大用户的兴趣,并增加用户阅读推荐新闻文章的可能性。在此基础上,本文提出了一种新闻推荐个性化聊天的生成方法。与现有的个性化文本生成方法不同,该方法只需要一个从在线论坛获得的外部聊天语料库,该语料库可以从用户和项目(新闻)的角度与推荐数据集分离。这是通过设计一种弱监督方法,通过传递新闻推荐模型中学到的知识来估计用户在闲聊帖子中的个性化兴趣来实现的。基于评估用户兴趣的方法,提出了一个强化学习框架来生成个性化的聊天。广泛的实验,包括自动离线评估和用户研究,证明了我们的方法的有效性。 code 0
G2NET: A General Geography-Aware Representation Network for Hotel Search Ranking Jia Xu, Fei Xiong, Zulong Chen, Mingyuan Tao, Liangyue Li, Quan Lu Guangxi University, Nanning, China; Alibaba Group, Hangzhou, China Hotel search ranking is the core function of Online Travel Platforms (OTPs), while geography information of location entities involved in it plays a critically important role in guaranteeing its ranking quality. The closest line of works to the hotel search ranking problem is thus the next POI (or location) recommendation problem, which has extensive works but fails to cope with two new challenges, i.e., consideration of two more location entities and effective utilization of geographical information, in a hotel search ranking scenario. To this end, we propose a General Geography-aware representation NETwork (G2NET for short) to better represent geography information of location entities so as to optimize the hotel search ranking. In G2NET, to address the first challenge, we first propose the concept of Geography Interaction Schema (GIS) which is a meta template for representing the arbitrary number of location entity types and their interactions. Then, a novel geography interaction encoder is devised providing general representation ability for an instance of GIS, followed by an attentive operation that aggregates representations of instances corresponding to all historically interacted hotels of a user in a weighted manner. The second challenge is handled by the combined application of three proposed geography embedding modules in G2NET, each of which focuses on computing embeddings of location entities based on a certain aspect of geographical information of location entities. Moreover, a self-attention layer is deployed in G2NET, to capture correlations among historically interacted hotels of a user which provides non-trivial functionality of understanding the user's behaviors. Both offline and online experiments show that G2NET outperforms the state-of-the-art methods. G2NET has now been successfully deployed to provide the high-quality hotel search ranking service at Fliggy, one of the most popular OTPs in China, serving tens of millions of users. 酒店搜索排名是在线旅游平台(OTP)的核心功能,而位置实体的地理信息对于保证其排名质量起着至关重要的作用。因此,与酒店搜索排名问题最接近的工作是下一个 POI (或位置)推荐问题,这个问题有大量的工作,但未能应对两个新的挑战,即在一个酒店搜索排名场景中考虑另外两个位置实体和有效利用地理信息。为此,我们提出了一个通用地理感知表示网络(G2NET) ,以更好地表示位置实体的地理信息,从而优化酒店搜索排名。在 G2NET 中,为了应对第一个挑战,我们首先提出了地理交互模式(GIS)的概念,它是一个元模板,用于表示任意数量的位置实体类型及其交互。然后,设计了一种新颖的地理交互编码器,提供了 GIS 实例的一般表示能力,然后进行了注意操作,以加权的方式聚合了对应于用户的所有历史交互酒店的实例表示。第二个挑战是通过在 G2NET 中联合应用三个地理嵌入模块来解决,每个模块的重点都是基于位置实体的地理信息的某一方面来计算位置实体的嵌入。此外,在 G2NET 中部署了一个自我关注层,以捕获用户在历史上交互的酒店之间的相关性,从而提供了理解用户行为的重要功能。离线和在线实验都表明,G2NET 的性能优于最先进的方法。目前,g2NET 已成功部署到 Fliggy,为数千万用户提供高质量的酒店搜索排名服务。 code 0
Avoiding Biases due to Similarity Assumptions in Node Embeddings Deepayan Chakrabarti University of Texas at Austin, Austin, TX, USA Node embeddings are vectors, one per node, that capture a graph's structure. The basic structure is the adjacency matrix of the graph. Recent methods also make assumptions about the similarity of unlinked nodes. However, such assumptions can lead to unintentional but systematic biases against groups of nodes. Calculating similarities between far-off nodes is also difficult under privacy constraints and in dynamic graphs. Our proposed embedding, called NEWS, makes no similarity assumptions, avoiding potential risks to privacy and fairness. NEWS is parameter-free, enables fast link prediction, and has linear complexity. These gains from avoiding assumptions do not significantly affect accuracy, as we show via comparisons against several existing methods on $21$ real-world networks. Code is available at https://github.com/deepayan12/news. 节点嵌入是向量,每个节点一个,它捕获图的结构。基本结构是图形的邻接矩阵。最近的方法也对未链接节点的相似性做了假设。然而,这样的假设可能会导致对节点群的无意的但是系统性的偏见。在隐私约束和动态图中,计算远程节点之间的相似度也很困难。我们提出的嵌入,称为新闻,没有相似的假设,避免了隐私和公平的潜在风险。NEWS 是无参数的,支持快速链路预测,具有线性复杂度。这些从避免假设中获得的收益不会显著影响准确性,正如我们通过比较现有的几种方法在 $21 $真实世界的网络上所显示的。密码可于 https://github.com/deepayan12/news 索取。 code 0
Task-optimized User Clustering based on Mobile App Usage for Cold-start Recommendations Bulou Liu, Bing Bai, Weibang Xie, Yiwen Guo, Hao Chen University of California, Davis, Davis, CA, USA; Independent Researcher, Beijing, China; Tencent Security Big Data Lab, Beijing, China; Tencent Inc., Guangzhou, China This paper reports our recent practice of recommending articles to cold-start users at Tencent. Transferring knowledge from information-rich domains to help user modeling is an effective way to address the user-side cold-start problem. Our previous work demonstrated that general-purpose user embeddings based on mobile app usage helped article recommendations. However, high-dimensional embeddings are cumbersome for online usage, thus limiting the adoption. On the other hand, user clustering, which partitions users into several groups, can provide a lightweight, online-friendly, and explainable way to help recommendations. Effective user clustering for article recommendations based on mobile app usage faces unique challenges, including (1) the gap between an active user's behavior of mobile app usage and article reading, and (2) the gap between mobile app usage patterns of active and cold-start users. To address the challenges, we propose a tailored Dual Alignment User Clustering (DAUC) model, which applies a sample-wise contrastive alignment to eliminate the gap between active users' mobile app usage and article reading behavior, and a distribution-wise adversarial alignment to eliminate the gap between active users' and cold-start users' app usage behavior. With DAUC, cold-start recommendation-optimized user clustering based on mobile app usage can be achieved. On top of the user clusters, we further build candidate generation strategies, real-time features, and corresponding ranking models without much engineering difficulty. Both online and offline experiments demonstrate the effectiveness of our work. 本文报道了我们最近向腾讯的冷启动用户推荐文章的做法。从信息丰富的领域转移知识以帮助用户建模是解决用户端冷启动问题的有效途径。我们以前的工作表明,基于移动应用程序使用的通用用户嵌入有助于文章推荐。但是,高维嵌入对于在线使用来说很麻烦,因此限制了采用。另一方面,用户集群(将用户划分为几个组)可以提供一种轻量级的、在线友好的、可解释的方式来帮助推荐。基于移动应用使用的文章推荐的有效用户聚类面临独特的挑战,包括(1)活跃用户的移动应用使用行为和文章阅读之间的差距,以及(2)活跃用户和冷启动用户的移动应用使用模式之间的差距。为了应对这些挑战,我们提出了一个定制的双对齐用户聚类(DAUC)模型,该模型应用样本对比对齐来消除活跃用户的移动应用程序使用和文章阅读行为之间的差距,以及分布式对抗对齐来消除活跃用户和冷启动用户的应用程序使用行为之间的差距。利用 DAUC,可以实现基于移动应用使用情况的冷启动推荐优化用户聚类。在用户集群的基础上,我们进一步构建了候选生成策略、实时特征以及相应的排序模型,这些都不需要很大的工程难度。这两个在线和离线实验都证明了我们工作的有效性。 code 0
Promotheus: An End-to-End Machine Learning Framework for Optimizing Markdown in Online Fashion E-commerce Eleanor Loh, Jalaj Khandelwal, Brian Regan, Duncan A. Little ASOS.com, London, United Kingdom Managing discount promotional events ("markdown") is a significant part of running an e-commerce business, and inefficiencies here can significantly hamper a retailer's profitability. Traditional approaches for tackling this problem rely heavily on price elasticity modelling. However, the partial information nature of price elasticity modelling, together with the non-negotiable responsibility for protecting profitability, mean that machine learning practitioners must often go through great lengths to define strategies for measuring offline model quality. In the face of this, many retailers fall back on rule-based methods, thus forgoing significant gains in profitability that can be captured by machine learning. In this paper, we introduce two novel end-to-end markdown management systems for optimising markdown at different stages of a retailer's journey. The first system, "Ithax," enacts a rational supply-side pricing strategy without demand estimation, and can be usefully deployed as a "cold start" solution to collect markdown data while maintaining revenue control. The second system, "Promotheus," presents a full framework for markdown optimization with price elasticity. We describe in detail the specific modelling and validation procedures that, within our experience, have been crucial to building a system that performs robustly in the real world. Both markdown systems achieve superior profitability compared to decisions made by our experienced operations teams in a controlled online test, with improvements of 86% (Promotheus) and 79% (Ithax) relative to manual strategies. These systems have been deployed to manage markdown at ASOS.com, and both systems can be fruitfully deployed for price optimization across a wide variety of retail e-commerce settings. 管理折扣促销活动(“降价”)是经营电子商务业务的一个重要组成部分,这里的低效率会严重阻碍零售商的盈利能力。解决这一问题的传统方法在很大程度上依赖于价格弹性模型。然而,价格弹性建模的部分信息性质,加上保护盈利能力的不可协商的责任,意味着机器学习从业人员必须经常花费大量的时间来确定衡量离线模型质量的策略。面对这种情况,许多零售商退回到基于规则的方法,因此放弃了可以通过机器学习获得的利润率的显著增长。在本文中,我们介绍了两个新颖的端到端降价管理系统优化降价在不同阶段的零售商的旅程。第一个系统,“ Ithax”,制定了一个合理的供应侧定价策略,没有需求估计,可以作为一个有用的“冷启动”解决方案,收集降价数据,同时保持收入控制。第二个系统,“茂德修斯”,提出了一个完整的框架降价优化与价格弹性。我们详细描述了具体的建模和验证程序,根据我们的经验,这些程序对于建立一个在现实世界中运行良好的系统至关重要。与我们经验丰富的运营团队在受控的在线测试中做出的决策相比,这两种降价系统都实现了更高的盈利能力,相对于手工策略,降价系统的改进率分别为86% 和79% 。这些系统已经部署到 ASOS.com 管理降价,这两个系统都可以在各种零售电子商务环境中进行价格优化,从而取得丰硕成果。 code 0
Scalar is Not Enough: Vectorization-based Unbiased Learning to Rank Mouxiang Chen, Chenghao Liu, Zemin Liu, Jianling Sun Salesforce Research Area, Singapore, Singapore; Zhejiang University & Alibaba-Zhejiang University Joint Institute of Frontier Technologies, Hangzhou, China; Singapore Management University, Singapore, Singapore Unbiased learning to rank (ULTR) aims to train an unbiased ranking model from biased user click logs. Most of the current ULTR methods are based on the examination hypothesis (EH), which assumes that the click probability can be factorized into two scalar functions, one related to ranking features and the other related to bias factors. Unfortunately, the interactions among features, bias factors and clicks are complicated in practice, and usually cannot be factorized in this independent way. Fitting click data with EH could lead to model misspecification and bring the approximation error. In this paper, we propose a vector-based EH and formulate the click probability as a dot product of two vector functions. This solution is complete due to its universality in fitting arbitrary click functions. Based on it, we propose a novel model named Vectorization to adaptively learn the relevance embeddings and sort documents by projecting embeddings onto a base vector. Extensive experiments show that our method significantly outperforms the state-of-the-art ULTR methods on complex real clicks as well as simple simulated clicks. 无偏学习排名(ULTR)的目的是从有偏见的用户点击日志中训练一个无偏见的排名模型。目前的 ULTR 方法大多基于检验假设(EH) ,假设点击概率可以分解为两个标量函数,一个与排序特征有关,另一个与偏差因子有关。遗憾的是,特征、偏差因素和点击之间的相互作用在实践中是复杂的,通常不能以这种独立的方式进行因子分解。将 click 数据与 EH 进行匹配可能导致模型错误说明,并带来逼近误差。本文提出了一种基于向量的 EH,并将点击概率表示为两个向量函数的点乘。该解决方案是完整的,因为它在拟合任意点击函数的通用性。在此基础上,提出了一种新的向量化模型,通过向基向量投影来自适应地学习相关嵌入和排序文档。大量的实验表明,我们的方法在复杂的真实点击和简单的模拟点击方面明显优于最先进的 ULTR 方法。 code 0
Efficient Approximate Algorithms for Empirical Variance with Hashed Block Sampling Xingguang Chen, Fangyuan Zhang, Sibo Wang The Chinese University of Hong Kong, Hong Kong, China Empirical variance is a fundamental concept widely used in data management and data analytics, e.g., query optimization, approximate query processing, and feature selection. A direct solution to derive the empirical variance is scanning the whole data table, which is expensive when the data size is huge. Hence, most current works focus on approximate answers by sampling. For results with approximation guarantees, the samples usually need to be uniformly independent random, incurring high cache miss rates especially in compact columnar style layouts. An alternative uses block sampling to avoid this issue, which directly samples a block of consecutive records fitting page sizes instead of sampling one record each time. However, this provides no theoretical guarantee. Existing studies show that the practical estimations can be inaccurate as the records within a block can be correlated. Motivated by this, we investigate how to provide approximation guarantees for empirical variances with block sampling from a theoretical perspective. Our results shows that if the records stored in a table are 4-wise independent to each other according to keys, a slightly modified block sampling can provide the same approximation guarantee with the same asymptotic sampling cost as that of independent random sampling. In practice, storing records via hash clusters or hash organized tables are typical scenarios in modern commercial database systems. Thus, for data analysis on tables in the data lake or OLAP stores that are exported from such hash-based storage, our strategy can be easily integrated to improve the sampling efficiency. Based on our sampling strategy, we present an approximate algorithm for empirical variance and an approximate top-k algorithm to return the k columns with the highest empirical variance scores. Extensive experiments show that our solutions outperform existing solutions by up to an order of magnitude. 经验方差是数据管理和数据分析中广泛使用的一个基本概念,如查询优化、近似查询处理和特征选择。推导经验方差的直接方法是对整个数据表进行扫描,当数据量很大时,扫描成本很高。因此,目前大多数的工作集中在抽样近似答案。对于具有近似保证的结果,样本通常需要是一致独立的随机的,特别是在紧凑的柱状样式布局中,会导致高缓存错过率。另一种方法是使用块抽样来避免这个问题,即直接抽样一个连续的记录块来适应页面大小,而不是每次抽样一个记录。然而,这并不能提供理论上的保证。现有的研究表明,实际的估计可能是不准确的,因为一个区块内的记录可以相关。在此基础上,我们从理论的角度研究了如何为区组抽样的经验方差提供近似保证。结果表明,如果存储在表中的记录按键相互独立,稍加修改的块抽样可以提供与独立随机抽样相同的渐近抽样代价的近似保证。在实践中,通过散列集群或散列组织表存储记录是现代商业数据库系统中的典型场景。因此,对于从这种基于散列的存储器导出的数据湖或 OLAP 存储器中的表的数据分析,可以很容易地将我们的策略集成起来以提高采样效率。基于我们的抽样策略,我们提出了一个经验方差的近似算法和一个近似 top-k 算法来返回经验方差得分最高的 k 列。大量的实验表明,我们的解决方案比现有解决方案的性能高出一个数量级。 code 0
Towards a Native Quantum Paradigm for Graph Representation Learning: A Sampling-based Recurrent Embedding Approach Ge Yan, Yehui Tang, Junchi Yan Shanghai Jiao Tong University, Shanghai, China Graph representation learning has been extensively studied, and recent models can well incorporate both node features and graph structures. Despite these progress, the inherent scalability challenge for classical computers of processing graph data and solving the downstream tasks (many are NP-hard) is still a bottleneck for existing classical graph learning models. On the other hand, quantum computing is known a promising direction for its theoretically verified scalability as well as the increasing evidence for the access to physical quantum machine in near-term. Different from many existing classical-quantum hybrid machine learning models on graphs, in this paper we take a more aggressive initiative for developing a native quantum paradigm for (attributed) graph representation learning, which to our best knowledge, has not been fulfilled in literature yet. Specifically, our model adopts the well-established theory and technique in quantum computing e.g. quantum random walk, and adapt it to the attributed graph. Then the node attribute quantum state sequence is fed into a quantum recurrent network to obtain the final node embedding. Experimental results on three public datasets show the effectiveness of our quantum model which also outperforms a classical learning approach GraphRNA notably in terms of efficiency even on a classical computer. Though it is still restricted to the classical loss-based learning paradigm with gradient descent for model parameter training, while our computing scheme is compatible with quantum computing without involving classical computers. This is in fact largely in contrast to many hybrid quantum graph learning models which often involve many steps and modules having to be performed on classical computers. 图表示学习已经得到了广泛的研究,现有的模型能够很好地结合节点特征和图结构。尽管取得了这些进展,经典计算机在处理图形数据和解决下游任务(许多是 NP 难的)方面固有的可伸缩性挑战仍然是现有经典图形学习模型的瓶颈。另一方面,量子计算因其在理论上被证实的可扩展性以及近期越来越多的物理量子计算的证据而被认为是一个有前途的方向。与许多现有的经典-量子混合机器学习模型不同,本文采取了更积极的主动性,开发了一个本土的量子范式(属性)图表示学习,据我们所知,这尚未在文献中得到实现。具体地说,我们的模型采用了量子计算中已经成熟的理论和技术,例如量子随机游走,并将其适用于属性图。然后将节点属性量子状态序列输入到量子递归网络中,得到最终的节点嵌入。在三个公共数据集上的实验结果表明了量子模型的有效性,即使在经典的计算机上,量子模型的效率也明显优于经典的学习方法 GraphRNA。虽然我们的计算机系统仍然局限于传统的以损失为基础的学习范式,并且只有模型参数训练的梯度下降法,但我们的计算机系统可以与量子计算兼容,而不需要使用传统的计算机。这实际上在很大程度上与许多混合量子图学习模型形成对比,这些模型通常涉及许多步骤和模块,必须在经典计算机上执行。 code 0
Toward Real-life Dialogue State Tracking Involving Negative Feedback Utterances Puhai Yang, Heyan Huang, Wei Wei, XianLing Mao Huazhong University of Science and Technology, Wuhan, China; Beijing Institute of Technology, Beijing, China; Beijing Institute of Technology & Beijing Engineering Research Center of High Volume Language Information Processing and Cloud Computing Applications, Beijing, China Recently, the research of dialogue systems has been widely concerned, especially task-oriented dialogue systems, which have received increased attention due to their wide application prospect. As a core component, dialogue state tracking (DST) plays a key role in task-oriented dialogue systems, and its function is to parse natural language dialogues into dialogue state formed by slot-value pairs. It is well known that dialogue state tracking has been well studied and explored on current benchmark datasets such as the MultiWOZ. However, almost all current research completely ignores the user negative feedback utterances that exist in real-life conversations when a system error occurs, which often contains user-provided corrective information for the system error. Obviously, user negative feedback utterances can be used to correct the inevitable errors in automatic speech recognition and model generalization. Thus, in this paper, we will explore the role of negative feedback utterances in dialogue state tracking in detail through simulated negative feedback utterances. Specifically, due to the lack of dataset involving negative feedback utterances, first, we have to define the schema of user negative feedback utterances and propose a joint modeling method for feedback utterance generation and filtering. Then, we explore three aspects of interaction mechanism that should be considered in real-life conversations involving negative feedback utterances and propose evaluation metrics related to negative feedback utterances. Finally, on WOZ2.0 and MultiWOZ2.1 datasets, by constructing simulated negative feedback utterances in training and testing, we not only verify the important role of negative feedback utterances in dialogue state tracking, but also analyze the advantages and disadvantages of different interaction mechanisms involving negative feedback utterances, lighting future research on negative feedback utterances. 近年来,对话系统的研究受到了广泛的关注,尤其是面向任务的对话系统,由于其广阔的应用前景而受到越来越多的关注。对话状态跟踪(DST)是任务导向对话系统的核心组成部分,其功能是将自然语言对话解析为由插槽值对形成的对话状态。众所周知,对话状态跟踪已经在当前的基准数据集(如 MultiWOZ)上得到了很好的研究和探索。然而,目前几乎所有的研究都完全忽视了系统错误发生时用户在现实交谈中的负面反馈语,其中往往包含用户提供的系统错误纠正信息。显然,用户负反馈话语可以用来纠正语音自动识别和模型推广中不可避免的错误。因此,本文将通过模拟负反馈话语来详细探讨负反馈话语在对话状态跟踪中的作用。具体来说,由于缺乏涉及负反馈话语的数据集,首先,我们必须定义用户负反馈话语的模式,并提出一种联合建模的方法来生成和过滤反馈话语。然后,从三个方面探讨了负反馈话语在现实会话中应该考虑的互动机制,并提出了与负反馈话语相关的评价指标。最后,在 WOZ2.0和 MultiWOZ2.1数据集上,通过构建训练和测试中的模拟负反馈话语,不仅验证了负反馈话语在对话状态跟踪中的重要作用,而且分析了负反馈话语不同交互机制的优缺点,为进一步研究负反馈话语提供参考。 code 0
M-Mix: Generating Hard Negatives via Multi-sample Mixing for Contrastive Learning Shaofeng Zhang, Meng Liu, Junchi Yan, Hengrui Zhang, Lingxiao Huang, Xiaokang Yang, Pinyan Lu Huawei TCS Lab, Shanghai, China; Shanghai University of Finance and Economics & Huawei TCS Lab, Shanghai, China; Shanghai Jiao Tong University, Shanghai, China Negative pairs, especially hard negatives as combined with common negatives (easy to discriminate), are essential in contrastive learning, which plays a role of avoiding degenerate solutions in the sense of constant representation across different instances. Inspired by recent hard negative mining methods via pairwise mixup operation in vision, we propose M-Mix, which dynamically generates a sequence of hard negatives. Compared with previous methods, M-Mix mainly has three features: 1) adaptively choose samples to mix; 2) simultaneously mix multiple samples; 3) automatically assign different mixing weights to the selected samples. We evaluate our method on two image datasets (CIFAR-10, CIFAR-100), five node classification datasets (PPI, DBLP, Pubmed, etc), five graph classification datasets (IMDB, PTC_MR, etc), and two downstream combinatorial tasks (graph edit distance and node clustering). Results show that it achieves state-of-the-art performance under self-supervised settings. Code is available at: https://github.com/Sherrylone/m-mix. 否定对,尤其是硬否定与普通否定(容易区分)的结合,在对比学习中是必不可少的,对比学习的作用是避免退化的解决方案在不同情况下的持续表征。受当前视觉硬负片挖掘方法的启发,提出了 M-Mix 算法,该算法动态生成硬负片序列。与以往的混合方法相比,M-Mix 方法主要有三个特点: 1)自适应地选择混合样本; 2)同时混合多个样本; 3)自动分配不同的混合权重给选定的样本。我们在两个图像数据集(CIFAR-10,CIFAR-100) ,五个节点分类数据集(PPI,DBLP,Pubmed 等) ,五个图形分类数据集(IMDB,PTC _ MR 等)和两个下游组合任务(图形编辑距离和节点聚类)上评估我们的方法。结果表明,该算法在自监督设置下达到了最佳性能。密码可于以下 https://github.com/sherrylone/m-mix 索取:。 code 0
Modeling Persuasion Factor of User Decision for Recommendation Chang Liu, Chen Gao, Yuan Yuan, Chen Bai, Lingrui Luo, Xiaoyi Du, Xinlei Shi, Hengliang Luo, Depeng Jin, Yong Li Meituan Inc., Beijing, China; Tsinghua University, Beijing, China In online information systems, users make decisions based on factors of several specific aspects, such as brand, price, etc. Existing recommendation engines ignore the explicit modeling of these factors, leading to sub-optimal recommendation performance. In this paper, we focus on the real-world scenario where these factors can be explicitly captured (the users are exposed with decision factor-based persuasion texts, i.e., persuasion factors). Although it allows us for explicit modeling of user-decision process, there are critical challenges including the persuasion factor's representation learning and effect estimation, along with the data-sparsity problem. To address them, in this work, we present our POEM (short for Persuasion factOr Effect Modeling) system. We first propose the persuasion-factor graph convolutional layers for encoding and learning representations from the persuasion-aware interaction data. Then we develop a prediction layer that fully considers the user sensitivity to the persuasion factors. Finally, to address the data-sparsity issue, we propose a counterfactual learning-based data augmentation method to enhance the supervision signal. Real-world experiments demonstrate the effectiveness of our proposed framework of modeling the effect of persuasion factors. 在网络信息系统中,用户根据品牌、价格等几个具体方面的因素进行决策。现有的推荐引擎忽略了这些因素的显式建模,导致推荐性能不理想。在本文中,我们关注的是真实世界中这些因素可以被明确地捕获的场景(用户暴露在基于决策因素的说服文本中,即,说服因素)。尽管它允许我们对用户决策过程进行明确的建模,但是仍然存在一些关键的挑战,包括说服因子的表示学习和效果估计,以及数据稀疏问题。为了解决这些问题,在本文中,我们提出了我们的 POEM (劝导因素效果建模的缩写)系统。我们首先提出了说服因子图卷积层,用于从感知说服的交互数据中编码和学习表示。然后我们开发了一个预测层,充分考虑了用户对说服因素的敏感性。最后,针对数据稀疏问题,提出了一种基于反事实学习的数据增强方法来增强监控信号。现实世界的实验证明了我们提出的说服因素效应建模框架的有效性。 code 0
Lion: A GPU-Accelerated Online Serving System for Web-Scale Recommendation at Baidu Hao Liu, Qian Gao, Xiaochao Liao, Guangxing Chen, Hao Xiong, Silin Ren, Guobao Yang, Zhiwei Zha Baidu, Inc., Beijing, China; HKUST(GZ), HKUST, Guangzhou, China Deep Neural Network (DNN) based recommendation systems are widely used in the modern internet industry for a variety of services. However, the rapid expansion of application scenarios and the explosive global internet traffic growth have caused the industry to face increasing challenges to serve the complicated recommendation workflow regarding online recommendation efficiency and compute resource overhead. In this paper, we present a GPU-accelerated online serving system, namely Lion, which consists of the staged event-driven heterogeneous pipeline, unified memory manager, and automatic execution optimizer to handle web-scale traffic in a real-time and cost-effective way. Moreover, Lion provides a heterogeneous template library to enable fast development and migration for diverse in-house web-scale recommendation systems without requiring knowledge of heterogeneous programming. The system is currently deployed at Baidu, supporting over twenty recommendation services, including news feed, short video clips, and the search engine. Extensive experimental studies on five real-world deployed online recommendation services demonstrate the superiority of the proposed GPU-accelerated online serving system. Since launched in early 2020, Lion has answered billions of recommendation requests per day, and has helped Baidu successfully save millions of U.S. dollars in hardware and utility costs per year. 基于深度神经网络(DNN)的推荐系统广泛应用于现代互联网行业的各种服务。然而,应用场景的快速扩展和全球互联网流量的爆炸性增长,使得业界面临着越来越多的挑战,以服务复杂的推荐工作流,包括在线推荐效率和计算资源开销。本文提出了一种基于 GPU 加速的在线服务系统 Lion,该系统由分级事件驱动的异构流水线、统一内存管理器和自动执行优化器组成,能够实时、高效地处理网络流量。此外,Lion 还提供了一个异构模板库,可以在不需要异构编程知识的情况下快速开发和迁移各种内部 Web 规模的推荐系统。该系统目前部署在百度,支持超过20种推荐服务,包括新闻馈送、短视频剪辑和搜索引擎。通过对五个实际部署的在线推荐服务的大量实验研究,证明了所提出的 GPU 加速在线服务系统的优越性。自2020年初推出以来,Lion 每天回应了数十亿的推荐请求,并帮助百度每年成功节省了数百万美元的硬件和公用事业成本。 code 0
CognitionNet: A Collaborative Neural Network for Play Style Discovery in Online Skill Gaming Platform Rukma Talwadker, Surajit Chakrabarty, Aditya Pareek, Tridib Mukherjee, Deepak Saini code 0
FedAttack: Effective and Covert Poisoning Attack on Federated Recommendation via Hard Sampling Chuhan Wu, Fangzhao Wu, Tao Qi, Yongfeng Huang, Xing Xie code 0
Mixture of Virtual-Kernel Experts for Multi-Objective User Profile Modeling Zhenhui Xu, Meng Zhao, Liqun Liu, Lei Xiao, Xiaopeng Zhang, Bifeng Zhang code 0
Embedding Compression with Hashing for Efficient Representation Learning in Large-Scale Graph ChinChia Michael Yeh, Mengting Gu, Yan Zheng, Huiyuan Chen, Javid Ebrahimi, Zhongfang Zhuang, Junpeng Wang, Liang Wang, Wei Zhang code 0
Medical Symptom Detection in Intelligent Pre-Consultation Using Bi-directional Hard-Negative Noise Contrastive Estimation Shiwei Zhang, Jichao Sun, Yu Huang, Xueqi Ding, Yefeng Zheng code 0
User-tag Profile Modeling in Recommendation System via Contrast Weighted Tag Masking Chenxu Zhu, Peng Du, Xianghui Zhu, Weinan Zhang, Yong Yu, Yang Cao code 0
AI for Social Impact: Results from Deployments for Public Health and Conversation Milind Tambe Harvard University & Google Research, Cambridge, MA, USA With the maturing of AI and multiagent systems research, we have a tremendous opportunity to direct these advances towards addressing complex societal problems. I will focus on domains of public health and conservation, and address one key cross-cutting challenge: how to effectively deploy our limited intervention resources in these problem domains. I will present results from work around the globe in using AI for challenges in public health such as Maternal and Child care interventions, HIV prevention, and in conservation such as endangered wildlife protection. Achieving social impact in these domains often requires methodological advances. To that end, I will highlight key research advances in multiagent reasoning and learning, in particular in, restless multiarmed bandits, influence maximization in social networks, computational game theory and decision-focused learning. In pushing this research agenda, our ultimate goal is to facilitate local communities and non-profits to directly benefit from advances in AI tools and techniques 随着人工智能和多智能体系统研究的成熟,我们有一个巨大的机会来指导这些进步,以解决复杂的社会问题。我将侧重于公共卫生和保护领域,并解决一个关键的跨领域挑战: 如何在这些问题领域有效部署我们有限的干预资源。我将介绍全球在利用人工智能应对公共卫生挑战方面的工作成果,如母婴保健干预、艾滋病毒预防以及濒危野生动物保护等方面的工作。要在这些领域产生社会影响,往往需要方法上的进步。为此,我将重点介绍多智能体推理和学习方面的关键研究进展,特别是在不安分的多武装匪徒、社交网络中的影响最大化、计算博弈理论和决策集中学习方面。在推动这一研究议程的过程中,我们的最终目标是促进当地社区和非营利组织直接受益于人工智能工具和技术的进步 code 0
Noisy Interactive Graph Search Qianhao Cong, Jing Tang, Kai Han, Yuming Huang, Lei Chen, Yeow Meng Chee code 0
LinE: Logical Query Reasoning over Hierarchical Knowledge Graphs Zijian Huang, MengFen Chiang, WangChien Lee code 0
Transfer Learning based Search Space Design for Hyperparameter Tuning Yang Li, Yu Shen, Huaijun Jiang, Tianyi Bai, Wentao Zhang, Ce Zhang, Bin Cui code 0
Graph Structural Attack by Perturbing Spectral Distance Lu Lin, Ethan Blaser, Hongning Wang code 0
Practical Counterfactual Policy Learning for Top-K Recommendations Yaxu Liu, JuiNan Yen, BoWen Yuan, Rundong Shi, Peng Yan, ChihJen Lin code 0
FedWalk: Communication Efficient Federated Unsupervised Node Embedding with Differential Privacy Qiying Pan, Yifei Zhu code 0
Fair Ranking as Fair Division: Impact-Based Individual Fairness in Ranking Yuta Saito, Thorsten Joachims code 0
Knowledge Enhanced Search Result Diversification Zhan Su, Zhicheng Dou, Yutao Zhu, JiRong Wen code 0
Self-Supervised Hypergraph Transformer for Recommender Systems Lianghao Xia, Chao Huang, Chuxu Zhang code 0
MetaPTP: An Adaptive Meta-optimized Model for Personalized Spatial Trajectory Prediction Yuan Xu, Jiajie Xu, Jing Zhao, Kai Zheng, An Liu, Lei Zhao, Xiaofang Zhou code 0
Nimble GNN Embedding with Tensor-Train Decomposition Chunxing Yin, Da Zheng, Israt Nisa, Christos Faloutsos, George Karypis, Richard W. Vuduc code 0
MDP2 Forest: A Constrained Continuous Multi-dimensional Policy Optimization Approach for Short-video Recommendation Sizhe Yu, Ziyi Liu, Shixiang Wan, Jia Zheng, Zang Li, Fan Zhou code 0
RCAD: Real-time Collaborative Anomaly Detection System for Mobile Broadband Networks Azza H. Ahmed, Michael A. Riegler, Steven Alexander Hicks, Ahmed Elmokashfi code 0
Generalizable Floorplanner through Corner Block List Representation and Hypergraph Embedding Mohammad Amini, Zhanguang Zhang, Surya Penmetsa, Yingxue Zhang, Jianye Hao, Wulong Liu code 0
Amazon Shop the Look: A Visual Search System for Fashion and Home Ming Du, Arnau Ramisa, Amit Kumar K. C, Sampath Chanda, Mengjiao Wang, Neelakandan Rajesh, Shasha Li, Yingchuan Hu, Tao Zhou, Nagashri Lakshminarayana, Son Tran, Doug Gray code 0
Affective Signals in a Social Media Recommender System Jane DwivediYu, YiChia Wang, Lijing Qin, Cristian CantonFerrer, Alon Y. Halevy code 0
Collaborative Intelligence Orchestration: Inconsistency-Based Fusion of Semi-Supervised Learning and Active Learning Jiannan Guo, Yangyang Kang, Yu Duan, Xiaozhong Liu, Siliang Tang, Wenqiao Zhang, Kun Kuang, Changlong Sun, Fei Wu code 0
Rax: Composable Learning-to-Rank Using JAX Rolf Jagerman, Xuanhui Wang, Honglei Zhuang, Zhen Qin, Michael Bendersky, Marc Najork code 0
AutoFAS: Automatic Feature and Architecture Selection for Pre-Ranking System Xiang Li, Xiaojiang Zhou, Yao Xiao, Peihao Huang, Dayao Chen, Sheng Chen, Yunsen Xian code 0
Duplex Conversation: Towards Human-like Interaction in Spoken Dialogue Systems TingEn Lin, Yuchuan Wu, Fei Huang, Luo Si, Jian Sun, Yongbin Li code 0
Rapid Regression Detection in Software Deployments through Sequential Testing Michael Lindon, Chris Sanden, Vaché Shirikian code 0
Retrieval-Based Gradient Boosting Decision Trees for Disease Risk Assessment Handong Ma, Jiahang Cao, Yuchen Fang, Weinan Zhang, Wenbo Sheng, Shaodian Zhang, Yong Yu code 0
Towards Reliable Detection of Dielectric Hotspots in Thermal Images of the Underground Distribution Network François Mirallès, Luc Cauchon, MarcAndré Magnan, François Grégoire, Mouhamadou Makhtar Dione, Arnaud Zinflou code 0
CERAM: Coverage Expansion for Recommendations by Associating Discarded Models Yoshiki Matsune, Kota Tsubouchi, Nobuhiko Nishio code 0
Intelligent Request Strategy Design in Recommender System Xufeng Qian, Yue Xu, Fuyu Lv, Shengyu Zhang, Ziwen Jiang, Qingwen Liu, Xiaoyi Zeng, TatSeng Chua, Fei Wu code 0
Profiling Deep Learning Workloads at Scale using Amazon SageMaker Nathalie Rauschmayr, Sami Kama, Muhyun Kim, Miyoung Choi, Krishnaram Kenthapadi code 0
Generative Adversarial Networks Enhanced Pre-training for Insufficient Electronic Health Records Modeling Houxing Ren, Jingyuan Wang, Wayne Xin Zhao code 0
Recommendation in Offline Stores: A Gamification Approach for Learning the Spatiotemporal Representation of Indoor Shopping Jongkyung Shin, Changhun Lee, Chiehyeon Lim, Yunmo Shin, Junseok Lim code 0
CausalInt: Causal Inspired Intervention for Multi-Scenario Recommendation Yichao Wang, Huifeng Guo, Bo Chen, Weiwen Liu, Zhirong Liu, Qi Zhang, Zhicheng He, Hongkun Zheng, Weiwei Yao, Muyu Zhang, Zhenhua Dong, Ruiming Tang code 0
COSSUM: Towards Conversation-Oriented Structured Summarization for Automatic Medical Insurance Assessment Sheng Xu, Xiaojun Wan, Sen Hu, Mengdi Zhou, Teng Xu, Hongbin Wang, Haitao Mi code 0
Scale Calibration of Deep Ranking Models Le Yan, Zhen Qin, Xuanhui Wang, Michael Bendersky, Marc Najork code 0
Multi-task Envisioning Transformer-based Autoencoder for Corporate Credit Rating Migration Early Prediction Han Yue, Steve Q. Xia, Hongfu Liu code 0
Felicitas: Federated Learning in Distributed Cross Device Collaborative Frameworks Qi Zhang, Tiancheng Wu, Peichen Zhou, Shan Zhou, Yuan Yang, Xiulang Jin code 0
Reducing the Friction for Building Recommender Systems with Merlin Sara Rabhi, Ronay Ak, Marc Romeijn, Gabriel de Souza Pereira Moreira, Benedikt D. Schifferer code 0
Modern Theoretical Tools for Designing Information Retrieval System Da Xu, Chuanwei Ruan code 0
Data Science and Artificial Intelligence for Responsible Recommendations Shoujin Wang, Ninghao Liu, Xiuzhen Zhang, Yan Wang, Francesco Ricci, Bamshad Mobasher code 0
User Behavior Pre-training for Online Fraud Detection Can Liu, Yuncong Gao, Li Sun, Jinghua Feng, Hao Yang, Xiang Ao code 0
Sampling-based Estimation of the Number of Distinct Values in Distributed Environment Jiajun Li, Zhewei Wei, Bolin Ding, Xiening Dai, Lu Lu, Jingren Zhou code 0
Sample-Efficient Kernel Mean Estimator with Marginalized Corrupted Data Xiaobo Xia, Shuo Shan, Mingming Gong, Nannan Wang, Fei Gao, Haikun Wei, Tongliang Liu code 0
Accurate Node Feature Estimation with Structured Variational Graph Autoencoder Jaemin Yoo, Hyunsik Jeon, Jinhong Jung, U Kang code 0
Semantic Aware Answer Sentence Selection Using Self-Learning Based Domain Adaptation Rajdeep Sarkar, Sourav Dutta, Haytham Assem, Mihael Arcan, John P. McCrae code 0
CONFLUX: A Request-level Fusion Framework for Impression Allocation via Cascade Distillation XiaoYu Wang, Bin Tan, Yonghui Guo, Tao Yang, Dongbo Huang, Lan Xu, Nikolaos M. Freris, Hao Zhou, Xiangyang Li code 0
Multi Armed Bandit vs. A/B Tests in E-commence - Confidence Interval and Hypothesis Test Power Perspectives Ding Xiang, Rebecca West, Jiaqi Wang, Xiquan Cui, Jinzhou Huang code 0
CausalMTA: Eliminating the User Confounding Bias for Causal Multi-touch Attribution Di Yao, Chang Gong, Lei Zhang, Sheng Chen, Jingping Bi code 0
Why Data Scientists Prefer Glassbox Machine Learning: Algorithms, Differential Privacy, Editing and Bias Mitigation Rich Caruana, Harsha Nori code 0
Efficient Machine Learning on Large-Scale Graphs Parker Erickson, Victor E. Lee, Feng Shi, Jiliang Tang code 0
FreeKD: Free-direction Knowledge Distillation for Graph Neural Networks Kaituo Feng, Changsheng Li, Ye Yuan, Guoren Wang code 0
SIPF: Sampling Method for Inverse Protein Folding Tianfan Fu, Jimeng Sun code 0
Antibody Complementarity Determining Regions (CDRs) design using Constrained Energy Model Tianfan Fu, Jimeng Sun code 0
Partial Label Learning with Semantic Label Representations Shuo He, Lei Feng, Fengmao Lv, Wen Li, Guowu Yang code 0
HyperLogLogLog: Cardinality Estimation With One Log More Matti Karppa, Rasmus Pagh code 0
SOS: Score-based Oversampling for Tabular Data Jayoung Kim, Chaejeong Lee, Yehjin Shin, Sewon Park, Minjung Kim, Noseong Park, Jihoon Cho code 0
Domain Adaptation in Physical Systems via Graph Kernel Haoran Li, Hanghang Tong, Yang Weng code 0
RL2: A Call for Simultaneous Representation Learning and Rule Learning for Graph Streams Qu Liu, Tingjian Ge code 0
Learning Models of Individual Behavior in Chess Reid McIlroyYoung, Russell Wang, Siddhartha Sen, Jon M. Kleinberg, Ashton Anderson code 0
Nonlinearity Encoding for Extrapolation of Neural Networks Gyoung S. Na, Chanyoung Park code 0
Neural Bandit with Arm Group Graph Yunzhe Qi, Yikun Ban, Jingrui He code 0
Importance Prioritized Policy Distillation Xinghua Qu, Yew Soon Ong, Abhishek Gupta, Pengfei Wei, Zhu Sun, Zejun Ma code 0
DICE: Domain-attack Invariant Causal Learning for Improved Data Privacy Protection and Adversarial Robustness Qibing Ren, Yiting Chen, Yichuan Mo, Qitian Wu, Junchi Yan code 0
Balancing Bias and Variance for Active Weakly Supervised Learning Hitesh Sapkota, Qi Yu code 0
Learning Optimal Priors for Task-Invariant Representations in Variational Autoencoders Hiroshi Takahashi, Tomoharu Iwata, Atsutoshi Kumagai, Sekitoshi Kanai, Masanori Yamada, Yuki Yamanaka, Hisashi Kashima code 0
Aligning Dual Disentangled User Representations from Ratings and Textual Content NhuThuat Tran, Hady W. Lauw code 0
Estimating Individualized Causal Effect with Confounded Instruments Haotian Wang, Wenjing Yang, Longqi Yang, Anpeng Wu, Liyang Xu, Jing Ren, Fei Wu, Kun Kuang code 0
Streaming Graph Neural Networks with Generative Replay Junshan Wang, Wenhao Zhu, Guojie Song, Liang Wang code 0
Domain Adaptation with Dynamic Open-Set Targets Jun Wu, Jingrui He code 0
Non-stationary A/B Tests Yuhang Wu, Zeyu Zheng, Guangyu Zhang, Zuohua Zhang, Chu Wang code 0
Enhancing Machine Learning Approaches for Graph Optimization Problems with Diversifying Graph Augmentation ChenHsu Yang, ChihYa Shen code 0
Learning Classifiers under Delayed Feedback with a Time Window Assumption Shota Yasui, Masahiro Kato code 0
Intrinsic-Motivated Sensor Management: Exploring with Physical Surprise Jingyi Yuan, Yang Weng, Erik Blasch code 0
Dual Bidirectional Graph Convolutional Networks for Zero-shot Node Classification Qin Yue, Jiye Liang, Junbiao Cui, Liang Bai code 0
Physics-infused Machine Learning for Crowd Simulation Guozhen Zhang, Zihan Yu, Depeng Jin, Yong Li code 0
Few-shot Heterogeneous Graph Learning via Cross-domain Knowledge Transfer Qiannan Zhang, Xiaodong Wu, Qiang Yang, Chuxu Zhang, Xiangliang Zhang code 0
Adaptive Learning for Weakly Labeled Streams ZhenYu Zhang, Yuyang Qian, YuJie Zhang, Yuan Jiang, ZhiHua Zhou code 0
Adaptive Fairness-Aware Online Meta-Learning for Changing Environments Chen Zhao, Feng Mi, Xintao Wu, Kai Jiang, Latifur Khan, Feng Chen code 0
Physics-Guided Graph Meta Learning for Predicting Water Temperature and Streamflow in Stream Networks Shengyu Chen, Jacob A. Zwart, Xiaowei Jia code 0
ILASR: Privacy-Preserving Incremental Learning for Automatic Speech Recognition at Production Scale Gopinath Chennupati, Milind Rao, Gurpreet Chadha, Aaron Eakin, Anirudh Raju, Gautam Tiwari, Anit Kumar Sahu, Ariya Rastrow, Jasha Droppo, Andy Oberlin, Buddha Nandanoor, Prahalad Venkataramanan, Zheng Wu, Pankaj Sitpure code 0
Large-Scale Acoustic Automobile Fault Detection: Diagnosing Engines Through Sound Dennis Fedorishin, Justas Birgiolas, Deen Dayal Mohan, Livio Forte, Philip Schneider, Srirangaraj Setlur, Venu Govindaraju code 0
Real-Time Rideshare Driver Supply Values Using Online Reinforcement Learning Benjamin Han, Hyungjun Lee, Sébastien Martin code 0
Three-Stage Root Cause Analysis for Logistics Time Efficiency via Explainable Machine Learning Shiqi Hao, Yang Liu, Yu Wang, Yuan Wang, Wenming Zhe code 0
Unsupervised Learning Style Classification for Learning Path Generation in Online Education Platforms Zhicheng He, Wei Xia, Kai Dong, Huifeng Guo, Ruiming Tang, Dingyin Xia, Rui Zhang code 0
Analyzing Online Transaction Networks with Network Motifs Jiawei Jiang, Yusong Hu, Xiaosen Li, Wen Ouyang, Zhitao Wang, Fangcheng Fu, Bin Cui code 0
COBART: Controlled, Optimized, Bidirectional and Auto-Regressive Transformer for Ad Headline Generation Yashal Shakti Kanungo, Gyanendra Das, Pooja A, Sumit Negi code 0
Fast Mining and Forecasting of Co-evolving Epidemiological Data Streams Tasuku Kimura, Yasuko Matsubara, Koki Kawabata, Yasushi Sakurai code 0
Design Domain Specific Neural Network via Symbolic Testing Hui Li, Xing Fu, Ruofan Wu, Jinyu Xu, Kai Xiao, Xiaofu Chang, Weiqiang Wang, Shuai Chen, Leilei Shi, Tao Xiong, Yuan Qi code 0
Arbitrary Distribution Modeling with Censorship in Real-Time Bidding Advertising Xu Li, Michelle Ma Zhang, Zhenya Wang, Youjun Tong code 0
Para-Pred: Addressing Heterogeneity for City-Wide Indoor Status Estimation in On-Demand Delivery Wei Liu, Yi Ding, Shuai Wang, Yu Yang, Desheng Zhang code 0
Uncovering the Heterogeneous Effects of Preference Diversity on User Activeness: A Dynamic Mixture Model Yunfei Lu, Peng Cui, Linyun Yu, Lei Li, Wenwu Zhu code 0
Looper: An End-to-End ML Platform for Product Decisions Igor L. Markov, Hanson Wang, Nitya S. Kasturi, Shaun Singh, Mia R. Garrard, Yin Huang, Sze Wai Celeste Yuen, Sarah Tran, Zehui Wang, Igor Glotov, Tanvi Gupta, Peng Chen, Boshuang Huang, Xiaowen Xie, Michael Belkin, Sal Uryasev, Sam Howie, Eytan Bakshy, Norm Zhou code 0
Proactively Reducing the Hate Intensity of Online Posts via Hate Speech Normalization Sarah Masud, Manjot Bedi, Mohammad Aflah Khan, Md. Shad Akhtar, Tanmoy Chakraborty code 0
Human-in-the-Loop Large-Scale Predictive Maintenance of Workstations Alexander V. Nikitin, Samuel Kaski code 0
Regional-Local Adversarially Learned One-Class Classifier Anomalous Sound Detection in Global Long-Term Space Yu Sha, Shuiping Gou, Johannes Faber, Bo Liu, Wei Li, Stefan Schramm, Horst Stoecker, Thomas Steckenreiter, Domagoj Vnucec, Nadine Wetzstein, Andreas Widl, Kai Zhou code 0
Septor: Seismic Depth Estimation Using Hierarchical Neural Networks M. Ashraf Siddiquee, Vinicius M. A. Souza, Glenn Eli Baker, Abdullah Mueen code 0
Optimizing Long-Term Efficiency and Fairness in Ride-Hailing via Joint Order Dispatching and Driver Repositioning Jiahui Sun, Haiming Jin, Zhaoxing Yang, Lu Su, Xinbing Wang code 0
NENYA: Cascade Reinforcement Learning for Cost-Aware Failure Mitigation at Microsoft 365 Lu Wang, Pu Zhao, Chao Du, Chuan Luo, Mengna Su, Fangkai Yang, Yudong Liu, Qingwei Lin, Min Wang, Yingnong Dang, Hongyu Zhang, Saravan Rajmohan, Dongmei Zhang code 0
ROI-Constrained Bidding via Curriculum-Guided Bayesian Reinforcement Learning Haozhe Wang, Chao Du, Panyan Fang, Shuo Yuan, Xuming He, Liang Wang, Bo Zheng code 0
Adaptive Multi-view Rule Discovery for Weakly-Supervised Compatible Products Prediction Rongzhi Zhang, Rebecca West, Xiquan Cui, Chao Zhang code 0
DESCN: Deep Entire Space Cross Networks for Individual Treatment Effect Estimation Kailiang Zhong, Fengtong Xiao, Yan Ren, Yaorong Liang, Wenqing Yao, Xiaofeng Yang, Ling Cen code 0
RBG: Hierarchically Solving Large-Scale Routing Problems in Logistic Systems via Reinforcement Learning Zefang Zong, Hansen Wang, Jingwei Wang, Meng Zheng, Yong Li code 0
Scalable Online Disease Diagnosis via Multi-Model-Fused Actor-Critic Reinforcement Learning Weijie He, Ting Chen code 0
Reinforcement Learning Enhances the Experts: Large-scale COVID-19 Vaccine Allocation with Multi-factor Contact Network Qianyue Hao, Wenzhen Huang, Fengli Xu, Kun Tang, Yong Li code 0
The Battlefront of Combating Misinformation and Coping with Media Bias Yi R. Fung, KungHsiang Huang, Preslav Nakov, Heng Ji code 0
Large-Scale Information Extraction under Privacy-Aware Constraints Rajeev Gupta, Ranganath Kondapally code 0
Online Clustering: Algorithms, Evaluation, Metrics, Applications and Benchmarking Jacob Montiel, HoangAnh Ngo, MinhHuong Le Nguyen, Albert Bifet code 0
Automated Machine Learning & Tuning with FLAML Chi Wang, Qingyun Wu, Xueqing Liu, Luis Quintanilla code 0
Decision Intelligence and Analytics for Online Marketplaces: Jobs, Ridesharing, Retail and Beyond Zhiwei (Tony) Qin, Liangjie Hong, Rui Song, Hongtu Zhu, Mohammed Korayem, Haiyan Luo, Michael I. Jordan code 0
Machine Learning for Materials Science (MLMS) Avadhut Sardeshmukh, Sreedhar Reddy, Gautham B. P., Ankit Agrawal code 0
The Power of (Statistical) Relational Thinking Lise Getoor UC Santa Cruz, Santa Cruz, CA, USA Taking into account relational structure during data mining can lead to better results, both in terms of quality and computational efficiency. This structure may be captured in the schema, in links between entities (e.g., graphs) or in rules describing the domain (e.g., knowledge graphs). Further, for richly structured prediction problems, there is often a need for a mix of both logical reasoning and statistical inference. In this talk, I will give an introduction to the field of Statistical Relational Learning (SRL), and I'll identify useful tips and tricks for exploiting structure in both the input and output space. I'll describe our recent work on highly scalable approaches for statistical relational inference. I'll close by introducing a broader interpretation of relational thinking that reveals new research opportunities (and challenges!). 在数据挖掘过程中考虑到关系结构,可以在质量和计算效率方面取得更好的结果。这种结构可以在模式、实体之间的链接(例如图形)或描述领域的规则(例如知识图形)中捕获。此外,对于结构丰富的预测问题,通常需要同时考虑逻辑推理和推论统计学。在这个演讲中,我将介绍统计关系学习(SRL)领域,并且我将确定在输入和输出空间中利用结构的有用提示和技巧。我将描述我们最近关于统计关系推理的高度可伸缩方法的工作。最后,我将介绍关系思维的更广泛的解释,揭示新的研究机会(和挑战!). code 0
Beyond Traditional Characterizations in the Age of Data: Big Models, Scalable Algorithms, and Meaningful Solutions ShangHua Teng University of Southern California, Los Angeles, CA, USA What are data and network models? What are efficient algorithms? What are meaningful solutions? Big Data, Network Sciences, and Machine Learning have fundamentally challenged the basic characterizations in computing, from the conventional graph-theoretical modeling of networks to the traditional polynomial-time worst-case measures of efficiency: For a long time, graphs have been widely used for defining the structure of social and information networks. However, real-world network data and phenomena are much richer and more complex than what can be captured by nodes and edges. Network data is multifaceted, and thus network sciences require new theories, going beyond classic graph theory and graph-theoretical frameworks, to capture the multifaceted data. More than ever before, it is not just desirable, but essential, that efficient algorithms should be scalable. In other words, their complexity should be nearly linear or even sub-linear with respect to the problem size. Thus, scalability, not just polynomial-time computability, should be elevated as the central complexity notion for characterizing efficient computation. 什么是数据和网络模型?什么是高效算法?什么是有意义的解决方案?大数据、网络科学和机器学习从根本上挑战了计算的基本特征,从传统的网络图形理论建模到传统的多项式时间最坏情况的效率度量: 长期以来,图形被广泛用于定义社会和信息网络的结构。然而,真实世界的网络数据和现象比节点和边所能捕获的要丰富和复杂得多。网络数据是多方面的,因此网络科学需要超越经典图论和图论框架的新理论来捕获多方面的数据。与以往任何时候相比,有效的算法应该是可伸缩的,这不仅是可取的,而且是必要的。换句话说,相对于问题的大小,它们的复杂度应该接近线性,甚至是次线性。因此,可伸缩性,而不仅仅是多项式时间的可计算性,应该被提升为表征有效计算的核心复杂性概念。 code 0
Multi-Variate Time Series Forecasting on Variable Subsets Jatin Chauhan, Aravindan Raghuveer, Rishi Saket, Jay Nandy, Balaraman Ravindran code 0
HyperAid: Denoising in Hyperbolic Spaces for Tree-fitting and Hierarchical Clustering Eli Chien, Puoya Tabaghi, Olgica Milenkovic code 0
Scalable Differentially Private Clustering via Hierarchically Separated Trees Vincent CohenAddad, Alessandro Epasto, Silvio Lattanzi, Vahab Mirrokni, Andres Muñoz Medina, David Saulpic, Chris Schwiegelshohn, Sergei Vassilvitskii code 0
Framing Algorithmic Recourse for Anomaly Detection Debanjan Datta, Feng Chen, Naren Ramakrishnan code 0
Fair Labeled Clustering Seyed A. Esmaeili, Sharmila Duppala, John P. Dickerson, Brian Brubach code 0
On Aligning Tuples for Regression Chenguang Fang, Shaoxu Song, Yinan Mei, Ye Yuan, Jianmin Wang code 0
Optimal Interpretable Clustering Using Oblique Decision Trees Magzhan Gabidolla, Miguel Á. CarreiraPerpiñán code 0
Finding Meta Winning Ticket to Train Your MAML Dawei Gao, Yuexiang Xie, Zimu Zhou, Zhen Wang, Yaliang Li, Bolin Ding code 0
BLISS: A Billion scale Index using Iterative Re-partitioning Gaurav Gupta, Tharun Medini, Anshumali Shrivastava, Alexander J. Smola code 0
Subset Node Anomaly Tracking over Large Dynamic Graphs Xingzhi Guo, Baojian Zhou, Steven Skiena code 0
Continuous-Time and Multi-Level Graph Representation Learning for Origin-Destination Demand Prediction Liangzhe Han, Xiaojian Ma, Leilei Sun, Bowen Du, Yanjie Fu, Weifeng Lv, Hui Xiong code 0
Quantifying and Reducing Registration Uncertainty of Spatial Vector Labels on Earth Imagery Wenchong He, Zhe Jiang, Marcus Kriby, Yiqun Xie, Xiaowei Jia, Da Yan, Yang Zhou code 0
AdaAX: Explaining Recurrent Neural Networks by Learning Automata with Adaptive States Dat Hong, Alberto Maria Segre, Tong Wang code 0
Flexible Modeling and Multitask Learning using Differentiable Tree Ensembles Shibal Ibrahim, Hussein Hazimeh, Rahul Mazumder code 0
Selective Cross-City Transfer Learning for Traffic Prediction via Source City Region Re-Weighting Yilun Jin, Kai Chen, Qiang Yang code 0
CoRGi: Content-Rich Graph Neural Networks with Attention Jooyeon Kim, Angus Lamb, Simon Woodhead, Simon Peyton Jones, Cheng Zhang, Miltiadis Allamanis code 0
ExMeshCNN: An Explainable Convolutional Neural Network Architecture for 3D Shape Analysis Seonggyeom Kim, DongKyu Chae code 0
In Defense of Core-set: A Density-aware Core-set Selection for Active Learning Yeachan Kim, Bonggun Shin code 0
Modeling Network-level Traffic Flow Transitions on Sparse Data Xiaoliang Lei, Hao Mei, Bin Shi, Hua Wei code 0
FlowGEN: A Generative Model for Flow Graphs Furkan Kocayusufoglu, Arlei Silva, Ambuj K. Singh code 0
The DipEncoder: Enforcing Multimodality in Autoencoders Collin Leiber, Lena G. M. Bauer, Michael Neumayr, Claudia Plant, Christian Böhm code 0
HierCDF: A Bayesian Network-based Hierarchical Cognitive Diagnosis Framework Jiatong Li, Fei Wang, Qi Liu, Mengxiao Zhu, Wei Huang, Zhenya Huang, Enhong Chen, Yu Su, Shijin Wang code 0
Mining Spatio-Temporal Relations via Self-Paced Graph Contrastive Learning Rongfan Li, Ting Zhong, Xinke Jiang, Goce Trajcevski, Jin Wu, Fan Zhou code 0
PAC-Wrap: Semi-Supervised PAC Anomaly Detection Shuo Li, Xiayan Ji, Edgar Dobriban, Oleg Sokolsky, Insup Lee code 0
TransBO: Hyperparameter Optimization via Two-Phase Transfer Learning Yang Li, Yu Shen, Huaijun Jiang, Wentao Zhang, Zhi Yang, Ce Zhang, Bin Cui code 0
Sparse Conditional Hidden Markov Model for Weakly Supervised Named Entity Recognition Yinghao Li, Le Song, Chao Zhang code 0
Deep Representations for Time-varying Brain Datasets Sikun Lin, Shuyun Tang, Scott T. Grafton, Ambuj K. Singh code 0
Partial-Quasi-Newton Methods: Efficient Algorithms for Minimax Optimization Problems with Unbalanced Dimensionality Chengchang Liu, Shuxian Bi, Luo Luo, John C. S. Lui code 0
Label-enhanced Prototypical Network with Contrastive Learning for Multi-label Few-shot Aspect Category Detection Han Liu, Feng Zhang, Xiaotong Zhang, Siyang Zhao, Junjie Sun, Hong Yu, Xianchao Zhang code 0
Fair Representation Learning: An Alternative to Mutual Information Ji Liu, Zenan Li, Yuan Yao, Feng Xu, Xiaoxing Ma, Miao Xu, Hanghang Tong code 0
S2RL: Do We Really Need to Perceive All States in Deep Multi-Agent Reinforcement Learning? Shuang Luo, Yinchuan Li, Jiahui Li, Kun Kuang, Furui Liu, Yunfeng Shao, Chao Wu code 0
ML4S: Learning Causal Skeleton from Vicinal Graphs Pingchuan Ma, Rui Ding, Haoyue Dai, Yuanyuan Jiang, Shuai Wang, Shi Han, Dongmei Zhang code 0
Non-stationary Time-aware Kernelized Attention for Temporal Event Prediction Yu Ma, Zhining Liu, Chenyi Zhuang, Yize Tan, Yi Dong, Wenliang Zhong, Jinjie Gu code 0
Discovering Invariant and Changing Mechanisms from Data Sarah Mameche, David Kaltenpoth, Jilles Vreeken code 0
Minimizing Congestion for Balanced Dominators Yosuke Mizutani, Annie Staker, Blair D. Sullivan code 0
Learning Fair Representation via Distributional Contrastive Disentanglement Changdae Oh, Heeji Won, Junhyuk So, Taero Kim, Yewon Kim, Hosik Choi, Kyungwoo Song code 0
MetaV: A Meta-Verifier Approach to Task-Agnostic Model Fingerprinting Xudong Pan, Yifan Yan, Mi Zhang, Min Yang code 0
Predicting Opinion Dynamics via Sociologically-Informed Neural Networks Maya Okawa, Tomoharu Iwata code 0
Bilateral Dependency Optimization: Defending Against Model-inversion Attacks Xiong Peng, Feng Liu, Jingfeng Zhang, Long Lan, Junjie Ye, Tongliang Liu, Bo Han code 0
Compute Like Humans: Interpretable Step-by-step Symbolic Computation with Deep Neural Network Shuai Peng, Di Fu, Yong Cao, Yijun Liang, Gu Xu, Liangcai Gao, Zhi Tang code 0
Evaluating Knowledge Graph Accuracy Powered by Optimized Human-machine Collaboration Yifan Qi, Weiguo Zheng, Liang Hong, Lei Zou code 0
Releasing Private Data for Numerical Queries Yuan Qiu, Wei Dong, Ke Yi, Bin Wu, Feifei Li code 0
External Knowledge Infusion for Tabular Pre-training Models with Dual-adapters Can Qin, Sungchul Kim, Handong Zhao, Tong Yu, Ryan A. Rossi, Yun Fu code 0
p-Meta: Towards On-device Deep Model Adaptation Zhongnan Qu, Zimu Zhou, Yongxin Tong, Lothar Thiele code 0
Fair and Interpretable Models for Survival Analysis Md. Mahmudur Rahman, Sanjay Purushotham code 0
A Generalized Backward Compatibility Metric Tomoya Sakai code 0
Multi-View Clustering for Open Knowledge Base Canonicalization Wei Shen, Yang Yang, Yinan Liu code 0
Deep Learning for Prognosis Using Task-fMRI: A Novel Architecture and Training Scheme Ge Shi, Jason Smucny, Ian Davidson code 0
Active Model Adaptation Under Unknown Shift JieJing Shao, Yunlu Xu, Zhanzhan Cheng, YuFeng Li code 0
GUIDE: Group Equality Informed Individual Fairness in Graph Neural Networks Weihao Song, Yushun Dong, Ninghao Liu, Jundong Li code 0
Robust and Informative Text Augmentation (RITA) via Constrained Worst-Case Transformations for Low-Resource Named Entity Recognition Hyunwoo Sohn, Baekkwan Park code 0
pureGAM: Learning an Inherently Pure Additive Model Xingzhi Sun, Ziyu Wang, Rui Ding, Shi Han, Dongmei Zhang code 0
Demystify Hyperparameters for Stochastic Optimization with Transferable Representations Jianhui Sun, Mengdi Huai, Kishlay Jha, Aidong Zhang code 0
Dense Feature Tracking of Atmospheric Winds with Deep Optical Flow Thomas J. Vandal, Kate Duffy, Will McCarty, Akira Sewnath, Ramakrishna R. Nemani code 0
Incremental Cognitive Diagnosis for Intelligent Education Shiwei Tong, Jiayu Liu, Yuting Hong, Zhenya Huang, Le Wu, Qi Liu, Wei Huang, Enhong Chen, Dan Zhang code 0
A Model-Agnostic Approach to Differentially Private Topic Mining Han Wang, Jayashree Sharma, Shuya Feng, Kai Shu, Yuan Hong code 0
Toward Learning Robust and Invariant Representations with Alignment Regularization and Data Augmentation Haohan Wang, Zeyi Huang, Xindi Wu, Eric P. Xing code 0
Group-wise Reinforcement Feature Generation for Optimal and Explainable Representation Space Reconstruction Dongjie Wang, Yanjie Fu, Kunpeng Liu, Xiaolin Li, Yan Solihin code 0
Proton: Probing Schema Linking Information from Pre-trained Language Models for Text-to-SQL Parsing Lihan Wang, Bowen Qin, Binyuan Hui, Bowen Li, Min Yang, Bailin Wang, Binhua Li, Jian Sun, Fei Huang, Luo Si, Yongbin Li code 0
Partial Label Learning with Discrimination Augmentation Wei Wang, MinLing Zhang code 0
An Embedded Feature Selection Framework for Control Jiawen Wei, Fangyuan Wang, Wanxin Zeng, Wenwei Lin, Ning Gui code 0
SagDRE: Sequence-Aware Graph-Based Document-Level Relation Extraction with Adaptive Margin Loss Ying Wei, Qi Li code 0
Beyond Point Prediction: Capturing Zero-Inflated & Heavy-Tailed Spatiotemporal Data with Deep Extreme Mixture Models Tyler Wilson, Andrew McDonald, Asadullah Hill Galib, PangNing Tan, Lifeng Luo code 0
Geometric Policy Iteration for Markov Decision Processes Yue Wu, Jesús A. De Loera code 0
Robust Tensor Graph Convolutional Networks via T-SVD based Graph Augmentation Zhebin Wu, Lin Shu, Ziyue Xu, Yaomin Chang, Chuan Chen, Zibin Zheng code 0
End-to-End Semi-Supervised Ordinal Regression AUC Maximization with Convolutional Kernel Networks Ziran Xiong, Wanli Shi, Bin Gu code 0
Solving the Batch Stochastic Bin Packing Problem in Cloud: A Chance-constrained Optimization Approach Jie Yan, Yunlei Lu, Liting Chen, Si Qin, Yixin Fang, Qingwei Lin, Thomas Moscibroda, Saravan Rajmohan, Dongmei Zhang code 0
Causal Discovery on Non-Euclidean Data Jing Yang, Kai Xie, Ning An code 0
Learning Task-relevant Representations for Generalization via Characteristic Functions of Reward Sequence Distributions Rui Yang, Jie Wang, Zijie Geng, Mingxuan Ye, Shuiwang Ji, Bin Li, Feng Wu code 0
Numerical Tuple Extraction from Tables with Pre-training Qingping Yang, Yixuan Cao, Ping Luo code 0
Deconfounding Actor-Critic Network with Policy Adaptation for Dynamic Treatment Regimes Changchang Yin, Ruoqi Liu, Jeffrey M. Caterino, Ping Zhang code 0
LeapAttack: Hard-Label Adversarial Attack on Text via Gradient-Based Optimization Muchao Ye, Jinghui Chen, Chenglin Miao, Ting Wang, Fenglong Ma code 0
MetroGAN: Simulating Urban Morphology with Generative Adversarial Network Weiyu Zhang, Yiyang Ma, Di Zhu, Lei Dong, Yu Liu code 0
MT-FlowFormer: A Semi-Supervised Flow Transformer for Encrypted Traffic Classification Ruijie Zhao, Xianwen Deng, Zhicong Yan, Jun Ma, Zhi Xue, Yijun Wang code 0
Integrity Authentication in Tree Models Weijie Zhao, Yingjie Lao, Ping Li code 0
Instant Graph Neural Networks for Dynamic Graphs Yanping Zheng, Hanzhi Wang, Zhewei Wei, Jiajun Liu, Sibo Wang code 0
KRATOS: Context-Aware Cell Type Classification and Interpretation using Joint Dimensionality Reduction and Clustering Zihan Zhou, Zijia Du, Somali Chaterji code 0
Unified 2D and 3D Pre-Training of Molecular Representations Jinhua Zhu, Yingce Xia, Lijun Wu, Shufang Xie, Tao Qin, Wengang Zhou, Houqiang Li, TieYan Liu code 0
A Nearly-Linear Time Algorithm for Minimizing Risk of Conflict in Social Networks Liwang Zhu, Zhongzhi Zhang code 0
A Process-Aware Decision Support System for Business Processes Prerna Agarwal, Buyu Gao, Siyu Huo, Prabhat Reddy, Sampath Dechu, Yazan Obeidi, Vinod Muthusamy, Vatche Isahagian, Sebastian Carbajales code 0
BrainNet: Epileptic Wave Detection from SEEG with Hierarchical Graph Diffusion Learning Junru Chen, Yang Yang, Tao Yu, Yingying Fan, Xiaolong Mo, Carl Yang code 0
Ask to Know More: Generating Counterfactual Explanations for Fake Claims ShihChieh Dai, YiLi Hsu, Aiping Xiong, LunWei Ku code 0
The Good, the Bad, and the Outliers: A Testing Framework for Decision Optimization Model Learning Orit Davidovich, GheorgheTeodor Bercea, Segev Wasserkrug code 0
Precise Mobility Intervention for Epidemic Control Using Unobservable Information via Deep Reinforcement Learning Tao Feng, Tong Xia, Xiaochen Fan, Huandong Wang, Zefang Zong, Yong Li code 0
DP-GAT: A Framework for Image-based Disease Progression Prediction Alex Foo, Wynne Hsu, MongLi Lee, Gavin Siew Wei Tan code 0
Graph Meta-Reinforcement Learning for Transferable Autonomous Mobility-on-Demand Daniele Gammelli, Kaidi Yang, James Harrison, Filipe Rodrigues, Francisco C. Pereira, Marco Pavone code 0
Applying Deep Learning Based Probabilistic Forecasting to Food Preparation Time for On-Demand Delivery Service Chengliang Gao, Fan Zhang, Yue Zhou, Ronggen Feng, Qiang Ru, Kaigui Bian, Renqing He, Zhizhao Sun code 0
T-Cell Receptor-Peptide Interaction Prediction with Physical Model Augmented Pseudo-Labeling Yiren Jian, Erik Kruus, Martin Renqiang Min code 0
Predicting Bearings Degradation Stages for Predictive Maintenance in the Pharmaceutical Industry Dovile Juodelyte, Veronika Cheplygina, Therese Graversen, Philippe Bonnet code 0
Vexation-Aware Active Learning for On-Menu Restaurant Dish Availability JeanFrançois Kagy, Flip Korn, Afshin Rostamizadeh, Chris Welty code 0
Preventing Catastrophic Forgetting in Continual Learning of New Natural Language Tasks Sudipta Kar, Giuseppe Castellucci, Simone Filice, Shervin Malmasi, Oleg Rokhlenko code 0
Self-Supervised Augmentation and Generation for Multi-lingual Text Advertisements at Bing Xiaoyu Kou, Tianqi Zhao, Fan Zhang, Song Li, Qi Zhang code 0
TaxoTrans: Taxonomy-Guided Entity Translation Zhuliu Li, Yiming Wang, Xiao Yan, Weizhi Meng, Yanen Li, Jaewon Yang code 0
A Logic Aware Neural Generation Method for Explainable Data-to-text Xiexiong Lin, Huaisong Li, Tao Huang, Feng Wang, Linlin Chao, Fuzhen Zhuang, Taifeng Wang, Tianyi Zhang code 0
BE3R: BERT based Early-Exit Using Expert Routing Sourab Mangrulkar, Ankith M. S, Vivek Sembium code 0
Graph Neural Network Training and Data Tiering Seungwon Min, Kun Wu, Mert Hidayetoglu, Jinjun Xiong, Xiang Song, WenMei Hwu code 0
Generating Examples from CLI Usage: Can Transformers Help? Roshanak Zilouchian Moghaddam, Spandan Garg, Colin B. Clement, Yevhen Mohylevskyy, Neel Sundaresan code 0
GradMask: Gradient-Guided Token Masking for Textual Adversarial Example Detection Han Cheol Moon, Shafiq R. Joty, Xu Chi code 0
Counterfactual Phenotyping with Censored Time-to-Events Chirag Nagpal, Mononito Goswami, Keith Dufendach, Artur Dubrawski code 0
Crowdsourcing with Contextual Uncertainty VietAn Nguyen, Peibei Shi, Jagdish Ramakrishnan, Narjes Torabi, Nimar S. Arora, Udi Weinsberg, Michael Tingley code 0
Solar: Science of Entity Loss Attribution Anshuman Mourya, Prateek Sircar, Anirban Majumder, Deepak Gupta code 0
Packet Representation Learning for Traffic Classification Xuying Meng, Yequan Wang, Runxin Ma, Haitong Luo, Xiang Li, Yujun Zhang code 0
Characterizing Covid Waves via Spatio-Temporal Decomposition Kevin Quinn, Evimaria Terzi, Mark Crovella code 0
Service Time Prediction for Delivery Tasks via Spatial Meta-Learning Sijie Ruan, Cheng Long, Zhipeng Ma, Jie Bao, Tianfu He, Ruiyuan Li, Yiheng Chen, Shengnan Wu, Yu Zheng code 0
Reinforcement Learning in the Wild: Scalable RL Dispatching Algorithm Deployed in Ridehailing Marketplace Soheil Sadeghi Eshkevari, Xiaocheng Tang, Zhiwei Qin, Jinhan Mei, Cheng Zhang, Qianying Meng, Jia Xu code 0
Generalized Deep Mixed Models Jun Shi, Chengming Jiang, Aman Gupta, Mingzhou Zhou, Yunbo Ouyang, Qiang Charles Xiao, Qingquan Song, Yi (Alice) Wu, Haichao Wei, Huiji Gao code 0
Counseling Summarization Using Mental Health Knowledge Guided Utterance Filtering Aseem Srivastava, Tharun Suresh, Sarah Peregrine Lord, Md. Shad Akhtar, Tanmoy Chakraborty code 0
Few-shot Learning for Trajectory-based Mobile Game Cheating Detection Yueyang Su, Di Yao, Xiaokai Chu, Wenbin Li, Jingping Bi, Shiwei Zhao, Runze Wu, Shize Zhang, Jianrong Tao, Hao Deng code 0
RT-VeD: Real-Time VoI Detection on Edge Nodes with an Adaptive Model Selection Framework Shuai Wang, Junke Lu, Baoshen Guo, Zheng Dong code 0
Representative Routes Discovery from Massive Trajectories Tingting Wang, Shixun Huang, Zhifeng Bao, J. Shane Culpepper, Reza Arablouei code 0
Connecting the Hosts: Street-Level IP Geolocation with Graph Neural Networks Zhiyuan Wang, Fan Zhou, Wenxuan Zeng, Goce Trajcevski, Chunjing Xiao, Yong Wang, Kai Chen code 0
Graph2Route: A Dynamic Spatial-Temporal Graph Neural Network for Pick-up and Delivery Route Prediction Haomin Wen, Youfang Lin, Xiaowei Mao, Fan Wu, Yiji Zhao, Haochen Wang, Jianbin Zheng, Lixia Wu, Haoyuan Hu, Huaiyu Wan code 0
Perioperative Predictions with Interpretable Latent Representation Bing Xue, York Jiao, Thomas George Kannampallil, Bradley A. Fritz, Christopher Ryan King, Joanna Abraham, Michael Avidan, Chenyang Lu code 0
A Meta Reinforcement Learning Approach for Predictive Autoscaling in the Cloud Siqiao Xue, Chao Qu, Xiaoming Shi, Cong Liao, Shiyi Zhu, Xiaoyu Tan, Lintao Ma, Shiyu Wang, Shijun Wang, Yun Hu, Lei Lei, Yangfei Zheng, Jianguo Li, James Zhang code 0
CMMD: Cross-Metric Multi-Dimensional Root Cause Analysis Shifu Yan, Caihua Shan, Wenyi Yang, Bixiong Xu, Dongsheng Li, Lili Qiu, Jie Tong, Qi Zhang code 0
TAG: Toward Accurate Social Media Content Tagging with a Concept Graph Jiuding Yang, Weidong Guo, Bang Liu, Yakun Yu, Chaoyue Wang, Jinwen Luo, Linglong Kong, Di Niu, Zhen Wen code 0
Multilingual Taxonomic Web Page Classification for Contextual Targeting at Yahoo Eric Ye, Xiao Bai, Neil O'Hare, Eliyar Asgarieh, Kapil Thadani, Francisco PerezSorrosal, Sujyothi Adiga code 0
A Stochastic Shortest Path Algorithm for Optimizing Spaced Repetition Scheduling Junyao Ye, Jingyong Su, Yilong Cao code 0
Predicting Age-Related Macular Degeneration Progression with Contrastive Attention and Time-Aware LSTM Changchang Yin, Sayoko E. Moroi, Ping Zhang code 0
Spatio-Temporal Vehicle Trajectory Recovery on Road Network Based on Traffic Camera Video Data Fudan Yu, Wenxuan Ao, Huan Yan, Guozhen Zhang, Wei Wu, Yong Li code 0
XDAI: A Tuning-free Framework for Exploiting Pre-trained Language Models in Knowledge Grounded Dialogue Generation Jifan Yu, Xiaohan Zhang, Yifan Xu, Xuanyu Lei, Xinyu Guan, Jing Zhang, Lei Hou, Juanzi Li, Jie Tang code 0
Data-Driven Oracle Bone Rejoining: A Dataset and Practical Self-Supervised Learning Scheme Chongsheng Zhang, Bin Wang, Ke Chen, Ruixing Zong, Bofeng Mo, Yi Men, George Almpanidis, Shanxiong Chen, Xiangliang Zhang code 0
Sparx: Distributed Outlier Detection at Scale Sean Zhang, Varun Ursekar, Leman Akoglu code 0
CAT: Beyond Efficient Transformer for Content-Aware Anomaly Detection in Event Sequences Shengming Zhang, Yanchi Liu, Xuchao Zhang, Wei Cheng, Haifeng Chen, Hui Xiong code 0
JiuZhang: A Chinese Pre-trained Language Model for Mathematical Problem Understanding Wayne Xin Zhao, Kun Zhou, Zheng Gong, Beichen Zhang, Yuanhang Zhou, Jing Sha, Zhigang Chen, Shijin Wang, Cong Liu, JiRong Wen code 0
Dynamic Graph Segmentation for Deep Graph Neural Networks Johan Kok Zhi Kang, Suwei Yang, Suriya Venkatesan, Sien Yi Tan, Feng Cheng, Bingsheng He code 0
Dynamic Network Anomaly Modeling of Cell-Phone Call Detail Records for Infectious Disease Surveillance Carl Yang, Hongwen Song, Mingyue Tang, Leon Danon, Ymir Vigfusson code 0
Medical Dialogue Response Generation with Pivotal Information Recalling Yu Zhao, Yunxin Li, Yuxiang Wu, Baotian Hu, Qingcai Chen, Xiaolong Wang, Yuxin Ding, Min Zhang code 0
Classifying Multimodal Data Using Transformers Watson W. K. Chua, Lu Li, Alvina Goh code 0
Hyperbolic Neural Networks: Theory, Architectures and Applications Nurendra Choudhary, Nikhil Rao, Karthik Subbian, Srinivasan H. Sengamedu, Chandan K. Reddy code 0
Toward Graph Minimally-Supervised Learning Kaize Ding, Chuxu Zhang, Jie Tang, Nitesh V. Chawla, Huan Liu code 0
Frontiers of Graph Neural Networks with DIG Shuiwang Ji, Meng Liu, Yi Liu, Youzhi Luo, Limei Wang, Yaochen Xie, Zhao Xu, Haiyang Yu code 0
Adapting Pretrained Representations for Text Mining Yu Meng, Jiaxin Huang, Yu Zhang, Jiawei Han code 0
Deep Learning for Network Traffic Data Manish Marwah, Martin F. Arlitt code 0
Temporal Graph Learning for Financial World: Algorithms, Scalability, Explainability & Fairness Nitendra Rajput, Karamjit Singh code 0
Accelerated GNN Training with DGL and RAPIDS cuGraph in a Fraud Detection Workflow Brad Rees, Xiaoyun Wang, Joe Eaton, Onur Yilmaz, Rick Ratzel, Dominque LaSalle code 0
Counterfactual Evaluation and Learning for Interactive Systems: Foundations, Implementations, and Recent Advances Yuta Saito, Thorsten Joachims code 0
Towards Adversarial Learning: From Evasion Attacks to Poisoning Attacks Wentao Wang, Han Xu, Yuxuan Wan, Jie Ren, Jiliang Tang code 0
New Frontiers of Scientific Text Mining: Tasks, Data, and Tools Xuan Wang, Hongwei Wang, Heng Ji, Jiawei Han code 0
Graph Neural Networks in Life Sciences: Opportunities and Solutions Zichen Wang, Vassilis N. Ioannidis, Huzefa Rangwala, Tatsuya Arai, Ryan Brand, Mufei Li, Yohei Nakayama code 0
Trustworthy Graph Learning: Reliability, Explainability, and Privacy Protection Bingzhe Wu, Yatao Bian, Hengtong Zhang, Jintang Li, Junchi Yu, Liang Chen, Chaochao Chen, Junzhou Huang code 0
Anomaly Detection for Spatiotemporal Data in Action Guang Yang, Ninad Kulkarni, Paavani Dua, Dipika Khullar, Alex Anto Chirayath code 0
HoloViz: Visualization and Interactive Dashboards in Python Sophia Yang, Marc Skov Madsen, James A. Bednar code 0
AdKDD 2022 Abraham Bagherjeiran, Nemanja Djuric, Mihajlo Grbovic, KuangChih Lee, Kun Liu, Wei Liu, Linsey Pang, Vladan Radosavljevic, Suju Rajan, Kexin Xie code 0
Fragile Earth: AI for Climate Mitigation, Adaptation, and Environmental Justice Naoki Abe, Kathleen Buckingham, Bistra Dilkina, Emre Eftelioglu, Auroop R. Ganguly, James Hodson, Ramakrishnan Kannan, Rose Yu code 0
Data-driven Humanitarian Mapping and Policymaking: Toward Planetary-Scale Resilience, Equity, and Sustainability Snehalkumar (Neil) S. Gaikwad, Shankar Iyer, Dalton D. Lunga, Takahiro Yabe, Xiaofan Liang, Bhavani Ananthabhotla, Nikhil Behari, Sreelekha Guggilam, Guanghua Chi code 0
ANDEA: Anomaly and Novelty Detection, Explanation, and Accommodation Guansong Pang, Jundong Li, Anton van den Hengel, Longbing Cao, Thomas G. Dietterich code 0
Visualization in Data Science VDS @ KDD 2022 Claudia Plant, Nina C. Hubig, Junming Shao, Alvitta Ottley, Liang Gou, Torsten Möller, Adam Perer, Alexander Lex, Anamaria Crisan code 0
Deep Learning on Graphs: Methods and Applications (DLG-KDD2022) Lingfei Wu, Jian Pei, Jiliang Tang, Yinglong Xia, Xiaojie Guo code 0