Bachelor of Engineering(computer science and technology): School of Computer Science and Technology, University of Chinese Academy of Sciences. My thesis was advised by Beihong Jin
Master of Philosophy(computer science and technology): Center for Biomedical Information Technology, Institute of Advanced Computing and Digital Engineering, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences. I was advised by Yunpeng Cai
AI4Science
Encoding and Decoding for higher cognitive function of human's brain.
Incomplete understanding of the pathophysiology of mental disorders.
Graph Embedding and Geometry Learning.
人工智能赋能自然科学(AI for Science, AI4Sci)的科研方法杂谈
数字图像处理中的二维离散傅里叶变换的性质
[1] Yufu HUO,Beihong JIN,Zhaoyi LIAO. Multi-modal information augmented model for micro-video recommendation. Journal of ZheJiang University (Engineering Science), 2024, 58(6): 1142-1152.
Code: https://github.com/RosalindFok/MMa4CTR
Abstract: A multi-modal augmented model for click through rate (MMa4CTR) tailored for micro-videos recommendation was proposed. Multi-modal data derived from user interactions with micro-videos were effectively leveraged to construct embedded user representations and capture diverse user interests across multi-modal. The aim was to reveal the latent semantic commonalities, by combining and crossing features across modalities. The overall recommendation performance was boosted via two training strategies, automatic learning rate adjustment and validation interruption. A computationally efficient multi-layer perceptron architecture was employed, in order to address the computational demands brought on by the vast amount of multi-modal data. Performance comparison experiments and sensitivity analyses of hyperparameter on WeChat Video Channel and TikTok datasets demonstrated that MMa4CTR outperformed baseline models, delivering superior recommendation results with minimal computational resources. Additionally, ablation studies performed on both datasets further validated the significance and efficacy of the micro-video modality cross module, the user multi-modal embedding layer, and the strategies for automatic learning rate adjustment and validation interruption in enhancing recommendation performance.
Key words: recommender system, click through rate, multi modal, micro-video, machine learning