Research Project (2022/Q4) conducted for the conclusion of the BSc study in Computer Science and Enginering (CSE) at TU Delft.
Paper: A Comprehensive Taxonomy of User Intents for Search Queries
Abstract: Search engines operate as an oracle between user queries and information access: the user types the input and receives back the information requested. To accomplish the task, search engines need to interpret human language and, most importantly, comprehend the underlying user intents of a query. With this process, they can retrieve the most appropriate sources of information. The purpose of our research is to introduce a new, hierarchical taxonomy that better depicts the underlying intents of users asking questions online (on search engines and Q&A platforms). Throughout our study, we first review the prior work and findings on the topic. We assemble a new dataset with queries aggregated from MS Marco, AskReddit and Quora. We examine its questions and label them to construct a new fine-grained ontology. Our examination continues with the integration of Deep Learning models and Active Learning (AL) to evaluate the quality of our work. The results show that the taxonomy can effectively assess users’ goals. Our taxonomy, the dataset composed and the codebase are publicly available to support future research.