MSGABench: A Comprehensive Evaluation Benchmark for Multi-Scenario Government Affairs
We introduce a Multi-Scenario Government Affairs Benchmark(MSGABench), which is an evaluation framework that comprising three evaluation dimensions and twelve metrics. Additionally, we also release the corresponding Chinese benchmark dataset, which encompasses 3 application scenarios, 8 task types, and covers 14 fields.
Data distribution
Scenario | Task | Description | Scale |
Document Automation Processing Scenario | Text Classification | The dataset covers common data types and content themes in governmental affairs, encompassing a range of sub-tasks such as the classification of governmental user comments, governmental terminologies, locations, and documents. It considers the diversity of text length, domain departments, and text formats, with precise annotations provided by professionals. | 13572 entries |
Text Generation | Mainly targeted at the demand for automatically generating official documents such as receipts, meeting minutes, and notifications, with clearly defined sample generation rules and evaluation criteria. | 1,000 entries | |
Text Summarization | Mainly targeted at extracting key points from long government texts, ensuring consistency and effectiveness of annotations through unified content and format requirements. | 6,022 entries | |
Government Affairs Q&A Service Scenario | Natural Language Understanding | Mainly composed of various types of user queries, aiming to capture the true intent of the asker and extract keywords. | 233 entries |
Qusetion Answering | Mainly based on Q&A pairs constructed from government data and policy documents, focusing on official and authoritative answers to single questions. | 2081 entries | |
Sentiment Analysis | Mainly aimed at classifying public sentiment tendencies based on user comments from government websites sourced from the internet,including four emotional types: sadness, anger, fear and neutral. The data covered the message data of 28 government departments. | 11,529 entries | |
Understanding and Decision Support Scenario | Government Affairs Understanding | Examining the basic understanding of government knowledge and decision-making. | 3,537 entries |
Decision Support | Examining whether one possesses basic government decision-making and judgment abilities, which can help automate the processing of various approval requests and improve decision-making efficiency. | 8,477 entries | |
Total | 46,451entries | ||