Skip to content

Latest commit

 

History

History
360 lines (296 loc) · 50 KB

qald.md

File metadata and controls

360 lines (296 loc) · 50 KB

Question Answering over Linked Data (QALD)

QALD is a series of evaluation campaigns on question answering over linked data, which aims at providing an up-to-date benchmark for assessing and comparing state-of-the-art systems that mediate between a user, expressing his or her information need in natural language, and RDF data. Thus, it targets all researchers and practitioners working on querying Linked Data, natural language processing for question answering, multilingual information retrieval and related topics. The main goal is to gain insights into the strengths and shortcomings of different approaches and into possible solutions for coping with the large, heterogeneous and distributed nature of Semantic Web data.

QALD challenge began in 2011 and is developing benchmarks that are increasingly being used as standard evaluation venue for question answering over Linked Data. Overviews of past instantiations of the challenge are available from the CLEF Working Notes, CEUR workshop notes as well as ESWC proceedings.

The key challenge for QA over Linked Data is to translate a user's natural language query into such a form that it can be evaluated using standard Semantic Web query processing and inferencing techniques. The main task of QALD therefore is the following:

Given one or several RDF dataset(s) as well as additional knowledge sources and natural language questions or keywords, return the correct answers or a SPARQL query that retrieves these answers.

Table of contents

QALD-9-PLUS

Please see the original paper for details about the dataset creation process, data format, task and participating systems.

Leaderboard

Model / System Year Precision Recall F1 Language Reported by
QAnswer 2022 - - 30.39 (Macro F1) EN Perevalov et. al.
QAnswer 2022 - - 19.98 (Macro F1) DE Perevalov et. al.
QAnswer 2022 - - 15.06 (Macro F1) FR Perevalov et. al.
QAnswer 2022 - - 9.57 (Macro F1) RU Perevalov et. al.
QAnswer 2022 - - 5.27 (Micro F1) EN Perevalov et. al.
QAnswer 2022 - - 2.19 (Micro F1) DE Perevalov et. al.
QAnswer 2022 - - 4.06 (Micro F1) FR Perevalov et. al.
QAnswer 2022 - - 1.53 (Micro F1) RU Perevalov et. al.
DeepPavlov 2022 - - 12.40 (Macro F1) EN Perevalov et. al.
DeepPavlov 2022 - - 0.13 (Micro F1) EN Perevalov et. al.
Platypus 2022 - - 15.03 (Macro F1) EN Perevalov et. al.
Platypus 2022 - - 1.26 (Micro F1) EN Perevalov et. al.
DeepPavlov 2022 - - 8.70 (Macro F1) RU Perevalov et. al.
DeepPavlov 2022 - - 0.05 (Micro F1) RU Perevalov et. al.
Platypus 2022 - - 4.17 (Macro F1) FR Perevalov et. al.
Platypus 2022 - - 0.00 (Micro F1) FR Perevalov et. al.

QALD-9

Please see the original paper for details about the dataset creation process, data format, task and participating systems.

Leaderboard

Model / System Year Precision Recall F1 Language Reported by
SGPT_Q,K 2022 - - 67.82 EN Al Hasan Rony et al
SPARQLGEN 2023 - - 67.07 EN Kovriguina et al.
SGPT_Q 2022 - - 60.22 EN Al Hasan Rony et al
Stage I No Noise [2] 2022 80.40 42.10 55.30 EN Purkayastha et al.
LingTeQA [1] 2020 52.60 64.20 53.50 EN P. Nhuan et al
qaSQP 2019 45.80 47.10 46.30 EN Zheng et. al.
chatGPT 2023 - - 45.71 EN Tan et al.
GPT-3.5v3 2023 - - 46.19 EN Tan et al.
NSpM 2022 - - 45.34 EN Al Hasan Rony et al
GPT-3.5v2 2023 - - 44.95 EN Tan et al.
KGQAn 2023 49.81 39.39 43.99 EN Omar et al.
Ensemble BR framework 2023 42.40 47.60 43.00 EN Chen et al.
KGQAn 2021 50.61 34.67 41.15 EN Omar et al.
Light-QAWizard 2022 39.80 42.60 40.60 EN Chen et al.
Stage-I Part Noise [7] 2022 63.90 28.70 39.60 EN Purkayastha et al.
GPT-3 2023 - - 38.54 EN Tan et al.
Stage-II w/o type [5] 2022 59.40 26.10 36.20 EN Purkayastha et al.
Stage-II w/ type [6] 2022 59.40 26.10 36.20 EN Purkayastha et al.
Stage-I Full Noise [8] 2022 82.60 23 36.00 EN Purkayastha et al.
QAWizard 2022 31.10 46.90 33.00 - EN
QAmp 2019 25 50 33 EN Vakulenko et. al.
QAwizard 2023 31.10 46.90 33 EN Chen et al.
WDAqua-core0 2021 - - 32 EN Orogat et al.
NSQA 2021 31.89 32.05 31.26 EN P.Kapanipathi et alf
DTQA 2021 31.41 32.16 30.88 EN Abdelaziz et al.
NSQA 2021 31.40 32.10 30.80 EN M. Borroto et al
sparql-qa 2021 31 32.48 30.60 EN M. Borroto et al
FLAN-T5 2023 - - 30.17 EN Tan et al.
DTQA 2023 31.40 32.20 30.10 EN Chen et al.
gAnswer 2021 - - 30 EN Orogat et al.
gAnswer 2021 29.34 32.68 29.81 EN Abdelaziz et al.
gAnswer [3] 2021 29.30 32.70 29.80 EN Purkayastha et al.
gAnswer2 2019 29.30 32.70 29.80 EN Zheng et. al.
gAnswer2 2023 29.30 32.70 29.80 EN Chen et al.
gAnswer 2021 60.70 31.60 29.60 EN L Siciliani et al.
TeBaQA 2022 - - 28.81 EN Al Hasan Rony et al
WDAqua-core1 2019 22 38 28 EN Vakulenko et. al.
SQG 2022 - - 27.85 EN Al Hasan Rony et al
WDAqua-core1 2019 26.10 26.70 25 EN Zheng et. al.
WDAqua 2023 26.10 26.70 25 EN Chen et al.
WDAqua-core1 2021 26.09 26.70 24.99 EN Abdelaziz et al.
qaSearch 2019 23.60 24.10 23.70 EN Zheng et. al.
QAnswer 2021 45.90 22.20 19.70 EN L Siciliani et al.
QASparql 2021 - - 19 EN Orogat et al.
TeBaQA 2021 64.40 14.10 13.90 EN L Siciliani et al.
TeBaQA 2019 12.90 13.40 13 EN Zheng et. al.
QASystem 2019 9.70 11.60 9.80 EN Zheng et. al.
AskNow 2021 - - 8 EN Orogat et al.
Qanary(TM+DP+QB) 2021 - - 7 EN Orogat et al.
Elon 2021 4.90 5.30 5 EN Steinmetz et al.
  • [1] DBpedia 2016-10.
  • [2] DBpedia 2016-10.
  • [3] DBpedia 2016-10.
  • [4] DBpedia 2016-10.
  • [5] DBpedia 2016-10.
  • [6] DBpedia 2016-10.
  • [7] DBpedia 2016-10.
  • [8] DBpedia 2016-10.

QALD-8

Please see the original paper for details about the dataset creation process, data format, task and participating systems.

Leaderboard

Model / System Year Precision Recall F1 Accuracy Language Reported by
Ensemble BR framework 2023 52.20 56.10 51.70 - EN Chen et al.
qaSQP 2019 45.90 46.30 46.10 - EN Zheng et. al.
Light-QAWizard 2022 46.20 50 45.70 - EN Chen et al.
gAnswer2 2023 38.62 39.02 38.80 - EN Chen et al.
gAnswer2 2019 38.60 39 38.80 - EN Zheng et. al.
gAnswer 2021 38.62 39.02 38.80 - EN Steinmetz et al.
WDAqua-core0 2021 39.12 40.65 38.72 - EN Steinmetz et al.
WDAqua-core0 2019 39.10 40.70 38.70 - EN Zheng et. al.
WDAqua 2023 39.10 40.70 38.70 - EN Chen et al.
QAwizard 2023 37.50 35.80 34.30 - EN Chen et al.
WDAqua-core0 2021 - - 33 - EN Orogat et al.
QASparql 2021 - - 30 - EN Orogat et al.
qaSearch 2019 24.40 24.40 24.40 - EN Zheng et. al.
AskNow 2021 - - 13 - EN Orogat et al.
Platypus 2021 - - 6 - EN Orogat et al.
QAKiS 2021 6.10 5.28 5.63 - EN Steinmetz et al.
QAKiS 2019 6.10 5.30 5.60 - EN Zheng et. al.
Qanary(TM+DP+QB) 2021 - - 4 - EN Orogat et al.
Entity Type Tags Modified 2022 - - - 88.15 EN Lin and Lu
SPARQL Generator 2022 - - - 40.09 EN Lin and Lu

QALD-7

Please see the original paper for details about the dataset creation process, data format, task and participating systems.

Leaderboard

Model / System Year Precision Recall F1 Accuracy Language Reported by
LAMA 2019 - - 90.50 - EN Radoev et. al.
LingTeQA [1] 2020 63.40 73.50 64.20 - EN D. Nhuan et al
Liang et al. 2021 81.30 52.70 63.90 - EN Liang et al.
Ensemble BR framework 2023 59.80 69.60 61.20 - EN Chen et al.
Light-QAWizard 2022 56.50 65.20 59.40 - EN Chen et al.
QAwizard 2023 59 59 59 - EN Chen et al.
gAnswer2 2020 55.70 59.20 55.60 - EN Athreya et. al
WDAqua-core0 2021 48.80 53.50 51.10 - EN Liang et al.
WDAqua-core0 2020 49 54 51 - EN Athreya et. al
gAnswer2 2023 46.90 49.80 48.70 - EN Chen et al.
TeBaQA RNN 2020 41.60 42.30 41.70 - EN Athreya et. al
GSM 2022 38 39 38 - EN Liu et al.
G Maheshwari et. al. Pointwise 2019 28 43 34 - EN G Maheshwari et. al.
AQG-Net 2022 30 37 33 - EN Liu et al.
gRGCN 2021 31.33 35.41 30.24 - EN Wu et al.
WDAqua-core0 2021 - - 29 - EN Orogat et al.
G Maheshwari et. al. Pairwise 2019 22 38 28 - EN G Maheshwari et. al.
gGCN 2021 23.34 31.09 24.37 - EN Wu et al.
GGNN 2021 21.76 27.51 21.10 - EN Wu et al.
Luo et al. 2021 21.17 24.38 20.16 - EN Wu et al.
HR-BiLSTM 2022 20 19 19 - EN Liu et al.
Yu et al. 2021 19.72 21.03 19.23 - EN Wu et al.
STAGG 2021 19.34 24.63 18.61 - EN Wu et al.
QASparql 2021 - - 17 - EN Orogat et al.
WDAqua 2023 16 16.20 16.30 - EN Chen et al.
AskNow 2021 - - 15 - EN Orogat et al.
Platypus 2021 - - 8 - EN Orogat et al.
Qanary(TM+DP+QB) 2021 - - 6 - EN Orogat et al.
Entity Type Tags Modified 2022 - - - 76.69 EN Lin and Lu
SPARQL Generator 2022 - - - 60.74 EN Lin and Lu
  • [1] Wikidata.

QALD-6

Please see the original paper for details about the dataset creation process, data format, task and participating systems.

Leaderboard

Model / System Year Precision Recall F1 Language Reported by
gAnswer 2017 70 89 78 EN Hu et al.
gAnswer 2021 - - 25 EN Orogat et al.
WDAqua-core0 2021 - - 24 EN Orogat et al.
QASparql 2021 - - 17 EN Orogat et al.
AskNow 2021 - - 9 EN Orogat et al.
Qanary(TM+DP+QB) 2021 - - 2 EN Orogat et al.

QALD-5

Please see the original paper for details about the dataset creation process, data format, task and participating systems.

Leaderboard

Model / System Year Precision Recall F1 Language Reported by
Xser 2020 74 72 73 EN Diefenbach et al.
UTQA 2016 - - 65.2 EN Ben Veyseh
UTQA 2020 - - 65 EN Diefenbach et al.
UTQA 2020 55 53 54 ES Diefenbach et al.
UTQA 2020 53 51 52 FA Diefenbach et al.
WDAqua-core1 2020 56 41 47 EN Diefenbach et al.
AskNow 2020 32 34 33 EN Diefenbach et al.
WDAqua-core1 2020 88 18 30 IT Diefenbach et al.
QAnswer 2020 34 26 29 EN Diefenbach et al.
WDAqua-core1 2020 92 16 28 DE Diefenbach et al.
WDAqua-core1 2020 90 16 28 FR Diefenbach et al.
WDAqua-core1 2020 88 14 25 ES Diefenbach et al.
gAnswer 2021 - - 20 EN Orogat et al.
SemGraphQA 2020 19 20 20 EN Diefenbach et al.
WDAqua-core0 2021 - - 18 EN Orogat et al.
YodaQA 2020 18 17 18 EN Diefenbach et al.
QASparql 2021 - - 12 EN Orogat et al.
AskNow 2021 - - 9 EN Orogat et al.
Qanary(TM+DP+QB) 2021 - - 2 EN Orogat et al.

QALD-4

Please see the original paper for details about the dataset creation process, data format, task and participating systems.

Leaderboard

Model / System Year Precision Recall F1 Language Reported by
Zhang et. al. 2016 89 88 88 EN Zhang et. al.
POMELO 2016 82 87 85 EN Zhang et. al.
SINA 2016 80 78 79 EN Zhang et. al.
Xser 2020 72 71 72 EN Diefenbach et al.
WDAqua-core1 2020 56 30 39 EN Diefenbach et al.
gAnswer 2020 37 37 37 EN Diefenbach et al.
CASIA 2020 32 40 36 EN Diefenbach et al.
WDAqua-core1 2020 90 20 32 DE Diefenbach et al.
WDAqua-core1 2020 92 20 32 IT Diefenbach et al.
WDAqua-core1 2020 90 20 32 ES Diefenbach et al.
WDAqua-core1 2020 86 18 29 FR Diefenbach et al.
Intui3 2020 23 25 24 EN Diefenbach et al.
ISOFT 2020 21 26 23 EN Diefenbach et al.
Hakimov 2020 52 13 21 EN Diefenbach et al.
gAnswer 2021 - - 16 EN Orogat et al.
RO FII 2016 16 16 16 EN Zhang et. al.
WDAqua-core0 2021 - - 12 EN Orogat et al.
QASparql 2021 - - 8 EN Orogat et al.
AskNow 2021 - - 8 EN Orogat et al.
Qanary(TM+DP+QB) 2021 - - 1 EN Orogat et al.

QALD-3

Please see the original paper for details about the dataset creation process, data format, task and participating systems.

Leaderboard

Model / System Year Precision Recall F1 Language Reported by
virtual player 2015 - - 64.29 EN Molino et al.
virtual player 2015 - - 59.47 EN Molino et al.
WDAqua-core1 2020 64 42 51 EN Diefenbach et al.
WDAqua-core1 2020 79 28 42 DE Diefenbach et al.
WDAqua-core1 2020 83 27 41 FR Diefenbach et al.
gAnswer 2020 40 40 40 EN Diefenbach et al.
WDAqua-core1 2020 70 26 38 FR Diefenbach et al.
Zhu et al. 2020 38 42 38 EN Diefenbach et al.
WDAqua-core1 2020 77 24 37 ES Diefenbach et al.
CASIA 2013 35 36 36 EN He et al.
WDAqua-core1 2020 79 23 36 IT Diefenbach et al.
CASIA 2013 35 36 36 EN S He et al
RTV 2020 32 34 33 EN Diefenbach et al.
Intui2 2020 32 32 32 EN Diefenbach et al.
SINA 2020 32 32 32 EN Diefenbach et al.
Intui2 [1] 2013 32 32 32 EN Corina Dima
DEANNA 2020 21 21 21 EN Diefenbach et al.
SWIP 2020 16 17 17 EN Diefenbach et al.
gAnswer 2021 - - 16 EN Orogat et al.
AskNow 2021 - - 13 EN Orogat et al.
WDAqua-core0 [2] 2021 - - 11 EN Orogat et al.
QASparql 2021 - - 6 EN Orogat et al.
Qanary(TM+DP+QB) 2021 - - 2 EN Orogat et al.
  • [1] DBpedia 3.8.
  • [2] DBpedia 2016-04.

QALD-2

Please see the original paper for details about the dataset creation process, data format, task and participating systems.

Leaderboard

Model / System Year Precision Recall F1 Language Reported by
robustQA [1] 2013 68 68 68 EN Yahya et al.
BELA 2012 73 62 67 EN Walter et al.
TLDRet [2] 2018 63 63 63 EN Rahoman and Ichise
SenseAware 2013 51 53 52 EN Elbedweihy et al.
semanticQA 2013 83 32 46 EN Hakimov et al.
SemSeK [3] 2013 44 48 46 EN Lopez et al.
Alexandria 2013 43 46 45 EN Lopez et al.
QAKiS 2013 39 37 38 EN Cabrio et al.
MHE 2013 36 40 38 EN Lopez et al.
QAKiS 2013 39 37 38 EN Lopez et al.
WolframAlpha 2012 32 30 30.9 EN Walter et al.
robustQA [4] 2013 50 15 23 EN Yahya et al.
gAnswer 2021 - - 21 EN Orogat et al.
WDAqua-core0 2021 - - 16 EN Orogat et al.
AskNow 2021 - - 10 EN Orogat et al.
QASparql 2021 - - 1 EN Orogat et al.
  • [1] factoid type questions.
  • [2] only temporal quetions.
  • [3] DBpedia 3.7.
  • [4] list type questions.

QALD-1

Please see the original paper for details about the dataset creation process, data format, task and participating systems.

Leaderboard

Model / System Year Precision Recall F1 Language Reported by
FREyA [1] 2013 63 54 58 EN Lopez et al.
PowerAqua 2013 52 48 50 En Lopez et al.
gAnswer 2021 - - 24 EN Orogat et al.
WDAqua-core0 2021 - - 14 EN Orogat et al.
AskNow 2021 - - 7 EN Orogat et al.
QASparql 2021 - - 1 EN Orogat et al.
  • [1] DBpedia 3.6.

Go back to the README