Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Statistics subquery incorrect when using repetition and boundaries #354

Open
arildm opened this issue Mar 28, 2024 · 0 comments
Open

Statistics subquery incorrect when using repetition and boundaries #354

arildm opened this issue Mar 28, 2024 · 0 comments
Labels

Comments

@arildm
Copy link
Member

arildm commented Mar 28, 2024

I previously "fixed" #289 by adding a []* to the subquery, but it seems to have added new problems.

This query for the NPEGL mode https://spraakbanken.gu.se/korplabb/?mode=npegl#?cqp=%3Cnp%3E%20%5B%5D%7B0,10%7D%20%5Bword%20%3D%20%22b%C3%A6%C3%B0i%22%5D%20%5B%5D%7B0,10%7D%20%3C%2Fnp%3E&corpus=npegl-ice&search_tab=1&within=text&show_stats&result_tab=2&search=cqp shows a statistics row with 6 hits for "bæði", but clicking it yields 14 hits, namely all that begin with "bæði".

An example from an "ordinary" corpus is here https://spraakbanken.gu.se/korplabb/#?cqp=%5Bpos%20%3D%20%22MID%7CMAD%7CPAD%22%5D%20%5B%5D%7B0,1%7D%20%3C%2Fsentence%3E&corpus=attasidor&search_tab=1&show_stats&result_tab=2&search=cqp where the value ")" is reported 8 times, but the link shows 17 hits, including some ") ." (This example not needed now that NPEGL is public)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant