Skip to content

Petroles/regis-collection

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

REGIS Collection - Retrieval Evaluation for Geoscientific Information Systems

DOI: 10.5281/zenodo.4726013

Public repository for the paper REGIS: A Test Collection for Geoscientific Documents in Portuguese

Lucas Lima de Oliveira, Regis Kruel Romeu, Viviane Pereira Moreira. "REGIS: A Test Collection for Geoscientific Documents in Portuguese". ACM SIGIR Forum, Association for Computing Machinery, 2021. DOI: 10.1145/3404835.3463256

Paper link: https://doi.org/10.1145/3404835.3463256
Video presentation: https://www.youtube.com/watch?v=WkwszoWEqLY
Slides presentation: https://drive.google.com/file/d/15HLmChpmTI7N2YsBy73gaY_Lfrja5O6f/view

About

Experimental validation is key to the development of Information Retrieval (IR) systems. The standard evaluation paradigm requires a test collection with documents, queries, and relevance judgments. Creating test collections requires significant human effort, mainly for providing relevance judgments. As a result, there are still many domains and languages that, to this day, lack a proper evaluation testbed. Portuguese is an example of a major world language that has been overlooked in terms of IR research -- the only test collection available is composed of news articles from 1994 and a hundred queries. With the aim of bridging this gap, in this paper, we developed REGIS (Retrieval Evaluation for Geoscientific Information Systems), a test collection for the geoscientific domain in Portuguese. REGIS contains 20K documents and 34 query topics along with relevance assessments. We describe the procedures for document collection, topic creation, and relevance assessment. In addition, we report on results of standard IR techniques on REGIS so that they can serve as a baseline for future research.

How to cite this work?

@inproceedings{oliveira2021,
  author = {Lima de Oliveira, Lucas and Romeu, Regis Kruel and Moreira, Viviane Pereira},
  title = {REGIS: A Test Collection for Geoscientific Documents in Portuguese},
  year = {2021},
  isbn = {9781450380379},
  publisher = {Association for Computing Machinery},
  address = {New York, NY, USA},
  url = {https://doi.org/10.1145/3404835.3463256},
  doi = {10.1145/3404835.3463256},
  booktitle = {Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval},
  pages = {2363–2368},
  numpages = {6},
  keywords = {test collection, information retrieval, geoscientific data},
  location = {Virtual Event, Canada},
  series = {SIGIR '21}
}

About

Retrieval Evaluation for Geoscientific Information Systems

Resources

License

Stars

Watchers

Forks

Packages

No packages published