Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing Files for Running COS Inference on OTT-QA #7

Closed
pshlego opened this issue Dec 14, 2023 · 5 comments
Closed

Missing Files for Running COS Inference on OTT-QA #7

pshlego opened this issue Dec 14, 2023 · 5 comments

Comments

@pshlego
Copy link

pshlego commented Dec 14, 2023

Hello!

First, I would like to express my gratitude for your work.
I have a question regarding a problem I'm facing while trying to run COS inference on OTT-QA.

It seems that the files to be specified in the following arguments do not exist on hugging face. Can you help me with this?

  • encoded_ctx_files=[/path/to/ott_table_original*]
  • ctx_datatsets=[/path/to/ott_table_chunks_original.json,/path/to/ott_wiki_passages.json,[/path/to/table_chunks_to_passages*]]
@Mayer123
Copy link
Owner

Hi there,

Sorry for the late reply, the missing data are actually released under the CORE repo. You can download them here
https://huggingface.co/kaixinm/CORE/blob/main/data/knowledge.zip

@pshlego
Copy link
Author

pshlego commented Dec 18, 2023

Thank you for your response!
I have confirmed that ott_table_chunks_original.json and ott_wiki_passages.json are present in the link you shared.

However, it seems that table_chunks_to_passages*, ott_table_original*, ott_wiki_linker* do not exist. Could you please let me know where these files are located?

@Mayer123
Copy link
Owner

Due to the large file size, table_chunks_to_passages*, ott_table_original*, ott_wiki_linker* were not migrated to HF previously. You can obtain them by running inference using the model and data you have obtained:

ott_table_original*: using retrieval index expert (expert id 1) to compute embeddings for ott_table_chunks_original.json

ott_wiki_linker*: using entity index expert (expert id 3) to compute embeddings for ott_wiki_passages.json

table_chunks_to_passages*: using entity span expert (expert id 5) to identify entities in tables in ott_table_chunks_original.json, and then encode the entities with entity representation expert (expert id 2), and then run retrieval against ott_wiki_linker*

You can refer to the COS Readme on how to use each of the experts for inference. I will also try to migrate table_chunks_to_passages* in the next few days.

@pshlego
Copy link
Author

pshlego commented Dec 19, 2023

Thank you for your guidance on how to work with the files. It's clear now.
And it is great to hear that you are planning to migrate table_chunks_to_passages* in the coming days.
Please let me know once the migration is completed, as it will certainly be helpful.

@Mayer123
Copy link
Owner

Update: table_chunks_to_passages* has been added to HF repo, (data/OTT_table_to_pasg_links.zip)

@pshlego pshlego closed this as completed Jan 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants