Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add tests for CKY #53

Merged
merged 3 commits into from
Mar 8, 2020
Merged

add tests for CKY #53

merged 3 commits into from
Mar 8, 2020

Conversation

zhaoyanpeng
Copy link
Contributor

This PR fixes several bugs in k-best parsing with dist.topk() and includes a simple test to test the function.

I made incremental changes so that existing modules relying on the CKY will not be affected.

@srush
Copy link
Collaborator

srush commented Mar 7, 2020

Hi Yanpeng,

This is great, thanks.

  • Mind running black on your new test file (https://github.com/psf/black) that should fix the test failure?

  • Is TopK fast enough for your use case? I was thinking of trying a different approach.

@srush
Copy link
Collaborator

srush commented Mar 7, 2020

Could that also be the issue here as well?

#50

@zhaoyanpeng
Copy link
Contributor Author

Hi Yanpeng,

This is great, thanks.

working on it now.

  • Is TopK fast enough for your use case? I was thinking of trying a different approach.

Yes. But speed is not a big concern. TopK consumes quite a lot of GPU memory with large K. I will give you more details later.

@zhaoyanpeng
Copy link
Contributor Author

Could that also be the issue here as well?

#50

Yes, it could be. I set cache=False in initializing Chart and fixed the issue in CKY. Could you double check if this can resolve the issue once for all?

@zhaoyanpeng zhaoyanpeng force-pushed the test_cky branch 2 times, most recently from ce8123e to b0d3f42 Compare March 8, 2020 00:16
@srush
Copy link
Collaborator

srush commented Mar 8, 2020

Gotcha. Maybe we can add a GPU topk.

@srush srush merged commit 67aa60d into harvardnlp:master Mar 8, 2020
@zhaoyanpeng
Copy link
Contributor Author

Gotcha. Maybe we can add a GPU topk.

I tested topk on GTX 1080 with 11g GPU memory. Setting k = 15 parsing sentences of length above 25 will run out of memory.

It might be far from being practically usable since re-ranking based parsers require more than 15 best parses (e.g., Richard Socher et. al., 2013 require top 200 parses), and most sentences are longer than 25.

What is the GPU topk? The current topk can run on GPUs but is memory-hungry.

@srush
Copy link
Collaborator

srush commented Mar 8, 2020

Gotcha. Let me know if you personally need topk for research. I can take a look at some memory reduction ideas. I mostly included it because it was fun.

For some semirings I wrote custom cuda implementations to save memory. We could do that for topk.

@zhaoyanpeng
Copy link
Contributor Author

Gotcha. Let me know if you personally need topk for research. I can take a look at some memory reduction ideas. I mostly included it because it was fun.

Yes. My project involves analyzing the top k parses of a parser. I can do k passes to get k best parses without memory issues. But I find this repo amazing! So It would be great to include a memory-efficient implementation of topk.

@zhaoyanpeng zhaoyanpeng deleted the test_cky branch March 10, 2020 19:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants