nlp@21 class projects task list：

Dataset(Arxiv6K):

About: 6000+ Arxiv papers from AI category at 2020. The dataset contains latex source files and images, which make it a good research dataset for multimodal learning.

Dataset URL: https://pan.baidu.com/s/1DsLVmZno7JSWxNQ9CBbBJQ
Dataset size: ~20G(compressed).

Task1(ResearchKG)：

About：Build fine-grained knowledge graph from given research papers of Arxiv6k. Consider answering the following questions:

Which sentence is most similar to a given sentence?
What concepts can be extracted from the corpus?
Which concept is relevant to a given phrase/concept and in what manner?
Which concepts are relevant to a given research problem?
Which concepts are clustered together in one paragraph/section/paper?
... other important questions...

Task2(MultiModalSys):

About: Build multimodal retrieval or recommendation system supporting text, image, formulas, and tables. Consider answering the following questions:

Which image is most relevant to a given sentence/query?
Which sentence/paragraph is most relevant to a given image?
Which formulas are relevant to a given sentence/query?
Which tables are relevant to a given sentence/query?
What concepts are relevant to a given formula?
... other important questions ...

Task3(AIHelper)

About：build AI helper system for computer science.

See Home for Researchers for reference.

Task4(DIY)：

About：build your own dataset, and develop some interesting models with it.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tasks.md

tasks.md

nlp@21 class projects task list：

Dataset(Arxiv6K):

Task1(ResearchKG)：

Task2(MultiModalSys):

Task3(AIHelper)

Task4(DIY)：

Any suggestions are welcome, current tasks may be updated and new tasks may be added in the future.

Files

tasks.md

Latest commit

History

tasks.md

File metadata and controls

nlp@21 class projects task list：

Dataset(Arxiv6K):

Task1(ResearchKG)：

Task2(MultiModalSys):

Task3(AIHelper)

Task4(DIY)：

Any suggestions are welcome, current tasks may be updated and new tasks may be added in the future.