Skip to content

SMU-AI-Lab/COVID-19-Research

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 
 
 

Repository files navigation

COVID-19-Open-Research-Dataset-Challenge

Eli Laird
SMU AI Lab
ejlaird@smu.edu

This repo contains our efforts to build Natural Language Processing tools to assist researchers around the world with processing ~51,000 research papers related to COVID-19 and similar coronaviruses. All work was done my several undergraduate members of the SMU AI Lab with the guidance of SMU AI Lab faculty.

Note: This repo is a work in progress. The documentation is still light and unorganized. Please be patient for a final release of our work.

The dataset used in this repo is provided by the Allen Institute For AI and posted on Kaggle.com. The dataset can be found here: https://www.kaggle.com/allen-institute-for-ai/CORD-19-research-challenge

RESULTS:

topics.csv:

lda_25_model.p

bhattacharya_similarity_matrix.csv

metadata.csv

  • Description: Metadata for bjattacharya_similarity_matrix.csv
    • 'Corpus' : Describes the corpus each document comes from * Values: {'main', 'task'}
    • 'Doc_Index' : Index for document in corresponding corpus
    • 'Topic' : Topic assignment for document
  • Link: https://smu.box.com/s/tflz9uyoul3yg3jbheynilce3lma5x35

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages