This project explores the COVID-19 Open Research Dataset (CORD-19) metadata using Jupyter Notebook for analysis and a Streamlit app for interactive exploration.
- Load and filter research papers by publication year.
- Visualize publication trends over time.
- Generate a Word Cloud from paper titles.
- Explore metadata samples interactively.
CORD19-Explorer/ │── notebooks/ │ └COVID-Analysis.ipynb # Jupyter Notebook for data exploration │── app.py # Streamlit application │── requirements.txt # Python dependencies │── README.md # Project documentation
Clone the repository:
git clone https://github.com/yourusername/cord19-explorer.git
cd cord19-explorer
📓 Usage – Jupyter Notebook
Open Jupyter: Navigate to COVID-Analysis.ipynb. Run the cells to explore the dataset with Pandas, Seaborn, and Matplotlib.
🖥️ Usage – Streamlit App Run the Streamlit app on Powershell: python -m streamlit run streamlit_app.py Then open the local URL in your browser (usually http://localhost:8501).
📥 Dataset
The dataset is not included in this repo (too large). Download the CORD-19 metadata file from Kaggle CORD-19 Challenge
https://www.kaggle.com/datasets/allen-institute-for-ai/CORD-19-research-challenge?resource=download