CORD-19 Metadata Dashboard

This project is part of the Frameworks Assignment. It uses the CORD-19 dataset to explore COVID-19 research papers through data analysis and a simple Streamlit web app.

Features

Load and explore the CORD-19 metadata.csv file
Data cleaning and sampling (avoid memory errors)
Interactive filters:
- Select year range
- Filter by journal
Visualizations:
- Publications per year (bar chart)
- Heatmap of publications per journal per year
- Word cloud of paper titles
Download filtered data as CSV

Tools and Libraries

Python 3.7+
pandas (data manipulation)
matplotlib & seaborn (visualizations)
wordcloud (word cloud generation)
streamlit (web application)

Installation

Clone the repository:

git clone https://github.com/iampunit123/week-8-python-assignment-frameworks-.git
cd Frameworks_Assignment

Install dependencies:

pip install -r requirements.txt

Usage

Run the Streamlit app:

streamlit run app.py

or (if streamlit is not in PATH):

python -m streamlit run app.py

The app will open in your browser at:

http://localhost:8501

Files

app.py → Streamlit app
metadata.csv → dataset file (or metadata_sample.csv if dataset is too big)
requirements.txt → dependencies list
README.md → this file

Expected Outputs

Publications by Year (bar chart)
Heatmap of publications per journal vs year
Word Cloud of paper titles
Download Button to export filtered results as CSV

Reflection

During this project, I learned how to:

Load and clean real-world datasets (handling missing data, sampling large files)
Perform basic exploratory data analysis with pandas
Create visualizations with matplotlib, seaborn, and wordcloud
Build an interactive dashboard with Streamlit
Document and share my work using GitHub

Challenges included dealing with the very large dataset (20+ GB). To solve this, I used only the metadata.csv file and sampled rows (nrows=5000) to make the app lightweight and fast.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
README.md		README.md
app.py		app.py
covid_file.py		covid_file.py
cumulative_publications.png		cumulative_publications.png
publications_by_year.png		publications_by_year.png
publications_heatmap.png		publications_heatmap.png
titles_wordcloud.png		titles_wordcloud.png
top_journals.png		top_journals.png
top_sources.png		top_sources.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

CORD-19 Metadata Dashboard

Features

Tools and Libraries

Installation

Usage

Files

Expected Outputs

Reflection

About

Uh oh!

Releases

Packages

Languages

iampunit123/week-8-python-assignment-frameworks-

Folders and files

Latest commit

History

Repository files navigation

CORD-19 Metadata Dashboard

Features

Tools and Libraries

Installation

Usage

Files

Expected Outputs

Reflection

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages