Skip to content

PreciousAnagwu/Week-8-Python-Data-Frame-Assignment

Repository files navigation

📊 Metadata Explorer (Week 8 Python Dataframe Assignment)

This project is a Streamlit web application for exploring the CORD-19 metadata dataset. It allows users to search research papers, filter by publication year, view publication trends, explore top journals, and generate word clouds of paper titles.


🚀 Features

  • Dataset Preview: Displays a sample of the metadata for quick inspection.
  • Search Tool: Enter a keyword to search through titles and abstracts.
  • Year Filter: Filter publications by year and see how many papers were published.
  • Publications Over Time: Line chart showing the trend of publications by year.
  • Top Journals: Bar chart of the most common journals in the dataset.
  • Word Cloud: Visualization of the most frequent words in paper titles.

📂 Dataset

  • metadata_sample.csv → a smaller, shareable sample (used for GitHub/Streamlit Cloud).

  • metadata.csv → the full dataset (~1.5GB).

    • ⚠️ This file is very large and should not be pushed to GitHub.
    • The app will fall back to the sample file if the full dataset is unavailable.

🛠️ Installation & Setup

  1. Clone the repository

     git clone https://github.com/PreciousAnagwu/Week-8-Python-Data-Frame-Assignment.git
  2. Create a virtual environment (recommended)

    python -m venv venv
    source venv/bin/activate   # Mac/Linux
    venv\Scripts\activate      # Windows
  3. Install dependencies

    pip install -r requirements.txt
  4. Run the Streamlit app

    streamlit run streamlit_app.py

📦 Requirements

Main Python libraries used:

  • pandas
  • streamlit
  • matplotlib
  • seaborn
  • wordcloud
  • (See requirements.txt for the full list.)

🌐 Live Deployment

You can view the live app here: 👉 Live Demo


✨ Acknowledgements

  • CORD-19 Dataset provided by Allen Institute for AI.
  • Built as part of the PLP Academy Week 8 Python Assignment.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages