Skip to content

Python web scraper and info hub aggregating news, research and reddit posts about long covid

Notifications You must be signed in to change notification settings

josephburgess/long-covid-web-scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

This project is still being built, however it is deployed on Render so that changes can be viewed on a live site in realtime. Please feel free to have a browse here, but note that it is still a work in progress! Also note that it is currently deployed on the free tier and as such scales down with inactivity so the web services may take a moment to fire up when accessed.

main codecov Maintainability

Long COVID Hub

This project is an information hub for research, news and social posts about Long Covid. It utilises several APIs, web scrapers and a neural model for summarisation built using Python to obtain information and display it to the user.

Technology

The project uses a Python/Flask backend and a TypeScript/React frontend, with MongoDB used to store scraped data. Currently the hub has 3 sections, News, Data and Reddit Feed. News articles and Reddit posts are obtained using API calls to the respective services.

The Research section visualizes data on long COVID articles from PubMed. The data is scraped from the PubMed website using BeautifulSoup which grabs the Title, Author, Date, and Abstract from the articles. The abstract is summarised using an AI model and stored in MongoDB, which is read by the Flask backend using Pandas. The backend serves the data as JSON, which is then fetched by the React frontend and visualized using Plotly.

Installation

To install the project, follow these steps:

  1. Clone the repository:

  2. Navigate into the project directory and install dependencies:

cd backend && pip install -r requirements.txt
cd ../frontend && npm install
  1. Install MongoDB

    brew tap mongodb/brew
    brew install mongodb-community@5.0
    
  2. Start MongoDB

    brew services start mongodb-community@5.0
    

Usage

  1. To start the backend, run:
cd backend && flask run
  1. To start the frontend, run:
cd frontend && npm start
  1. open http://localhost:3000/ in your web browser.

Testing

To run the backend tests, run:

cd backend
pytest

To run the frontend tests, run:

cd frontend
npm run test

About

Python web scraper and info hub aggregating news, research and reddit posts about long covid

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published