Skip to content

Web crawling application for dynamic web page, build with Backend - Python, Scrapy, Selenium, FastAPI.

Notifications You must be signed in to change notification settings

TanyaAng/Articles_API

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 

Repository files navigation

ARTICLES API

Table of Contents
  1. About The Project
  2. Getting Started
  3. Usage
  4. API endpoints
  5. License
  6. Contact

About The Project

Python application for web crawling of dynamic web page, which generates content by doing asynchronous Javascript calls after page is loaded. The Scrapy spider implements inside a Selenuim WebDriver to handle the asynchronous JS calls and handles:

  • collecting of NBS articles on multiple pages;
  • validations of collected data;
  • saving collected and valid articles to sqlite database;
  • with FastAPI framework are build endpoints of collected data.

back to top

Build With

Getting Started

Installation

  1. Clone the repo
    https://github.com/TanyaAng/Articles_API.git
  2. Install all Python libraries
    pip install -r requirements.txt
  3. Make nbs_articles root directory

Article_API_folders

back to top

Usage

  1. Run Spider from terminal:
  • (venv) ..\nbs_articles> scrapy crawl article
  1. Run FastApi: Article_API_FastAPI_config

back to top

API endpoints

Datapoint HTTP Method Description
/articles/ GET get all crawled articles and their properties
/articles/?label={label} GET get list of articles with the same label
/articles/?date={date} GET get list of articles from the date
/article/{article_id} GET get single article
/article/{article_id} DELETE delete single article
/article/{article_id} PUT update single article

back to top

License

MIT License

back to top

Contact

Tanya Angelova - LinkedIn - t.j.angelova@gmail.com

Project Link: github link

back to top

About

Web crawling application for dynamic web page, build with Backend - Python, Scrapy, Selenium, FastAPI.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages