Skip to content

Python web scraping script and MySQL database design 🐛

Notifications You must be signed in to change notification settings

yixin0829/web-data-scraping-project

Repository files navigation

web_data_scraping_project

A project idea developed during the winter break to use python for scraping environmental data from a website and then create visualizations. The final result is to be displayed on a self-created webpage.

Research & Workflow

A majority of the information and workflow applied in the project can be found from this article

Workflow

  • Pick a website
  • Web scraping
  • Store data into two .csv files (can be found in doc)
  • Calling Gender API to enrich the data
  • Generate basic charts using matplotlib, nltk
  • Create mysql database / (Firebox) and store the data

Needed Tools

  1. Python 3 (bs4, requset, matplotlib, nltk, FLask)
  2. MySQL
  3. Gender API

Result

Please see the data scrapped stored in the data_title_author.csv and key_words.csv

Along with genders.csv for predicted gender (data enrichment)

Some plots can be found under /plots.

Releases

No releases published

Packages

No packages published

Languages