Skip to content

A Web scraping project to scrape the data from the websites which doesn't require login and display that data using Fast API endpoints

Notifications You must be signed in to change notification settings

mysterious-shailendr/Web-Scraping-and-Fast-API

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Web Scraping Project with Fast API

A Web scraping project to scrape the data from the websites which doesn't require login.

Beautiful Soup library is used for extracting my response data. It creates a parse tree from page source code that can be used to extract data in a hierarchical and more readable manner, basically a Python library for pulling data out of HTML and XML files.

I have done scraping on this website: "https://www.onthisday.com"
" On This Day " is the world's largest, most accurate and popular site for on this day in history, it gives all the historical events happened in a day wise frame.
I have scrap the whole bunch of data of all days, filtered it in month wise frame and stored it in json file.
Using Fast API, have assigned endpoints for displaying historical events of today's date, month wise events, a particular day and month event and more...
It's a basic demo, just for understanding purpose.
You can use this code to scrap any website data which doesn't requires login.

Do these installs before running the project,

pip install beautifulsoup4

If anyone got any module error, then install that module like

pip install module_name

For accessing Fast API, run collect_events.py file first ( for creating events.json file ):

python collect_events.py
uvicorn main:app --reload

Then go to the respective url( Ex: http://127.0.0.1:8000/ ), for a better view just add "docs" or "redoc" to your url. ( Ex: http://127.0.0.1:8000/docs or http://127.0.0.1:8000/redoc ) and explore it.

Reference

1). https://beautiful-soup-4.readthedocs.io/
2). https://www.onthisday.com/
3). https://fastapi.tiangolo.com/

For any doubts, raise your issues, willingly waiting to help you and clear your doubts...

About

A Web scraping project to scrape the data from the websites which doesn't require login and display that data using Fast API endpoints

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages