Scrapping_MUBAWAB.ma-

collecting data from mubawab.ma website to use it for creating predictive model

how it works?

This web scrapper extract posted articles urls from each page , and it uses each article url to access into article details , after that the needed content of this webpage will be exracted and returned as a python dictionary. Also, each article data collected will be stored as row in a csv file using dictionary writer.

how much time it takes?

In my personal computer (8GB RAM, Intel i7-10th) it takes 3 hours to extract data from 18100 web pages.

frameworks used:

I used Beautifulsoup4 for parsing the html code extracted from the web server using request library, also I used python regular expression to extract and clean alphanumerical data from the web page.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
LICENSE		LICENSE
README.md		README.md
Scrapper_code.py		Scrapper_code.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Scrapping_MUBAWAB.ma-

how it works?

how much time it takes?

frameworks used:

About

Releases

Packages

Languages

License

MOUHASSINE-badreddine/Scrapping_MUBAWAB.ma-

Folders and files

Latest commit

History

Repository files navigation

Scrapping_MUBAWAB.ma-

how it works?

how much time it takes?

frameworks used:

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages