📊 Enterprise Data Scraping and Email Validation

📌 Project Overview

This project automates the collection of enterprise-related data from multiple sources using Python and Selenium for web scraping. It includes validating email addresses, extracting comments from a Facebook post, and gathering enterprise information (e.g., name, address, SIREN) from two French business directories. The scraped data is saved as CSV files for further analysis.

📂 Data Sources

Email Validation: VerifyEmailAddress.org to check email validity.
Facebook Comments: Extracted from a specific post on MBI Network.
Enterprise Info:
- Le Figaro Entreprises
- Manageo

Chrome Setup

Configure and initialize Chrome WebDriver for scraping.

from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options
from webdriver_manager.chrome import ChromeDriverManager

chrome_options = Options()
chrome_options.add_argument("--start-maximized")
driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()), options=chrome_options)

📦 Requirements

pip install selenium webdriver-manager pandas requests

▶️ How to Run

Clone the repository:

git clone https://github.com/ali27kh/Python_Web_Scraping.git
cd Python_Web_Scraping

Install dependencies.
Run the scraping script.

📌 Key Insights

Email validation ensures reliable contact information using VerifyEmailAddress.org.
Facebook comment scraping from posts.
Le Figaro and Manageo provide complementary enterprise data.
Selenium automation ensures robust handling of dynamic web content.

📜 License

MIT License

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.ipynb_checkpoints		.ipynb_checkpoints
Email-Verifier.ipynb		Email-Verifier.ipynb
Email_verifier.png		Email_verifier.png
README.md		README.md
activite_p1.xlsx		activite_p1.xlsx
entreprise2022_secteur.xlsx		entreprise2022_secteur.xlsx
entreprises_france_2022.xlsx		entreprises_france_2022.xlsx
lefigaro.png		lefigaro.png
mandataire_p1.xlsx		mandataire_p1.xlsx
scraping commantaire facebook.ipynb		scraping commantaire facebook.ipynb
scraping lefigaro finale.ipynb		scraping lefigaro finale.ipynb
scraping manageo.ipynb		scraping manageo.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

📊 Enterprise Data Scraping and Email Validation

📌 Project Overview

📂 Data Sources

Chrome Setup

📦 Requirements

▶️ How to Run

📌 Key Insights

📜 License

About

Uh oh!

Releases

Packages

Languages

ali27kh/Python_Web_Scraping

Folders and files

Latest commit

History

Repository files navigation

📊 Enterprise Data Scraping and Email Validation

📌 Project Overview

📂 Data Sources

** Chrome Setup**

📦 Requirements

▶️ How to Run

📌 Key Insights

📜 License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Chrome Setup

Packages