## Web Scraping Demo
This is a simple demonstration of web scraping using Python. The code uses libraries such as `requests` and `BeautifulSoup` to extract job data from websites like Wuzzuf and Bayt.


### Step 1: Install Required Libraries
Install the `lxml` library for HTML parsing.

In [1]:
!pip install lxml




[notice] A new release of pip is available: 24.2 -> 24.3.1
[notice] To update, run: python.exe -m pip install --upgrade pip


### Step 2: Import Required Libraries
Import the necessary libraries for making HTTP requests and parsing HTML.

In [2]:
import requests
from bs4 import BeautifulSoup

### Step 3: Scrape Wuzzuf for Job Data
Define the target URL and fetch the page content.

In [3]:
# Define the URL for Wuzzuf search page
u = "https://wuzzuf.net/search/jobs?a=spbg&q=machine%20learning"

# Send a GET request to the URL
page = requests.get(u)

# Parse the page content using BeautifulSoup
soup = BeautifulSoup(page.content, "html.parser")

#### Step 3.1: Extract Job Titles
Use BeautifulSoup to find and print all job titles on the page.

In [4]:
# Find all job titles on the page
j_t = soup.find_all('h2', class_="css-m604qf")
for i in j_t:
    print(i.text)

Machine Learning Manager
AI Engineer
Robotics and Programming Engineer
Sales Specialist
Senior Backend Developer in Node/Express
Machinist / Mechanical Engineer
AI Technical Team Lead (Computer Vision Focus & NLP)
Senior ML JD
Online Coding Instructor
Senior Full Stack/ Embedded Engineering
Senior AI Engineer
Senior Full Stack Developer (MERN or Laravel Stack)
Technical sales and marketing manager
Process Electronics Engineer
Electrical Maintenance Engineer (Fresh Grads)


#### Step 3.2: Extract Job Locations
Extract and print the locations of the jobs.

In [5]:
# Find all job locations
loc = soup.find_all("span", class_="css-5wys0k")
for i in loc:
    print(i.text)

Cairo, Egypt 
Heliopolis, Cairo, Egypt 
Alexandria, Egypt 
Downtown, Cairo, Egypt 
Manchester, United Kingdom 
Larnaca, Cyprus 
Sheikh Zayed, Giza, Egypt 
Damietta, Egypt 
Riyadh, Saudi Arabia 
Riyadh, Saudi Arabia 
Cairo, Egypt 
Cairo, Egypt 
Hadayek Alahram, Giza, Egypt 
10th of Ramadan City, Sharqia, Egypt 
10th of Ramadan City, Sharqia, Egypt 


#### Step 3.3: Extract Company Names
Extract and print the company names associated with the jobs.

In [6]:
# Find all company names
company = soup.find_all("a", class_="css-17s97q8")
for i in company:
    print(i.text)

kcsc -
Integrated Technology Group -
Smart Technology -
Gila Electric -
Give Brite  -
kpec international -
Lumin -
ysolution -
Confidential -
Qudra Tech -
RMG -
 Si-Vision -
Etkaan -
VIVO -
Bakeland Egypt  -


#### Step 3.4: Extract Job Post Dates
Extract and print when the jobs were posted.

In [7]:
# Find all job posting dates
# <div class="css-do6t5g">1 month ago</div>
# <div class="css-4c4ojb">5 hours ago</div>
dates = soup.find_all("div", class_= ["css-do6t5g", "css-4c4ojb"])
for i in dates:
    print(i.text)

1 month ago
4 days ago
17 days ago
3 days ago
2 months ago
1 month ago
20 days ago
10 days ago
2 months ago
6 days ago
1 month ago
11 days ago
11 days ago
18 days ago
3 days ago


#### Step 3.5: Extract Job Types
Extract and print the types of jobs (e.g., Full-time, Part-time).

In [8]:
# Find all job types
# <span class="css-1ve4b75 eoyjyou0">Full Time</span>
job_type = soup.find_all("span", class_="css-1ve4b75 eoyjyou0")
for i in job_type:
    print(i.text)

Full Time
Full Time
Full Time
Full Time
Full Time
Full Time
Full Time
Full Time
Full Time
Part Time
Full Time
Full Time
Full Time
Full Time
Full Time
Full Time


#### Step 3.6: Extract Job URLs
Extract and print the URLs for individual job postings.

In [9]:
# Extract job URLs
# <h2 class="css-m604qf">
# <a href="https://wuzzuf.net/jobs/p/ow9y8zcxQbne-Machine-Learning-Manager-kcsc-Cairo-Egypt" target="_blank" rel="noreferrer" class="css-o171kl">Machine Learning Manager</a></h2>
jobs_url = []
urls = soup.find_all("h2", class_="css-m604qf")
for i in urls:
    jobs_url.append(i.find('a').attrs["href"])
jobs_url

['https://wuzzuf.net/jobs/p/ow9y8zcxQbne-Machine-Learning-Manager-kcsc-Cairo-Egypt',
 'https://wuzzuf.net/jobs/p/mJweLehwNSbz-AI-Engineer-Integrated-Technology-Group-Cairo-Egypt',
 'https://wuzzuf.net/jobs/p/jjEycNgftu1U-Robotics-and-Programming-Engineer-Smart-Technology-Alexandria-Egypt',
 'https://wuzzuf.net/jobs/p/PfXPkSfnCCNs-Sales-Specialist-Gila-Electric-Cairo-Egypt',
 'https://wuzzuf.net/jobs/p/3k4DYwP9VAza-Senior-Backend-Developer-in-NodeExpress-Give-Brite-Manchester-United-Kingdom',
 'https://wuzzuf.net/jobs/p/eRVXqix8zgJQ-Machinist-Mechanical-Engineer-kpec-international-Larnaca-Cyprus',
 'https://wuzzuf.net/jobs/p/1kRUuVPxsf8W-AI-Technical-Team-Lead-Computer-Vision-Focus-NLP-Lumin-Giza-Egypt',
 'https://wuzzuf.net/jobs/p/iitM79Aq3BJp-Senior-ML-JD-ysolution-Damietta-Egypt',
 'https://wuzzuf.net/jobs/p/Ej8oiMp2sfqj-Online-Coding-Instructor-Riyadh-Saudi-Arabia',
 'https://wuzzuf.net/jobs/p/lewDJYtw0FhS-Senior-Full-Stack-Embedded-Engineering-Qudra-Tech-Riyadh-Saudi-Arabia',
 'htt

In [10]:
for url in jobs_url[0:3]:
    url = url.replace(" ", "")
    print(url)
    page = requests.get(url)
    soup = BeautifulSoup(page.content, "html.parser")
    print(soup.title.text)

https://wuzzuf.net/jobs/p/ow9y8zcxQbne-Machine-Learning-Manager-kcsc-Cairo-Egypt
Machine Learning Manager Job at kcsc in Cairo, Egypt – Apply Now!
https://wuzzuf.net/jobs/p/mJweLehwNSbz-AI-Engineer-Integrated-Technology-Group-Cairo-Egypt
AI Engineer Job at Integrated Technology Group in Heliopolis, Cairo – Apply Now!
https://wuzzuf.net/jobs/p/jjEycNgftu1U-Robotics-and-Programming-Engineer-Smart-Technology-Alexandria-Egypt
Robotics and Programming Engineer Job at Smart Technology in Alexandria, Egypt – Apply Now!


#### Step 3.7: Fetch Individual Job Details
Fetch and print the titles of individual jobs from their URLs.

In [3]:
# Example URL for testing
# url = "https://wuzzuf.net/jobs/p/5NFxSMKMH5K0-SeniorMid-Senior-Deep-Learning-Engineer-Cairo-Egypt?o=1&l=sp&t=sj&a=machine%20learning|search-v3|spbg"
import requests
from bs4 import BeautifulSoup

url = "https://www.bayt.com/en/egypt/jobs/data-science-jobs/"

# Send a GET request and parse the content
page = requests.get(url)
soup = BeautifulSoup(page.content, "html.parser")
print(soup.title.text)

Data Science Jobs in Egypt (2024) - Bayt.com


In [4]:
soup

<!DOCTYPE html>

<html dir="ltr" lang="en">
<head>
<link href="https://secure.b8cdn.com" rel="preconnect"/>
<link crossorigin="" href="https://fonts.gstatic.com/" rel="preconnect"/>
<link href="https://cdn-cookieyes.com/" rel="preconnect"/>
<link href="https://secure.b8cdn.com" rel="dns-prefetch"/>
<link href="https://fonts.gstatic.com/" rel="dns-prefetch"/>
<link href="https://cdn-cookieyes.com/" rel="dns-prefetch"/>
<link href="https://log.cookieyes.com/" rel="dns-prefetch"/>
<link href="https://directory.cookieyes.com/" rel="dns-prefetch"/>
<meta content="data science jobs in egypt, data science vacancies egypt, data science career opportunities in egypt" name="keywords"/>
<meta charset="utf-8"/>
<meta content="width=device-width, initial-scale=1.0" name="viewport"/>
<meta content="Bayt.com FZ-LLC" name="copyright"/>
<meta content="strict-origin-when-cross-origin" name="referrer"/>
<meta content="116830345011417" data-js-id="meta-fb_app_id" property="fb:app_id"/>
<meta content="Bayt

### Step 4: Scrape Bayt for Job Data
Switch to a different platform and extract job data from Bayt.

In [5]:
# Define the Bayt URL for data science jobs
url = "https://www.bayt.com/en/egypt/jobs/data-science-jobs/"
# Send a GET request with a user-agent header to mimic a browser
page = requests.get(url)
soup = BeautifulSoup(page.content, "html.parser")
print(soup.title.text)

Data Science Jobs in Egypt (2024) - Bayt.com


#### Step 4.1: Extract Job Titles on Bayt
Extract and print job titles from Bayt.

In [6]:
# Find all job titles on the page
# <h2 class="col u-stretch t-large m0 t-nowrap-d t-trim">
# <a data-automation-is_aggregated="0" data-automation-is_external="1" data-js-aid="jobID" data-js-link="" href="/en/egypt/jobs/immdiate-hiring-for-product-managerfor-a-factory-in-egypt-5214512/">
# immdiate hiring for product managerfor a factory in Egypt </a>
# </h2>
title = soup.find_all("h2", class_="m0")
for j in title:
    print(j.text.strip())

Senior Presales Solution Architect " Data "
immdiate hiring for product managerfor a factory in Egypt
Power BI Developer
Microsoft CRM Administrator
Network & Security Head
Business Analytics & Insights Lead ELI & North Africa
Pharmacist
Data Analytics - Data Science Team Lead - Cairo
Customer Success Manager – Focused on Data Science - 218456
Safety Coordinator, Workplace Health and Safety
HCM Techno-Functional Consultant
Business Analyst (Retail Loan Origination System)
Data Science Manager
Senior Data Science Engineer
Enterprise Account Executive - MENA
Enterprise Account Executive - Saudi/UAE/Egypt
Accountant-OUTSOURCED OPPORTUNITY-THREE MONTHS CONTRACT
Accountant-Outsourced Opportunity-Three Months Contract
Senior Scientist, Computational Biotherapeutics Engineering
DAM security Engineer


#### Step 4.2: Extract Job URLs on Bayt
Extract and print the URLs for job postings on Bayt.

In [7]:
# Extract job URLs on Bayt
jobs_url = []
urls = soup.find_all("h2", class_="m0")
for i in urls:
    jobs_url.append("https://www.bayt.com" + i.find('a').attrs['href'])
jobs_url

['https://www.bayt.com/en/egypt/jobs/senior-presales-solution-architect-quot-data-quot-5206515/',
 'https://www.bayt.com/en/egypt/jobs/immdiate-hiring-for-product-managerfor-a-factory-in-egypt-5214512/',
 'https://www.bayt.com/en/egypt/jobs/power-bi-developer-5201473/',
 'https://www.bayt.com/en/egypt/jobs/microsoft-crm-administrator-5199788/',
 'https://www.bayt.com/en/egypt/jobs/network-security-head-5209500/',
 'https://www.bayt.com/en/egypt/jobs/business-analytics-amp-insights-lead-eli-amp-north-africa-5198838/',
 'https://www.bayt.com/en/egypt/jobs/pharmacist-5205410/',
 'https://www.bayt.com/en/egypt/jobs/data-analytics-data-science-team-lead-cairo-72038421/',
 'https://www.bayt.com/en/egypt/jobs/customer-success-manager-focused-on-data-science-218456-72042911/',
 'https://www.bayt.com/en/egypt/jobs/safety-coordinator-workplace-health-and-safety-5205289/',
 'https://www.bayt.com/en/egypt/jobs/hcm-techno-functional-consultant-5192293/',
 'https://www.bayt.com/en/egypt/jobs/busines

### Quizzes
Try to find additional information from the scraped pages using the following tasks:

#### Quiz 1: Find the Locations of Jobs

In [9]:
# Find the location
location = soup.find_all("div", class_ ="t-mute t-small")
for l in location:
    print(l.text)

Cairo · Egypt
Alexandria · Egypt
Cairo · Egypt
Cairo · Egypt
Egypt
Cairo · Egypt
Cairo · Egypt
Egypt
Cairo · Egypt
Cairo · Egypt
Cairo · Egypt
Cairo · Egypt
Cairo · Egypt
Cairo · Egypt
Egypt
Egypt
Cairo · Egypt
Cairo · Egypt
Rosetta · Egypt
Cairo · Egypt


#### Quiz 2: Find the Company Names of Jobs

In [12]:
# Find the company
company = soup.find_all("div", class_ ="t-nowrap")
for c in company:
    print(c.text)


GIZA Systems 
Cairo · Egypt

GIZA Systems 

Shaheen Farouk 
Alexandria · Egypt

Shaheen Farouk 

Flint Consulting Ltd 
Cairo · Egypt

Flint Consulting Ltd 

VeriPark Gulf 
Cairo · Egypt

VeriPark Gulf 

Job Hunting 
Egypt

Job Hunting 

Pfizer - United Arab Emirates 
Cairo · Egypt

Pfizer - United Arab Emirates 

Axios International Consultants 
Cairo · Egypt

Axios International Consultants 

Infomineo 
Egypt

Infomineo 

Teradata 
Cairo · Egypt

Teradata 

Amazon MENA 
Cairo · Egypt

Amazon MENA 

شركة البدوي للتوظيف/ Albadwy Recruitment 
Cairo · Egypt

شركة البدوي للتوظيف/ Albadwy Recruitment 

VeriPark Gulf 
Cairo · Egypt

VeriPark Gulf 

Sylndr 
Cairo · Egypt

Sylndr 

Sylndr 
Cairo · Egypt

Sylndr 

Canonical 
Egypt

Canonical 

Canonical 
Egypt

Canonical 

Abbott 
Cairo · Egypt

Abbott 

Abbott 
Cairo · Egypt

Abbott 

Pfizer Manufacturing Belgium NV 
Rosetta · Egypt

Pfizer Manufacturing Belgium NV 

Accenture 
Cairo · Egypt

Accenture 


#### Quiz 3: Extract Job Descriptions

In [13]:
# job descreption
desc = soup.find_all("div", class_ ="jb-descr m10t t-small")
for d in desc:
    print(d.text)


Data Science Solution Arch plays a crucial role in helping businesses adopt data ... client needs and demonstrate how data science solutions can solve businessproblems. Key ...

This is a full-time on-site role for an enthusiactic, active and full of energy Senior Product Manager for a small factory for producing…

, dashboards, and data visualizations.• Strong understanding of data modeling, data warehousing, and ETL ... Ability to integrate various data sources and maintain data...

We enable financial institutions to become digital leaders. As a professional team of global scale, we work with best clients for…

Job Summary: We are a prominent food manufacturing company based in 10th of Ramadan City, seeking an experienced Network & Security…

need for advanced analytics and data science to address business opportunities and ... leadership, including articulation of the data and its implications for the ...

Position: Pharmacist Location: Egypt Position Purpose The purpose of this r