- Visit the website [WUZZUF](https://wuzzuf.net/searchtran) .

- "Wuzzuf" is an Egyptian online job portal that provides recruitment and human resources management services. It is one of the leading job sites in Egypt, offering its services in all governorates of the country. The site allows job seekers and companies to search for and connect with each other online. It also provides various services to companies, institutions, and individuals to search for suitable jobs and improve their chances of finding new job opportunities. 

- In this project, we will extract data from the Wuzzuf website and save it in a CSV file.

**1st: we import needed libraries**

In [1]:
# imports 
import requests 
import csv 
from bs4 import BeautifulSoup 
from itertools import zip_longest

**2nd: we use requests to return the page**

In [2]:
# use request to fetch the url 
# result store the page that we need  
# requests.get >> return the page in a result

result=requests.get("https://wuzzuf.net/search/jobs/?q=data+scientist&a=navbg")

**3rd: we use content to return the content of page in src variable**

In [3]:
# we need the content of the page
src=result.content

In [4]:
src

b'<!DOCTYPE html>\n<html lang="en">\n<head>\n    <meta charset="utf-8">\n    <meta http-equiv="X-UA-Compatible" content="IE=edge">\n    <meta name="viewport" content="width=device-width, initial-scale=1.0, shrink-to-fit=no">\n    <meta http-equiv="expires" content="Thu Dec 08 2022 18:30:44 GMT+0200" />\n\n    <meta http-equiv="Pragma" content="no-cache">\n    <meta http-equiv="cache-control" content="no-cache, no-store, must-revalidate">\n\n    <title data-react-helmet="true">Job Search | WUZZUF</title>\n\n<meta data-react-helmet="true" charset="utf-8"/><meta data-react-helmet="true" name="description" content="Searching for jobs in Egypt? Wuzzuf helps you in your online job search to find Jobs in Egypt and Middle East. Choose the right job using our online recruitment services."/><meta data-react-helmet="true" name="keywords" content="jobs in Egypt, job in Egypt, careers egypt, jobs in Cairo, jobs in alexandria, employment in egypt, Egypt jobs, jobs vacancies, job vacancies in egypt, 

**4th: we can deal with content and extract information by beatufil soup library**

In [5]:
# soup object need the content and the parser 
soup=BeautifulSoup(src,"lxml")

**5th: find the elements containing info we need**

- we need job title - job skills - company name - location name
- any web page consists of HTML,CSS,JavaScript 
- the content we need write in HTML 
- so if we go to any part on the page and click right click,then inspect , it show us the HTML that developers use it to make the sahpe of the page 

- the HTML consists of some tages     

- we need some text from the page,this text located inside tages 

- so we need to search about tages that contain our text 

- after we click inspect in the text we need , we can ee the html code about the text 

- we need the tage for this text 

- so after we know the tag we can search about this tag , this can be done by "find_all" function in beatufil soup 

- this function takes two parameters , the 1st is the tage we searched about , 2nd is the properties of the tage (the property that describe the tag ,aech ag has a unique class that can describe it) 

- so we can use a class name for each age in the 2nd parameter

- "find_all" function return a list contains all tags we filtered on it

In [6]:
# 1st parameter is a type of 'tag' 
# 2nd is a dictionary contains the properties of the tag'class'
job_titles=soup.find_all("h2", {"class":"css-m604qf"})

In [7]:
job_titles

[<h2 class="css-m604qf"><style data-emotion="css o171kl">.css-o171kl{-webkit-text-decoration:none;text-decoration:none;color:inherit;}</style><a class="css-o171kl" href="/jobs/p/yt1kTxoZWRA7-DATA-ENGINEER-DATA-SCIENTIST--NCR-REMOTE-EURO-Rates-Pillars-Cairo-Egypt?o=1&amp;l=sp&amp;t=sj&amp;a=data scientist|search-v3|navbg" rel="noreferrer" target="_blank">DATA ENGINEER (DATA SCIENTIST) -NCR (REMOTE) (EURO Rates)</a></h2>,
 <h2 class="css-m604qf"><a class="css-o171kl" href="/jobs/p/C26QAehr65Rx-Data-Scientist-Analyst--Remote---Urgent-GetTechForce-com-Cairo-Egypt?o=2&amp;l=sp&amp;t=sj&amp;a=data scientist|search-v3|navbg" rel="noreferrer" target="_blank">Data Scientist/ Analyst- Remote - Urgent</a></h2>,
 <h2 class="css-m604qf"><a class="css-o171kl" href="/jobs/p/EVzEDYkzxJg2-Data-Scientist-Giza-Egypt?o=3&amp;l=sp&amp;t=sj&amp;a=data scientist|search-v3|navbg" rel="noreferrer" target="_blank">Data Scientist</a></h2>,
 <h2 class="css-m604qf"><a class="css-o171kl" href="/jobs/p/8MHaaeCAwJzU-

In [8]:
company_names=soup.find_all("a",{"class":"css-17s97q8"}) 
company_names

[<a class="css-17s97q8" href="https://wuzzuf.net/jobs/careers/Pillars-Egypt-4586" rel="noreferrer" target="_blank">Pillars -</a>,
 <a class="css-17s97q8" href="https://wuzzuf.net/jobs/careers/GetTechForce-com-Egypt-57284" rel="noreferrer" target="_blank">GetTechForce.com -</a>,
 <a class="css-17s97q8" rel="noreferrer" target="_blank">Confidential -</a>,
 <a class="css-17s97q8" rel="noreferrer" target="_blank">Confidential -</a>,
 <a class="css-17s97q8" href="https://wuzzuf.net/jobs/careers/United-Grocers-Egypt-16038" rel="noreferrer" target="_blank">Seoudi Supermarket -</a>,
 <a class="css-17s97q8" href="https://wuzzuf.net/jobs/careers/Proteinea-Egypt-51669" rel="noreferrer" target="_blank">Proteinea -</a>,
 <a class="css-17s97q8" href="https://wuzzuf.net/jobs/careers/gbrands-com-555" rel="noreferrer" target="_blank">Global Brands -</a>,
 <a class="css-17s97q8" href="https://wuzzuf.net/jobs/careers/Care-Dental-Egypt-27404" rel="noreferrer" target="_blank">Sequel Solutions -</a>,
 <a cl

In [9]:
locations_names=soup.find_all("span",{"class":"css-5wys0k"}) 
locations_names

[<span class="css-5wys0k">Cairo, <!-- -->Egypt </span>,
 <span class="css-5wys0k">Cairo, <!-- -->Egypt </span>,
 <span class="css-5wys0k">Giza, <!-- -->Egypt </span>,
 <span class="css-5wys0k">New Nozha, <!-- -->Cairo, <!-- -->Egypt </span>,
 <span class="css-5wys0k">Sheikh Zayed, <!-- -->Giza, <!-- -->Egypt </span>,
 <span class="css-5wys0k">Cairo, <!-- -->Egypt </span>,
 <span class="css-5wys0k">New Cairo, <!-- -->Cairo, <!-- -->Egypt </span>,
 <span class="css-5wys0k">Cairo, <!-- -->Egypt </span>,
 <span class="css-5wys0k">Cairo, <!-- -->Egypt </span>,
 <span class="css-5wys0k">Cairo, <!-- -->Egypt </span>,
 <span class="css-5wys0k">Cairo, <!-- -->Egypt </span>,
 <span class="css-5wys0k">Tanta, <!-- -->Gharbia, <!-- -->Egypt </span>,
 <span class="css-5wys0k">10th of Ramadan City, <!-- -->Cairo, <!-- -->Egypt </span>,
 <span class="css-5wys0k">Cairo, <!-- -->Egypt </span>,
 <span class="css-5wys0k">Cairo, <!-- -->Egypt </span>]

In [10]:
job_skills=soup.find_all('div', attrs={'class': None}) 
# job_skills

In [11]:
job_nature=soup.find_all("div",{"class":"css-1lh32fc"}) 
job_nature

[<div class="css-1lh32fc"><style data-emotion="css n2jc4m">.css-n2jc4m{display:-webkit-inline-box;display:-webkit-inline-flex;display:-ms-inline-flexbox;display:inline-flex;-webkit-align-items:center;-webkit-box-align:center;-ms-flex-align:center;align-items:center;-webkit-text-decoration:none;text-decoration:none;color:inherit;margin-bottom:4px;}</style><a class="css-n2jc4m" href="/a/Full-Time-Jobs-in-Egypt"><style data-emotion="css adtuo7">.css-adtuo7{cursor:pointer;padding:0 4px;border-radius:4px;}</style><style data-emotion="css 1ve4b75">.css-1ve4b75{font-size:12px;font-weight:600;display:-webkit-inline-box;display:-webkit-inline-flex;display:-ms-inline-flexbox;display:inline-flex;-webkit-align-items:center;-webkit-box-align:center;-ms-flex-align:center;align-items:center;min-height:20px;margin-right:4px;border-radius:2px;max-width:196px;white-space:nowrap;overflow:hidden;cursor:default;text-overflow:ellipsis;padding:2px 4px;background-color:#EBEDF0;color:#001433;cursor:pointer;pad

In [12]:
titles=[] 
names=[]
locations=[] 
nature=[]
skills=[] 
links=[] 
salaries=[] 
responsibilities=[]

In [13]:
for i in range(6,21) : 
      skills.append(job_skills[i].text)

In [14]:
for i in range(len(job_titles)) : 
    titles.append(job_titles[i].text) 
    names.append(company_names[i].text)
    nature.append(job_nature[i].text)
    locations.append(locations_names[i].text)

In [15]:
from urllib.parse import urljoin, quote

In [16]:
for title in job_titles :
    link='https://wuzzuf.net'+title.find('a')['href']
    links.append(quote(link, safe=':/?-='))
    

In [17]:
links

['https://wuzzuf.net/jobs/p/yt1kTxoZWRA7-DATA-ENGINEER-DATA-SCIENTIST--NCR-REMOTE-EURO-Rates-Pillars-Cairo-Egypt?o=1%26l=sp%26t=sj%26a=data%20scientist%7Csearch-v3%7Cnavbg',
 'https://wuzzuf.net/jobs/p/C26QAehr65Rx-Data-Scientist-Analyst--Remote---Urgent-GetTechForce-com-Cairo-Egypt?o=2%26l=sp%26t=sj%26a=data%20scientist%7Csearch-v3%7Cnavbg',
 'https://wuzzuf.net/jobs/p/EVzEDYkzxJg2-Data-Scientist-Giza-Egypt?o=3%26l=sp%26t=sj%26a=data%20scientist%7Csearch-v3%7Cnavbg',
 'https://wuzzuf.net/jobs/p/8MHaaeCAwJzU-Senior-Data-Scientist-Cairo-Egypt?o=4%26l=sp%26t=sj%26a=data%20scientist%7Csearch-v3%7Cnavbg',
 'https://wuzzuf.net/jobs/p/VssduzQBDTOi-Data-Scientist-Seoudi-Supermarket-Giza-Egypt?o=5%26l=sp%26t=sj%26a=data%20scientist%7Csearch-v3%7Cnavbg',
 'https://wuzzuf.net/jobs/p/TDIl3dpTtpD8-immunologyimmuno-oncology-scientist-Proteinea-Cairo-Egypt?o=6%26l=sp%26t=sj%26a=data%20scientist%7Csearch-v3%7Cnavbg',
 'https://wuzzuf.net/jobs/p/C89OQXgLP7Yy-Business-Analyst-Data-AI-Global-Brands-Cair

In [None]:
for link in links : 
    result=requests.get(link)
    src=result.content 
    soup=BeautifulSoup(src,"lxml") 
    salary=soup.find("span",{"class":"css-47jx3m"}) 
#     salaries.append(salary.text)
    print(salary)
#     requitement=soup.find('div',{"class":"css-1t5f0fr"}).find("ul").find_all("li") 
#     responsibilities.append(requitement.text)

None
None
None
None
None


In [None]:
salaries

In [None]:
titles

In [None]:
names

In [None]:
locations

In [None]:
nature

In [None]:
file_list=[titles,names,locations,nature,skills] 
exported=zip_longest(*file_list)

In [None]:
with open('wuzzuf.csv',"w") as myfile : 
    wr=csv.writer(myfile) 
    wr.writerow(['job_title','company_name','job_location','job_nature','required_skills'])
    wr.writerows(exported)