# Web Scraping Tutorial

This notebook provides a step-by-step guide to scrape data from a website. Web scraping is a technique used to extract information from websites by transforming the data on web pages into a structured format. This is particularly useful for data analysis, machine learning, and other data-driven tasks.

In this tutorial, we will walk through the process of scraping product information from a sample e-commerce site. By following these steps, you will learn how to:

1. Send HTTP requests to retrieve web pages.
2. Parse HTML content using BeautifulSoup.
3. Identify and extract relevant data elements from the parsed HTML.
4. Store the extracted data in a structured format using pandas.
5. Save the data to a CSV file.
6. Optionally, save the data to a database such as MongoDB.

The website we will be scraping is [ScrapeMe](https://scrapeme.live/shop/). This site is designed for practice purposes and contains a variety of products with details such as names and prices, which makes it an ideal candidate for learning web scraping techniques.

Before you begin, please visit the site to understand its structure. This will help you identify the elements you need to scrape.

Let's get started!

## Import libraries here

In [141]:
import requests


## Step 1: Send a request to the website

In [142]:
from bs4 import BeautifulSoup

In [143]:
url = 'https://tuwaiq.edu.sa/TestCenter'

response = requests.get(url)

response.status_code

200

## Step 2: Parse the HTML content of the page

In [144]:
soup = BeautifulSoup(response.text,'html.parser')

soup.title

<title>أكاديمية طويق</title>

## Step 3: Inspect the website and identify the elements to scrape
Inspect the website and identify the elements (e.g., product names, prices, etc.).

In [145]:

print(soup.find_all('h1',class_= 'text-[#1E1E1E] font-bold text-3xl'))
soup.find_all('ul',class_='grid items-start justify-between grid-cols-2 gap-4 md:flex')
    

[<h1 class="text-[#1E1E1E] font-bold text-3xl">
                    +300 شهادة احترافية عالمية في مجالات:
                </h1>]


[<ul class="grid items-start justify-between grid-cols-2 gap-4 md:flex">
 <li class="flex flex-col items-center w-full md:-ms-10">
 <span class="w-10 pt-5 border-t-8 border-primary-400"></span>
                             الذكاء الاصطناعي
                         </li>
 <li class="flex flex-col items-center w-full">
 <span class="w-10 pt-5 border-t-8 border-primary-400"></span>
                             تحليل وإدارة البيانات
                         </li>
 <li class="flex flex-col items-center w-full">
 <span class="w-10 pt-5 border-t-8 border-primary-400"></span>
                             الأمن السيبراني
                         </li>
 <li class="flex flex-col items-center w-full">
 <span class="w-10 pt-5 border-t-8 border-primary-400"></span>
                             الحوسبة السحابية
                         </li>
 <li class="flex flex-col items-center w-full">
 <span class="w-10 pt-5 border-t-8 border-primary-400"></span>
                             تقنية المعلومات
     

## Step 4: Extract the desired data

In [186]:
arr=[]
for li in soup.find_all('ul',class_='grid items-start justify-between grid-cols-2 gap-4 md:flex'):
    arr = li.text

## Step 5: Create a DataFrame to store the extracted data

## Step 6: Save the data to a CSV file

In [1]:

for i in range(1, 7):
    url = f'https://binbaz.org.sa/fatwas/kind/{i}'
    response = requests.get(url)
    soup = BeautifulSoup(response.text, 'html.parser')

    question = soup.find_all('h1')  
    Answers = soup.find_all('p')
q = {}
for question in question:
    print("Qustion : " ,question.get_text())
    q.insert(question.get_text())

q

NameError: name 'requests' is not defined

## Step 7: Print the DataFrame to verify the extracted data

## Step 8: Save the data to a database of your choice. If you are using MongoDB, include the code here.