Skip to content

Commit c3e0965

Browse files
authored
Merge pull request #364 from AHNAF14924/main
Added Scripts by AHNAF14924
2 parents 4275206 + 455b462 commit c3e0965

File tree

3 files changed

+98
-0
lines changed

3 files changed

+98
-0
lines changed
9.37 KB
Binary file not shown.

AUTOMATION/Web_Scraper/README.md

+44
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,44 @@
1+
# Introduction
2+
3+
This Python program is a web scraper that extracts data about graphics cards from a specific website. It uses the BeautifulSoup library to parse the HTML content of the website and requests library to fetch the web page.
4+
5+
## Requirements
6+
7+
- Python 3.x
8+
- BeautifulSoup library (`beautifulsoup4`)
9+
- Requests library (`requests`)
10+
- Openpyxl library (`openpyxl`)
11+
12+
You can install the required libraries using pip:
13+
14+
```
15+
pip install beautifulsoup4 requests openpyxl
16+
```
17+
18+
## How to Use
19+
20+
1. Clone this repository or download the files.
21+
22+
2. Open a terminal or command prompt and navigate to the project directory.
23+
24+
3. Run the Python script `app.py`:
25+
26+
```
27+
app.py
28+
```
29+
30+
4. The program will start scraping data from the website and display the brand, name, and price of each graphics card on the console.
31+
32+
5. Once the scraping is complete, the program will save the data to an Excel file named `Graphics Card.xlsx`.
33+
34+
## Configuration
35+
36+
You can modify the URL in the `scrape_graphics_cards_data()` function inside the `app.py` file to scrape data from a different website or adjust the parameters as needed.
37+
38+
## Output
39+
40+
The program will generate an Excel file `Graphics Card.xlsx` containing the scraped data. Each row in the Excel file represents a graphics card and includes the columns `Brand`, `Name`, and `Price`.
41+
42+
## Disclaimer
43+
44+
This web scraper is provided for educational and informational purposes only. Please be respectful of the website's terms of service and scraping policies. Always obtain proper authorization before scraping any website, and use the scraper responsibly and ethically.

AUTOMATION/Web_Scraper/app.py

+54
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,54 @@
1+
from bs4 import BeautifulSoup
2+
import requests
3+
import openpyxl
4+
5+
6+
def extract_brand_name_and_title(name):
7+
# Split the name and return the first word as the brand name and the rest as title
8+
brand, title = name.split(' ', 1)
9+
return brand, title
10+
11+
12+
def scrape_graphics_cards_data():
13+
try:
14+
# Create a new Excel workbook and set up the worksheet
15+
excel = openpyxl.Workbook()
16+
sheet = excel.active
17+
sheet.title = "price"
18+
sheet.append(['Brand', 'Name', 'Price'])
19+
20+
url = 'https://www.techlandbd.com/pc-components/graphics-card?sort=p.price&order=ASC&fq=1&limit=100'
21+
response = requests.get(url)
22+
response.raise_for_status()
23+
24+
# Parse the HTML content
25+
soup = BeautifulSoup(response.text, 'html.parser')
26+
27+
# Find all product cards on the webpage
28+
cards = soup.find('div', class_='main-products product-grid').find_all(
29+
'div', class_='product-layout has-extra-button')
30+
31+
for card in cards:
32+
# Extract the product name
33+
name = card.find('div', class_='name').a.text
34+
35+
# Split the name to get the brand and title
36+
brand, title = extract_brand_name_and_title(name)
37+
38+
# Extract the product price
39+
price = card.find('div', class_='price').span.text
40+
41+
# Print the product details and add them to the Excel sheet
42+
print(brand, title, price)
43+
sheet.append([brand, title, price])
44+
45+
# Save the Excel file
46+
excel.save('Graphics Card.xlsx')
47+
48+
except Exception as e:
49+
print("An error occurred:", e)
50+
51+
52+
if __name__ == "__main__":
53+
# Call the main scraping function
54+
scrape_graphics_cards_data()

0 commit comments

Comments
 (0)