# Scrap ebay.in

### Objective: Scrap ebay.com using the link below & generate the following info: ProductName, Price, Shipping_Cost, Image URL, Number of Watchers & Product Condition(Brand New, Opened Box or Pre-Owned).

In [1]:
# Importing the Required Libraries
import requests # Sends requests...
from bs4 import BeautifulSoup # Used for scraping data out of HTML and XML files
import pandas as pd # Used to construct a DataFrame

In [2]:
# Creating a user agent
user_agent ={"User-Agent":"chrome"} 
link = "https://www.ebay.com/sch/i.html?_nkw=table"
page = requests.get(link,headers=user_agent) # Requests module allows you to send HTTP requests using Python.
print(page) # Response 200 suggests that we can do web scraping

<Response [200]>


In [3]:
# We will use Beautiful Soup to Generate the HTML Structure of the Page & is Saved in "soup" variable.
soup = BeautifulSoup(page.content, features="html.parser")

In [4]:
# Generating a Parent Structure from which the desired items will be scraped. 
products = soup.find_all("li", class_="s-item s-item__pl-on-bottom")

**Now, we will extract the following items:-**
* ProductName
* Price
* Link
* Image URL
* Number of Watchers
* Product Condition(Brand New, Opened Box or Pre-Owned)

### 1. Product Name

This code sets up an empty list called "product_name" which will be used to store data. It then loops through each item in the "products" list and searches for the product name by looking for a specific HTML tag and class. Once it has found the name, it removes any extra spaces or commas before adding it to the "product_name" list. Finally, the code prints the name of each product.

In [5]:
# Creating list where the data will be stored...
product_name = [] 

for prod in products:
    # Finding Product Name & saving it in var 'name'
    name = prod.find("div", class_="s-item__title").find("span").text.strip().replace(",", "")
    
    # Appending the Product Name in 'product_name' list.
    product_name.append(name)
    
    # Print the Product Names...
    print(f"Product Name:{name}\n")

Product Name:Shop on eBay

Product Name:Malachite Round Top Coffee Table Top Semi Precious Mosaic Art Inlay Home Decor

Product Name:Live edge very old giant size great shape olive Acacia wood slab coffee table

Product Name:Foldable Indoor Plastic Round Dining Table Portable Outdoor Picnic Desk w/Handle

Product Name:Foldable Adjustable Height TV Tray Home Portable Sofa Bed Side Table Laptop Desk

Product Name:End Table Espresso Storage Living Room Half Round Table Moon Drawer Shelf Design

Product Name:Industrial Style Porch Table Single Layer Black Oak Triamine Board [105 * 30 * 7

Product Name:Industrial Style Porch Table Single Layer Light Walnut Color Triamine Board

Product Name:New ListingShellmond Rustic Distressed Metal Accent Cocktail Table with Lift Top 20" Gray

Product Name:Silicone Chair Leg Caps Covers Furniture Table Feet Pads Floor Protectors

Product Name:46*46*46cm Single Layer Round HDPE Side Table Black

Product Name:Million Dollar Cube Side Table - coffee side ta

### 2. Product Condition

This code creates a list called "prod_condition" to store product condition data. It loops through each product in the "products" list and extracts its condition status by searching for a specific HTML tag and class. The condition status is then stripped of any unnecessary spaces and added to the "prod_condition" list. Lastly, it prints the condition status of each product, which can be "Brand New", "Opened Box", "Pre-Owned", or other similar terms.

In [6]:
# Product Condition(Brand New, Opened Box or Pre-Owned)
prod_condition=[]

for prod in products:
    status = prod.find("div", class_="s-item__subtitle").find_next("span", class_="SECONDARY_INFO").text.strip()
    
    # Appending the Product Condition Status in the 'prod_condition' list
    prod_condition.append(status)
    
    # Print the Product Condition Status...
    print(f"Product_Condition:{status}\n")

Product_Condition:Brand New

Product_Condition:Brand New

Product_Condition:Brand New

Product_Condition:Brand New

Product_Condition:Brand New

Product_Condition:Brand New

Product_Condition:Brand New

Product_Condition:Brand New

Product_Condition:Brand New

Product_Condition:Brand New

Product_Condition:Brand New

Product_Condition:Brand New

Product_Condition:Brand New

Product_Condition:Brand New

Product_Condition:Brand New

Product_Condition:Brand New

Product_Condition:Brand New

Product_Condition:Brand New

Product_Condition:Brand New

Product_Condition:Brand New

Product_Condition:Brand New

Product_Condition:Brand New

Product_Condition:Pre-Owned

Product_Condition:Pre-Owned

Product_Condition:Brand New

Product_Condition:Brand New

Product_Condition:Brand New

Product_Condition:Brand New

Product_Condition:Brand New

Product_Condition:Brand New

Product_Condition:Brand New

Product_Condition:Brand New

Product_Condition:Brand New

Product_Condition:Brand New

Product_Condit

### 3. Price

This code finds the price of each product by searching for a specific HTML tag and class name called "s-item__price". It then stores the price value in a list called "price" & prints the output

In [7]:
price = []

for prod in products:
    amt = prod.find("span", class_="s-item__price").text.strip()
    
    # Appending it in the 'price' list
    price.append(amt)
    
    # Print the Price..
    print(f"Price:{amt}\n")

Price:$20.00

Price:$333.00 to $2,300.40

Price:$704.96 to $6,471.47

Price:$105.99 to $199.99

Price:$38.99

Price:$55.15

Price:$55.87

Price:$55.63

Price:$189.19

Price:$2.41 to $12.20

Price:$54.69

Price:$260.00

Price:$284.00

Price:$29.95

Price:$19.99

Price:$3.13 to $4.67

Price:$40.99 to $47.22

Price:$38.69

Price:$49.27

Price:$38.99 to $39.99

Price:$38.47

Price:$319.95

Price:$75.00

Price:$500.00

Price:$55.32

Price:$59.99

Price:$89.00

Price:$59.80

Price:$103.14

Price:$24.99

Price:$39.99

Price:$3.03 to $13.28

Price:$48.99

Price:$54.69

Price:$54.63

Price:$79.00

Price:$55.85

Price:$0.33

Price:$59.80

Price:$247.61

Price:$74.22 to $81.65

Price:$55.36

Price:$129.81

Price:$55.63

Price:$175.00

Price:$1.65 to $53.05

Price:$55.98

Price:$75.00

Price:$57.61

Price:$54.20

Price:$15.00

Price:$23.18

Price:$47.49

Price:$55.12

Price:$107.27

Price:$2.07 to $5.22

Price:$9.72

Price:$1.39 to $49.85

Price:$5.44 to $10.61

Price:$49.98



### 4. No of Watchers

* This code creates an empty list called "no_watchers" to store data. It loops through each product in the "products" list and tries to find the number of watchers by looking for a specific HTML tag and class. 

* If the watcher count is not available, it sets the value as "Not Available". Then it adds the watcher count to the "no_watchers" list & generates the Output.

In [8]:
no_watchers = []

for prod in products:
    watchers = prod.find("span", class_="s-item__dynamic s-item__watchCountTotal")
    
    # Since watcher count is not available and hence saving it in watch_tags where it is NA
    watch_tags = (watchers.text.strip() if watchers else "Not Available")
    
    # Append the watch_tags in 'no_watchers' list
    no_watchers.append(watch_tags)
    
    # Print the Watch tags...
    print(f"Watcher Count: {watch_tags}\n")

Watcher Count: Not Available

Watcher Count: Not Available

Watcher Count: Not Available

Watcher Count: Not Available

Watcher Count: Not Available

Watcher Count: Not Available

Watcher Count: 49 watchers

Watcher Count: 36 watchers

Watcher Count: Not Available

Watcher Count: 6+ watchers

Watcher Count: 6 watchers

Watcher Count: Not Available

Watcher Count: Not Available

Watcher Count: Not Available

Watcher Count: Not Available

Watcher Count: Not Available

Watcher Count: Not Available

Watcher Count: 18 watchers

Watcher Count: Not Available

Watcher Count: Not Available

Watcher Count: Not Available

Watcher Count: Not Available

Watcher Count: Not Available

Watcher Count: Not Available

Watcher Count: Not Available

Watcher Count: Not Available

Watcher Count: Not Available

Watcher Count: Not Available

Watcher Count: Not Available

Watcher Count: Not Available

Watcher Count: Not Available

Watcher Count: Not Available

Watcher Count: Not Available

Watcher Count: 1 watc

### 5. Image Links  


In [38]:
# Generating Image Links
img_url=[]

root = "https://www.ebay.com"
for prod in products:
    #Image Links
    img_src=prod.find("div", class_="s-item__image-wrapper image-treatment").find("img").get("src")
    # Append the Image Links in the 'image_url' list
    img_url.append(img_src)
    # Print the Image Source
    print(f"Image Source:{img_src}\n")

Image Source:https://ir.ebaystatic.com/rs/v/fxxj3ttftm5ltcqnto1o4baovyl.png

Image Source:https://i.ebayimg.com/thumbs/images/g/NPcAAOSwHbZjfKCh/s-l225.jpg

Image Source:https://i.ebayimg.com/thumbs/images/g/IroAAOSwhw9gzwqd/s-l225.jpg

Image Source:https://i.ebayimg.com/thumbs/images/g/0usAAOSwePxjWJh5/s-l225.jpg

Image Source:https://i.ebayimg.com/thumbs/images/g/9eEAAOSw8whjLUuH/s-l225.jpg

Image Source:https://i.ebayimg.com/thumbs/images/g/ZUEAAOSwZrtarDcb/s-l225.jpg

Image Source:https://i.ebayimg.com/thumbs/images/g/IiAAAOSwl01kHXRb/s-l225.jpg

Image Source:https://i.ebayimg.com/thumbs/images/g/NaoAAOSw2EtkHWqG/s-l225.jpg

Image Source:https://i.ebayimg.com/thumbs/images/g/6BsAAOSwBzFkJxgx/s-l225.jpg

Image Source:https://i.ebayimg.com/thumbs/images/g/KlcAAOSwawxhytd~/s-l225.jpg

Image Source:https://i.ebayimg.com/thumbs/images/g/87YAAOSwB2NkFXZH/s-l225.jpg

Image Source:https://i.ebayimg.com/thumbs/images/g/hJMAAOSwpfxfmbMW/s-l225.jpg

Image Source:https://i.ebayimg.com/thumbs/i

### 6. Webpage Links

Generating web page links and storing it in 'links' list which will be used in Question 2.

In [19]:
# Generating Webpage Links - It would be used for Question 2.
links=[]

for prod in products:
    
    # Generating Links and storing it in var 'links'
    link = prod.find("div", class_= "s-item__info clearfix").find("a").get("href")
    
    # Appending the Links in 'links' list.
    links.append(link)
    
    # Printing the Link
    print(f"Link:{link}\n")

Link:https://ebay.com/itm/123456?hash=item28caef0a3a:g:E3kAAOSwlGJiMikD&amdata=enc%3AAQAHAAAAsJoWXGf0hxNZspTmhb8%2FTJCCurAWCHuXJ2Xi3S9cwXL6BX04zSEiVaDMCvsUbApftgXEAHGJU1ZGugZO%2FnW1U7Gb6vgoL%2BmXlqCbLkwoZfF3AUAK8YvJ5B4%2BnhFA7ID4dxpYs4jjExEnN5SR2g1mQe7QtLkmGt%2FZ%2FbH2W62cXPuKbf550ExbnBPO2QJyZTXYCuw5KVkMdFMDuoB4p3FwJKcSPzez5kyQyVjyiIq6PB2q%7Ctkp%3ABlBMULq7kqyXYA

Link:https://www.ebay.com/itm/125624687528?hash=item1d3fd09ba8:g:NPcAAOSwHbZjfKCh&amdata=enc%3AAQAHAAAA4CSWdvDQNNrPCXslFkR%2F2olgcqWz3VDg46IbUuJmGoSdTiOz%2FNQje0nkplS6a4O6W005ZB1%2BJZYKMkkhy5Fs5QDGTHHo2truDwy3SAvdaxJAUF%2FOzzQQJaTwjJ%2FNDQWRcHpzewcJP3RNAxJ4rHGsuGvbZ7ZtZW8VHVdARMiExPjmPwRz9FAuve2nHbiRMH0PlmQbfGgK48AmC4Glv5%2BI%2FsR9%2BDNfaTAi9NEEf0Z7gPxFrCJ51vQq95Pa9Upi9ap0IwIqvvqETdqXUwaKUglQ1G5l4EYjVsJJZ6ammv7uZ%2FPK%7Ctkp%3ABFBMgO33i-dh

Link:https://www.ebay.com/itm/114858552175?hash=item1abe1a6b6f:g:IroAAOSwhw9gzwqd&amdata=enc%3AAQAHAAAA4Ph8L%2F35sYUBwl2c06VCDomTwuJO7TMee3T2YvtwjFnEWCpj9Lye1RuxZo0LYJ1pR%2BVqilcmutHrnyaoWnu

### 7. Shipping Cost

In [20]:
# Shipping Cost

shipping_cost=[]

for prod in products:
    # Finding Shipping Cost & if the Shipping Cost is N.A., it will generate N/A
    shipping_tag = prod.find("span", class_="s-item__shipping s-item__logisticsCost")
    # saved the shipping cost in var 'tags'
    tags = (shipping_tag.text.strip() if shipping_tag else "N/A")
    # Appending it in the 'shipping_cost' list
    shipping_cost.append(tags)
    # Print the Shipping Cost...
    print(f"Shipping Cost:{tags}\n")

Shipping Cost:N/A

Shipping Cost:Free shipping

Shipping Cost:Free shipping

Shipping Cost:+$206.01 shipping

Shipping Cost:+$203.74 shipping

Shipping Cost:+$203.98 shipping

Shipping Cost:+$3.99 shipping

Shipping Cost:+$3.99 shipping

Shipping Cost:Shipping not specified

Shipping Cost:Free International Shipping

Shipping Cost:+$2.00 shipping

Shipping Cost:+$1,307.69 shipping

Shipping Cost:+$1,058.69 shipping

Shipping Cost:+$75.03 shipping

Shipping Cost:+$203.46 shipping

Shipping Cost:Free International Shipping

Shipping Cost:+$8.00 shipping

Shipping Cost:Free International Shipping

Shipping Cost:+$202.80 shipping

Shipping Cost:+$203.74 shipping

Shipping Cost:+$216.72 shipping

Shipping Cost:+$90.00 shipping

Shipping Cost:Shipping not specified

Shipping Cost:Shipping not specified

Shipping Cost:+$3.99 shipping

Shipping Cost:+$204.06 shipping

Shipping Cost:+$19.06 shipping

Shipping Cost:Free shipping

Shipping Cost:+$197.53 shipping

Shipping Cost:+$107.25 shipping



## Create a DataFrame and Generate the tsv file named 'tableList.tsv' as instructed

In [21]:
# Creating Dataframe..

tableList = pd.DataFrame(product_name, columns=["Product_Name"])
tableList["Product_Condition"] = pd.Series(prod_condition)
tableList["Price"] = pd.Series(price)
tableList["No_Watchers"] = pd.Series(no_watchers)
tableList["Image_URL"] = pd.Series(img_url)
tableList["Webpage_Link"] = pd.Series(links)
tableList["Shipping_Cost"] = pd.Series(shipping_cost)

In [22]:
tableList.head()

Unnamed: 0,Product_Name,Product_Condition,Price,No_Watchers,Image_URL,Webpage_Link,Shipping_Cost
0,Shop on eBay,Brand New,$20.00,Not Available,https://ir.ebaystatic.com/rs/v/fxxj3ttftm5ltcq...,https://ebay.com/itm/123456?hash=item28caef0a3...,
1,Malachite Round Top Coffee Table Top Semi Prec...,Brand New,"$333.00 to $2,300.40",Not Available,https://i.ebayimg.com/thumbs/images/g/NPcAAOSw...,https://www.ebay.com/itm/125624687528?hash=ite...,Free shipping
2,Live edge very old giant size great shape oliv...,Brand New,"$704.96 to $6,471.47",Not Available,https://i.ebayimg.com/thumbs/images/g/IroAAOSw...,https://www.ebay.com/itm/114858552175?hash=ite...,Free shipping
3,Foldable Indoor Plastic Round Dining Table Por...,Brand New,$105.99 to $199.99,Not Available,https://i.ebayimg.com/thumbs/images/g/0usAAOSw...,https://www.ebay.com/itm/134177164095?hash=ite...,+$206.01 shipping
4,Foldable Adjustable Height TV Tray Home Portab...,Brand New,$38.99,Not Available,https://i.ebayimg.com/thumbs/images/g/9eEAAOSw...,https://www.ebay.com/itm/134246341723?hash=ite...,+$203.74 shipping


### Exporting the tsv file

In [23]:
# Export the tsv file as 'tableList.tsv'
tableList.to_csv("tableList.tsv", index = False)

## For each product, we will go into each link and would extract the Product Name, Return Policy, Ships to, Item Location, Shipping Cost, Estimated delivery date, Payment modes available, Price, Starting bid, eBay item number Condition, Brand, Color, Type and Material**

### Product Name

* Now, We will use the "links" list created above. It has all the links to the webpages and now we will run the loop on the web pages to extract the above mentioned information.

* The key aspect is to run the loop on the links and find the correct tag that can help extract the desired information.

* The below code would extract the name of each product from a list of links. For each link, it sends an HTTP request to that link, gets the HTML content of the page, and then extracts the product name from the HTML using Beautiful Soup. The extracted product names are then stored in a list called "product_titles". The code then prints the name of each product to the console.

In [25]:
# Extract Product Names
product_titles=[]

for link in links:
    page_urls = requests.get(link, headers=user_agent)
    page_soup = BeautifulSoup(page_urls.content, features="html.parser")
    
    # Extracting Product Names...
    titles = page_soup.find_all("div", class_="vim x-item-title")
    for names in titles:
        titles = names.find("h1", class_="x-item-title__mainTitle").find("span", class_="ux-textspans ux-textspans--BOLD").text
        product_titles.append(titles)
        print(f"Product Name: {titles}\n")

Product Name: Malachite Round Top Coffee Table Top Semi Precious Mosaic Art Inlay Home Decor

Product Name: Live edge very old giant size, great shape, olive Acacia wood slab, coffee table

Product Name: Foldable Indoor Plastic Round Dining Table Portable Outdoor Picnic Desk w/Handle

Product Name: Foldable Adjustable Height TV Tray Home Portable Sofa Bed Side Table Laptop Desk

Product Name: End Table Espresso Storage Living Room Half Round Table Moon Drawer Shelf Design

Product Name: Industrial Style Porch Table Single Layer Black Oak Triamine Board [105 * 30 * 7

Product Name: Industrial Style Porch Table Single Layer Light Walnut Color Triamine Board

Product Name: Shellmond Rustic Distressed Metal Accent Cocktail Table with Lift Top 20", Gray

Product Name: Silicone Chair Leg Caps Covers Furniture Table Feet Pads Floor Protectors

Product Name: 46*46*46cm Single Layer Round HDPE Side Table Black

Product Name: Million Dollar Cube Side Table - coffee side table great for home and 

### Return Policy

Below code would return the Return Policy from the webpages. There are three statements for the Return Policy which are published/printed using refund_policy1, 2 & 3.

* First, the code starts with 03 blank lists for the purpose of containing the values. Then, for each link in the 'links' list, the code sends a request to the webpage and extracts its content using BeautifulSoup.

* The code then searches for all the div tags with class 'vim x-returns-maxview' in the content using BeautifulSoup's 'find_all' method. For each of these tags, the code extracts the first refund policy and appends it to the 'refund_policy1' list.

* If the first refund policy has a next sibling with class 'ux-table-section__cell', the code extracts the second refund policy and appends it to the 'refund_policy2' list and the same goes for the 3rd policy.

In [28]:
# Extracting Return Policy
refund_policy1=[]
refund_policy2=[]
refund_policy3=[]

for link in links:
    # Extracting Refund Policy
    page_urls = requests.get(link, headers=user_agent)
    page_soup = BeautifulSoup(page_urls.content, features="html.parser")
    refunds = page_soup.find_all("div", class_="vim x-returns-maxview")
    for refund in refunds:
        refund1 = refund.find("td", class_="ux-table-section__cell").find("span")
        refund_policy1.append(refund1.text.strip())
        print(f"Refund Policy1: {refund1.text.strip()}\n")
        
        
        refund2 = refund1.find_next("td", class_="ux-table-section__cell")
        if refund2 is not None:
            refund2 = refund2.find("span", class_="ux-textspans")
            refund_policy2.append(refund2.text.strip())
            print(f"Refund Policy2: {refund2.text.strip()}\n")
        
            refund3 = refund2.find_next("td", class_="ux-table-section__cell")
            if refund3 is not None:
                refund3 = refund3.find("span", class_="ux-textspans")
                refund_policy3.append(refund3.text.strip())
                print(f"Refund Policy3: {refund3.text.strip()}\n")

Refund Policy1: 30 days

Refund Policy2: Money Back, Replacement

Refund Policy3: Seller pays for return shipping

Refund Policy1: 30 days

Refund Policy2: Money Back, Replacement

Refund Policy3: Seller pays for return shipping

Refund Policy1: 30 days

Refund Policy2: Money Back

Refund Policy3: Buyer pays for return shipping

Refund Policy1: 30 days

Refund Policy2: Money Back

Refund Policy3: Buyer pays for return shipping

Refund Policy1: 30 days

Refund Policy2: Money Back

Refund Policy3: Buyer pays for return shipping

Refund Policy1: Seller does not accept returns

Refund Policy1: Seller does not accept returns

Refund Policy1: 30 days

Refund Policy2: Money Back

Refund Policy3: Buyer pays for return shipping

Refund Policy1: 30 days

Refund Policy2: Buyer pays for return shipping

Refund Policy3: Free shipping

Refund Policy1: Seller does not accept returns

Refund Policy1: 30 days

Refund Policy2: Money Back

Refund Policy3: Buyer pays for return shipping

Refund Policy1: 1

### Ships to:

* This code tries to extract information about where a product can be shipped to. It does this by first sending a request to a webpage using the requests library and then using BeautifulSoup to parse the HTML response. 

* It then looks for a section of the HTML that contains information about shipping locations and saves that information to a list called ships_to. Finally, it prints the shipping location information for each product and adds it to the ships_to list.

In [29]:
# Ships to...
ships_to=[]

for link in links:
    page_urls = requests.get(link, headers=user_agent)
    page_soup = BeautifulSoup(page_urls.content, features="html.parser")
    main = page_soup.find_all("div", class_="vim sh-tab-bdr")
    for ship in main:
        loc = ship.find("div", class_="ux-layout-section-module").find("div", class_="ux-layout-section ux-layout-section--shipping")
        shipto_loc = loc.find_next("span", class_="ux-textspans ux-textspans--SECONDARY").text.strip().replace(",", "")
        print(f"Shipping to: {shipto_loc}\n")
        ships_to.append(shipto_loc)

Shipping to: Worldwide

Shipping to: Worldwide

Shipping to: Afghanistan Albania Algeria Andorra Angola Anguilla Antigua and Barbuda Argentina Armenia Aruba Australia Austria Azerbaijan Republic Bahamas Bahrain Bangladesh Belgium Belize Benin Bermuda Bhutan Bolivia Bosnia and Herzegovina Botswana Brazil Brunei Darussalam Bulgaria Burkina Faso Burundi Cambodia Cameroon Canada Cape Verde Islands Cayman Islands Central African Republic Chad Chile China Colombia Costa Rica Cyprus Czech Republic Côte d'Ivoire (Ivory Coast) Democratic Republic of the Congo Denmark Djibouti Dominican Republic Ecuador Egypt El Salvador Equatorial Guinea Eritrea Estonia Ethiopia Fiji Finland France Gabon Republic Gambia Georgia Germany Ghana Gibraltar Greece Greenland Grenada Guatemala Guinea Guinea-Bissau Guyana Haiti Honduras Hong Kong Hungary Iceland India Indonesia Ireland Israel Italy Jamaica Japan Jordan Kazakhstan Kenya Kiribati Kuwait Kyrgyzstan Laos Latvia Lebanon Lesotho Liberia Liechtenstein Lithuani

### Item Location

* This code extracts the location where the product is being sold from. It first creates an empty list called item_location. It then loops over each link in the links list. Inside the loop, it uses the requests library to get the content of the page specified by the link, and then uses BeautifulSoup to parse the HTML content of the page.

* It then looks for the HTML element that contains the item location information, using find(). If it finds the element, it extracts the item location text and adds it to the item_location list. Finally, it prints the item location and appends it to the item_location list.

In [41]:
# Item Location
item_location=[]

for link in links:
    page_urls = requests.get(link, headers=user_agent)
    page_soup = BeautifulSoup(page_urls.content, features="html.parser")
    main = page_soup.find_all("div", class_="vim sh-tab-bdr")
    for ship in main:
        item = ship.find("div", class_="ux-labels-values col-12 ux-labels-values--itemLocation").find("div", class_="ux-labels-values__values col-6")
        for locs in item:
            item_locs = locs.find("span", class_="ux-textspans ux-textspans--BOLD").text.strip()
            print(f"Item Location: {item_locs}\n")
            item_location.append(item_locs)

Item Location: AGRA, UTTAR PRADESH, India

Item Location: AGRA, UTTAR PRADESH, India

Item Location: Houston, TX, United States

Item Location: Houston, TX, United States

Item Location: Englewood, Colorado, United States

Item Location: china, China

Item Location: china, China

Item Location: Rosedale, Maryland, United States

Item Location: Guangzhou, China

Item Location: china, China

Item Location: Miami, Florida, United States

Item Location: Miami, Florida, United States

Item Location: Las Vegas, Nevada, United States

Item Location: El Monte, California, United States

Item Location: putian, China

Item Location: Shenzhen, China

Item Location: XiXianXinQu, null, China

Item Location: Rosedale, Maryland, United States

Item Location: El Monte, California, United States

Item Location: Rosedale, Maryland, United States

Item Location: ANKARA, Turkey

Item Location: Santa Fe, New Mexico, United States

Item Location: Miami, Florida, United States

Item Location: china, China

I

## Payment Types

This code is searching for the payment methods that are available for each product in a list of links. It does this by visiting each link in the list, getting the web page content, and looking for a specific section that lists the payment methods.

In [31]:
# Payment Types

payment_types=[]

for link in links:
    page_urls = requests.get(link, headers=user_agent)
    page_soup = BeautifulSoup(page_urls.content, features="html.parser")
    modes_of_pay = page_soup.find_all("div", class_="ux-layout-section ux-layout-section--payments")
    for payment_modes in modes_of_pay:
        payments = payment_modes.find_all("span", role = "img")
        paymentmodes_ = [payment["title"] for payment in payments]
        print("Payment Modes: ", paymentmodes_)
        payment_types.append(paymentmodes_)

Payment Modes:  ['PayPal', 'Apple Pay', 'Google Pay', 'Visa', 'Master Card', 'American Express', 'Discover']
Payment Modes:  ['PayPal', 'Apple Pay', 'Google Pay', 'Visa', 'Master Card', 'American Express', 'Discover']
Payment Modes:  ['PayPal', 'Apple Pay', 'Google Pay', 'Visa', 'Master Card', 'American Express', 'Discover']
Payment Modes:  ['PayPal', 'Apple Pay', 'Google Pay', 'Visa', 'Master Card', 'American Express', 'Discover']
Payment Modes:  ['PayPal', 'Apple Pay', 'Google Pay', 'Visa', 'Master Card', 'American Express', 'Discover']
Payment Modes:  ['PayPal', 'Apple Pay', 'Google Pay', 'Visa', 'Master Card', 'American Express', 'Discover']
Payment Modes:  ['PayPal', 'Apple Pay', 'Google Pay', 'Visa', 'Master Card', 'American Express', 'Discover']
Payment Modes:  ['PayPal', 'Apple Pay', 'Google Pay', 'Visa', 'Master Card', 'American Express', 'Discover']
Payment Modes:  ['PayPal', 'Apple Pay', 'Google Pay', 'Visa', 'Master Card', 'American Express']
Payment Modes:  ['PayPal', 'App

## Shipping Cost

* This code is trying to extract the cost of shipping for a list of items from different websites. For each website link, the code first sends a request to the website and uses BeautifulSoup to extract the shipping cost. 
* It looks for a table that contains information about the shipping costs, and then extracts the cost information from the first cell in the table. Finally, the extracted cost is added to a list called "cost_of_shipping". The code also prints the shipping cost for each item.

In [33]:
# Shipping Cost

cost_of_shipping =[]
for link in links:
    page_urls = requests.get(link, headers=user_agent)
    page_soup = BeautifulSoup(page_urls.content, features="html.parser")
    costing = page_soup.find_all("table", class_="ux-table-section ux-table-section--html-table ux-table-section-with-hints--shippingTable")
    for cost in costing:
        # cost.find("td", class_="ux-table-section__cell").find("span").text)
        transit_cost = cost.find("td", class_="ux-table-section__cell").find("span").text.strip()
        print(f"Shipping Cost: {transit_cost}\n") 
        cost_of_shipping.append(transit_cost)
        

Shipping Cost: Free shipping

Shipping Cost: Free shipping

Shipping Cost: US $14.12

Shipping Cost: US $13.12

Shipping Cost: US $203.98

Shipping Cost: US $3.99

Shipping Cost: US $3.99

Shipping Cost: Free shipping

Shipping Cost: US $2.00

Shipping Cost: US $1,307.69

Shipping Cost: US $1,058.69

Shipping Cost: US $75.03

Shipping Cost: US $49.98

Shipping Cost: Free shipping

Shipping Cost: US $8.00

Shipping Cost: Free shipping

Shipping Cost: US $202.80

Shipping Cost: US $75.16

Shipping Cost: US $216.72

Shipping Cost: US $90.00

Shipping Cost: US $3.99

Shipping Cost: US $13.43

Shipping Cost: US $19.06

Shipping Cost: Free shipping

Shipping Cost: US $197.53

Shipping Cost: US $107.25

Shipping Cost: US $203.76

Shipping Cost: GBP 0.50

Shipping Cost: US $3.99

Shipping Cost: US $3.99

Shipping Cost: Free shipping

Shipping Cost: US $3.99

Shipping Cost: Free shipping

Shipping Cost: US $2.00

Shipping Cost: US $153.06

Shipping Cost: US $3.00

Shipping Cost: US $0.80

Shipp

## Price

* This code is finding the price of a product on a webpage. It goes through a list of links to different product pages, and then it uses the requests library to get the HTML content of each page. 

* Then it uses BeautifulSoup to parse the HTML content and find the section of the page that displays the product price. Once it finds this section, it extracts the price and adds it to a list called "price". Finally, it prints the price of the product and adds it to the list.

In [35]:
# Price

price_of_product=[]

for link in links:
    page_urls = requests.get(link, headers=user_agent)
    page_soup = BeautifulSoup(page_urls.content, features="html.parser")
    price_prod = page_soup.find_all("div", class_="x-price-primary")
    for p in price_prod:
        product_price = p.find("span").text.strip()
        print(f"Price: {product_price}\n")
        price_of_product.append(product_price)

Price: US $333.00

Price: US $704.96

Price: US $105.99/ea

Price: US $38.99/ea

Price: US $55.15

Price: US $55.87

Price: US $55.63

Price: US $189.19

Price: GBP 1.95

Price: US $54.69

Price: US $260.00/ea

Price: US $284.00/ea

Price: US $29.95

Price: US $19.99/ea

Price: GBP 2.53

Price: US $40.99

Price: US $38.69

Price: US $49.27

Price: US $38.99/ea

Price: US $38.47

Price: US $319.95/ea

Price: US $75.00

Price: US $500.00

Price: US $55.32

Price: US $59.99/ea

Price: US $89.00

Price: US $59.80

Price: US $103.14

Price: US $24.99/ea

Price: US $39.99

Price: GBP 2.45

Price: US $48.99/ea

Price: US $54.69

Price: US $54.63

Price: US $79.00

Price: US $55.85

Price: US $0.33

Price: US $59.80

Price: GBP 199.99

Price: GBP 59.95

Price: US $55.36

Price: US $129.81

Price: US $55.63

Price: US $175.00

Price: US $1.65

Price: US $55.98

Price: US $75.00

Price: US $57.61

Price: US $54.20

Price: US $15.00

Price: US $23.18/ea

Price: US $47.49

Price: US $55.12

Price:

## E Bay Number 

* This code is finding the Ebay number of each product in a list of links to Ebay products. It starts by making a request to each link in the list, and then using BeautifulSoup to extract the Ebay number from the webpage. 

* It then adds each Ebay number to a list called ebay_number & prints the same

In [36]:
# Ebay Number

ebay_number=[]

for link in links:
    page_urls = requests.get(link, headers=user_agent)
    page_soup = BeautifulSoup(page_urls.content, features="html.parser")
    number = page_soup.find_all("div", class_="ux-layout-section__textual-display ux-layout-section__textual-display--itemId")
    for ebayno in number:
         ebay_no = ebayno.find("span", class_="ux-textspans ux-textspans--BOLD").text.strip()
         print(f"Ebay Number: {ebay_no}\n")
         ebay_number.append(ebay_no)

Ebay Number: 125624687528

Ebay Number: 125624687528

Ebay Number: 114858552175

Ebay Number: 114858552175

Ebay Number: 134177164095

Ebay Number: 134177164095

Ebay Number: 134246341723

Ebay Number: 134246341723

Ebay Number: 253496191779

Ebay Number: 253496191779

Ebay Number: 295583123463

Ebay Number: 295583123463

Ebay Number: 314481011464

Ebay Number: 314481011464

Ebay Number: 275769401156

Ebay Number: 275769401156

Ebay Number: 265473237961

Ebay Number: 265473237961

Ebay Number: 175655606617

Ebay Number: 175655606617

Ebay Number: 283997113265

Ebay Number: 283997113265

Ebay Number: 284402248308

Ebay Number: 284402248308

Ebay Number: 195178653961

Ebay Number: 195178653961

Ebay Number: 303730033918

Ebay Number: 303730033918

Ebay Number: 403595806095

Ebay Number: 403595806095

Ebay Number: 304710855999

Ebay Number: 304710855999

Ebay Number: 404202395176

Ebay Number: 404202395176

Ebay Number: 285213074253

Ebay Number: 285213074253

Ebay Number: 303729898354

E

## Condition, Brand & Color

* This code generates the Description of the Product followed by the Brand and the Color of the Product.
* It uses the requests and BeautifulSoup libraries to scrape the webpages. The code first finds the section of the webpage that contains the product details, then extracts the product description, brand, and color using the find method. 

* If the color is not available, it adds "Not Available" to the color list. Finally, the extracted values are printed for each link, and added to the corresponding lists.

In [37]:
# Product Condition, Brand and Color

product_desc=[]
brand=[]
color = []

for link in links:
    page_urls = requests.get(link, headers=user_agent)
    page_soup = BeautifulSoup(page_urls.content, features="html.parser")
    condition = page_soup.find_all("div", class_="tab-content-m")
    for product_detail in condition:
        desc = product_detail.find("div", class_="ux-labels-values__values-content").find("span").text.strip().replace(",", "")
        print(f"Product Desc: {desc}\n")
        product_desc.append(desc)
        # Brand
        prod_brand = product_detail.find("span", itemprop = "brand").find("span", class_="ux-textspans").text.strip()    
        print(f"Brand: {prod_brand}\n")
        brand.append(prod_brand)
        
        # Color
        prod_color = product_detail.find("span", itemprop = "color")
        if prod_color is not None:
            color_prod = prod_color.find("span", class_="ux-textspans").text.strip()
            print(f"Color: {color_prod}\n")
            color.append(color_prod)
        else:
            print("Not Avilable")
            color.append("Not Available")


Product Desc: New: A brand-new unused unopened undamaged item in its original packaging (where packaging is ...  Read moreabout the condition

Brand: Agra Heritage Marble Crafts

Color: Green

Product Desc: New: A brand-new unused unopened undamaged item in its original packaging (where packaging is ...  Read moreabout the condition

Brand: Tabletopsemporium

Color: As Photo

Product Desc: New: A brand-new unused unopened undamaged item in its original packaging (where packaging is ...  Read moreabout the condition

Brand: MoNiBloom

Color: White

Product Desc: New: A brand-new unused unopened undamaged item in its original packaging (where packaging is ...  Read moreabout the condition

Brand: MoNiBloom

Not Avilable
Product Desc: New: A brand-new unused unopened undamaged item in its original packaging (where packaging is ...  Read moreabout the condition

Brand: MegaHome

Color: Dark Brown

Product Desc: New: A brand-new unused unopened undamaged item in its original packaging (wher

## Create a DataFrame and Generate the tsv file named 'tableDetails.tsv' as instructed

In [42]:
# Generate a DataFrame

tableDetails = pd.DataFrame(product_titles, columns=["Product_Name"])
tableDetails["Return_Policy_1"] = pd.Series(refund_policy1)
tableDetails["Return_Policy_2"] = pd.Series(refund_policy2)
tableDetails["Return_Policy_3"] = pd.Series(refund_policy3)
tableDetails["Ships_To"] = pd.Series(ships_to)
tableDetails["Item_Location"] = pd.Series(item_location)
tableDetails["Shipping_Cost"] = pd.Series(cost_of_shipping)
tableDetails["Payment_Modes"] = pd.Series(payment_types)
tableDetails["Price"] = pd.Series(price_of_product)

tableDetails["eBay_Item_Number"] = pd.Series(ebay_number)
tableDetails["Condition"] = pd.Series(product_desc)
tableDetails["Brand"] = pd.Series(brand)
tableDetails["Color"] = pd.Series(color)

In [43]:
# Preview the Dataset Created

tableDetails.head()


Unnamed: 0,Product_Name,Return_Policy_1,Return_Policy_2,Return_Policy_3,Ships_To,Item_Location,Shipping_Cost,Payment_Modes,Price,eBay_Item_Number,Condition,Brand,Color
0,Malachite Round Top Coffee Table Top Semi Prec...,30 days,"Money Back, Replacement",Seller pays for return shipping,Worldwide,"AGRA, UTTAR PRADESH, India",Free shipping,"[PayPal, Apple Pay, Google Pay, Visa, Master C...",US $333.00,125624687528,New: A brand-new unused unopened undamaged ite...,Agra Heritage Marble Crafts,Green
1,"Live edge very old giant size, great shape, ol...",30 days,"Money Back, Replacement",Seller pays for return shipping,Worldwide,"AGRA, UTTAR PRADESH, India",Free shipping,"[PayPal, Apple Pay, Google Pay, Visa, Master C...",US $704.96,125624687528,New: A brand-new unused unopened undamaged ite...,Tabletopsemporium,As Photo
2,Foldable Indoor Plastic Round Dining Table Por...,30 days,Money Back,Buyer pays for return shipping,Afghanistan Albania Algeria Andorra Angola Ang...,"Houston, TX, United States",US $14.12,"[PayPal, Apple Pay, Google Pay, Visa, Master C...",US $105.99/ea,114858552175,New: A brand-new unused unopened undamaged ite...,MoNiBloom,White
3,Foldable Adjustable Height TV Tray Home Portab...,30 days,Money Back,Buyer pays for return shipping,Afghanistan Albania Algeria Andorra Angola Ang...,"Houston, TX, United States",US $13.12,"[PayPal, Apple Pay, Google Pay, Visa, Master C...",US $38.99/ea,114858552175,New: A brand-new unused unopened undamaged ite...,MoNiBloom,Not Available
4,End Table Espresso Storage Living Room Half Ro...,30 days,Money Back,Buyer pays for return shipping,Afghanistan Albania Algeria Andorra Angola Ang...,"Englewood, Colorado, United States",US $203.98,"[PayPal, Apple Pay, Google Pay, Visa, Master C...",US $55.15,134177164095,New: A brand-new unused unopened undamaged ite...,MegaHome,Dark Brown


In [45]:
# Export the Data outside as tsv file
tableDetails.to_csv("tableDetails.tsv", index = False)

In [46]:
cd

/Users/mukulchauhan
