# In this file we scrape our required data off Etsy

We will use beautifulsoup4 to get our data: https://pypi.org/project/beautifulsoup4/

Finally, we will export all acquired data to a .csv for further processing.

**NOTE I:** As we are building an HTML Parser, the code below may fail to work anytime Etsy makes changes to their website. If the below does not work for your purposes, try to examine if their HTML code on the site.

**NOTE II:** Etsy offers an API interface for devs to code shop management apps, which could likely be used to complete these actions more effectively and could be easier to maintain over time. I did register for API access, but Etsy had not unlocked my access key in time for this project. If you have a similar project in mind, best sign up for API access as early as possible!

#### Libraries used:

In [4]:
import pandas as pd
import csv
import requests
from bs4 import BeautifulSoup
from random import randint
from time import sleep
import tqdm
import re

#### Method:
2 Step Approach

- 1st Step: 
    Parse search page by page to get listing_name, listing_price, listing_url, listing_image 

- 2nd Step: 
    use listing_url to go listing to listing and gather further info

**URL:**
https://www.etsy.com/search?q=flower+girl+dress&explicit=1&order=highest_reviews&page=1&ref=pagination
- Note the page=1. Change the 1 to any number to jump to that corresponding page. Limit 250!

**Observations:**

Interestingly, Etsy says we get 60,722 listing results (with Ads) for our search query "flower girl dress". With their default results page showing 16x4 rows of listings, this should amount to ca.949 results pages! However, when reaching page 250 we cannot go further, thus opposing a limit of 16,000 listings. However, I noticed a diminishing relevance of the search results in the latter pages (increasing numbers of listings we'd need to filter out anyways), so this limit is not much of an issue. 

There are already some items in there that should not trip on "flower girl dress", such as "Dog wedding dress". There is also quite a noticeable difference between my personal search results on the page vs. the ones pulled by the sraper.

Ideas to increase dataset from ca. 16k (before cleanup) listings:

We may run several query operations, not only for "flower girl dress", but also "christening dress" which is another item category our client offers. 

During the second step I plan to collect shop owner name and url, so we may use the shop url to do further scraping operations shop by shop.

This would surely result in lots of duplicates we will have to clean, but ultimately may result in a more complete picture of the leading shops' offerings.

#### Step 1 Data Collection

In [17]:
# empty scraping lists to append to:
listing_name = []
listing_url = []
listing_image = []

# link that will iterate by page to url in loop:
link = "https://www.etsy.com/search/clothing/girls-clothing?explicit=1&q=flower+girl+dress&ship_to=DE&order=highest_reviews&page={}&ref=pagination"

# key that will iterate listing by listing in nested loop:
key = "#content > div > div.wt-bg-white.wt-grid__item-md-12.wt-pl-xs-1.wt-pr-xs-0.wt-pr-md-1.wt-pl-lg-0.wt-pr-lg-0.wt-bb-xs-1 > div > div.wt-mt-xs-3.wt-text-black > div.wt-grid.wt-pl-xs-0.wt-pr-xs-0.search-listings-group > div:nth-child(3) > div.wt-bg-white.wt-display-block.wt-pb-xs-2.wt-mt-xs-0 > div:nth-child(1) > div > div > ul > li:nth-child({})"

# cycle through search results until page 250 and scrape soup:
for i in tqdm.tqdm(range(1,251)): # tqdm code for progress bar
    url = link.format(i)
    response = requests.get(url)
    soup = BeautifulSoup(response.text,"html.parser")
    
    # nested loop to scrape listing info:
    for o in range(1,9):
        x = key.format(o)
        # items to scrape:
        title = soup.select(x)[0].a["title"]
        href = soup.select(x)[0].a["href"]
        src = soup.select(x)[0].a.img["src"]
        # append results to scraping lists:
        listing_name.append(title)
        listing_url.append(href)
        listing_image.append(src)
    
    # random sleep duration between page cycles:
    wait_time = randint(1,4000)
    sleep(wait_time/1000)

100%|█████████████████████████████████████████| 250/250 [15:54<00:00,  3.82s/it]


**Step 1 Data converted to Pandas DF**

In [18]:
data_tuples = list(zip(listing_name,listing_url,listing_image))
step1_data = pd.DataFrame(data_tuples, columns=['listing_name','listing_url','listing_image'])

In [19]:
step1_data.shape

(2000, 3)

In [20]:
step1_data.to_csv('step1_data.csv',index=False)

In [5]:
step1_data = pd.read_csv('step1_data.csv')

In [None]:
test.loc[test['shop_name'] == "PetiteLuluStudio"]

#### Step 2 Data Collection

In [36]:
test_small=step1_data.sample(10)

In [None]:
test_small

In [38]:
step2(test_small)

100%|███████████████████████████████████████████| 10/10 [00:51<00:00,  5.18s/it]


In [39]:
col_dict = {"shop_name":shop_name,
            "start_price":start_price,
            "size_options":size_options,
            "ship_cost_germany":ship_cost_germany,
            "allows_returns":allows_returns,
            "shop_location":shop_location,
            "shop_rating":shop_rating,
            "shop_review_counts":shop_review_counts,
            "item_review_counts":item_review_counts,
            "customer_item_reviews":customer_item_reviews}

for col,val in col_dict.items():
    test_small[col] = val

In [40]:
test_small

Unnamed: 0,listing_name,listing_url,listing_image,shop_name,start_price,size_options,ship_cost_germany,allows_returns,shop_location,shop_rating,shop_review_counts,item_review_counts,customer_item_reviews
1655,Lemons on Blue Gingham Dress. Baby Dress. Litt...,https://www.etsy.com/listing/1036672238/lemons...,https://i.etsystatic.com/23243567/c/2400/1907/...,PinsAndNeedlesKRipa,27.47,11,0,0,"Wallingford, CT",5.0,416.0,0,N.A.
550,Vintage style dress baby girl coming home outf...,https://www.etsy.com/listing/667752702/vintage...,https://i.etsystatic.com/16079692/r/il/484767/...,KidcycleCo,86.17,17,0,1,"Patchogue, NY",4.5,139.0,0,N.A.
921,"Black flower girl dress,Lace flower girl dress...",https://www.etsy.com/listing/1134445298/black-...,https://i.etsystatic.com/21818448/c/1526/1213/...,BoutiqueDeCharme,140.49,0,0,0,Ukraine,5.0,1.556,0,N.A.
1195,"Tulle flower girl dress, Christmas eve girl dr...",https://www.etsy.com/listing/927244695/tulle-f...,https://i.etsystatic.com/21512279/r/il/50490e/...,ColibriFashionStudio,137.12,0,0,0,Ukraine,5.0,1.2,18,[Ahhh it came out even better than expected!!!...
965,Rose gold flower girl dress Sequin flower girl...,https://www.etsy.com/listing/601071275/rose-go...,https://i.etsystatic.com/16838036/r/il/ecd79e/...,classygown,56.2,22,0,1,Thailand,5.0,212.0,0,N.A.
1815,"Dress for Flower Girl, Junior Bridesmaid, Whit...",https://www.etsy.com/listing/726457079/dress-f...,https://i.etsystatic.com/10608562/c/540/429/0/...,OliveLaneWeddings,101.25,11,0,1,United Kingdom,5.0,835.0,0,N.A.
296,"Flower girl dress, Lace girl dress, Birthday d...",https://www.etsy.com/listing/1217880492/flower...,https://i.etsystatic.com/24901926/c/2000/1589/...,GavrylivDress,113.34,14,0,1,Poland,4.5,257.0,0,N.A.
1917,Blue velvet dress navy blue flower girl dress ...,https://www.etsy.com/listing/774378662/blue-ve...,https://i.etsystatic.com/18637524/c/3000/2381/...,Tangerinegift,69.93,25,0,1,Ukraine,5.0,932.0,0,N.A.
243,Flower girls dress girls special occasion whit...,https://www.etsy.com/listing/1177770034/flower...,https://i.etsystatic.com/32788823/r/il/f49f6b/...,AngelvalleyCreations,52.07,10,0,0,United Kingdom,4.5,71.0,0,N.A.
1936,"White dress, Flower Girl Dress , Wedding Dress...",https://www.etsy.com/listing/1178826776/white-...,https://i.etsystatic.com/23982628/c/2250/1786/...,ByJacquelynNelson,84.92,10,0,1,"Cheraw, SC",5.0,127.0,0,N.A.


In [15]:
def step2(df, default=0):
    
    # define empty value lists:
    global shop_name
    shop_name = []
    global start_price
    start_price = []
    global size_options
    size_options = []
    global ship_cost_germany
    ship_cost_germany = []
    global allows_returns
    allows_returns = []
    global shop_location
    shop_location = []
    global shop_rating
    shop_rating = []
    global shop_review_counts
    shop_review_counts = []
    global item_review_counts
    item_review_counts = []
    global customer_item_reviews
    customer_item_reviews = []
  
    # turn listing_url into link list for interative loop:
    link = list(df["listing_url"])
    
    # define a regular expression pattern trained on DE currentcy format with decimal "," separator
    currency = re.compile('[-+]? (?: (?: \d* \, \d+ ) | (?: \d+ \,? ) )(?: [Ee] [+-]? \d+ ) ?', re.VERBOSE)
   
    # english url soup
    for url in tqdm.tqdm(link): # tqdm code for progress bar
        response = requests.get(url)
        soup = BeautifulSoup(response.text,"html.parser")
       
    
        # extract shop name, set up failsave in case listing has been removed
        try:
            name = soup.select("#listing-page-cart > div:nth-child(1) > div > div.wt-display-flex-xs.wt-align-items-center.wt-mb-xs-1 > p")[0].get_text()
            name = name[31:]
            name = name[:-27]
            shop_name.append(name)

            
            # extract size options count for this item
            try:
                sizes = len(soup.select("#listing-page-cart > div.wt-mb-xs-6.wt-mb-lg-0 > div:nth-child(1) > div.wt-mb-xs-3 > div:nth-child(4) > div")[0].select("option")[1:])
            except IndexError:
                sizes = len(soup.select("#listing-page-cart > div.wt-mb-xs-6.wt-mb-lg-0 > div:nth-child(1) > div.wt-mb-xs-3")[0].select("option")[1:])
            size_options.append(sizes)

            
            # extract shop location
            location = soup.select("#shipping-variant-div > div > div.wt-grid.wt-mb-xs-3 > div.wt-grid__item-xs-12.wt-text-black.wt-text-caption")[0].get_text()
            location = location[16:]
            location = location[:-1]
            shop_location.append(location)

            
            # extract returns allowed yes/no
            ret = soup.select("#shipping-variant-div > div > div.wt-grid.wt-mb-xs-3")[0].get_text()
            if ret.find("Accepted") != -1:
                ret = 1
            else:
                ret = 0
            allows_returns.append(ret)

            
            # extract rating and review counts 
            try:
                shop_rev = int(soup.select("#listing-right-column > div > div.body-wrap.wt-body-max-width.wt-display-flex-md.wt-flex-direction-column-xs > div.listing-info.review-col.wt-order-xs-6 > div > div > div:nth-child(2)")[0].get_text()[19:].split(" ")[0].replace(",",""))
                shop_rt = soup.select("#listing-right-column > div > div.body-wrap.wt-body-max-width.wt-display-flex-md.wt-flex-direction-column-xs > div.listing-info.review-col.wt-order-xs-6 > div > div > div:nth-child(2)")[0].text.split("reviews\n        \n\n\n\n")[1]
                shop_rt = int(shop_rt.split(" ")[0].replace(",",""))
                shop_review_counts.append(shop_rev)
                shop_rating.append(shop_rt)
                try:
                    item_rev = soup.select("#listing-right-column > div > div.body-wrap.wt-body-max-width.wt-display-flex-md.wt-flex-direction-column-xs > div.listing-info.review-col.wt-order-xs-6 > div > div > div:nth-child(2)")[0].text.split("Reviews for this item\n                \n                    ")[1]
                    item_rev = int(item_rev.split("\n")[0].replace(",",""))
                    item_review_counts.append(item_rev)
                except IndexError:
                    item_rev = 0
                    item_review_counts.append(item_rev)
            except IndexError:
                shop_rev = 0
                shop_rt = 0
                item_rev = 0
                shop_review_counts.append(shop_rev)
                shop_rating.append(shop_rt)
                item_review_counts.append(item_rev)

                
            # extract customer reviews per item
            rev_list = []
            if soup.select("#listing-right-column > div > div.body-wrap.wt-body-max-width.wt-display-flex-md.wt-flex-direction-column-xs > div.listing-info.review-col.wt-order-xs-6 > div > div > div:nth-child(2)")[0].text.find("Reviews for this item") != -1:
                rev_key = "#review-preview-toggle-{}"
                for iterate in range(0,4):
                    key = rev_key.format(iterate)
                    try:
                        rev = soup.select(key)[0].text.split("\n                    ")[1]
                        rev = rev.split(" \n\n    ")[0]
                        rev_list.append(rev)
                    except IndexError:
                        rev_list.append("N.A.") 
                customer_item_reviews.append(rev_list)
            else:
                customer_item_reviews.append("N.A.")

                
            # create DE url variant to scrape prices in EUR instead of USD
            url_de = url.replace("https://www.etsy.com/listing/","https://www.etsy.com/de/listing/")
            response_de = requests.get(url_de)
            soup_de = BeautifulSoup(response_de.text,"html.parser")     

            
            # extract starting price quote from soup and applying proper decimal "." separator
            p = currency.findall(soup_de.select("#listing-page-cart > div.wt-mb-xs-6.wt-mb-lg-0 > div:nth-child(1) > div.wt-mb-xs-3 > div.wt-mb-xs-3 > div:nth-child(1) > div > div.wt-display-flex-xs.wt-align-items-center.wt-flex-wrap > p")[0].get_text())[0]
            p = float(p.replace(",","."))
            start_price.append(p)

            
            # extract shipping to Germany cost quote
            sc = soup_de.select("#shipping-variant-div > div > div.wt-grid.wt-mb-xs-3")[0].text.split('\nVersandkosten\n')[1]
            sc = sc.split('\n\n\n\n\n\n')[0]
            if sc == 'Kostenlos':
                ship_cost = 0           
            else:
                sc = sc[:-2]
                ship_cost = float(sc.replace(",","."))
            ship_cost_germany.append(ship_cost)
        
        
        # failsave if listing has been removed
        except IndexError:
            shop_name.append("skipped")
            size_options.append("skipped")
            shop_location.append("skipped")
            allows_returns.append("skipped")
            shop_review_counts.append(0)
            shop_rating.append(0)
            item_review_counts.append(0)
            customer_item_reviews.append("skipped")
            start_price.append(0)
            ship_cost_germany.append(0)
        
        
        # wait time between cycles
        wait_time = randint(1,3000)
        sleep(wait_time/1000)
        
    return

In [16]:
step2(step1_data)

  8%|███▏                                    | 159/2000 [04:35<53:07,  1.73s/it]


KeyboardInterrupt: 

In [14]:
step1_data["listing_url"].iloc[123]

'https://www.etsy.com/listing/1207013068/ruffle-flower-girl-dress-linen?click_key=83101a73d7fae19fdd3b9d1dca80cd997c64d190%3A1207013068&click_sum=db6bbeb1&ga_order=highest_reviews&ga_search_type=all&ga_view_type=gallery&ga_search_query=flower+girl+dress&ref=search_in_grid-16-4'

In [42]:
col_dict = {"shop_name":shop_name,
            "start_price":start_price,
            "size_options":size_options,
            "ship_cost_germany":ship_cost_germany,
            "allows_returns":allows_returns,
            "shop_location":shop_location,
            "shop_rating":shop_rating,
            "shop_review_counts":shop_review_counts,
            "item_review_counts":item_review_counts,
            "customer_item_reviews":customer_item_reviews}

for col,val in col_dict.items():
    step1_data[col] = val

In [44]:
step1_data.head()

Unnamed: 0,listing_name,listing_url,listing_image,shop_name,start_price,size_options,ship_cost_germany,allows_returns,shop_location,shop_rating,shop_review_counts,item_review_counts,customer_item_reviews
0,"White flower girl dress, Lace flower girl dres...",https://www.etsy.com/listing/933095128/white-f...,https://i.etsystatic.com/23909643/c/1721/1367/...,BeverlyCoStore,83.36,0,0,1,Ukraine,5,84.0,0,N.A.
1,"Flower Girl Dress, Wedding, Ivory, Chiffon, Tu...",https://www.etsy.com/listing/1125944983/flower...,https://i.etsystatic.com/14118926/r/il/65855c/...,PetiteLuluStudio,60.0,15,0,0,Germany,5,93.0,5,[PERFECT!!! Everything is as in the picture al...
2,Communion Dress Communion Dress Flower Girl Dress,https://www.etsy.com/listing/779701917/communi...,https://i.etsystatic.com/18224465/c/1319/1048/...,LoliweKids,95.9,4,0,0,Germany,5,1.683,0,N.A.
3,dusty rose flower girl dress baby linen dress,https://www.etsy.com/listing/1209369337/dusty-...,https://i.etsystatic.com/30993712/r/il/562f0e/...,SchoenBoutique,68.2,15,0,1,Germany,5,69.0,0,N.A.
4,Communion Dress Communion Dress Flower Girl Dress,https://www.etsy.com/listing/769082893/communi...,https://i.etsystatic.com/18224465/c/2029/1613/...,LoliweKids,108.9,4,0,0,Germany,5,1.683,0,N.A.


In [8]:
step1_data["ship_cost_germany"].nunique()


1

In [45]:
step1_data.shape

(2000, 13)

In [46]:
step1_data.to_csv('step1_data.csv',index=False)

In [None]:
step1_data = pd.read_csv('step1_data.csv')

my original step2

In [None]:
def step2(df_link, default=0):
    
    
    # define empty value lists:
    global cols
    cols = ['listing_url',
            'shop_name',
            'size_options',            
            'shop_location',
            'allows_returns',
            'total_shop_sales',
            'shop_reviews_count',
            'shop_5_star_rating_percentage',
            'shop_4_star_rating_percentage',
            'shop_3_star_rating_percentage',
            'shop_2_star_rating_percentage',
            'shop_1_star_rating_percentage',
            'item_reviews_count',
            'listing_customer_reviews',
            'listing_customer_reviews',
            'listing_customer_reviews',
            'listing_customer_reviews',
            'start_price',
            'ship_cost_germany',]
    
    errors = []
    
    global df
    df = pd.DataFrame(columns = cols)
  
    # turn listing_url into link list for interative loop:
    link = list(df_link["listing_url"])
    
    # define a regular expression pattern trained on DE currentcy format with decimal "," separator
    currency = re.compile('[-+]? (?: (?: \d* \, \d+ ) | (?: \d+ \,? ) )(?: [Ee] [+-]? \d+ ) ?', re.VERBOSE)
    
    #define a regular expression pattern for finding total sales
    sales_select = '[\d]{1},[\d]{3} sales|[\d]{2},[\d]{3} sales|[\d]{3} sales|[\d]{2} sales|[\d]{1} sales'

   
    # english url soup
    for url in tqdm.tqdm(link): # tqdm code for progress bar
        response = requests.get(url, timeout=3)
        soup = BeautifulSoup(response.text,"html.parser")
        if response.status_code == 200:  
            row = []
            row.append(url)

            # extract shop name, set up failsave in case listing has been removed
            try:
                name = soup.select("#listing-page-cart > div:nth-child(1) > div > div.wt-display-flex-xs.wt-align-items-center.wt-mb-xs-1 > p")[0].get_text()
                name = name[31:]
                name = name[:-27]
                row.append(name)


                # extract size options count for this item
             
                sizes = len(soup.select("#listing-page-cart > div.wt-mb-xs-6.wt-mb-lg-0 > div:nth-child(1) > div.wt-mb-xs-3 > div:nth-child(4) > div")[0].select("option")[1:])
                row.append(sizes)

                # extract shop location
                location = soup.select("#shipping-variant-div > div > div.wt-grid.wt-mb-xs-3 > div.wt-grid__item-xs-12.wt-text-black.wt-text-caption")[0].get_text()
                location = location[16:]
                location = location[:-1]
                row.append(location)


                # extract returns allowed yes/no
                ret = soup.select("#shipping-variant-div > div > div.wt-grid.wt-mb-xs-3")[0].get_text()
                if ret.find("Accepted") != -1:
                    ret = 1
                else:
                    ret = 0
                row.append(ret)

                
                # extract total shop sales
                try:
                    tot_sales = re.findall(sales_select, soup.select(i)[0].text)[0]
                    tot_sales = int(tot_sales[:-6].replace(",",""))
                    row.append(tot_sales)
                except:
                    row.append(0)


                # extract review count for the shop 
                try:
                    shop_rev = int(soup.select("#listing-right-column > div > div.body-wrap.wt-body-max-width.wt-display-flex-md.wt-flex-direction-column-xs > div.listing-info.review-col.wt-order-xs-6 > div > div > div:nth-child(2)")[0].get_text()[19:].split(" ")[0].replace(",",""))
                    row.append(shop_rev)
                except:
                    row.append(0)

                
                #extract 5/5 star rating for this shop (only way to do this is in percentage counts per star rating level)
                try:
                    five = soup.select("#listing-right-column > div > div.body-wrap.wt-body-max-width.wt-display-flex-md.wt-flex-direction-column-xs > div.listing-info.review-col.wt-order-xs-6 > div > div > div:nth-child(2)")[0].text.split("5 out of 5 stars\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n5 stars\n\n\n\n\n\n\n")[1]
                    five = float(five.split("%\n\n\n\n\n4 stars")[0])
                    five = five/100
                    row.append(five)
                except:
                    row.append(0)
                    
                try:
                    four = soup.select("#listing-right-column > div > div.body-wrap.wt-body-max-width.wt-display-flex-md.wt-flex-direction-column-xs > div.listing-info.review-col.wt-order-xs-6 > div > div > div:nth-child(2)")[0].text.split("4 stars\n\n\n\n\n\n\n")[1]
                    four = float(four.split("%\n\n\n\n\n3 stars")[0])
                    four = four/100
                    row.append(four)
                except:
                    row.append(0)
                    
                try:
                    three = soup.select("#listing-right-column > div > div.body-wrap.wt-body-max-width.wt-display-flex-md.wt-flex-direction-column-xs > div.listing-info.review-col.wt-order-xs-6 > div > div > div:nth-child(2)")[0].text.split("3 stars\n\n\n\n\n\n\n")[1]
                    three = float(three.split("%\n\n\n\n\n")[0])
                    three = three/100
                    row.append(three)
                except:
                    row.append(0)
                    
                try:
                    two = soup.select("#listing-right-column > div > div.body-wrap.wt-body-max-width.wt-display-flex-md.wt-flex-direction-column-xs > div.listing-info.review-col.wt-order-xs-6 > div > div > div:nth-child(2)")[0].text.split("2 stars\n\n\n\n\n\n\n")[1]
                    two = float(two.split("%\n\n\n\n\n")[0])
                    two = two/100
                    row.append(two)
                except:
                    row.append(0)
                
                try:
                    one = soup.select("#listing-right-column > div > div.body-wrap.wt-body-max-width.wt-display-flex-md.wt-flex-direction-column-xs > div.listing-info.review-col.wt-order-xs-6 > div > div > div:nth-child(2)")[0].text.split("1 star\n\n\n\n\n\n\n")[1]
                    one = float(one.split("%\n\n\n\n\n")[0])
                    one = one/100
                    row.append(one)
                except:
                    row.append(0)
                
                # extract review count for this item
                try:
                    item_rev = soup.select("#listing-right-column > div > div.body-wrap.wt-body-max-width.wt-display-flex-md.wt-flex-direction-column-xs > div.listing-info.review-col.wt-order-xs-6 > div > div > div:nth-child(2)")[0].text.split("Reviews for this item\n                \n                    ")[1]
                    item_rev = int(item_rev.split("\n")[0].replace(",",""))
                    row.append(item_rev)
                except:
                    row.append(0)
                
                # extract customer reviews per item
                try:
                    if soup.select("#listing-right-column > div > div.body-wrap.wt-body-max-width.wt-display-flex-md.wt-flex-direction-column-xs > div.listing-info.review-col.wt-order-xs-6 > div > div > div:nth-child(2)")[0].text.find("Reviews for this item") != -1:
                        rev_key = "#review-preview-toggle-{}"
                        for iterate in range(0,4):
                            key = rev_key.format(iterate)
                            try:
                                rev = soup.select(key)[0].text.split("\n                    ")[1]
                                rev = rev.split(" \n\n    ")[0]
                                row.append(rev)
                            except:
                                row.append("N.A.")
                    else:
                        row.append("N.A.")
                except:
                    row.append("N.A.")

                flag = True
                try:
                    # create DE url variant to scrape prices in EUR instead of USD
                    url_de = url.replace("https://www.etsy.com/listing/","https://www.etsy.com/de/listing/")
                    response_de = requests.get(url_de)
                    soup_de = BeautifulSoup(response_de.text,"html.parser")     
                except:
                    flag = False
                    row.append(0)
                    row.append(0)

                if flag == True:
                    
                    try:
                        # extract starting price quote from soup and applying proper decimal "." separator
                        p = currency.findall(soup_de.select("#listing-page-cart > div.wt-mb-xs-6.wt-mb-lg-0 > div:nth-child(1) > div.wt-mb-xs-3 > div.wt-mb-xs-3 > div:nth-child(1) > div > div.wt-display-flex-xs.wt-align-items-center.wt-flex-wrap > p")[0].get_text())[0]
                        p = float(p.replace(",","."))
                        row.append(p)
                    except:
                        row.append(0)

                    try:
                        # extract shipping to Germany cost quote
                        sc = soup_de.select("#shipping-variant-div > div > div.wt-grid.wt-mb-xs-3")[0].text.split('\nVersandkosten\n')[1]
                        sc = sc.split('\n\n\n\n\n\n')[0]
                        if sc == 'Kostenlos':
                            ship_cost = 0           
                        else:
                            ship_cost = float(sc[:-2].replace(",","."))
                        row.append(ship_cost)
                    except:
                        row.append(0)                 

                #df.loc[len(df)] = row
        
            
            # failsave if listing has been removed or ANYTHING goes wrong
            except:
                errors.append(link)

            # wait time between cycles
            wait_time = randint(1,2000)
            sleep(wait_time/1000)
                

    return df, errors