## Vivino: experimenting with data extraction

In [1]:
# main requirements

import pandas as pd
import numpy as np

### A. Working with ready datasets

There are some [existing wine datasets on Kaggle](https://www.kaggle.com/zynicide/wine-reviews) which were scraped from [WineEnthusiast](https://www.winemag.com/?s=&drink_type=wine) that however do not meet the purposes of our research (the sample is too small, the number of features is limited). Example provided below. 

In [2]:
data1 = pd.read_csv('external_dataset/winemag-data_first150k.csv')
data2 = pd.read_csv('external_dataset/winemag-data-130k-v2.csv')
data = [data1, data2]

In [3]:
for d in data:
    print(d.columns)
    display(d.head())

Index(['Unnamed: 0', 'country', 'description', 'designation', 'points',
       'price', 'province', 'region_1', 'region_2', 'variety', 'winery'],
      dtype='object')


Unnamed: 0.1,Unnamed: 0,country,description,designation,points,price,province,region_1,region_2,variety,winery
0,0,US,This tremendous 100% varietal wine hails from ...,Martha's Vineyard,96,235.0,California,Napa Valley,Napa,Cabernet Sauvignon,Heitz
1,1,Spain,"Ripe aromas of fig, blackberry and cassis are ...",Carodorum Selección Especial Reserva,96,110.0,Northern Spain,Toro,,Tinta de Toro,Bodega Carmen Rodríguez
2,2,US,Mac Watson honors the memory of a wine once ma...,Special Selected Late Harvest,96,90.0,California,Knights Valley,Sonoma,Sauvignon Blanc,Macauley
3,3,US,"This spent 20 months in 30% new French oak, an...",Reserve,96,65.0,Oregon,Willamette Valley,Willamette Valley,Pinot Noir,Ponzi
4,4,France,"This is the top wine from La Bégude, named aft...",La Brûlade,95,66.0,Provence,Bandol,,Provence red blend,Domaine de la Bégude


Index(['Unnamed: 0', 'country', 'description', 'designation', 'points',
       'price', 'province', 'region_1', 'region_2', 'taster_name',
       'taster_twitter_handle', 'title', 'variety', 'winery'],
      dtype='object')


Unnamed: 0.1,Unnamed: 0,country,description,designation,points,price,province,region_1,region_2,taster_name,taster_twitter_handle,title,variety,winery
0,0,Italy,"Aromas include tropical fruit, broom, brimston...",Vulkà Bianco,87,,Sicily & Sardinia,Etna,,Kerin O’Keefe,@kerinokeefe,Nicosia 2013 Vulkà Bianco (Etna),White Blend,Nicosia
1,1,Portugal,"This is ripe and fruity, a wine that is smooth...",Avidagos,87,15.0,Douro,,,Roger Voss,@vossroger,Quinta dos Avidagos 2011 Avidagos Red (Douro),Portuguese Red,Quinta dos Avidagos
2,2,US,"Tart and snappy, the flavors of lime flesh and...",,87,14.0,Oregon,Willamette Valley,Willamette Valley,Paul Gregutt,@paulgwine,Rainstorm 2013 Pinot Gris (Willamette Valley),Pinot Gris,Rainstorm
3,3,US,"Pineapple rind, lemon pith and orange blossom ...",Reserve Late Harvest,87,13.0,Michigan,Lake Michigan Shore,,Alexander Peartree,,St. Julian 2013 Reserve Late Harvest Riesling ...,Riesling,St. Julian
4,4,US,"Much like the regular bottling from 2012, this...",Vintner's Reserve Wild Child Block,87,65.0,Oregon,Willamette Valley,Willamette Valley,Paul Gregutt,@paulgwine,Sweet Cheeks 2012 Vintner's Reserve Wild Child...,Pinot Noir,Sweet Cheeks


In [4]:
data2.taster_name.unique()

array(['Kerin O’Keefe', 'Roger Voss', 'Paul Gregutt',
       'Alexander Peartree', 'Michael Schachner', 'Anna Lee C. Iijima',
       'Virginie Boone', 'Matt Kettmann', nan, 'Sean P. Sullivan',
       'Jim Gordon', 'Joe Czerwinski', 'Anne Krebiehl\xa0MW',
       'Lauren Buzzeo', 'Mike DeSimone', 'Jeff Jenssen',
       'Susan Kostrzewa', 'Carrie Dykes', 'Fiona Adams',
       'Christina Pickard'], dtype=object)

### B. Manually parsing vivino.com

### 1. Beautiful Soup

First, I was trying to extract individual wine properties by parsing the HTML tree of Vivino [explore page](https://www.vivino.com/explore?e=eJzLLbI11jNVy83MszVXy02ssDU2UEuutHULUku2dQ0NUiuwNVRLT7MtSyzKTC1JzFHLT7ItSizJzEsvjk8sSy1KTE9Vy7dNSS1OVisviY4FKgZTRgDL1Bz4) using `requests` and `BeautifulSoup` Python libraries. 


The idea was to extract links to individul wine pages, so that those pages are analyzed one-by-one at a later step. 

Among other things, the unrolled HTML tree classes of the original explore page include:
* `body class="inner-page"`
* `div class="wrap"`
* `div id="explore-page-app"`
* `div class="explorerPage__explorePage--26aGH layout__outer--S05yQ"`
* `div class="layout__inner--3JC-x"`
* `div class="explorerPage__columns--1TTaK"`
* `div class="explorerPage__results--3wqLw"`
* `div class="explorerCard__explorerCard--3Q7_0 explorerPageResults__explorerCard--3q6Qe"`
* etc

Eventually, the following element was needed: `"anchor__anchor--2QZvA"` visible in order to get the link to the page of each wine.

After experimenting, it was discovered that vivino.com is using JavaScript to generate dynamic web pages, therefore, individual elements of the wine list can not be parsed without JavaScript. The deepest element of HTML page that can be achieved with Beautiful soup is the container with id `explore-page-app`.

So, BeautifulSoup was not a working solution. 


In [5]:
import requests
from bs4 import BeautifulSoup


headers = {
    "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
    "Host": "www.vivino.com",
    "User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.0 Safari/605.1.15",
    "Accept-Language": "en-gb",
    "Accept-Encoding": "gzip, deflate, br",
    "Connection": "keep-alive"    
}

response = requests.get("https://www.vivino.com/explore?e=eJwNyb0KgCAYBdC3ubNB611aWtsj4stMBH9CzertaznLCZkdgotUCPKwVwr65ThA_0w4_22a8fIe9mCT7EwVj7QxS3XRllWayWINEndTNO46L-w-zyUdWg==", headers=headers)
content = response.content

parser = BeautifulSoup(content, 'html.parser')
body = parser.body
# print(body)
wine_titles = body.find_all(class_="vintageTitle__wine--U7t9G")
for title in wine_titles:
    type(title.text)
# type(wine_titles)
wine_titles

[]

### 2. Selenium web driver

First, I considered using PhantomJS to generate web pages with datascript, as suggested [here](https://stackoverflow.com/questions/8049520/web-scraping-javascript-page-with-python).

However, after some attempts it appears that PhantomJS has been depricated, and developers of Selenium suggest using headless versions of Chrome or Firefox instead 
(see details [here](https://stackoverflow.com/questions/50416538/python-phantomjs-says-i-am-not-using-headless))


In [6]:
from selenium import webdriver

Basically, everything is done on the WebDriver instance object. The `find_element(By)` or `find_element_by_id` methods return WebElements, each having contents and/or properties.
See documentation [here](https://www.selenium.dev/documentation/en/getting_started_with_webdriver/locating_elements/).

Below is the code required to initialize a web driver. Here, I added an optional argument `long_screen` which (if chosen) returns the long version of the page. Default value, however, is `long_screen=False` which initializes the window of size 1920x1080.

Also, after several attempts to load the webpage I get the response that my IP has been temporarily blocked for exceeding bulk request limits. Therefore, I added some properties to the Web Driver, such as setting a User agent that does not reveal headless nature of Chrome


In [33]:
# Initialize web driver

def initialize_chrome_driver(long_screen=False):
    """
    Initialize a chrome web driver. Only works if the driver is passed to lib/ directory.
    long_screen is a boolean argument that initializes browser with increased window size.
    """
    options = webdriver.ChromeOptions()
    options.add_argument('--headless')
    options.add_argument('--user-agent=Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.0 Safari/605.1.15')
    if long_screen:
        options.add_argument("--window-size=1920x20800")
    else:
        options.add_argument("--window-size=1920x1080")
    # options.add_argument("start-maximized")
    # options.add_argument("disable-infobars")
    # options.add_argument("--no-sandbox")
    # options.add_argument("--disable-extensions")
    # options.add_argument("--disable-dev-shm-usage")
    # options.add_argument('--lang=en')
    # options.add_argument('--incognito')

    browser = webdriver.Chrome("lib/chromedriver.uu", options=options)
    return browser

### Parsing the wine list (no scroll)

In [10]:
# script to check functionality of bbc.co.uk
browser = initialize_chrome_driver()
browser.get('https://www.bbc.co.uk/')
res = browser.find_element_by_class_name("css-4wew9k-MastheadText")
res.text

'Welcome to the BBC'

In order to see whether at this point vivino.com explore page is loaded correctly and JavaScript is run, the screenshot is made and saved to the folder. 

In [11]:
# get the screenshot of the explore page

browser = initialize_chrome_driver()
test_explore_page = "https://www.vivino.com/explore?e=eJzLLbI1VMvNzLM1UMtNrLA1MTBQS660dXdSSwYSAWoFQNn0NNuyxKLM1JLEHLX8JNuixJLMvPTi-MSy1KLE9FS1fNuU1OJktfKS6FhbQwDu-xpj"
browser.get(test_explore_page)
browser.get_screenshot_as_file('test_screenshot.png')

True

Once the page is loaded correctly, we are trying to get the individual wine properties.

Those can be found within the class `"anchor__anchor--2QZvA"`
The property of each wine has the link that leads to the individual web page of that wine. 

Once extracted, each link is  added to a Pyhon list.

The page contains some other elements with the same class, unnecessary for our study (such as wine-regions, wine-sountries, etc). Those are specifically prevented from being added to the list. 

In [2]:
import time

def add_link_to_list(web_element, wine_page_list):
    """
    Function that appends wine link extracted from the property of a web element to a list, unless it already exists or is irrelevant.
    """
#     print(web_element)
    cur_link = web_element.get_property("href")
    if cur_link.startswith('https://www.vivino.com/wine-countries/')\
    or cur_link.startswith('https://www.vivino.com/wine-regions/')\
    or cur_link.startswith('https://www.vivino.com/redirect/')\
    or cur_link.startswith('https://instagram')\
    or cur_link.startswith('https://facebook')\
    or cur_link.startswith('https://twitter')\
    or cur_link in wine_page_list:
        pass
    else:
        wine_page_list.append(cur_link)
    return wine_page_list


def get_list_no_scroll(browser, page, class_name="anchor__anchor--2QZvA"):
    """
    The function that returns the list of links to all wines found on the page.
    Arguments include: 
    * browser - an instance of chrome driver
    * page - a page that needs to be parsed 
    * class_name - CSS selector that should contain wine properties such as links.
    
    """
    browser.get(page)
#     results_list = []
    time.sleep(5)  # allows for the page to load
    results_list = browser.find_elements_by_class_name(class_name)
    wine_pages_list = []
    for el in results_list:
        wine_pages_list = add_link_to_list(el, wine_pages_list)
    return wine_pages_list

In [40]:
browser = initialize_chrome_driver()
wine_pages_list = get_list_no_scroll(browser, test_explore_page)
# browser.get_screenshot_as_file('test_screenshot_1.png')
len(wine_pages_list)

25

We can see that even with JavaScript-enabled web driver having a long window size, the wines are not all loaded simultaneously, and the search results in just a few elements. Therefore, scrolling needs to be implemented. 

### Parsing a wine list (with JavaScript scrolling)

It appears that we need a scroller to get the whole list of wines, since they are not all loaded simultaneously.

One way to do this, is need find an element that is located in the bottom of the page, and run a scrolling script until such element becomes visible.

First, I try to find an element with a class `"addWidgetLink__addWidgetLink--aPZ_V"` to indicate the page bottom, and see if the scrolling works

In [41]:
# scroll until the end of a specific wine page 

from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import TimeoutException

def scroll_until_explore_page_bottom(browser, page, page_end_class="addWidgetLink__addWidgetLink--aPZ_V"):
    """
    Function that scrolls until the bottom of the page, if certain element in the bottom is located within 5 seconds. Arguments are:
    * browser - an instance of chrome driver
    * page - a page that needs to be scrolled
    * page_end_class - CSS selector that can indicate that the page reached its bottom.
    """
    browser.get(page)
    browser.execute_script("window.scrollTo(0, document.body.scrollHeight);")
    try:
        WebDriverWait(browser, 5).until(EC.visibility_of_element_located((By.CLASS_NAME, page_end_class)))
    except TimeoutException:
        print("exception raised")

In [42]:
# experimenting with scrolling on different pages 

# red wines with ratings of 4.5+ and price above 400 
explore_link_1 = "https://www.vivino.com/explore?e=eJzLLbI10TNVy83MszUxMFDLTawA08mVtu5OaslAIkCtwNZQLT3NtiyxKDO1JDFHLT_JtiixJDMvvTg-sSy1KDE9VS3fNiW1OFmtvCQ61tYQADLIGy0="
# desert wines with ratings of 4.5+ and price above 25 
explore_link_2 = "https://www.vivino.com/explore?e=eJwNxbEKgCAUBdC_eWNYGE13aWltj4iXmQipoWL197mc4yJk05OzHl2NX0ghSH2YRlKVmW60ZE4UjlZnvijsiJytN2njoiMbTQGHToqevKwYfiK_GwY="

scroll_until_explore_page_bottom(browser, explore_link_1)
results_list_1 = browser.find_elements_by_class_name("anchor__anchor--2QZvA")

scroll_until_explore_page_bottom(browser, explore_link_2)
results_list_2 = browser.find_elements_by_class_name("anchor__anchor--2QZvA")

wine_count_1 = len(results_list_1)
wine_count_2 = len(results_list_2)
print(wine_count_1)
print(wine_count_2)

exception raised
exception raised
105
105


Scrolling until certain element is located does not seem to work properly (likely, because page elements are not loaded simultaneously). 
Therefore, the next try is to scroll a page down certain number of times, and wait a bit after each scroll (to allow the page to load more wines). This solution is ispired by [this article](https://dev.to/mr_h/python-selenium-infinite-scrolling-3o12).

Therefore, I implement a function that scrolls down, enables a sleep timer (eg. 2 sec) after each scroll, loads the web elements that have a certain class (`"anchor__anchor--2QZvA"`), returns a list with such elements, and then continues. 

In order to limit the desired number of elements, we can either manually restrict the length of the resulting list (using a variable `total_wine_num`, or target the overall number of wines that meet the filter criteria (those can be seen in the top of the explore page, and are stored in the element with class `"querySummary__querySummary--39WP2"`). 

For the sake of experimenting, we try to load 50 wines first. 

Also, since the page loads only a few new elements with each scroll, in order to increase efficiency, we update the list of results every 10 scrolls (which explains the usage of the iteration counter).

In [43]:
def _scroll_load_scroll(driver, page, timeout=2, manual_input=True, wine_num=0, class_name="anchor__anchor--2QZvA"):
    """
    Function that loads a page, scrolls down with a given sleep timer until the whole wine list is loaded (if manual_input is False), 
    or a certain number of wines (if manual_input is True and wine_num is provided).
    In the latter case, wine_num should also be provided.
    
    """
    
    if manual_input and wine_num <= 0:
        raise ValueError('wine_num must be positive if manual_input is true')
    
    driver.get(page)     # Opens the explore page
    
    if manual_input==False: # extract the total number of wines with a given search criteria
        time.sleep(timeout)
        total_wine_string = browser.find_element_by_class_name("querySummary__querySummary--39WP2").text
        total_wine_num = int(total_wine_string.split()[1])

    total_wine_num = 50 # 50 elements chosen for the sake of experiment
    results_list = []
    count_iter = 0

    while len(results_list) < total_wine_num: # iterate until the desired number of elements is loaded
        if count_iter % 10 == 0:    # new items are loaded
            timepoint_1 = time.time()
            all_results = browser.find_elements_by_class_name(class_name)
            timepoint_2 = time.time()
            print("time to parse elements: " + str(timepoint_2 - timepoint_1))

            for el in all_results:
                add_link_to_list(el, results_list)   # get the link and check whether it meets the criteria to be included to the list 

        timepoint_3 = time.time()
        print("time to update list of links: " + str(timepoint_3 - timepoint_2))
        
        driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")         # Scroll down to bottom
        time.sleep(timeout)         # Wait to load page
        
        timepoint_4 = time.time()
        print("time to scroll and wait: " + str(timepoint_4 - timepoint_3))
        
        count_iter += 1
        
    print("list has {} elements".format(len(results_list)))
    return results_list

The above solution `scroll_load_scroll` that loads after each scroll (or each 10 scrolls) seems pretty slow, so, alternatively `scroll_and_load` function takes a given number of scrolls as an argument, and performs this number of scrolls before loading the page.

In [44]:
def scroll_and_load(driver, page, timeout=1, scrolls=10, class_name="anchor__anchor--2QZvA"):
    """
    Function that loads a page, scrolls down with a given sleep timer certain number of times.
    """
    timepoint_0 = time.time()
    results_list = []
    driver.get(page)
    timepoint_1 = time.time()
    for i in range(scrolls):
        driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
        time.sleep(timeout)
    timepoint_2 = time.time()
#     print("time to scroll and wait: {} s.".format(timepoint_2 - timepoint_1))
    loaded_elements = browser.find_elements_by_class_name(class_name)
    timepoint_3 = time.time()
#     print("time to load elements: {} s.".format(timepoint_3 - timepoint_2))
    for el in loaded_elements:
        results_list = add_link_to_list(el, results_list)
    timepoint_4 = time.time()
#     print("time to extract the list of links: {} s.".format(timepoint_4 - timepoint_3))
    print("total time elapsed: {} s.".format(timepoint_4 - timepoint_0))
    print("list has {} elements".format(len(results_list)))
    print("average time per element is {} s".format((timepoint_4 - timepoint_0)/len(results_list)))
    return results_list

In [45]:
# some other links for testing

# all fortified wines with a rating above 4.5 (the overal result should be 423)
explore_link_3 = "https://www.vivino.com/explore?e=eJwNxL0KgCAYBdC3uWNY2HiXltb2iPgyEyE1TPp5-zrDCZm6ahF8pEKQh1opmJd9B_M34GANt_GS7G2RHWlhluKjO2e5bBZnkbja0-Au48RGfyshGv4="
# all red wines, sorted by rating
explore_link_4 = "https://www.vivino.com/explore?e=eJzLLbI1VMvNzLM1UMtNrLA1MTBQS660dXdSSwYSAWoFQNn0NNuyxKLM1JLEHLX8JNuixJLMvPTi-MSy1KLE9FS1fNuU1OJktfKS6FhbQwDu-xpj"
# all red wines, sorted by popularity
explore_link_5 = "https://www.vivino.com/explore?e=eJzLLbI1VMvNzLM1UMtNrLA1MTBQS660dXdSSwYSAWoFQNn0NNuyxKLM1JLEHLX8JNuixJLMvPTi-OT80rwStXzblNTiZLXykuhYW0MAulcZsQ=="

In [46]:
browser = initialize_chrome_driver()
list_with_scroll = scroll_and_load(browser, explore_link_5)

total time elapsed: 14.743172883987427 s.
list has 250 elements
average time per element is 0.05897269153594971 s


The above strategy (10 scrolls, each with 1s timeout) gained 250 elements. We should experiment with a number of scrolls and/or the value of timeout to see if it helps to increase the speed of loading of new elements. 

Experiments performed below. Since the timing and the list size differ in different experiments, one should consider average loading time per element to see whether a strategy is efficient or not. 

In [47]:
# browser = initialize_chrome_driver()
# list_with_scroll = scroll_and_load(browser, explore_link_5, scrolls=100)

In [48]:
# browser = initialize_chrome_driver()
# list_with_scroll = scroll_and_load(browser, explore_link_5, timeout=0.5)

In [49]:
# browser = initialize_chrome_driver()
# list_with_scroll = scroll_and_load(browser, explore_link_5, timeout=0.5, scrolls=100)

After severla tries we can see that timeout of 0.5s and 100 scrolls gains relatively high speed of loading and achieves relatively higher number of results. If needed, above cells can be re-run. We will use the strategy of 0.5s and 200 scrolls further on, and apply it to various wine groups (red vs. white, and also divided by country).

To use this approach we feed individual web pages per each country and wine type to our program (sorted in the order of descending popularity), and run a web crawler through each of them. The results are saved in a list. 

For the sake of time, the following code extracting red and white samples (using the chosen strategy of of 0.5s and 200 scrolls) does not need to be re-run. During the initial run, its results were saved to pickle files, and can be loaded in the similar way. Therefore, for the time being, the below code is commented out.

In [50]:
# argentina_red = "https://www.vivino.com/explore?e=eJwNirsOgCAMAP-mMyauXVxc3Y0xtSIhETClPvh7u9wNd0mwgxQzOkj0Ye8ccMNxADZMcFkNBz4k0SudUDYU0phDXbncWaHg7ivDq_NiK7dqJvkBsTQc7g=="
# australia_red = "https://www.vivino.com/explore?e=eJwNijsKgDAQBW_z6gi229jY2ovIumoImETyUXN7t5kpZnyiDt4FMvD8UW8MpNE4QBQTbq32pIeTOwpfiBslLi7YvEqsoSDSfmTBW-ZFV2lZzfUHsTcc8Q=="
# austria_red = "https://www.vivino.com/explore?e=eJwNijsOgCAQBW_zakxst7GxtTfGrCsSEgED64fbSzNTzIRMHYKPZBD4o94YSKVxgDRMuFp1Bz2cvVU-kTbKrD66skq6oyLRbovg1Xlpq9TSzPoDsTYc8A=="
# chile_red = "https://www.vivino.com/explore?e=eJwNijsKgDAQBW_z6gi229jY2otIXGMI5CPJ-ru928wUM6lShxQyGST7Um8M-KNxACsmnFr9QbetwYmNKBtVKyH7tnK5sqDQ7hrjkXnRlb-m5vgDsTIc6g=="
# france_red = "https://www.vivino.com/explore?e=eJwNirsOgCAMAP-mMyauXVxc3Y0xtQIhETClvv7eLnfDXRbsIKeCDjK92DsH_OE4ABsmOK3GgDdJ8koH1A2FNJXYVq5XUai4-8bw6LzYyl8zB_kBsT4c8w=="
# germany_red = "https://www.vivino.com/explore?e=eJwNijsOgCAQBW_zakxst7GxtTfG4IqERMDA-uH2bjNTzMRCHWJIZBDtR70x4EbjAFZMuLT6gx5bghN7Im9UrITk68r5ToJMu6uMV-ZFV25VvbsfsS0c5A=="
# italy_red = "https://www.vivino.com/explore?e=eJwNirsOgCAMAP-mMyauXVxc3Y0xtSJpImCgPvh7u9wNd7FgB1ESOoj0Ye8ccMNxADZMcFkNBz5UxCudkDcspJJCXTnfSSHj7ivDq_NiK7dqFv0BsUYc-A=="
# portugal_red = "https://www.vivino.com/explore?e=eJwNijsKgDAQBW_z6gi229jY2otIXGMImA_J-ru928wUM7FShxgSGUT7Um8M-KNxACsmFK3-oNvW4MSeyBtVKyH5tnK-kiDT7hrjkXnRlb-mLvIDsVQc_w=="
# spain_red = "https://www.vivino.com/explore?e=eJwNirsOgCAMAP-mMyauXVxc3Y0xtSIhETC0Pvh7u9wNd6liBylmdJDow9454IbjAGyY4LIaDnyoRq90QtmwksYcZOVyZ4WCuxeGV-fFVm5i9vIDsT0c8w=="
# usa_red = "https://www.vivino.com/explore?e=eJwNijEOgCAMAH_TGRPXLi6u7saYWpWQCBhaVH5vl7vhLhbsIIaEDiJ92DsH3HAcgA0T3Fb9iQ-VcChdkDcspCF5WTnXpJBxP4Th1XmxlZuYq_yxXR0D"

# full_red_list = []
# length_list = []

# browser = initialize_chrome_driver()
# country_pages = [argentina_red, australia_red, austria_red, chile_red, france_red, germany_red, italy_red, portugal_red, spain_red, usa_red]
# for page in country_pages:
#     country_list = scroll_and_load(browser, page, timeout=0.5, scrolls=200)
#     length_list.append(len(country_list))
#     full_red_list.append(country_list)
#     print("{} finished and gained {} links".format(page, len(country_list)))

In [51]:
# argentina_white = "https://www.vivino.com/explore?e=eJwNi7sKgDAMAP8mcxXXLC6u7iISYy0F20oaX39vlrvhuCTYQIoZHSR6sXMO-MOhBzaMcFoNO94k0SsdUFYU0phDXbhcWaHg5ivDo9OMra3VTPIDsT4c7w=="
# australia_white = "https://www.vivino.com/explore?e=eJwNizsOgCAQBW_zajS229jY2htj1lUJiYCBxc_tpZkpJuMTNfAukIHnlzpjIB8NPaRixFWrPejm5HblE3GlxOqCzYvEEhSRtj0LHp1mauuaq7n8sUEc8g=="
# austria_white = "https://www.vivino.com/explore?e=eJwNizsOgCAQBW_zajS229jY2htj1hUJiYCB9Xd7aWaKyYRMDYKPZBD4pc4YyEdDD6kYcdbqdro5e6t8IK2UWX10ZZF0RUWizRbBo9NMbV1LNesPsUAc8Q=="
# chile_white = "https://www.vivino.com/explore?e=eJwNizsKgDAQBW_z6ii229jY2ovIumoI5CPJ-ru9aWaKYUKmBsFFMgj8UmcM5KOhh1SMOGu1B92c3a7skVbKrC7aski6oiLRthfBo9NMbV1LtfgfsTwc6w=="
# france_white = "https://www.vivino.com/explore?e=eJwNi7sKgDAMAP8mcxXXLC6u7iISoy0F20oaX39vlrvhuCTYQIoZHSR6sXMO-MOhBzaMcFoNHm-SuCsdUFYU0phDXbhcWaHgtleGR6cZW1ur2csPsUgc9A=="
# germany_white = "https://www.vivino.com/explore?e=eJwNizsOgCAQBW_zajS229jY2htj1hUJiYCB9Xd7aWaKyYRMDYKPZBD4pc4YyEdDD6kYcdbqdro5e6t8IK2UWX10ZZF0RUWizRbBo9NMbV1L9WZ_sTcc5Q=="
# italy_white = "https://www.vivino.com/explore?e=eJwNi7sKgDAMAP8mcxXXLC6u7iISYy0B20obX39vlrvhuFiwgSgJHUR6sXMO-MOhBzaMcFoNO95UxCsdkFcspJJCXThfSSHj5ivDo9OMra3VLPoDsVAc-Q=="
# portugal_white = "https://www.vivino.com/explore?e=eJwNizsKgDAQBW-zdRTb19jY2ovIumoImA_J-ru9aWaKYXxGQ94FGPL8ojOG5MPQk1SMlGq1B27Oblc-Ka7IrC7Yski8glLEthehR6cZbV1LddIfsV4dAA=="
# spain_white = "https://www.vivino.com/explore?e=eJwNi7sKgDAMAP8mcxXXLC6u7iISYy0F20oTX39vlrvhuFSxgRQzOkj0Yucc8IdDD2wY4bQadrypRq90QFmxksYcZOFyZYWCmxeGR6cZW1vF7OUHsUcc9A=="
# usa_white = "https://www.vivino.com/explore?e=eJwNi7sKgDAMAP8mcxXXLC6u7iISo5aCbaVJffy9We6G42LBBmJI6CDSi51zwB8OPbBhhMuqP_CmEnalE_KKhTQkLwvnmhQybrswPDrN2Noq5io_sWcdBA=="

# full_white_list = []
# white_length_list = []

# browser = initialize_chrome_driver()
# country_white_pages = [argentina_white, australia_white, austria_white, chile_white, france_white, germany_white, italy_white, portugal_white, spain_white, usa_white]
# for page in country_white_pages:
#     country_list = scroll_and_load(browser, page, timeout=1, scrolls=150)
#     white_length_list.append(len(country_list))
#     full_white_list.append(country_list)
#     print("{} finished and gained {} links".format(page, len(country_list)))

Check the resulting sample size for red and white wines:

In [52]:
# records = 0
# for i in full_red_list:
#     records += len(i)
# print('our search gained {} red wine records'.format(records))

In [53]:
# records = 0
# for i in full_white_list:
#     records += len(i)
# print('our search gained {} white wine records'.format(records))

Save results to a file (for convenience, using [pickle library](https://docs.python.org/3/library/pickle.html))

In [3]:
import pickle

In [55]:
# with open("popular_reds_sample", 'wb') as f:
#     pickle.dump(full_red_list, f)

In [56]:
# with open("popular_whites_sample", 'wb') as f:
#     pickle.dump(full_white_list, f)

### Parsing the properties of each wine from the list 

Trying to get interesting information per each wine, we might need the following web page elements: 
- id:
    * wine name (this usualy includes also the year) (`class="wine"`)
- properties:
    * winery (`class="winery"`)
    * wine type (`class="wineLocationHeader__wineType--14nrC"`)
    * grapes (`class="wineFacts__fact--3BAsi"`) 
    * wine style (`class="wineFacts__fact--3BAsi"`) 
    * region (`class="anchor__anchor--3DOSm"`)
    * country (`class="wineLocationHeader__country--1RcW2"`)
- quality information:
    * number of reviews (`class="vivinoRatingWide__basedOn--s6y0t"`)
    * average price (`class="purchaseAvailabilityPPC__amount--2_4GT"`)
    * rating score (`class="vivinoRatingWide__averageValue--1zL_5"`)
    * 3 random community reviews (`class="reviewCard__reviewNote--fbIdd"`)
- taste structure:
    * there are 4 progress bars: light/bold, smooth/tannic, dry/sweet, soft/acidic 
    * each taste progressbar has the following class: `class="indicatorBar__progress--3aXLX"` with the following style properties identifying the bar location: `style="width: 15%; left: 85%;"` 
    * notes mention (`class="tasteNote__popularKeywords--1q7RG"`) 
    
Some of these elements are not visible until the page is scrolled down till the end. Therefore, just as with the wine list, we might need to scroll down, but this time without a pause. Instead, we will scroll until the specific element in the bottom becomes visible. This will ensure that all other elements are visible too. The bottom element chosen for this purposes has a class `addWidgetLink__addWidgetLink--aPZ_V`. 

In [57]:
# scroll until the end of a specific wine page 

def scroll_until_wine_page_bottom(browser, page, page_end_class="addWidgetLink__addWidgetLink--aPZ_V"):
    """
    Function that scrolls until the bottom of the page, if certain element in the bottom is located within 10 seconds. Arguments are:
    * browser - an instance of chrome driver
    * page - a page that needs to be scrolled
    * page_end_class - CSS selector that can indicate that the page reached its bottom.
    """
    browser.get(page)
    browser.execute_script("window.scrollTo(0, document.body.scrollHeight);")
    try:
        WebDriverWait(browser, 10).until(EC.visibility_of_element_located((By.CLASS_NAME, page_end_class)))
    except TimeoutException:
        print("exception raised")

Once the whole wine page is loaded, we will need to check existence of each item that we need for analysis. 

To parse the data, important to note that each of page items has either a unique class name (in this case a single element can be extracted using a method `browser.find_element_by_class_name()`), or a generic class name that applies to more than one element (and therefore the list of elements can be extracted using a method `browser.find_elements_by_class_name()` and then indexed to get the desired data). 

Generally, we are interested in the text property of a given web element, however, in specific cases (eg. for the wine taste structure that is represented as a slidebar on vivino web page) we are interested in web element attributes instead.  

Some of the required information may be missing on a wine page, in such cases no exception should be raised. Instead, the program should consider `np.nan` as the correct data input. 

In order to consider the above specifics, we introduce a function `extract_data_with_exceptions` that takes several optional boolean arguments (`multiple_elements` tells whether a given class has more than one elements on a page, `get_style` tells whether we need attribute of a web element instead of text)


In [58]:
# extract either a single element, or a list of multiple elements
def extract_data_with_exceptions(browser, class_name, multiple_elements=False, get_style=False):
    """
    function that extracts certain elements (or np.nan if elements are absent) from a web page
    """
    if multiple_elements:
        try:
            list_with_data = []
            data_list = browser.find_elements_by_class_name(class_name)
            for element in data_list:
                if get_style:
                    list_with_data.append(element.get_attribute("style"))
                else:
                    list_with_data.append(element.text)
        except:
            list_with_data = "Not available"
        return list_with_data
    else:
        try:
            if get_style:
                data = browser.find_element_by_class_name(class_name).get_attribute("style")
            else:
                data = browser.find_element_by_class_name(class_name).text
        except:
            data = np.nan
        return data

The data will be stored in a pandas DataFrame. Therefore, for convenience, it was chosen to extract individual wine data in the form of a Python dictionary, that will be appended to the target DataFrame one by one. 

In [59]:
def extract_data_as_dict(browser):
    """
    extract certain data from a browser page to a dictionary
    """
    data_dict = {}
    data_dict["wine_name"] = extract_data_with_exceptions(browser, "vintage")     # extract wine name 
    data_dict["winery"] = extract_data_with_exceptions(browser, "winery")     # extract winery
    data_dict["wine_type"] = extract_data_with_exceptions(browser, "wineLocationHeader__wineType--14nrC")        # extract wine type
    facts_list = extract_data_with_exceptions(browser, "wineFacts__fact--3BAsi", multiple_elements=True)     # extract facts about grapes and wine style
    data_dict["grapes"] = facts_list[1] if len(facts_list) > 1 else np.nan         # extract grapes
    data_dict["wine_style"] = facts_list[3] if len(facts_list) > 3 else np.nan         # extract wine style
    data_dict["region"] = extract_data_with_exceptions(browser, "anchor__anchor--3DOSm")         # extract region 
    data_dict["country"] = extract_data_with_exceptions(browser, "wineLocationHeader__country--1RcW2")       # extract country
    data_dict["reviews"] = extract_data_with_exceptions(browser, "vivinoRatingWide__basedOn--s6y0t")        # extract the number of reviews
    data_dict["price"] = extract_data_with_exceptions(browser, "purchaseAvailabilityPPC__amount--2_4GT")      # extract the average price 
    data_dict["score"] = extract_data_with_exceptions(browser, "vivinoRatingWide__averageValue--1zL_5")         # extract the rating score
    reviews_list = extract_data_with_exceptions(browser, "reviewCard__reviewNote--fbIdd", multiple_elements=True)       # extract 3 reviews
    data_dict["review_1"] = reviews_list[0] if len(reviews_list) > 0 else np.nan
    data_dict["review_2"] = reviews_list[1] if len(reviews_list) > 1 else np.nan
    data_dict["review_3"] = reviews_list[2] if len(reviews_list) > 2 else np.nan
    taste_list = extract_data_with_exceptions(browser, "indicatorBar__progress--3aXLX", multiple_elements=True, get_style=True)    # extract taste bar
    data_dict["taste_light_bold"] = taste_list[0] if len(taste_list) > 0 else np.nan
    data_dict["taste_smooth_tannic"] = taste_list[1] if len(taste_list) > 1 else np.nan
    data_dict["taste_dry_sweet"] = taste_list[2] if len(taste_list) > 2 else np.nan
    data_dict["taste_soft_acidic"] = taste_list[3] if len(taste_list) > 3 else np.nan
    keyword_list = extract_data_with_exceptions(browser, "tasteNote__popularKeywords--1q7RG", multiple_elements=True)        # extract keywords
    data_dict["keywords_1"] = keyword_list[0] if len(keyword_list) > 0 else np.nan
    data_dict["keywords_2"] = keyword_list[1] if len(keyword_list) > 1 else np.nan
    data_dict["keywords_3"] = keyword_list[2] if len(keyword_list) > 2 else np.nan
    return data_dict

In [60]:
df_25_wines = pd.DataFrame()
browser = initialize_chrome_driver()

for page in wine_pages_list:
    scroll_until_wine_page_bottom(browser, page, page_end_class="addWidgetLink__addWidgetLink--aPZ_V")
    cur_wine_data = extract_data_as_dict(browser)
    df_25_wines = df_25_wines.append(cur_wine_data, ignore_index=True)

In [61]:
df_25_wines.head()

Unnamed: 0,country,grapes,keywords_1,keywords_2,keywords_3,price,region,review_1,review_2,review_3,reviews,score,taste_dry_sweet,taste_light_bold,taste_smooth_tannic,taste_soft_acidic,wine_name,wine_style,wine_type,winery
0,United States,Cabernet Sauvignon,"tobacco, oak, vanilla","blackberry, dark fruit,...","leather, cocoa, earthy",£185.40,Rutherford,,,,81 ratings,4.9,width: 15%; left: 1.48415%;,width: 15%; left: 85%;,width: 15%; left: 49.9412%;,width: 15%; left: 50.714%;,Patriarch 2012,Napa Valley Cabernet Sauvignon,Red wine,Frank Family
1,France,Pinot Noir,,,,£1504.68,Échezeaux Grand Cru,,,,32 ratings,4.9,,,,,Échezeaux Grand Cru 1986,Burgundy Côte de Nuits Red,Red wine,Domaine de La Romanée-Conti
2,United States,100% Cabernet Sauvignon,"oak, tobacco, vanilla","blackberry, cassis, bla...","leather, earthy, minera...",£562.68,Napa Valley,,,,60 ratings,4.9,width: 15%; left: 12.4875%;,width: 15%; left: 85%;,width: 15%; left: 51.1701%;,width: 15%; left: 52.7505%;,Few and Far Between Cabernet Sauvignon 2013,Napa Valley Cabernet Sauvignon,Red wine,Hundred Acre
3,United States,Cabernet Sauvignon,"black fruit, blackberry...","vanilla, cola, baking s...","flint, minerals",£138.28,Oakville,,,,59 ratings,4.9,width: 15%; left: 5.32852%;,width: 15%; left: 85%;,width: 15%; left: 50.0542%;,width: 15%; left: 39.0767%;,Echion Cabernet Sauvignon 2014,Napa Valley Cabernet Sauvignon,Red wine,Amici
4,United States,100% Grenache,"vanilla, oak, cola","pepper, anise, licorice","pomegranate, cherry col...",£625,Sta. Rita Hills,,,,58 ratings,4.9,,,,,Rattrapante Grenache 2012,15.6%,Red wine,Sine Qua Non


Now when we know that the web scraper works correctly, we might want to apply it to a bigger subset of wines at the same time (eg. about 20k red wines that were saved in the script above).

However, once the script was run, it managed to load data about app. 340 wines before it triggered the bulk request limits on vivino.com.

The following message was thrown: 
*Your IP address (185.192.69.14) has been temporarily blocked for exceeding bulk request limits. If you believe this was done in error or you have legitimate needs to access our pages and data above and beyond these limits please contact admin@vivino.com with the subject 'Requests Blocked' and we'll try and resolve the issue.*

Still, for documentation purposes, the below code is kept here, and might be un-commented if needed. 

In [62]:
# wine_df_new = pd.DataFrame()

# browser = initialize_chrome_driver()

# start_time = time.time()
# for page in list_with_scroll:
#     scroll_until_the_end_of_wine_page(browser, page, page_end_class="addWidgetLink__addWidgetLink--aPZ_V")
#     cur_wine_data = extract_data_as_dict(browser)
#     wine_df_new = wine_df_new.append(cur_wine_data, ignore_index=True)
# end_time = time.time()

# print(wine_df_new)
# print("time elapsed: " + str(end_time - start_time) + " s.")

# wine_df_new.to_csv('red_wine_highest_score.csv')

### 3. Using Vivino API

#### Wine data overview

During a more detailed review of vivino request/response pairs, in Dev tools of vivino page it was discovered that vivino has its own internal API with various endpoints (including explore), it returns a structured JSON string. The GET request to explore API sent by JavaScript looks as follows:
https://www.vivino.com/api/explore/explore?country_code=GB&currency_code=GBP&grape_filter=varietal&min_rating=1&order_by=ratings_average&order=desc&page=4&price_range_max=400&price_range_min=0&wine_type_ids[]=1

From the format of the request sent to vivino servers, it can be seen that the following arguments are passed to the request to explore page: 
* `country_code`
* `currency_code`
* `grape_filter`
* `min_rating`
* `order_by`
* `order`
* `page`
* `price_range_max`
* `price_range_min`
* `wine_type_ids[]`

It means that certain filters can be applied to specify the results, and since the data is loaded by pages, we can iterate by page numbers to get information on the whole subset.

We will simply use `requests` library and pass the headers specifying that json is expected in return (otherwise, the request might throw IP blocking for bulk request).

First, we check which data can be retrieved (wine data & reviews) using this method.

In [63]:
test_page = "https://www.vivino.com/api/explore/explore?country_code=GB&currency_code=GBP&grape_filter=varietal&min_rating=1&order_by=ratings_average&order=desc&page=4&price_range_max=400&price_range_min=0&wine_type_ids[]=1"

headers_api = {
    'Accept': 'application/json',
    'Content-Type': 'application/json',
    'User-Agent': 'python/requests',
}

response = requests.get(test_page, headers=headers_api)
test_result = response.content

Since vivino.com server returns a JSON string, we might want to convert it to a Python-readable format. We will use `json` library for this task that converts string into a Python dictionary. 

In [4]:
import json 
json_obj = json.loads(test_result)

NameError: name 'test_result' is not defined

In [65]:
test_df = pd.DataFrame(json_obj['explore_vintage']['matches'])
test_df.shape

(25, 3)

We can see that a single page request yields 25 wines. We need to 'unroll' the vintage column which is by itself a dictionary. 
`pd.json_normalize()` fits perfectly for that purposes.
The results are stored in the `new_test_df`.

In [66]:
new_test_df = pd.json_normalize(test_df['vintage'])

In [67]:
new_test_df.head(2)

Unnamed: 0,id,seo_name,name,year,grapes,has_valid_ratings,statistics.status,statistics.ratings_count,statistics.ratings_average,statistics.labels_count,...,wine.style.region.country.wineries_count,wine.style.region.country.most_used_grapes,wine.style.region.background_image.location,wine.style.region.background_image.variations.large,wine.style.region.background_image.variations.medium,wine.has_valid_ratings,top_list_rankings,wine.region.background_image,wine.style.background_image,wine.style.region
0,1438498,vega-sicilia-unico-1970,Vega Sicilia Unico 1970,1970,,True,Normal,379,4.8,919,...,16721.0,"[{'id': 19, 'name': 'Tempranillo', 'seo_name':...",//images.vivino.com/regions/backgrounds/YX2wax...,//thumbs.vivino.com/region_backgrounds/YX2waxK...,//thumbs.vivino.com/region_backgrounds/YX2waxK...,True,,,,
1,1232825,domaine-de-la-romanee-conti-la-tache-grand-cru...,Domaine de La Romanée-Conti La Tâche Grand Cru...,2000,,True,Normal,368,4.8,2204,...,65187.0,"[{'id': 14, 'name': 'Pinot Noir', 'seo_name': ...",//images.vivino.com/regions/backgrounds/oTEcWw...,//thumbs.vivino.com/region_backgrounds/oTEcWwU...,//thumbs.vivino.com/region_backgrounds/oTEcWwU...,True,"[{'rank': 1, 'previous_rank': 1, 'description'...",,,


We can see that a bunch of useful data is stored in 'wine' and 'statistics' columns. 

#### Reviews data overview

From the looks of the API request we can see that it takes the following arguments:
* wine ID
* year
* number of reviews per page

We test sending request to reviews API, and it also results in a JSON string with requested information.

In [68]:
test_reviews_page = "https://www.vivino.com/api/wines/83496/reviews?year=2016&per_page=15"
response = requests.get(test_reviews_page, headers=headers_api)
reviews_test_result = response.content

In [69]:
json_obj = json.loads(reviews_test_result)
reviews_test_df = pd.DataFrame(json_obj['reviews'])

In [70]:
reviews_test_df.head(2)

Unnamed: 0,id,rating,note,language,created_at,aggregated,user,vintage,activity,flavor_word_matches,tagged_note
0,125161577,4.0,"Deep flavor. Strong notes of black fruit, toba...",en,2019-04-27T05:12:23.000Z,True,"{'id': 34010037, 'seo_name': 'christopher.sugo...","{'id': 91466354, 'seo_name': 'joel-gott-cabern...","{'id': 325755121, 'statistics': {'likes_count'...","[{'id': 39, 'match': 'black fruit'}, {'id': 15...","Deep flavor. Strong notes of black fruit, toba..."
1,111290859,4.0,Oak and berries on the nose with vanilla backg...,en,2018-12-03T04:11:47.000Z,True,"{'id': 8907902, 'seo_name': 'mike-herma', 'ali...","{'id': 91466354, 'seo_name': 'joel-gott-cabern...",,"[{'id': 292, 'match': 'oak'}, {'id': 434, 'mat...",Oak and berries on the nose with vanilla backg...


### Actual data extraction

According to the email from Birkir Barkarson (the CTO of vivino), the requests should be limited to 1000 per 10 minute window in order to avoid major interrution to their servers and resulting IP blockerage. 

Therefore, the `ratelimiter` is set accordingly. It will respect the limit of 1 request per second.

In [5]:
from ratelimiter import RateLimiter

#### Wine data extraction

By default, API would return a JSON string that can be further converted to a Python object using `json.loads()` method.

As is, JSON string returned from explore API has the following root structure:
```
{
"e": ...,
"explore_vintage": {
                    "bottle_type_errors": ...,
                    "market": ..., 
                    "matches": [{
                                 price: ...,
                                 prices: ...,
                                 vintage: ...
                                },
                                {},
                                ...
                                ],
                    "records": ...,
                    "records_matched": ...
                    },
"selected_filters": ...
}
```

From the above structure we can see that the list with results needed for the current research are stored in `explore_vintage/matches` (including both the wine data, and its market price), while the overall number of records meeting the search criteria are stored in `explore_vintage/records_matched`. We won't be needing any other information returned from explore API. 

Since we'll be running the below script iteratively, the function is written in a way to accept a list as an argument, and append the new results to that list. If no list is provided, the function would create a clean one, and append results to it. 

In [7]:
# extracting data to JSON object

@RateLimiter(max_calls=1, period=1)
def get_wine_json(page, headers, matches=[]):
    """
    Function extracting total number of records and JSON list from vivino API explore endpoint
    """
    response = requests.get(page, headers=headers)
    json_str = response.content
    json_obj = json.loads(json_str)
    records_num = json_obj['explore_vintage']['records_matched']
    for match in json_obj['explore_vintage']['matches']:
        matches.append(match)
    return records_num, matches

For quick overview, it might be also convenient to store data from a JSON string in a pandas DataFrame. This can be done either by a built-in [pandas method](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_json.html) or by our custom method `extract_json_to_df` provided below. 

In [8]:
# extracting data to a pandas DataFrame

def extract_json_to_df(df, json_str):
    """
    Function that converts json string containing vivino data on vintages to a pandas DataFrame
    """
    json_obj = json.loads(json_str)
    for wine in json_obj['explore_vintage']['matches']:
        df = df.append(wine, ignore_index=True)
#     df = pd.json_normalize(df['vintage'])  # this flattens json into a single wide table 
    return df

@RateLimiter(max_calls=1, period=1)
def get_wine_df(page, headers, df=pd.DataFrame()):
    """
    Function extracting DataFrame from vivino API explore endpoint
    """
    response = requests.get(page, headers=headers)
    json_str = response.content
    df = extract_json_to_df(df, json_str)
    return df

In order to check the above functions, we are sending a couple of GET requests to explore API endpoint passing the following parameters: 
* `country_code=GB` in order to make sure the results are returned in English, and given that the research is performed in the UK 
* `currency_code=GBP` for consistency purposes
* `grape_filter=varietal` to allow for all grapes
* `min_rating=1` to allow for any ratings
* `order_by=ratings_average` and `order=desc` to load the popular wines first
* `page={}` a variable that we are iterating upon
* `per_page=100` that was deduced to be the highest possible value for that parameter
* `price_range_max=400` the highest possible value
* `price_range_min=0` the lowest possible value

It should be noted that those wines for which vivino.com does not have any market data, price is considered zero for the purpose of filtering. Therefore, price filters of [0,400] will return considerably more results than [1,400] most of which will not contain any information about the price. 

Also, wines with a price exceeding 400 are considered to have the price of 400 for the purpsoe of filtering.

Later on, in order to count the number of results, we avoid setting the price parameters to [0,400] in order to make sure that all wine records include at least some information about their price. 

In the code below we are experimenting to load only 5 pages (therefore, iterating 5 items, and appending new results to the existing JSON object and/or DataFrame on each iteration). 

In [74]:
#experimenting with data extraction using rate limiter
subset_df = pd.DataFrame()
subset_list = []

headers_api = {
    'Accept': 'application/json',
    'Content-Type': 'application/json',
    'User-Agent': 'python/requests',
}

for i in range(1,6):  # here, the number of required pages was set to 5 in order to extract more data, the upper bound can be increased if needed
    page = "https://www.vivino.com/api/explore/explore?country_code=GB&currency_code=GBP&grape_filter=varietal&min_rating=1&\
    order_by=ratings_average&order=desc&page={}&per_page=100&price_range_max=400&price_range_min=0".format(i)
    subset_df = get_wine_df(page, headers_api, subset_df)
    _, subset_list = get_wine_json(page, headers_api, subset_list)

In [75]:
subset_df = pd.json_normalize(subset_df['vintage'])

In [76]:
print(f"The resulting DataFrame has the following shape: {subset_df.shape}")
print(f"The DataFrame contains {subset_df['id'].nunique()} unique results")
subset_df.head(2)

The resulting DataFrame has the following shape: (500, 129)
The DataFrame contains 500 unique results


Unnamed: 0,id,seo_name,name,year,grapes,has_valid_ratings,statistics.status,statistics.ratings_count,statistics.ratings_average,statistics.labels_count,...,wine.taste.structure.calculated_structure_count,wine.style.background_image.location,wine.style.background_image.variations.small,wine.region,wine.winery,wine.style,wine.region.background_image,wine.style.region,wine.style.region.background_image,top_list_rankings
0,6803795,real-companhia-velha-colheita-port-1944,Real Companhia Velha Colheita Port 1944,1944,,True,Normal,35,5.0,335,...,,,,,,,,,,
1,3595824,chateau-d-yquem-sauternes-sauternes-dessert-wi...,Château d'Yquem Sauternes 1945,1945,,True,Normal,30,5.0,99,...,1782.0,,,,,,,,,


From the above experiment, we can see that the data is returned correctly. Indeed, the program loads the number of unique items equal to 100 multiplied by the number of pages requested. 

However, after trying to load a higher number of pages using the same script, it was discovered that data is not returned correctly after page #81 (meaning, all requests to page number 81 and above return the same results). I assume that either this is a protective feature against scrapers, or the backend on vivino side is not working properly for requests to pages exceeding 80.

Vivino contains about 55 thousand wine records with information about their prices. Loading those results with 100 records per page, with the current filter criteria, would require iterating for at least 550 pages. Therefore, I need to specify the filtering criteria in a way to avoid approaching page number 81. 

One of the possible solutions is to divide the whole dataset into 400 small subsets based on their price filters. Like this, we'll load the data iteratively for each narrow price range (apparently, none of them contains more than 8000 records, so, the paging problem would be solved).

In [9]:
import math

In [10]:
# slightly extended headings list to make sure not getting blocked during loading data

headers_browser = {
    'Accept': 'application/json',
    'Content-Type': 'application/json',
    'Accept-Encoding': 'gzip, deflate, br',
    'Accept-Language': 'en-GB,en-US;q=0.9,en;q=0.8',
    'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.183 Safari/537.36',
    'Set-Fetch-Dest': 'empty',
    'Sec-Fetch-Mode': 'cors',
    'Sec-Fetch-Site': 'same-origin',
    'X-Requested-With': 'XMLHttpRequest'
}

In [11]:
def extract_wines_to_json(price_min, price_max, write_intermediate_backup=False, write_final_backup=False):
    """
    Function that iterates over small price ranges to extract all the data within a given range between min price and max price. 
    If necessary, the function may store intermediate backups and/or a final backup inside pickle files. 
    
    """

    page_template = 'https://www.vivino.com/api/explore/explore?country_code=GB&currency_code=GBP&grape_filter=varietal&\
    min_rating=1&order_by=ratings_average&order=desc&page={}&per_page=100&price_range_min={}&price_range_max={}'

    records_total, _ = get_wine_json(page_template.format(1, price_max, max(price_min,1)), headers_browser)  # here, we consider price min to be at least 1 to avoid inaccurate results 
    print(f"The program recognized app.{records_total} unique records with a price from {price_min} to {price_max} and will run for app.{math.ceil(records_total/100)} iterations in total")

    timepoint_start = time.time()
    max_records_bunch = 0       # piece of code necessary to write intermediate backups in the process

    for i in range(price_min, price_max):
        if i == price_min:
            timepoint_iter = timepoint_start
            match_list = []

        cur_price_records_num, _ = get_wine_json(page_template.format(1, price_min + 1, price_min), headers_browser)
        iterations_required = math.ceil(cur_price_records_num/100)  # to make sure the necessary number of iterations to catch all of the records with a given price 
        max_records = iterations_required * 100     # piece of code necessary to write intermediate backups in the process
        max_records_bunch += max_records         # piece of code necessary to write intermediate backups in the process
        for it in range(1, iterations_required + 1):
            cur_page = page_template.format(it, i+1, i)
            _, match_list = get_wine_json(cur_page, headers_browser, match_list)

            if i % 10 == 0 and it == iterations_required:
                
                if write_intermediate_backup:
                    with open(f"backup_data/match_list_{i-1}", 'wb') as f:          # piece of code necessary to write intermediate backups in the process
                        pickle.dump(match_list[-max_records_bunch:], f)        # piece of code necessary to write intermediate backups in the process
                max_records_bunch = 0           # piece of code necessary to write intermediate backups in the process
                time_elapsed_bunch = time.time() - timepoint_iter
                timepoint_iter = time.time()
                print(f"Bunch of records with price up to {i+1} uploaded and took {round(time_elapsed_bunch/60, 2)} minutes. Next, running prices from {i+1}...")

    full_match_list = match_list
    if write_final_backup:
        with open(f"backup_data/full_match_list", 'wb') as f:
            pickle.dump(full_match_list, f)
    timepoint_end = time.time()
    time_elapsed = timepoint_end - timepoint_start
    print(f"Program finished and successfully uploaded {len(full_match_list)} wine records with prices(need to check duplicates). \
    The program took app. {round(time_elapsed/60, 2)} minutes to run")
    
    return full_match_list

In [80]:
test_list = extract_wines_to_json(100, 105)

The program recognized app.321 unique records with a price from 100 to 105 and will run for app.4 iterations in total
Bunch of records with price up to 101 uploaded and took 0.08 minutes. Next, running prices from 101...
Program finished and successfully uploaded 321 wine records with prices(need to check duplicates).     The program took app. 0.43 minutes to run


We can see that the code works correctly and can be run for the desired price range (in our case, from 0 to 400). 
The loading was performed on November 13, 2020, and returned 55839 records. During the process, it stored intermediary and final backup files inside a pickle file. 

Further in this notebook, for the sake of time efficiency, the code should not be re-run. Instead, the resulting data can be extracted from the backup. 

In [12]:
with open(f"backup_data/full_match_list", 'rb') as f:
    recovered_data = pickle.load(f)
    
len(recovered_data)

55839

Since the code took some time to run, we can not exclude that certain records might be duplicated (if, for example, some of their parameters were changed, and they appeared in more than one page). We would like to specifically exclude data duplicates, which is done in the function below. This function will return the wine record with the last occurence of specific vintage id. 

After removing duplicates, we get 55819 unique wine records. 

In [13]:
def remove_wine_duplicates(json_data):
    distinct_dict = {entry['vintage']['id']: entry for entry in json_data}
    recovered_data_distinct = distinct_dict.values()
    return list(recovered_data_distinct)

recovered_data_distinct = remove_wine_duplicates(recovered_data)
len(recovered_data_distinct)

55819

#### Extracting reviews

Results extracted from Explore API include 55819 unique vintages belonging to 29811 unique wines.

There are the following ways to get information about reviews: 
* **based on wine IDs**

Standard request to API looks as follows: 
`https://www.vivino.com/api/wines/1324035/reviews?year=2013&per_page=10&page=1`
To do this, we need to prepare a subset of wines with their id's, vintage years and respective number of reviews (that can be found in the explore results), after which send iterative GET requests for each of them. This will return JSON files structured as follows:
```
{
"reviews": [{"activity": {"id": 331298026, 
                          "statistics": {"likes_count": 58, 
                                         "comments_count": 3}
                          },
             "aggregated": true,
             "created_at": "2019-05-18T21:36:04.000Z",
             "id": 127250463,
             "language": "en",
             "note": "Excellent red! Complex blend of Primitivo, Sangiovese, Negroamaro and plus 2 more. Deep. Complex. Excellent balance!4.5* Russian- in comments ",
             "rating": 4.5,
             "tagged_note": "Excellent red! Complex blend of Primitivo, Sangiovese, Negroamaro and plus 2 more. Deep. Complex. Excellent balance!4.5* Russian- in comments ",
             "user": {"alias": "Alexandre Kondrashov YPO",
                      "background_image": null,
                      "id": 1183892,
                      "is_featured": false,
                      "seo_name": "alexandre_ko",
                      "statistics": {"followers_count": 2040, 
                                     "followings_count": 2274, 
                                     "ratings_count": 3085, 
                                     "ratings_sum": 11981,
                                     "reviews_count": 2727
                                     },
                      "visibility": "all"
                      },
             "vintage": {"id": 150615005, 
                         "seo_name": "farnese-edizione-cinque-autoctoni-2017",…}
             },
             {...},
             ...]
}
```

* **based on user IDs**

To do this, first we need to prepare a list of interesting users on vivino.com (one way to do this is by looking into the country rankings). POST request to `https://www.vivino.com/users/anthony_rob/country_rankings` with the respective form data (page number and country code) and (if necessary) a valid x-csrf-token, returns a JavaScript code that contains data about top users (i.e. links to their pages) returned by batches of 10. 
Once the links to the user pages are collected, we need to extract the user id (for example, by looking at the following web element on the user page: `<meta content='vivino://?user_id=2930225' name='twitter:app:url:iphone'>`). Once the list of user IDs is ready, information about their reviews can be extracted in a similar way: sending GET request to `/users/4461180/activities?limit=10&start_from_id=0&_=1605868240796`. This request returns a JavaScript code with the list of reviews. 

* **based on activity IDs**

For this we need to find activity IDs that we're interested in. The request will look as follows: 
`https://www.vivino.com/api/activities/489116864/comments?per_page=3`

First, we'll try to extract reviews based on wine IDs, that seems to be the quickest and most reliable method. In order to do this, we'll prepare a table of vintages (with information about wine IDs, vintage years and country). Then, we'll split the table into batches based on different countries and years. First, we need to overview the amount of reviews in each batch.

In [1]:
import requests
import json
import pandas as pd
import pickle
from ratelimiter import RateLimiter
import math

In [2]:
def remove_wine_duplicates(json_data):
    distinct_dict = {entry['vintage']['id']: entry for entry in json_data}
    recovered_data_distinct = distinct_dict.values()
    return list(recovered_data_distinct)

In [38]:
with open(f"backup_data/full_match_list", 'rb') as f:
    recovered_data = pickle.load(f)
    
recovered_data_distinct = remove_wine_duplicates(recovered_data)
full_df = pd.DataFrame(recovered_data_distinct)
full_df_normalized = pd.json_normalize(full_df['vintage'])

In [39]:
full_df = None
recovered_data_distinct = None
recovered_data = None

In [4]:
full_df_normalized.head(2)

Unnamed: 0,id,seo_name,name,year,grapes,has_valid_ratings,statistics.status,statistics.ratings_count,statistics.ratings_average,statistics.labels_count,...,wine.style.region,wine.style.background_image.location,wine.style.background_image.variations.small,wine.taste.structure,wine.style,wine.region.background_image,wine.style.region.background_image,top_list_rankings,wine.region,wine.winery
0,111604237,esporao-alandra-tinto-2016,Esporão Alandra Tinto 2016,2016,,True,Normal,2142,3.2,17293,...,,,,,,,,,,
1,7290004,bacalhoa-vinhos-de-portugal-alentejano-monte-d...,Bacalhôa Alentejano Monte das Ânforas Tinto 2014,2014,,True,Normal,755,3.4,3737,...,,,,,,,,,,


In [5]:
full_df_normalized.iloc[:,:40].columns

Index(['id', 'seo_name', 'name', 'year', 'grapes', 'has_valid_ratings',
       'statistics.status', 'statistics.ratings_count',
       'statistics.ratings_average', 'statistics.labels_count',
       'image.location', 'image.variations.bottle_large',
       'image.variations.bottle_medium',
       'image.variations.bottle_medium_square',
       'image.variations.bottle_small', 'image.variations.bottle_small_square',
       'image.variations.label', 'image.variations.label_large',
       'image.variations.label_medium', 'image.variations.label_medium_square',
       'image.variations.label_small_square', 'image.variations.large',
       'image.variations.medium', 'image.variations.medium_square',
       'image.variations.small_square', 'wine.id', 'wine.name',
       'wine.seo_name', 'wine.type_id', 'wine.vintage_type', 'wine.is_natural',
       'wine.region.id', 'wine.region.name', 'wine.region.name_en',
       'wine.region.seo_name', 'wine.region.country.code',
       'wine.region.count

Now we need to choose specific columns that can help us with grouping reviews: 
* `id` - ID of vintage
* `year` - year of vintage
* `has_valid_ratings` - should be set to True
* `statistics.ratings_count` - the number of reviews for this particular vintage
* `wine.id` - the ID of a wine
* `wine.region.country.name` - country name of a wine

In [6]:
review_filtering_df = full_df_normalized[['id', 'year', 'has_valid_ratings', 'statistics.ratings_count', 'wine.id', 'wine.region.country.name']]

In [7]:
filter_df = review_filtering_df.loc[review_filtering_df['has_valid_ratings']==True, ['id', 'year', 'statistics.ratings_count',
       'wine.id', 'wine.region.country.name']]

In [16]:
filter_df.columns = ['id', 'year', 'rating_count', 'wine_id', 'country']

In [19]:
filter_df.head(5)

Unnamed: 0,id,year,rating_count,wine_id,country
0,111604237,2016,2142,1105374,Portugal
1,7290004,2014,755,1706071,Portugal
2,156234290,2018,349,4269600,Portugal
3,156633179,2018,235,1200770,Portugal
4,158257747,2018,198,4269602,Portugal


In [20]:
review_filtering_df = pd.DataFrame()
full_df_normalized = pd.DataFrame()
full_df = pd.DataFrame()

In [21]:
filter_gb = filter_df.groupby(['country', 'year']).agg({'id': 'count', 'rating_count': 'sum'})

In [22]:
filter_gb['average_ratings'] = filter_gb['rating_count'] / filter_gb['id']
filter_gb.sort_values('rating_count', ascending=False)[:15]
# filter_gb.sort_values('average_reviews', ascending=False)[:15]

Unnamed: 0_level_0,Unnamed: 1_level_0,id,rating_count,average_ratings
country,year,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
France,N.V.,282,947143,3358.663121
Italy,2015,1261,487043,386.235527
France,2016,2837,473987,167.073317
France,2015,2456,470415,191.537052
Italy,2016,1207,452299,374.729909
France,2017,2605,385098,147.830326
Italy,2017,1244,378245,304.055466
Italy,2018,1518,370566,244.114625
France,2018,2237,351309,157.044703
Spain,2016,814,351233,431.490172


From the above breakdown we can see that vintages from France and Italy appear to have the highest total number of reviews.

Now, we'll look individually to wines from Italy to decide which year should be analyzed first.

In [23]:
filter_gb.loc['Italy'].sort_values('rating_count', ascending=False).head(10)

Unnamed: 0_level_0,id,rating_count,average_ratings
year,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
2015,1261,487043,386.235527
2016,1207,452299,374.729909
2017,1244,378245,304.055466
2018,1518,370566,244.114625
N.V.,123,340017,2764.365854
2013,707,322209,455.74116
2014,641,289794,452.096724
2012,498,202827,407.283133
2010,344,134182,390.063953
2011,340,132066,388.429412


We'll start with extracting wines from Italy having a year 'N.V.' (meaning it was made from grapes of various years), since those have the highest average number of reviews, and it can allow us to get more quick results.

First, we'll prepare a DataFrame containing information about wine ID's of those wines from year 'N.V.' and the number of vintage reviews.

In [24]:
data_Italy_NV = filter_df[(filter_df['country'] == 'Italy') & (filter_df['year'] == 'N.V.')].sort_values('rating_count', ascending=False)
print(data_Italy_NV['wine_id'].nunique())
data_Italy_NV.head(10)

123


Unnamed: 0,id,year,rating_count,wine_id,country
28460,1469676,N.V.,49646,21646,Italy
847,164942584,N.V.,44563,1155726,Italy
21598,6401709,N.V.,21159,2459848,Italy
33191,164943083,N.V.,20249,1191278,Italy
2684,164942602,N.V.,16739,4022142,Italy
20615,1749818,N.V.,16728,1367887,Italy
2686,1538925,N.V.,15277,1236719,Italy
7068,164942605,N.V.,13380,1377390,Italy
7069,1471106,N.V.,9673,1134756,Italy
33194,164943133,N.V.,8451,4381953,Italy


In [25]:
with open("backup_data/reviews/Italy_NV", "rb") as f:
    reviews_Italy_NV = pickle.load(f)

In [29]:
reviews_Italy_NV_df = pd.json_normalize(reviews_Italy_NV)

In [30]:
reviews_Italy_NV_df.head()

Unnamed: 0,id,rating,note,language,created_at,aggregated,tagged_note,user.id,user.seo_name,user.alias,...,user.image.variations.small_square,flavor_word_matches,vintage.wine.statistics.status,vintage.wine.statistics.ratings_count,vintage.wine.statistics.ratings_average,vintage.wine.statistics.labels_count,vintage.wine.statistics.vintages_count,vintage.wine.vintage_mask,vintage.wine.region.background_image,user.image
0,183949523,4.5,\n,un,2020-11-20T17:09:08.000Z,True,,39504912,lennart.kanth,Lennart Kanth,...,,,,,,,,,,
1,183934246,5.0,적당한 탄닌 적은 산미 오크향이 가득\n좋아요! 미디엄풀바디 되는것 같아요 oak,ko,2020-11-20T13:57:09.000Z,True,적당한 탄닌 적은 산미 오크향이 가득\n좋아요! 미디엄풀바디 되는것 같아요 oak,28254423,yuri_kim2,yuri kim,...,,,,,,,,,,
2,183928468,2.5,Too fruity,un,2020-11-20T12:21:13.000Z,True,Too fruity,27394194,jaehyeon.jun,JaeHyeon Jun,...,//thumbs.vivino.com/avatars/r4Y5Kax7Rieq71UmE0...,,,,,,,,,
3,183874762,4.5,"Cinnamon and cherry on the nose, and full of f...",en,2020-11-19T19:50:13.000Z,True,"Cinnamon and cherry on the nose, and full of f...",13603746,tor-sta,Tor Sta,...,,"[{'id': 93, 'match': 'cherry'}, {'id': 105, 'm...",,,,,,,,
4,183812835,5.0,야스,un,2020-11-18T23:34:40.000Z,True,야스,43962096,i-jinho1,이 진호,...,,,,,,,,,,


In [None]:
# create a function that eliminates reviews which have wrong vintage ID or are duplicated 

In [93]:
reviews_Italy_NV_short = reviews_Italy_NV_df[['id', 'rating', 'note', 'language', 'created_at', 'user.id', 'user.seo_name', 'vintage.id', 'vintage.wine.id']]

In [87]:
reviews_Italy_NV_users = reviews_Italy_NV_df[['user.id', 'user.seo_name', 'user.alias',
       'user.is_featured', 'user.visibility', 'user.image.location',
       'user.image.variations', 'user.statistics.followers_count',
       'user.statistics.followings_count', 'user.statistics.ratings_count',
       'user.statistics.ratings_sum', 'user.statistics.reviews_count',
       'user.background_image']]

In [89]:
reviews_Italy_NV_users.head()

Unnamed: 0,user.id,user.seo_name,user.alias,user.is_featured,user.visibility,user.image.location,user.image.variations,user.statistics.followers_count,user.statistics.followings_count,user.statistics.ratings_count,user.statistics.ratings_sum,user.statistics.reviews_count,user.background_image
0,39504912,lennart.kanth,Lennart Kanth,False,all,//images.vivino.com/avatars/default_user.png,,1.0,0.0,12.0,45.0,2.0,
1,28254423,yuri_kim2,yuri kim,False,all,//images.vivino.com/avatars/default_user.png,,0.0,0.0,23.0,90.0,22.0,
2,27394194,jaehyeon.jun,JaeHyeon Jun,False,all,//images.vivino.com/avatars/r4Y5Kax7Rieq71UmE0...,,0.0,0.0,15.0,54.0,13.0,
3,13603746,tor-sta,Tor Sta,False,all,//images.vivino.com/avatars/default_user.png,,1.0,0.0,57.0,221.5,52.0,
4,43962096,i-jinho1,이 진호,False,all,//images.vivino.com/avatars/default_user.png,,2.0,0.0,14.0,60.0,12.0,


In [90]:
reviews_Italy_NV_users['user.id'].nunique()

73641

In [68]:
reviews_Italy_NV_df.columns

Index(['id', 'rating', 'note', 'language', 'created_at', 'aggregated',
       'tagged_note', 'user.id', 'user.seo_name', 'user.alias',
       'user.is_featured', 'user.visibility', 'user.image.location',
       'user.image.variations', 'user.statistics.followers_count',
       'user.statistics.followings_count', 'user.statistics.ratings_count',
       'user.statistics.ratings_sum', 'user.statistics.reviews_count',
       'user.background_image', 'vintage.id', 'vintage.seo_name',
       'vintage.name', 'vintage.statistics.status',
       'vintage.statistics.ratings_count',
       'vintage.statistics.ratings_average', 'vintage.statistics.labels_count',
       'vintage.statistics.reviews_count', 'vintage.organic_certification_id',
       'vintage.certified_biodynamic', 'vintage.image.location',
       'vintage.image.variations.bottle_large',
       'vintage.image.variations.bottle_medium',
       'vintage.image.variations.bottle_medium_square',
       'vintage.image.variations.bottle_smal

In [32]:
reviews_Italy_NV_df['vintage.wine.id'].nunique()

123

In [34]:
reviews_Italy_NV_df['vintage.id'].nunique()

1095

In [66]:
reviews_Italy_NV_df.shape

(95543, 100)

In [69]:
review_count_Italy_NV = reviews_Italy_NV_df[['vintage.wine.id', 'vintage.id', 'vintage.seo_name', 'id']].groupby(['vintage.wine.id', 'vintage.id', 'vintage.seo_name'])\
.agg({'id':'count'})
review_count_Italy_NV.head()

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,id
vintage.wine.id,vintage.id,vintage.seo_name,Unnamed: 3_level_1
4965,5605,ferrari-demi-sec-2006,3
4965,1170792,ferrari-demi-sec-2008,8
4965,1441679,ferrari-demi-sec-2010,4
4965,1606317,ferrari-demi-sec-2012,3
4965,1741811,ferrari-demi-sec-nv,227


In [65]:
review_count_Italy_NV['id'].sum()

95543

In [53]:
wines_main_df = full_df_normalized[['id', 'name', 'year', 'wine.id', 'wine.name', 'wine.vintage_type', 'wine.region.country.name', 'statistics.ratings_count']]

In [56]:
wines_Italy_NV = wines_main_df[(wines_main_df['year'] == 'N.V.') & (wines_main_df['wine.region.country.name'] == 'Italy')]
wines_Italy_NV

Unnamed: 0,id,name,year,wine.id,wine.name,wine.vintage_type,wine.region.country.name,statistics.ratings_count
45,2339584,Riunite Lambrusco Emilia Rosé,N.V.,1583687,Lambrusco Emilia Rosé,1,Italy,714
218,2605648,Righi Lambrusco Grasparossa di Castelvetro Ama...,N.V.,1677707,Lambrusco Grasparossa di Castelvetro Amabile,1,Italy,281
237,2192509,Albinea Canali Foglie Rosse Lambrusco,N.V.,78716,Foglie Rosse Lambrusco,1,Italy,212
435,164943011,Maschio dei Cavalieri Prosecco Treviso Extra Dry,N.V.,1245489,Prosecco Treviso Extra Dry,1,Italy,2783
450,3823417,Albinea Canali Ottocentonero Lambrusco,N.V.,2063574,Ottocentonero Lambrusco,1,Italy,865
...,...,...,...,...,...,...,...,...
30524,143361301,Dubl Esse,N.V.,4483749,Esse,1,Italy,112
33191,164943083,Ca' del Bosco Franciacorta Cuvée Prestige,N.V.,1191278,Franciacorta Cuvée Prestige,1,Italy,20249
33194,164943133,Bellavista Alma Gran Cuvée Brut,N.V.,4381953,Alma Gran Cuvée Brut,1,Italy,8451
36203,154817910,1701 Franciacorta Saten,N.V.,5772902,Saten,1,Italy,137


In [58]:
merged_Italy_NV = pd.merge(left = wines_Italy_NV, right = review_count_Italy_NV, left_on = ['id', 'wine.id'], right_on = ['vintage.id', 'vintage.wine.id'])
merged_Italy_NV

Unnamed: 0,id_x,name,year,wine.id,wine.name,wine.vintage_type,wine.region.country.name,statistics.ratings_count,id_y
0,2339584,Riunite Lambrusco Emilia Rosé,N.V.,1583687,Lambrusco Emilia Rosé,1,Italy,714,149
1,2605648,Righi Lambrusco Grasparossa di Castelvetro Ama...,N.V.,1677707,Lambrusco Grasparossa di Castelvetro Amabile,1,Italy,281,79
2,2192509,Albinea Canali Foglie Rosse Lambrusco,N.V.,78716,Foglie Rosse Lambrusco,1,Italy,212,69
3,164943011,Maschio dei Cavalieri Prosecco Treviso Extra Dry,N.V.,1245489,Prosecco Treviso Extra Dry,1,Italy,2783,748
4,3823417,Albinea Canali Ottocentonero Lambrusco,N.V.,2063574,Ottocentonero Lambrusco,1,Italy,865,223
...,...,...,...,...,...,...,...,...,...
118,143361301,Dubl Esse,N.V.,4483749,Esse,1,Italy,112,18
119,164943083,Ca' del Bosco Franciacorta Cuvée Prestige,N.V.,1191278,Franciacorta Cuvée Prestige,1,Italy,20249,5096
120,164943133,Bellavista Alma Gran Cuvée Brut,N.V.,4381953,Alma Gran Cuvée Brut,1,Italy,8451,1952
121,154817910,1701 Franciacorta Saten,N.V.,5772902,Saten,1,Italy,137,31


In [62]:
merged_Italy_NV['percentage_reviews'] = merged_Italy_NV['id_y']/merged_Italy_NV['statistics.ratings_count']

In [63]:
merged_Italy_NV.head()

Unnamed: 0,id_x,name,year,wine.id,wine.name,wine.vintage_type,wine.region.country.name,statistics.ratings_count,id_y,percentage_reviews
0,2339584,Riunite Lambrusco Emilia Rosé,N.V.,1583687,Lambrusco Emilia Rosé,1,Italy,714,149,0.208683
1,2605648,Righi Lambrusco Grasparossa di Castelvetro Ama...,N.V.,1677707,Lambrusco Grasparossa di Castelvetro Amabile,1,Italy,281,79,0.281139
2,2192509,Albinea Canali Foglie Rosse Lambrusco,N.V.,78716,Foglie Rosse Lambrusco,1,Italy,212,69,0.325472
3,164943011,Maschio dei Cavalieri Prosecco Treviso Extra Dry,N.V.,1245489,Prosecco Treviso Extra Dry,1,Italy,2783,748,0.268775
4,3823417,Albinea Canali Ottocentonero Lambrusco,N.V.,2063574,Ottocentonero Lambrusco,1,Italy,865,223,0.257803


In [64]:
merged_Italy_NV['id_y'].sum()

57179

In [61]:
review_count_Italy_NV.loc[78716]

Unnamed: 0_level_0,id
vintage.id,Unnamed: 1_level_1
1412766,4
2192509,69
3792957,1
11359305,3


In [51]:
wines_main_df[wines_main_df['year'] == 'N.V.']

673

In [94]:
merged_reviews_Italy_NV = pd.merge(left = reviews_Italy_NV_short, right = wines_Italy_NV, left_on = ['vintage.wine.id', 'vintage.id'], right_on = ['wine.id', 'id'])
merged_reviews_Italy_NV.head()

Unnamed: 0,id_x,rating,note,language,created_at,user.id,user.seo_name,vintage.id,vintage.wine.id,id_y,name,year,wine.id,wine.name,wine.vintage_type,wine.region.country.name,statistics.ratings_count
0,183949523,4.5,\n,un,2020-11-20T17:09:08.000Z,39504912,lennart.kanth,1469676,21646,1469676,Farnese Edizione Cinque Autoctoni,N.V.,21646,Edizione Cinque Autoctoni,2,Italy,49646
1,183934246,5.0,적당한 탄닌 적은 산미 오크향이 가득\n좋아요! 미디엄풀바디 되는것 같아요 oak,ko,2020-11-20T13:57:09.000Z,28254423,yuri_kim2,1469676,21646,1469676,Farnese Edizione Cinque Autoctoni,N.V.,21646,Edizione Cinque Autoctoni,2,Italy,49646
2,183928468,2.5,Too fruity,un,2020-11-20T12:21:13.000Z,27394194,jaehyeon.jun,1469676,21646,1469676,Farnese Edizione Cinque Autoctoni,N.V.,21646,Edizione Cinque Autoctoni,2,Italy,49646
3,183874762,4.5,"Cinnamon and cherry on the nose, and full of f...",en,2020-11-19T19:50:13.000Z,13603746,tor-sta,1469676,21646,1469676,Farnese Edizione Cinque Autoctoni,N.V.,21646,Edizione Cinque Autoctoni,2,Italy,49646
4,183812835,5.0,야스,un,2020-11-18T23:34:40.000Z,43962096,i-jinho1,1469676,21646,1469676,Farnese Edizione Cinque Autoctoni,N.V.,21646,Edizione Cinque Autoctoni,2,Italy,49646


In [98]:
merged_reviews_Italy_NV.groupby(['user.seo_name', 'user.id']).agg({'id_x': 'count'}).sort_values(by='id_x', ascending=False)

Unnamed: 0_level_0,Unnamed: 1_level_0,id_x
user.seo_name,user.id,Unnamed: 2_level_1
victor_yanko,536293,23
simonj10,14698124,23
enricopez0,8955375,23
mauros5,6077018,21
sergey.sl,5162166,19
...,...,...
graham-war,3907129,1
graham-wild,14864984,1
graham.care,13266109,1
graham.doz,3527187,1


In [75]:
merged_reviews_Italy_NV['id_x'].nunique()

57178

In [82]:
merged_reviews_Italy_NV.groupby('language').agg({'id_x': 'count', 'rating': 'mean'}).sort_values(by='id_x', ascending=False)

Unnamed: 0_level_0,id_x,rating
language,Unnamed: 1_level_1,Unnamed: 2_level_1
en,19064,3.719261
un,16510,3.897729
ru,6830,3.968521
it,5765,3.726366
pt,1889,3.876654
...,...,...
mn,1,5.000000
jw,1,4.500000
kl,1,4.000000
ln,1,4.000000


In [86]:
merged_reviews_Italy_NV[(merged_reviews_Italy_NV['language'] == 'un') & (merged_reviews_Italy_NV['note'] != '\n')]

Unnamed: 0,id_x,rating,note,language,created_at,user.id,vintage.id,vintage.wine.id,id_y,name,year,wine.id,wine.name,wine.vintage_type,wine.region.country.name,statistics.ratings_count
2,183928468,2.5,Too fruity,un,2020-11-20T12:21:13.000Z,27394194,1469676,21646,1469676,Farnese Edizione Cinque Autoctoni,N.V.,21646,Edizione Cinque Autoctoni,2,Italy,49646
4,183812835,5.0,야스,un,2020-11-18T23:34:40.000Z,43962096,1469676,21646,1469676,Farnese Edizione Cinque Autoctoni,N.V.,21646,Edizione Cinque Autoctoni,2,Italy,49646
7,183782250,5.0,Kräftiger Italiener \n,un,2020-11-18T17:35:59.000Z,25469956,1469676,21646,1469676,Farnese Edizione Cinque Autoctoni,N.V.,21646,Edizione Cinque Autoctoni,2,Italy,49646
10,183754222,5.0,Vinhaço,un,2020-11-18T03:16:16.000Z,615079,1469676,21646,1469676,Farnese Edizione Cinque Autoctoni,N.V.,21646,Edizione Cinque Autoctoni,2,Italy,49646
13,183706901,4.5,k,un,2020-11-17T15:24:44.000Z,10913182,1469676,21646,1469676,Farnese Edizione Cinque Autoctoni,N.V.,21646,Edizione Cinque Autoctoni,2,Italy,49646
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
57143,98485297,3.5,36,un,2018-06-29T21:34:55.000Z,10044056,15893309,2514179,15893309,Fidora Tenuta Civranetta Prosecco Frizzante,N.V.,2514179,Tenuta Civranetta Prosecco Frizzante,1,Italy,43
57149,173026344,4.5,Awesome,un,2020-08-09T09:13:32.000Z,20836649,67128392,3963257,67128392,Ca' di Alte Rosé Spumante,N.V.,3963257,Rosé Spumante,1,Italy,35
57152,138547452,5.0,Lovely \n,un,2019-09-19T20:25:33.000Z,26497456,67128392,3963257,67128392,Ca' di Alte Rosé Spumante,N.V.,3963257,Rosé Spumante,1,Italy,35
57156,111615971,3.5,Bono\n,un,2018-12-08T00:15:07.000Z,31396817,67128392,3963257,67128392,Ca' di Alte Rosé Spumante,N.V.,3963257,Rosé Spumante,1,Italy,35


In [92]:
merged_reviews_Italy_NV['user.id'].nunique()

45509

Wines dataframe contains 673 unique vintage IDs with a year N.V., and the same number of unique wine IDs, that suggests that there is only one vintage per one wine, if the year is chosen. Now, we'll return the number of reviews found for each wine ID and vintage ID from the Wines dataframe. 

In the reviews DataFrame, there is more than one vintage given the same wine ID and year. 
Overall, the Reviews df for Italy N.V. contains 95.543 values of which 57.179 values (60%) have the right vintage ID. The other values of vintage IDs found in reviews df relate to a diiferent year and we assume they show up in the reviews df mistakenly, therefore, we disregard those. One of the resulting values is duplicated, so the number of unique reviews for Italian wines N.V. is 57.178. 

Of those, 784 do not contain any note, so they should not be considered, 19.064 are written in English which represents app. 20% of the initially uploaded dataset.

Reviews are given by 45.509 unique users.

In [41]:
full_df_normalized.iloc[:, :50].columns

Index(['id', 'seo_name', 'name', 'year', 'grapes', 'has_valid_ratings',
       'statistics.status', 'statistics.ratings_count',
       'statistics.ratings_average', 'statistics.labels_count',
       'image.location', 'image.variations.bottle_large',
       'image.variations.bottle_medium',
       'image.variations.bottle_medium_square',
       'image.variations.bottle_small', 'image.variations.bottle_small_square',
       'image.variations.label', 'image.variations.label_large',
       'image.variations.label_medium', 'image.variations.label_medium_square',
       'image.variations.label_small_square', 'image.variations.large',
       'image.variations.medium', 'image.variations.medium_square',
       'image.variations.small_square', 'wine.id', 'wine.name',
       'wine.seo_name', 'wine.type_id', 'wine.vintage_type', 'wine.is_natural',
       'wine.region.id', 'wine.region.name', 'wine.region.name_en',
       'wine.region.seo_name', 'wine.region.country.code',
       'wine.region.count

In [43]:
# full_df_normalized.iloc[:, 50:].columns

In [27]:
prepared_filter_df['reviews_count'].sum()

340017

In [28]:
prepared_filter_df.shape

(123, 5)

In [110]:
s = requests.Session()

s.headers.update({
    'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.193 Safari/537.36'
})

s.headers.update({
    'Accept': 'application/json',
    'Content-Type': 'application/json',
})

profile_page = 'http://app.vivino.com/api/users/13603746?'
response = s.get(profile_page)
json_obj = json.loads(response.content)
print(response.status_code)
print(response.content)
# print(json_obj['address']['country'])
# pd.json_normalize(response.content)

200
b'{"id":13603746,"seo_name":"tor-sta","alias":"Tor Sta","is_featured":false,"visibility":"all","image":{"location":"//images.vivino.com/avatars/default_user.png","variations":null},"background_image":{"location":"//images.vivino.com/users/backgrounds/default_1.jpg","variations":{"large":"//images.vivino.com/users/backgrounds/default_1_1200x400.jpg","medium":"//images.vivino.com/users/backgrounds/default_1_600x200.jpg","small":"//images.vivino.com/users/backgrounds/default_1_140x60.jpg"}},"bio":null,"website":null,"address":{"title":null,"name":null,"street":null,"street2":null,"neighborhood":null,"city":null,"zip":null,"state":null,"country":"gb","company":null,"phone":null,"external_id":null,"residential":null,"vat_number":null,"vat_code":null,"addition":null},"statistics":{"followers_count":1,"followings_count":0,"ratings_count":58,"ratings_sum":225.5,"reviews_count":53,"badges_count":null,"wishlist_count":61,"activity_stories_count":0}}'


First, we'll try to extract data about wine_id `21646` with the highest number of ratings (about 50 thousand).

In [29]:
# extracting data to a pandas DataFrame

def extract_reviews_json_to_df(df, json_str):
    """
    Function that converts json string containing vivino data on reviews to a pandas DataFrame
    """
    json_obj = json.loads(json_str)
    for review in json_obj['reviews']:
        df = df.append(review, ignore_index=True)
    return df

@RateLimiter(max_calls=1, period=1)
def get_reviews_df(s, page, df):
    """
    Function extracting DataFrame from vivino API review endpoint respecting rate limiting and keeping connection open
    """
    response = s.get(page)
    if response.status_code // 100 == 2:
        json_str = response.content
        df = extract_json_to_df(df, json_str)
    else: 
        print(response.content)
    return df

In [30]:
@RateLimiter(max_calls=1, period=1)
def get_reviews_json(s, page): 
    """
    Function extracting list from vivino API review endpoint respecting rate limiting and keeping connection open
    """
    review_list = []
    response = s.get(page)
    if response.status_code // 100 == 2:
        json_str = response.content
        json_obj = json.loads(json_str)
        for review in json_obj['reviews']:
            review_list.append(review)
    else: 
        print(response.content)
    return review_list

In [44]:
def extract_reviews_to_json(s, wine_id, year, num_reviews, write_intermediate_backup=False):
    """
    Function that returns reviews extracted for a particular wine ID and year.
    If write_intermediate_backup is chosen, it saves the results to a pickle file.
    """
    
    page_template = 'https://www.vivino.com/api/wines/{}/latest_reviews?year={}&per_page=50&page={}'
        
    timepoint_0 = time.time()
    wine_id = wine_id
    year = year 
    num_reviews = num_reviews
    num_pages = math.ceil(num_reviews/50)
#     max_reviews = num_pages * 50
    
#     reviews_previous_state = []
    reviews = []
    
#     print(f"The program will send {num_pages} requests to reviews API to extract up to {num_reviews} reviews for wine {wine_id}")
    # for wine_id in wine_id_list[:6]:
    for it in range(1, num_pages + 1):
        page = page_template.format(wine_id, year, it)
        new_reviews = get_reviews_json(s, page)
        if len(new_reviews) == 0:
            break
        else:
            reviews.extend(new_reviews)
        # check whether this iteration brought any extra information
#         if reviews == reviews_previous_state:
#             break
#         else:
#             reviews_previous_state = reviews.copy()

#         if it % 100 == 0:
#             print(f"Iteration {it} finished, program extracted {len(set([review['id'] for review in reviews]))} unique reviews so far...")

#     if write_intermediate_backup:
#         with open(f"backup_data/reviews/{wine_id}_{year}", 'wb') as f:
#             pickle.dump(reviews, f)

#     print(f"""Program uploaded {len(set([review['id'] for review in reviews]))} unique reviews on wine {wine_id} for the year {year}. 
#     It took app. {round((time.time() - timepoint_0)/60, 2)} minutes to run""")
    
    return reviews

In [33]:
import requests

In [40]:
s = requests.Session()

s.headers.update({
    'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.193 Safari/537.36'
})

s.headers.update({
    'Accept': 'application/json',
    'Content-Type': 'application/json',
})

In [165]:
# wine_data = [(21646, 'N.V.', 49646), (1155726, 'N.V.', 44563), (2459848, 'N.V.', 21159)]
# for wine, year, num in wine_data:
#     print(wine)
#     print(year)
#     print(num)

wine 21646 generated 13038 reviews (request to latest_reviews API) and 11735 reviews (request to latest_reviews API) while the total number of ratings should be 49646. This includes not only reviews, but also ratings without review notes.
I tried to merge results from `reviews` and `latest_reviews`, but even after merge the result contains only 13037 unique records suggesting that all of them are incliuded to the result from `latest_reviews` API. That means that reviews API does not bring any extra information to the analysis.

After trying to upload only 5-star reviews, search yielded 4164 while the overal number of rankings is 14710 (28%). 
After trying to upload only 4-star reviews, search yielded 7463 while the overal number of rankings is 29611 (25%).
After trying to upload only 3-star reviews, search yielded 1196 while the overal number of rankings is 4768 (25%).
After trying to upload only 2-star reviews, search yielded 140 while the overal number of rankings is 471 (30%).
After trying to upload only 1-star reviews, search yielded 74 while the overal number of rankings is 329 (22%).
In total, when extracting reviews by rating, the search yielded 13037 records

In [None]:
# total_reviews = []
# # test_test_test_reviews = []
# wine_data = [(21646, 'N.V.', 49646), (1155726, 'N.V.', 44563), (2459848, 'N.V.', 21159)]
# for wine, year, num in wine_data:
#     reviews = extract_reviews_to_json(s, wine, year, num,  True)
#     print(f"{len(reviews)} reviews extracted and will be added to the total list")
#     total_reviews.extend(reviews)

Time required for 115368 rankings: 16.66 minutes
With 90 minutes it's possible to upload 650 thousand rankings

In [None]:
# data_Italy_NV = prepared_filter_df
# reviews_Italy_NV = []

# for index, row in data_Italy_NV.iterrows():
#     wine = row['wine_id']
#     year = row['year']
#     num = row['reviews_count']
#     reviews = extract_reviews_to_json(s, wine, year, num,  False)
#     reviews_Italy_NV.extend(reviews)

# with open(f"backup_data/reviews/Italy_NV", 'wb') as f:
#     pickle.dump(reviews_Italy_NV, f)
# #     print(f"The data is {wine_id} and {ratings_total}")

In [35]:
data_Italy_2014 = filter_df[(filter_df['country'] == 'Italy') & (filter_df['year'] == 2014)].sort_values('reviews_count', ascending=False)
data_Italy_2018 = filter_df[(filter_df['country'] == 'Italy') & (filter_df['year'] == 2018)].sort_values('reviews_count', ascending=False)

In [289]:
data_Italy_2015.head()

Unnamed: 0,id,year,reviews_count,wine_id,country
37700,14423362,2015,15474,11890,Italy
32649,15518330,2015,13436,1633194,Italy
48209,14155244,2015,10598,1652,Italy
40032,14384714,2015,7751,1480684,Italy
5943,16363571,2015,7590,1500493,Italy


In [8]:
data_Italy_2015.head()

NameError: name 'data_Italy_2015' is not defined

In [None]:
# reviews_Italy_2015 = []

# for index, row in data_Italy_2015.iterrows():
#     wine = row['wine_id']
#     year = row['year']
#     num = row['reviews_count']
#     reviews = extract_reviews_to_json(s, wine, year, num,  False)
#     reviews_Italy_2015.extend(reviews)
#     if index % 100 == 0:
#         print(f"Record no. {index} with id {wine} finished. Currently stores {len(set([review['id'] for review in reviews_Italy_2015]))} records")

# with open(f"backup_data/reviews/Italy_2015", 'wb') as f:
#     pickle.dump(reviews_Italy_2015, f)

In [45]:
# reviews_Italy_2015 = []

# for index, row in data_Italy_2015.iterrows():
#     wine = row['wine_id']
#     year = row['year']
#     num = row['reviews_count']
#     reviews = extract_reviews_to_json(s, wine, year, num,  False)
#     reviews_Italy_2015.extend(reviews)
#     if index % 100 == 0:
#         print(f"Record no. {index} with id {wine} finished. Currently stores {len(set([review['id'] for review in reviews_Italy_2015]))} records")

# with open(f"backup_data/reviews/Italy_2015", 'wb') as f:
#     pickle.dump(reviews_Italy_2015, f)

In [46]:
# reviews_Italy_2016 = []

# for index, row in data_Italy_2016.iterrows():
#     wine = row['wine_id']
#     year = row['year']
#     num = row['reviews_count']
#     reviews = extract_reviews_to_json(s, wine, year, num,  False)
#     reviews_Italy_2016.extend(reviews)
#     if index % 100 == 0:
#         print(f"Record no. {index} with id {wine} finished. Currently stores {len(set([review['id'] for review in reviews_Italy_2016]))} records")

# with open(f"backup_data/reviews/Italy_2016", 'wb') as f:
#     pickle.dump(reviews_Italy_2016, f)
    
# reviews_Italy_2017 = []

# for index, row in data_Italy_2017.iterrows():
#     wine = row['wine_id']
#     year = row['year']
#     num = row['reviews_count']
#     reviews = extract_reviews_to_json(s, wine, year, num,  False)
#     reviews_Italy_2017.extend(reviews)
#     if index % 100 == 0:
#         print(f"Record no. {index} with id {wine} finished. Currently stores {len(set([review['id'] for review in reviews_Italy_2017]))} records")

# with open(f"backup_data/reviews/Italy_2017", 'wb') as f:
#     pickle.dump(reviews_Italy_2017, f)

Record no. 7100 with id 1436740 finished. Currently stores 227915 records
Record no. 23900 with id 23035 finished. Currently stores 235365 records
Record no. 42800 with id 22945 finished. Currently stores 264490 records
Record no. 42500 with id 2127846 finished. Currently stores 285870 records
Record no. 34900 with id 1084872 finished. Currently stores 286520 records
Record no. 41700 with id 7338920 finished. Currently stores 457667 records
Record no. 35000 with id 4533978 finished. Currently stores 462433 records
Record no. 14600 with id 20810 finished. Currently stores 143259 records
Record no. 14200 with id 12378 finished. Currently stores 144409 records
Record no. 16100 with id 77441 finished. Currently stores 161175 records
Record no. 10700 with id 1785035 finished. Currently stores 322514 records

In [301]:
with open(f"backup_data/reviews/Italy_2017", 'wb') as f:
    pickle.dump(reviews_Italy_2017, f)

In [302]:
len(reviews_Italy_2016)

465756

In [303]:
len(reviews_Italy_2017)

353696

In [39]:
len(reviews_Italy_2018)

47250

In [None]:
reviews_Italy_2018 = []

for index, row in data_Italy_2018.iterrows():
    wine = row['wine_id']
    year = row['year']
    num = row['reviews_count']
    reviews = extract_reviews_to_json(s, wine, year, num,  False)
    reviews_Italy_2018.extend(reviews)
#     print("Process started")
    if index % 100 == 0:
        print(f"Record no. {index} with id {wine} finished. Currently stores {len(set([review['id'] for review in reviews_Italy_2018]))} records")

with open(f"backup_data/reviews/Italy_2018", 'wb') as f:
    pickle.dump(reviews_Italy_2018, f)
    
reviews_Italy_2014 = []

for index, row in data_Italy_2014.iterrows():
    wine = row['wine_id']
    year = row['year']
    num = row['reviews_count']
    reviews = extract_reviews_to_json(s, wine, year, num,  False)
    reviews_Italy_2014.extend(reviews)
    if index % 100 == 0:
        print(f"Record no. {index} with id {wine} finished. Currently stores {len(set([review['id'] for review in reviews_Italy_2014]))} records")

with open(f"backup_data/reviews/Italy_2014", 'wb') as f:
    pickle.dump(reviews_Italy_2014, f)

Record no. 7600 with id 80041 finished. Currently stores 128911 records
Record no. 500 with id 28434 finished. Currently stores 234512 records


In [296]:
len(reviews_Italy_2015)

504717

In [280]:
Italy_NV_df = pd.DataFrame(reviews_Italy_NV)

In [282]:
Italy_NV_df.shape

(95543, 11)

In [284]:
Italy_NV_df['id'].nunique()

95541

In [241]:
len(set([review['note'] for review in test_test_test_reviews]))

73

In [224]:
len(test_test_test_reviews)

13038

In [226]:
recovered_reviews.extend(test_test_test_reviews)

In [227]:
len(recovered_reviews)

24773

In [228]:
len(set([review['id'] for review in recovered_reviews]))

13037

In [216]:
page = 'https://www.vivino.com/api/wines/21646/latest_reviews?year=N.V.&per_page=50&page=239'

# one_more_test_list = []
response = s.get(page)
json_str = response.content
# print(json_str)
json_obj = json.loads(json_str)
for review in json_obj['reviews']:
    one_more_test_list.append(review)

In [217]:
len(one_more_test_list)

200

In [218]:
len(set([review['id'] for review in one_more_test_list]))
# len(test_list)

200

In [190]:
test_list[-5:]

[{'id': 38246048,
  'rating': 2.0,
  'note': 'Sehr trocken, etwas bittere Note',
  'language': 'de',
  'created_at': '2015-12-27T19:28:51.000Z',
  'aggregated': True,
  'user': {'id': 11321257,
   'seo_name': 'bjorn-schl',
   'alias': 'Björn Schläfli',
   'is_featured': False,
   'visibility': 'all',
   'image': {'location': '//images.vivino.com/avatars/default_user.png',
    'variations': None},
   'statistics': {'followers_count': 0,
    'followings_count': 0,
    'ratings_count': 8,
    'ratings_sum': 27,
    'reviews_count': 8},
   'background_image': None},
  'vintage': {'id': 1489236,
   'seo_name': 'farnese-edizione-cinque-autoctoni-2012',
   'name': 'Farnese Edizione Cinque Autoctoni 2012',
   'statistics': {'status': 'Normal',
    'ratings_count': 6013,
    'ratings_average': 4.3,
    'labels_count': 31946,
    'reviews_count': 1802},
   'organic_certification_id': None,
   'certified_biodynamic': None,
   'image': {'location': '//images.vivino.com/labels/63pDC-p9Q9q1fTCL4pQ5Q

In [221]:
with open(f"backup_data/reviews/21646_N.V.", 'rb') as f:
    recovered_reviews = pickle.load(f)

In [225]:
len(recovered_reviews)

11735

In [207]:
test_test_df = pd.DataFrame(recovered_reviews)

In [208]:
test_test_df.head(10)

Unnamed: 0,id,rating,note,language,created_at,aggregated,user,vintage,activity,flavor_word_matches,tagged_note
0,92612380,4.5,Wow! Great experience. Excellent balance betwe...,en,2018-04-13T17:31:19.000Z,True,"{'id': 10419860, 'seo_name': 'guilherme.sega',...","{'id': 7010464, 'seo_name': 'san-marzano-cinqu...","{'id': 246411903, 'statistics': {'likes_count'...","[{'id': 292, 'match': 'oak'}]",Wow! Great experience. Excellent balance betwe...
1,63353852,4.5,Revisited ´Cinquanta' after a year or so. It's...,en,2017-02-24T20:27:35.000Z,True,"{'id': 2165793, 'seo_name': 'stefanvadocz', 'a...","{'id': 7010464, 'seo_name': 'san-marzano-cinqu...","{'id': 168647084, 'statistics': {'likes_count'...",,Revisited ´Cinquanta&#39; after a year or so. ...
2,161006896,4.5,"Excelent Wine! 92/100. Prunes, red fruits, som...",en,2020-04-21T18:16:20.000Z,True,"{'id': 5513308, 'seo_name': 'cesar_gonc', 'ali...","{'id': 7010464, 'seo_name': 'san-marzano-cinqu...","{'id': 435377919, 'statistics': {'likes_count'...","[{'id': 101, 'match': 'chocolate'}, {'id': 341...","Excelent Wine! 92/100. Prunes, red fruits, som..."
3,123518445,4.5,4.3 is a good rating. It is light and smooth b...,en,2019-04-10T01:41:24.000Z,True,"{'id': 28164520, 'seo_name': 'tom.rauck', 'ali...","{'id': 7010464, 'seo_name': 'san-marzano-cinqu...","{'id': 321417718, 'statistics': {'likes_count'...","[{'id': 434, 'match': 'vanilla'}, {'id': 93, '...",4.3 is a good rating. It is light and smooth b...
4,107326292,4.5,"Very drinkable, light, flavoursome wine that g...",en,2018-10-19T22:51:02.000Z,True,"{'id': 6020317, 'seo_name': 'steve.bye', 'alia...","{'id': 7010464, 'seo_name': 'san-marzano-cinqu...","{'id': 279169499, 'statistics': {'likes_count'...",,"Very drinkable, light, flavoursome wine that g..."
5,37867614,4.5,"I really like the wines from San marzano, and ...",en,2015-12-24T13:50:39.000Z,True,"{'id': 5689140, 'seo_name': 'simon-vestergaard...","{'id': 7010464, 'seo_name': 'san-marzano-cinqu...","{'id': 71316934, 'statistics': {'likes_count':...",,"I really like the wines from San marzano, and ..."
6,128459517,4.5,"One of the great ones. Full body, rich in flav...",en,2019-06-01T10:07:44.000Z,True,"{'id': 2696938, 'seo_name': 'marius-vel', 'ali...","{'id': 7010464, 'seo_name': 'san-marzano-cinqu...","{'id': 334701960, 'statistics': {'likes_count'...",,"One of the great ones. Full body, rich in flav..."
7,157045340,4.5,"Fantastic bouquet of ripe berries, cocoa and o...",en,2020-03-14T20:31:01.000Z,True,"{'id': 5371048, 'seo_name': 'sandro.tob', 'ali...","{'id': 7010464, 'seo_name': 'san-marzano-cinqu...","{'id': 425883217, 'statistics': {'likes_count'...","[{'id': 93, 'match': 'cherry'}, {'id': 113, 'm...","Fantastic bouquet of ripe berries, cocoa and o..."
8,162642751,4.5,Definitely the best quality price ratio so far...,en,2020-05-06T11:24:59.000Z,True,"{'id': 43526320, 'seo_name': 'p.joksimovic', '...","{'id': 7010464, 'seo_name': 'san-marzano-cinqu...","{'id': 439028240, 'statistics': {'likes_count'...","[{'id': 341, 'match': 'prunes'}, {'id': 93, 'm...",Definitely the best quality price ratio so far...
9,143978225,4.5,"Excelente vinho, complexo, agradavel , excelen...",en,2019-11-15T05:13:14.000Z,True,"{'id': 4435597, 'seo_name': 'joao.vis', 'alias...","{'id': 7010464, 'seo_name': 'san-marzano-cinqu...","{'id': 381915739, 'statistics': {'likes_count'...","[{'id': 93, 'match': 'cherry'}, {'id': 292, 'm...","Excelente vinho, complexo, agradavel , excelen..."


In [203]:
np.mean([review['rating'] for review in recovered_reviews])

3.8350021958717613

In [206]:
recovered_reviews[0]

{'id': 92612380,
 'rating': 4.5,
 'note': 'Wow! Great experience. Excellent balance between fruits and oak. It is everything in there...right acidity and taninos. Amazing.',
 'language': 'en',
 'created_at': '2018-04-13T17:31:19.000Z',
 'aggregated': True,
 'user': {'id': 10419860,
  'seo_name': 'guilherme.sega',
  'alias': 'Guilherme Segantini',
  'is_featured': False,
  'visibility': 'all',
  'image': {'location': '//images.vivino.com/avatars/0067c0ks12jk9190efa960f.jpg',
   'variations': {'large': '//thumbs.vivino.com/avatars/0067c0ks12jk9190efa960f_300x300.jpg',
    'small_square': '//thumbs.vivino.com/avatars/0067c0ks12jk9190efa960f_50x50.jpg'}},
  'statistics': {'followers_count': 23,
   'followings_count': 8,
   'ratings_count': 1013,
   'ratings_sum': 3784,
   'reviews_count': 619},
  'background_image': None},
 'vintage': {'id': 7010464,
  'seo_name': 'san-marzano-cinquanta-collezione-2012',
  'name': 'San Marzano Cinquanta Collezione 2012',
  'statistics': {'status': 'Normal'

In [196]:
min([review['created_at'] for review in recovered_reviews])

'2012-07-19T12:30:49.000Z'

In [None]:
# @RateLimiter(max_calls=1, period=1)
# def get_wine_json(page, headers, matches):
#     response = requests.get(page, headers=headers)
#     json_str = response.content
#     json_obj = json.loads(json_str)
#     records_num = json_obj['explore_vintage']['records_matched']
# #     print(records_num)
# #     records.append(records_num)
#     for match in json_obj['explore_vintage']['matches']:
#         matches.append(match)
#     return records_num, matches
# #     print(time.time())

In [None]:
# def extract_json_to_df(df, json_str):
#     json_obj = json.loads(json_str)
#     for wine in json_obj['explore_vintage']['matches']:
# #         print(type(wine))
#         df = df.append(wine['vintage'], ignore_index=True)
#     return df

In [597]:
import re
response_pattern = r'2.'
re.search(response_pattern, str(201)) is not None

True

In [148]:
# reviews_df = pd.DataFrame()

@RateLimiter(max_calls=1, period=1)
def extract_reviews_to_df(s, page, df):
#     for wine_id in wine_id_list:
#         for page in range(1, 6):
#             page = f'https://www.vivino.com/api/wines/{wine_id}/reviews?year=N.V.&per_page=50&page={page}'
#             print(page)
#     proxies = {
#   "http": "http://scraperapi:b65a0deee126a85a36e64532b1d7ebeb@proxy-server.scraperapi.com:8001",
#   "https": "http://scraperapi:b65a0deee126a85a36e64532b1d7ebeb@proxy-server.scraperapi.com:8001"}
    
    response = s.get(page)
#     response_pattern = r'2.'
#     print(response.status_code)
    if response.status_code // 100 == 2:
        json_str = response.content
#         print(response.content)
        json_obj = json.loads(json_str)
        for review in json_obj['reviews']:
            df = df.append(review, ignore_index=True)
    else: 
        print(response.content)
    return df

In [134]:
# wine_id_list

In [145]:
reviews_df_1 = pd.DataFrame()

s = requests.Session()

s.headers.update({
    'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.193 Safari/537.36'
})
s.headers.update({
    'Accept': 'application/json',
    'Content-Type': 'application/json',
})

timepoint_0 = time.time()

for wine_id in wine_id_list[:6]:
    for page in range(1, 10):
        # page = 'https://www.vivino.com/api/explore/explore?min_rating=4.5&order_by=ratings_average&order=desc&page=1&per_page=10&price_range_max=21.838499999999996&wine_style_ids[]=235&wine_type_ids[]=2&vc_only=true'
        page = f'https://www.vivino.com/api/wines/{wine_id}/reviews?year=N.V.&per_page=50&page={page}'
#         print(page)
        reviews_df_1 = extract_reviews_to_df(s, page, reviews_df_1)
        # rev_df_test = get_reviews_to_df(wine_id_list[:5], headers_browser, reviews_df)
        
with open(f"backup_data/reviews/{wine_id}", 'wb') as f:
    pickle.dump(full_match_list, f)
print(f"Loading of {reviews_df_1['id'].nunique()} reviews related to {len(wine_id_list[:6])} wines took app. {(time.time() - timepoint_0)/60} min")

Loading of 2412 reviews related to 6 wines took app. 1.5054479837417603 min


In [338]:
full_df = pd.read_json(json.dumps(list(recovered_data_distinct)))

In [340]:
full_df.head()

Unnamed: 0,vintage,price,prices
0,"{'id': 111604237, 'seo_name': 'esporao-alandra...","{'id': 21003666, 'amount': 1.47, 'discounted_f...","[{'id': 21003666, 'amount': 1.47, 'discounted_..."
1,"{'id': 7290004, 'seo_name': 'bacalhoa-vinhos-d...","{'id': 16543955, 'amount': 1.67, 'discounted_f...","[{'id': 16543955, 'amount': 1.67, 'discounted_..."
2,"{'id': 156234290, 'seo_name': 'cartuxa-vinea-t...","{'id': 20763698, 'amount': 1.98, 'discounted_f...","[{'id': 20763698, 'amount': 1.98, 'discounted_..."
3,"{'id': 156633179, 'seo_name': 'cerejeiras-lisb...","{'id': 22559942, 'amount': 1.88, 'discounted_f...","[{'id': 22559942, 'amount': 1.88, 'discounted_..."
4,"{'id': 158257747, 'seo_name': 'cartuxa-vinea-b...","{'id': 20763702, 'amount': 1.98, 'discounted_f...","[{'id': 20763702, 'amount': 1.98, 'discounted_..."


In [342]:
full_df_normalized = pd.json_normalize(full_df['vintage'])

In [343]:
full_df_normalized.head()

Unnamed: 0,id,seo_name,name,year,grapes,has_valid_ratings,statistics.status,statistics.ratings_count,statistics.ratings_average,statistics.labels_count,...,wine.style.region,wine.style.background_image.location,wine.style.background_image.variations.small,wine.taste.structure,wine.style,wine.region.background_image,wine.style.region.background_image,top_list_rankings,wine.region,wine.winery
0,111604237,esporao-alandra-tinto-2016,Esporão Alandra Tinto 2016,2016,,True,Normal,2142,3.2,17293,...,,,,,,,,,,
1,7290004,bacalhoa-vinhos-de-portugal-alentejano-monte-d...,Bacalhôa Alentejano Monte das Ânforas Tinto 2014,2014,,True,Normal,755,3.4,3737,...,,,,,,,,,,
2,156234290,cartuxa-vinea-tinto-2018,Cartuxa Vinea Tinto 2018,2018,,True,Normal,349,3.5,1891,...,,,,,,,,,,
3,156633179,cerejeiras-lisboa-tinto-2018,Quinta das Cerejeiras Lisboa Tinto 2018,2018,,True,Normal,235,3.5,1778,...,,,,,,,,,,
4,158257747,cartuxa-vinea-branco-2018,Cartuxa Vinea Branco 2018,2018,,True,Normal,198,3.5,1661,...,,,,,,,,,,


In [352]:
review_filtering_df = full_df_normalized[['id', 'year', 'statistics.ratings_count', 'wine.id', 'wine.region.country.name']]

In [353]:
review_filtering_df.head()

Unnamed: 0,id,year,statistics.ratings_count,wine.id,wine.region.country.name
0,111604237,2016,2142,1105374,Portugal
1,7290004,2014,755,1706071,Portugal
2,156234290,2018,349,4269600,Portugal
3,156633179,2018,235,1200770,Portugal
4,158257747,2018,198,4269602,Portugal


In [359]:
review_filtering_gb_df = review_filtering_df.groupby(['wine.region.country.name', 'year']).agg({'id': 'count', 'statistics.ratings_count': 'sum'})

In [366]:
review_filtering_gb_df.loc['Italy'].sort_values(by='statistics.ratings_count', ascending=False)

Unnamed: 0_level_0,id,statistics.ratings_count
year,Unnamed: 1_level_1,Unnamed: 2_level_1
2015,1261,487043
2016,1207,452299
2017,1244,378245
2018,1518,370566
N.V.,123,340017
2013,707,322209
2014,641,289794
2012,498,202827
2010,344,134182
2011,340,132066


In [296]:
vintage_rating_count = [item['vintage']['statistics']['ratings_count'] if item['vintage']['statistics']['ratings_count'] is not None else np.nan for item in list(recovered_data_distinct)]

In [295]:
list(recovered_data_distinct)[0]['vintage']['statistics']

{'status': 'Normal',
 'ratings_count': 2142,
 'ratings_average': 3.2,
 'labels_count': 17293}

In [297]:
vintage_rating_count[0]

2142

In [290]:
len(vintage_rating_count)

55819

In [298]:
vintage_rating_count_df = pd.DataFrame()

In [304]:
vintage_rating_count_df['id'] = vintage_ids
vintage_rating_count_df['ratings'] = vintage_rating_count
vintage_rating_count_df['country'] = wine_country
vintage_rating_count_df['year'] = vintage_year

In [None]:
[str(year) for year in range(2010, 2021)]

In [327]:
print([str(year) for year in range(2010, 2021)])

['2010', '2011', '2012', '2013', '2014', '2015', '2016', '2017', '2018', '2019', '2020']


In [None]:
df_full = p

In [335]:
vintage_rating_count_df[(vintage_rating_count_df['country']=='Italy') & \
#                         ((vintage_rating_count_df['year'] == 2010) | \
#                         (vintage_rating_count_df['year'] == 2011) | \
#                         (vintage_rating_count_df['year'] == 2012) | \
#                         (vintage_rating_count_df['year'] == 2013) | \
#                         (vintage_rating_count_df['year'] == 2014) | \
                        ((vintage_rating_count_df['year'] == 2015) | \
                        (vintage_rating_count_df['year'] == 2016) | \
                        (vintage_rating_count_df['year'] == 2017) | \
                        (vintage_rating_count_df['year'] == 2018) | \
                        (vintage_rating_count_df['year'] == 2019) | \
                        (vintage_rating_count_df['year'] == 2020))].ratings.sum()

1791076

In [328]:
vintage_rating_count_df[vintage_rating_count_df['year'] in [str(year) for year in range(2010, 2021)]] #.ratings.sum()

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

In [332]:
[vintage_rating_count_df['year'] in [str(year) for year in range(2010, 2021)]

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

In [316]:
vintage_rating_count_df['year'].replace('N.V.', 'NaN', inplace=True)

In [306]:
rating_gb_vintage_country = vintage_rating_count_df.groupby(['country', 'year']).agg({'id': 'count', 'ratings': 'mean'})

In [309]:
rating_gb_vintage_country.loc['Italy', :].sort_values(by='id', ascending=False)

Unnamed: 0_level_0,id,ratings
year,Unnamed: 1_level_1,Unnamed: 2_level_1
2018,1518,244.114625
2015,1261,386.235527
2017,1244,304.055466
2016,1207,374.729909
2013,707,455.74116
2019,658,156.323708
2014,641,452.096724
2012,498,407.283133
2010,344,390.063953
2011,340,388.429412


In [216]:
wine_ids[0]

1105374

In [212]:
# for item in list(recovered_data_distinct):
#     if item['vintage']['wine']['id'] == '63':
#         print(item)

In [245]:
rating_count = [item['vintage']['wine']['statistics']['ratings_count'] if item['vintage']['wine']['statistics']['ratings_count'] is not None else np.nan for item in list(recovered_data_distinct)]

In [230]:
wine_country = [item['vintage']['wine']['region']['country']['name'] if item['vintage']['wine']['region'] is not None else np.nan for item in list(recovered_data_distinct)]

In [283]:
for item in list(recovered_data_distinct):
    if item['vintage']['wine']['id'] == 86559:
        print(item['vintage']['wine']['statistics'])

{'status': 'Normal', 'ratings_count': 6582, 'ratings_average': 4.2, 'labels_count': 24661, 'vintages_count': 42}
{'status': 'Normal', 'ratings_count': 6582, 'ratings_average': 4.2, 'labels_count': 24661, 'vintages_count': 42}
{'status': 'Normal', 'ratings_count': 6582, 'ratings_average': 4.2, 'labels_count': 24661, 'vintages_count': 42}


In [221]:
len(wine_ids)

55819

In [241]:
wine_ids[0]

1105374

In [246]:
type(rating_count)

list

In [231]:
len(wine_country)

55819

In [247]:
len(rating_count)

55819

In [218]:
# wine_rating_dict = {'id': wine_ids, 'ratings': wine_rating_count}

In [248]:
wine_rating_count = pd.DataFrame()

In [249]:
wine_rating_count['id'] = wine_ids
wine_rating_count['ratings'] = rating_count
wine_rating_count['country'] = wine_country

In [275]:
# wine_rating_count.columns = ['rating_count']
wine_rating_dist = wine_rating_count.drop_duplicates()

In [273]:
wine_rating_count[wine_rating_count['id']==86559]

Unnamed: 0,id,ratings,country
2829,86559,6582,Italy
4537,86559,6582,Italy
14219,86559,6582,Italy


In [269]:
wine_rating_count.ratings.sum()

213407778

In [276]:
rating_gb_country = wine_rating_dist.groupby('country').agg({'id': 'count', 'ratings': 'mean'})
# rating_gb_country[['']]

In [277]:
rating_gb_country.sort_values(by='id', ascending=False)

Unnamed: 0_level_0,id,ratings
country,Unnamed: 1_level_1,Unnamed: 2_level_1
France,9671,1524.14042
Italy,5819,2434.069943
Spain,3170,2861.962776
Portugal,2268,1937.43739
South Africa,1763,1165.027794
United States,1441,3251.425399
Australia,1406,1294.027027
Argentina,839,4848.464839
Chile,768,3551.606771
New Zealand,733,1597.25648


In [205]:
wine_rating_count_dist = wine_rating_count['rating_count'].sort_values(ascending=False).drop_duplicates()

In [209]:
wine_rating_count_dist.head()

63    8953437
29    8953423
54    8761218
35    8754616
70    8675139
Name: rating_count, dtype: int64

In [210]:
wine_rating_count_dist[63]

63    8953437
63    8171983
63    8138743
63    7899836
63    7732911
63    7377040
63    6844596
63    6477749
63    6207941
63    5145828
63    4993029
63    4654271
63    3604522
63    2808499
63    2536378
63    2374012
63    2264358
63    1772869
Name: rating_count, dtype: int64

In [201]:
wine_rating_count.loc['1324035',:]

KeyError: '1324035'

In [172]:
wine_rating_count = [{'id': wine_ids[i], 'rating_count': wine_rating_count[i]} for i in range(len(wine_ids))]

SyntaxError: invalid syntax (<ipython-input-172-205e46dcb48c>, line 1)

In [165]:
wine_ids = list(set(wine_ids))

In [166]:
len(wine_ids)

29811

In [159]:
len(ratings_num)

55819

In [167]:
vintage_list = list(recovered_data_distinct)

In [None]:
# for wine_id in wine_ids:
#     for item in vintage_list:
#         if item['wine']['id'] == wine_id:
#     if recovered_data_distinct

In [268]:
# list(recovered_data_distinct)[0]

In [156]:
pd.json_normalize(reviews_df['vintage'])

Unnamed: 0,id,seo_name,name,organic_certification_id,certified_biodynamic,year,grapes,has_valid_ratings,statistics.status,statistics.ratings_count,...,image.variations.label,image.variations.label_large,image.variations.label_medium,image.variations.label_medium_square,image.variations.label_small_square,wine.statistics.status,wine.statistics.ratings_count,wine.statistics.ratings_average,wine.statistics.labels_count,wine.statistics.vintages_count
0,1540565,esporao-alandra-tinto-uv,Esporão Alandra Tinto,,,U.V.,,True,Normal,18142,...,,,,,,,,,,
1,1540565,esporao-alandra-tinto-uv,Esporão Alandra Tinto,,,U.V.,,True,Normal,18142,...,//images.vivino.com/thumbs/ZH8kSRcfRE2CL0wIYVu...,//images.vivino.com/thumbs/ZH8kSRcfRE2CL0wIYVu...,//images.vivino.com/thumbs/ZH8kSRcfRE2CL0wIYVu...,//images.vivino.com/thumbs/ZH8kSRcfRE2CL0wIYVu...,//images.vivino.com/thumbs/ZH8kSRcfRE2CL0wIYVu...,,,,,
2,1540565,esporao-alandra-tinto-uv,Esporão Alandra Tinto,,,U.V.,,True,Normal,18142,...,//images.vivino.com/thumbs/ZH8kSRcfRE2CL0wIYVu...,//images.vivino.com/thumbs/ZH8kSRcfRE2CL0wIYVu...,//images.vivino.com/thumbs/ZH8kSRcfRE2CL0wIYVu...,//images.vivino.com/thumbs/ZH8kSRcfRE2CL0wIYVu...,//images.vivino.com/thumbs/ZH8kSRcfRE2CL0wIYVu...,,,,,
3,1540565,esporao-alandra-tinto-uv,Esporão Alandra Tinto,,,U.V.,,True,Normal,18142,...,//images.vivino.com/thumbs/ZH8kSRcfRE2CL0wIYVu...,//images.vivino.com/thumbs/ZH8kSRcfRE2CL0wIYVu...,//images.vivino.com/thumbs/ZH8kSRcfRE2CL0wIYVu...,//images.vivino.com/thumbs/ZH8kSRcfRE2CL0wIYVu...,//images.vivino.com/thumbs/ZH8kSRcfRE2CL0wIYVu...,,,,,
4,1540565,esporao-alandra-tinto-uv,Esporão Alandra Tinto,,,U.V.,,True,Normal,18142,...,//images.vivino.com/thumbs/ZH8kSRcfRE2CL0wIYVu...,//images.vivino.com/thumbs/ZH8kSRcfRE2CL0wIYVu...,//images.vivino.com/thumbs/ZH8kSRcfRE2CL0wIYVu...,//images.vivino.com/thumbs/ZH8kSRcfRE2CL0wIYVu...,//images.vivino.com/thumbs/ZH8kSRcfRE2CL0wIYVu...,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2407,32230520,maal-rebelion-malbec-blend-2014,Maal Rebelion Malbec Blend 2014,,,2014,,True,Normal,99,...,,,,,,,,,,
2408,32230520,maal-rebelion-malbec-blend-2014,Maal Rebelion Malbec Blend 2014,,,2014,,True,Normal,99,...,,,,,,,,,,
2409,32230520,maal-rebelion-malbec-blend-2014,Maal Rebelion Malbec Blend 2014,,,2014,,True,Normal,99,...,,,,,,,,,,
2410,32230520,maal-rebelion-malbec-blend-2014,Maal Rebelion Malbec Blend 2014,,,2014,,True,Normal,99,...,,,,,,,,,,


In [154]:
reviews_df.head()

Unnamed: 0,activity,aggregated,created_at,flavor_word_matches,id,language,note,rating,tagged_note,user,vintage
0,"{'id': 37813379, 'statistics': {'likes_count':...",1.0,2015-05-17T00:55:25.000Z,"[{'id': 49, 'match': 'blackberry'}, {'id': 347...",27887240.0,en,A dry raisin start with a sight blackberry fin...,3.0,A dry raisin start with a sight blackberry fin...,"{'id': 950437, 'seo_name': 'christian_hav', 'a...","{'id': 1540565, 'seo_name': 'esporao-alandra-t..."
1,"{'id': 272319839, 'statistics': {'likes_count'...",1.0,2018-09-11T18:51:43.000Z,"[{'id': 384, 'match': 'smoke'}]",104293043.0,en,Slightly sweet however a little flat. Lacks d...,3.0,Slightly sweet however a little flat. Lacks d...,"{'id': 15643826, 'seo_name': 'tron.em', 'alias...","{'id': 1540565, 'seo_name': 'esporao-alandra-t..."
2,"{'id': 243365981, 'statistics': {'likes_count'...",1.0,2018-03-26T23:35:59.000Z,"[{'id': 49, 'match': 'blackberry'}, {'id': 93,...",91245283.0,en,"A nice spice on the nose, with cherry red frui...",3.0,"A nice spice on the nose, with cherry red frui...","{'id': 12609403, 'seo_name': 'oliver-sho', 'al...","{'id': 1540565, 'seo_name': 'esporao-alandra-t..."
3,"{'id': 356154948, 'statistics': {'likes_count'...",1.0,2019-08-24T21:00:31.000Z,"[{'id': 135, 'match': 'dark fruit'}]",136260274.0,en,Some dark fruit but overall this was a little ...,3.0,Some dark fruit but overall this was a little ...,"{'id': 16114186, 'seo_name': 'scott_alexande',...","{'id': 1540565, 'seo_name': 'esporao-alandra-t..."
4,"{'id': 11405174, 'statistics': {'likes_count':...",1.0,2014-05-23T19:18:43.000Z,,11309568.0,en,"Fruity, nice finish",4.0,"Fruity, nice finish","{'id': 4301557, 'seo_name': 'janhart1', 'alias...","{'id': 1540565, 'seo_name': 'esporao-alandra-t..."


In [104]:
browser = initialize_chrome_driver()
test_wine_page = "https://www.vivino.com/domaine-hippolyte-reverdy-sancerre/w/1112140?year=2019&price_id=22907185&cart_item_source="

# "https://www.vivino.com/explore?e=eJzLLbI1VMvNzLM1UMtNrLA1MTBQS660dXdSSwYSAWoFQNn0NNuyxKLM1JLEHLX8JNuixJLMvPTi-MSy1KLE9FS1fNuU1OJktfKS6FhbQwDu-xpj"
browser.get(test_wine_page)
# res = browser.find_element_by_class_name("inner-page")
browser.get_screenshot_as_file('test_screenshot.png')
# # browser.text

True

In [None]:
# sends request 1 to load the page
cookie_request_1_ending = "_ruby-web_session=ZFUgjJW5mJwlOYZswY8NfAuSPiBDkA18lNES5wCIqdJx2yxxc7KYki6Duc6GDojGQgssWjptW0uN7knNpxIJ8mhGtbl82MiLRHjpu%2BgQMmIconQU0K8jfhPCWVKqHVWlIJem7gMNHsrJAeGAkZv0YEakOVLyd3qzBP1y6iIRY37%2BGfo0HU48A%2BSslPsz4TJ%2Ft2UFrgjYZmj%2BGHjeamt9HncGGHF4eCoRiSDB3Y8u8At%2FBB33xrqAvR2ZTMpm29ngRoRjAiw5qYZ76vH%2FTZQaQbdWnPUJeYZTJ6nE8NLLUIy9pmvGWu9rR18H461yZ%2BPXGwfJbkWJQUZuy%2FOwi9MvEsbcYc%2Bl4OYAscE3MkDgRbroHCE59JPLoTOkUJw0FlOZIMo22SmZvOarIHMacN2KculJGKhLoLE6Fp0JkWappbr3uNsCJBIFEHbCWqP5I5fmA9g0OXdjYIznd4rIJgIlz5tE%2BmIj4EN6o6bRDHA0JIhM9%2FOmJ62k8pZWn%2BT8xWhKA361veCGG62IClAfqhwi1xEGPFAKznUUbTwTAzZ9e7rv90N2dn8qcD3TU5bMGyXyJmV3RG6sYE1M%2FsoUncZmPmGTLpvHqkoCFiORUU4GdPMdgzLVJGoD4UoQQ%2B5eJ8v2dJUnfNS8NUIbiNnnB16%2BQuOkSDZbIDvmzsUTZ08yB5ilLhURD1mxG%2FSdIGVOrO6e325zFPJamiNQNft3%2BFAjLq2mRBv47U8c2itR4DD39Sz7k4QxugCoZJo1JZmvIQUUDKzHoAYaN89Liz4SgY5PALU%3D--QM6wguzltewL%2FWwE--Up%2Bk5RhESSoqLe0VhnMvAA%3D%3D"

cookie_response_1_set = "_ruby-web_session=ojWEad5gcHym50K%2F9aXQ08Vo3qYrJ8rxVB5SW0kRVIFkVymVYp02FlVYMk1fzUOJ2BAQDRMeLkGgnWy3J4ORuQJu5BrfLHDA8KaK9p5gdgLD4AYhFjEQpwYwKxg0TZ23Oz4Iiig%2Bc88D6SpPrAJj0uvtgsRM%2B4lNOZNgf4bssgvxi4LdNwgxmntKhbM1%2F8OvWONIyxSpZiC2eTqzRIFeeb2g6rJJm2zgMzKrk60cPucOWuMo2900HKEyVsQRfWLZKj9wc9sUXWzYyW0IJuQNF0%2BSGt%2FrJeM6iWotfngysEXM2o7I2jQgeRwsrQDFUH7u2thXTCHUEdFq8%2B11ZWzS7qaxYEMzAvADNiS0DFumxquZ9wDZTbAl3UZpx%2Blw9I2icw9gMNRvaFxgZdp8EySFUNSVyRBCihpMPcLw1S2piAHbVIjPpZoT2ypFFOrqcHU0wVaZRIXrcIJaJWgKHnFBf5kthyRCALnOQbOVHrplCxzUeAgCOaOJ1DKiJ3K68vWElvgLM5DWZnMaYYFjrja19X4yK3CL37JwmUopHc%2BTSd8BbRE5xCgjVVUPD4Uxdgze%2Fa0k6CQ6XH1aQfxwB6oMJzUbcxYbpB81Rp%2BQ8yUeTU7EmRF7VvEvEUNLIUuiVqbr%2B6RghvFuzlRdzAga0953g6eShVfhwBVoDCb1%2BYLaydtYSfZBmlUpzQ7iVK1sgfl7CxdxOuQDz2uvMEanSgll99oQasiT9hxWj9ec8yTD%2FUEXqP9gLWvJhgO5pGBqErk1NUV9vYPEqVrX9K4zJij5laM%3D--GEYwb07Zv7XUBeW4--ZrhLwtHiGVhgMIYVyrDYkA%3D%3D; path=/; expires=Tue, 16 Nov 2021 14:12:54 GMT; HttpOnly; secure; SameSite=Lax"

# sends GET request 2 to /api/grapes/?cache_key=446940327868b531465ef20d93fb20c8e7a1f4ceb068388a6aea
cookie_request_2_ends = "_ruby-web_session=ojWEad5gcHym50K%2F9aXQ08Vo3qYrJ8rxVB5SW0kRVIFkVymVYp02FlVYMk1fzUOJ2BAQDRMeLkGgnWy3J4ORuQJu5BrfLHDA8KaK9p5gdgLD4AYhFjEQpwYwKxg0TZ23Oz4Iiig%2Bc88D6SpPrAJj0uvtgsRM%2B4lNOZNgf4bssgvxi4LdNwgxmntKhbM1%2F8OvWONIyxSpZiC2eTqzRIFeeb2g6rJJm2zgMzKrk60cPucOWuMo2900HKEyVsQRfWLZKj9wc9sUXWzYyW0IJuQNF0%2BSGt%2FrJeM6iWotfngysEXM2o7I2jQgeRwsrQDFUH7u2thXTCHUEdFq8%2B11ZWzS7qaxYEMzAvADNiS0DFumxquZ9wDZTbAl3UZpx%2Blw9I2icw9gMNRvaFxgZdp8EySFUNSVyRBCihpMPcLw1S2piAHbVIjPpZoT2ypFFOrqcHU0wVaZRIXrcIJaJWgKHnFBf5kthyRCALnOQbOVHrplCxzUeAgCOaOJ1DKiJ3K68vWElvgLM5DWZnMaYYFjrja19X4yK3CL37JwmUopHc%2BTSd8BbRE5xCgjVVUPD4Uxdgze%2Fa0k6CQ6XH1aQfxwB6oMJzUbcxYbpB81Rp%2BQ8yUeTU7EmRF7VvEvEUNLIUuiVqbr%2B6RghvFuzlRdzAga0953g6eShVfhwBVoDCb1%2BYLaydtYSfZBmlUpzQ7iVK1sgfl7CxdxOuQDz2uvMEanSgll99oQasiT9hxWj9ec8yTD%2FUEXqP9gLWvJhgO5pGBqErk1NUV9vYPEqVrX9K4zJij5laM%3D--GEYwb07Zv7XUBeW4--ZrhLwtHiGVhgMIYVyrDYkA%3D%3D"

#sends GET request 3 to /api/countries?cache_key=446940327868b531465ef20d93fb20c8e7a1f4ceb068388a6aea
cookie_request_3_ends = "_ruby-web_session=ojWEad5gcHym50K%2F9aXQ08Vo3qYrJ8rxVB5SW0kRVIFkVymVYp02FlVYMk1fzUOJ2BAQDRMeLkGgnWy3J4ORuQJu5BrfLHDA8KaK9p5gdgLD4AYhFjEQpwYwKxg0TZ23Oz4Iiig%2Bc88D6SpPrAJj0uvtgsRM%2B4lNOZNgf4bssgvxi4LdNwgxmntKhbM1%2F8OvWONIyxSpZiC2eTqzRIFeeb2g6rJJm2zgMzKrk60cPucOWuMo2900HKEyVsQRfWLZKj9wc9sUXWzYyW0IJuQNF0%2BSGt%2FrJeM6iWotfngysEXM2o7I2jQgeRwsrQDFUH7u2thXTCHUEdFq8%2B11ZWzS7qaxYEMzAvADNiS0DFumxquZ9wDZTbAl3UZpx%2Blw9I2icw9gMNRvaFxgZdp8EySFUNSVyRBCihpMPcLw1S2piAHbVIjPpZoT2ypFFOrqcHU0wVaZRIXrcIJaJWgKHnFBf5kthyRCALnOQbOVHrplCxzUeAgCOaOJ1DKiJ3K68vWElvgLM5DWZnMaYYFjrja19X4yK3CL37JwmUopHc%2BTSd8BbRE5xCgjVVUPD4Uxdgze%2Fa0k6CQ6XH1aQfxwB6oMJzUbcxYbpB81Rp%2BQ8yUeTU7EmRF7VvEvEUNLIUuiVqbr%2B6RghvFuzlRdzAga0953g6eShVfhwBVoDCb1%2BYLaydtYSfZBmlUpzQ7iVK1sgfl7CxdxOuQDz2uvMEanSgll99oQasiT9hxWj9ec8yTD%2FUEXqP9gLWvJhgO5pGBqErk1NUV9vYPEqVrX9K4zJij5laM%3D--GEYwb07Zv7XUBeW4--ZrhLwtHiGVhgMIYVyrDYkA%3D%3D"

#sends GET request 4 to /api/wine_styles/?cache_key=446940327868b531465ef20d93fb20c8e7a1f4ceb068388a6aea
cookie_request_4_ends = "_ruby-web_session=ojWEad5gcHym50K%2F9aXQ08Vo3qYrJ8rxVB5SW0kRVIFkVymVYp02FlVYMk1fzUOJ2BAQDRMeLkGgnWy3J4ORuQJu5BrfLHDA8KaK9p5gdgLD4AYhFjEQpwYwKxg0TZ23Oz4Iiig%2Bc88D6SpPrAJj0uvtgsRM%2B4lNOZNgf4bssgvxi4LdNwgxmntKhbM1%2F8OvWONIyxSpZiC2eTqzRIFeeb2g6rJJm2zgMzKrk60cPucOWuMo2900HKEyVsQRfWLZKj9wc9sUXWzYyW0IJuQNF0%2BSGt%2FrJeM6iWotfngysEXM2o7I2jQgeRwsrQDFUH7u2thXTCHUEdFq8%2B11ZWzS7qaxYEMzAvADNiS0DFumxquZ9wDZTbAl3UZpx%2Blw9I2icw9gMNRvaFxgZdp8EySFUNSVyRBCihpMPcLw1S2piAHbVIjPpZoT2ypFFOrqcHU0wVaZRIXrcIJaJWgKHnFBf5kthyRCALnOQbOVHrplCxzUeAgCOaOJ1DKiJ3K68vWElvgLM5DWZnMaYYFjrja19X4yK3CL37JwmUopHc%2BTSd8BbRE5xCgjVVUPD4Uxdgze%2Fa0k6CQ6XH1aQfxwB6oMJzUbcxYbpB81Rp%2BQ8yUeTU7EmRF7VvEvEUNLIUuiVqbr%2B6RghvFuzlRdzAga0953g6eShVfhwBVoDCb1%2BYLaydtYSfZBmlUpzQ7iVK1sgfl7CxdxOuQDz2uvMEanSgll99oQasiT9hxWj9ec8yTD%2FUEXqP9gLWvJhgO5pGBqErk1NUV9vYPEqVrX9K4zJij5laM%3D--GEYwb07Zv7XUBeW4--ZrhLwtHiGVhgMIYVyrDYkA%3D%3D"

#sends GET request 5 to /api/foods/?cache_key=446940327868b531465ef20d93fb20c8e7a1f4ceb068388a6aea
cookie_request_5_ends = "_ruby-web_session=ojWEad5gcHym50K%2F9aXQ08Vo3qYrJ8rxVB5SW0kRVIFkVymVYp02FlVYMk1fzUOJ2BAQDRMeLkGgnWy3J4ORuQJu5BrfLHDA8KaK9p5gdgLD4AYhFjEQpwYwKxg0TZ23Oz4Iiig%2Bc88D6SpPrAJj0uvtgsRM%2B4lNOZNgf4bssgvxi4LdNwgxmntKhbM1%2F8OvWONIyxSpZiC2eTqzRIFeeb2g6rJJm2zgMzKrk60cPucOWuMo2900HKEyVsQRfWLZKj9wc9sUXWzYyW0IJuQNF0%2BSGt%2FrJeM6iWotfngysEXM2o7I2jQgeRwsrQDFUH7u2thXTCHUEdFq8%2B11ZWzS7qaxYEMzAvADNiS0DFumxquZ9wDZTbAl3UZpx%2Blw9I2icw9gMNRvaFxgZdp8EySFUNSVyRBCihpMPcLw1S2piAHbVIjPpZoT2ypFFOrqcHU0wVaZRIXrcIJaJWgKHnFBf5kthyRCALnOQbOVHrplCxzUeAgCOaOJ1DKiJ3K68vWElvgLM5DWZnMaYYFjrja19X4yK3CL37JwmUopHc%2BTSd8BbRE5xCgjVVUPD4Uxdgze%2Fa0k6CQ6XH1aQfxwB6oMJzUbcxYbpB81Rp%2BQ8yUeTU7EmRF7VvEvEUNLIUuiVqbr%2B6RghvFuzlRdzAga0953g6eShVfhwBVoDCb1%2BYLaydtYSfZBmlUpzQ7iVK1sgfl7CxdxOuQDz2uvMEanSgll99oQasiT9hxWj9ec8yTD%2FUEXqP9gLWvJhgO5pGBqErk1NUV9vYPEqVrX9K4zJij5laM%3D--GEYwb07Zv7XUBeW4--ZrhLwtHiGVhgMIYVyrDYkA%3D%3D"

#sends GET request 6 to /api/explore/explore?country_code=GB&currency_code=GBP&grape_filter=varietal&min_rating=3.5&order_by=ratings_average&order=desc&page=1&price_range_max=20&price_range_min=5&wine_type_ids[]=1&wine_type_ids[]=2
cookie_request_6_ends = "_ruby-web_session=ojWEad5gcHym50K%2F9aXQ08Vo3qYrJ8rxVB5SW0kRVIFkVymVYp02FlVYMk1fzUOJ2BAQDRMeLkGgnWy3J4ORuQJu5BrfLHDA8KaK9p5gdgLD4AYhFjEQpwYwKxg0TZ23Oz4Iiig%2Bc88D6SpPrAJj0uvtgsRM%2B4lNOZNgf4bssgvxi4LdNwgxmntKhbM1%2F8OvWONIyxSpZiC2eTqzRIFeeb2g6rJJm2zgMzKrk60cPucOWuMo2900HKEyVsQRfWLZKj9wc9sUXWzYyW0IJuQNF0%2BSGt%2FrJeM6iWotfngysEXM2o7I2jQgeRwsrQDFUH7u2thXTCHUEdFq8%2B11ZWzS7qaxYEMzAvADNiS0DFumxquZ9wDZTbAl3UZpx%2Blw9I2icw9gMNRvaFxgZdp8EySFUNSVyRBCihpMPcLw1S2piAHbVIjPpZoT2ypFFOrqcHU0wVaZRIXrcIJaJWgKHnFBf5kthyRCALnOQbOVHrplCxzUeAgCOaOJ1DKiJ3K68vWElvgLM5DWZnMaYYFjrja19X4yK3CL37JwmUopHc%2BTSd8BbRE5xCgjVVUPD4Uxdgze%2Fa0k6CQ6XH1aQfxwB6oMJzUbcxYbpB81Rp%2BQ8yUeTU7EmRF7VvEvEUNLIUuiVqbr%2B6RghvFuzlRdzAga0953g6eShVfhwBVoDCb1%2BYLaydtYSfZBmlUpzQ7iVK1sgfl7CxdxOuQDz2uvMEanSgll99oQasiT9hxWj9ec8yTD%2FUEXqP9gLWvJhgO5pGBqErk1NUV9vYPEqVrX9K4zJij5laM%3D--GEYwb07Zv7XUBeW4--ZrhLwtHiGVhgMIYVyrDYkA%3D%3D"

#sends GET request 7 to https://www.vivino.com/api/carts/
cookie_request_7_ends = "_ruby-web_session=ojWEad5gcHym50K%2F9aXQ08Vo3qYrJ8rxVB5SW0kRVIFkVymVYp02FlVYMk1fzUOJ2BAQDRMeLkGgnWy3J4ORuQJu5BrfLHDA8KaK9p5gdgLD4AYhFjEQpwYwKxg0TZ23Oz4Iiig%2Bc88D6SpPrAJj0uvtgsRM%2B4lNOZNgf4bssgvxi4LdNwgxmntKhbM1%2F8OvWONIyxSpZiC2eTqzRIFeeb2g6rJJm2zgMzKrk60cPucOWuMo2900HKEyVsQRfWLZKj9wc9sUXWzYyW0IJuQNF0%2BSGt%2FrJeM6iWotfngysEXM2o7I2jQgeRwsrQDFUH7u2thXTCHUEdFq8%2B11ZWzS7qaxYEMzAvADNiS0DFumxquZ9wDZTbAl3UZpx%2Blw9I2icw9gMNRvaFxgZdp8EySFUNSVyRBCihpMPcLw1S2piAHbVIjPpZoT2ypFFOrqcHU0wVaZRIXrcIJaJWgKHnFBf5kthyRCALnOQbOVHrplCxzUeAgCOaOJ1DKiJ3K68vWElvgLM5DWZnMaYYFjrja19X4yK3CL37JwmUopHc%2BTSd8BbRE5xCgjVVUPD4Uxdgze%2Fa0k6CQ6XH1aQfxwB6oMJzUbcxYbpB81Rp%2BQ8yUeTU7EmRF7VvEvEUNLIUuiVqbr%2B6RghvFuzlRdzAga0953g6eShVfhwBVoDCb1%2BYLaydtYSfZBmlUpzQ7iVK1sgfl7CxdxOuQDz2uvMEanSgll99oQasiT9hxWj9ec8yTD%2FUEXqP9gLWvJhgO5pGBqErk1NUV9vYPEqVrX9K4zJij5laM%3D--GEYwb07Zv7XUBeW4--ZrhLwtHiGVhgMIYVyrDYkA%3D%3D"

#sends GET request 8 to /api/vintages/159806425/highlights
cookie_request_8_ends = "_ruby-web_session=oNuxPOQPixXfw27wIOnDeLVu5H%2FAJycGg9RKhuzatmMswm0SYT28f4MTMhFFLXaYGwL9y2o1l7NvgD%2FgP1XCGMvjI3rkvXAkvcYmE%2F9yJ2%2FlLM4HKc5i22qxppbKhU%2FA7dUp9jPYulk76rLLFZAhQ8jx4bfqlotLwK8%2BZ9%2FYEVcv6PO351cp%2FedXew1I2FXcdzt%2FQzqiQxlHKkfrP1dSUK8yLmcHU4wMU07SYuWfnnneDwstR7x8yQ%2F3kW2EGERbIFvkWwGe06f9spTNoXhJdRiyTB7C5lkEUt1Tn0ysVvAoKKGafRDllJ9mwUQDRImpXBIoBTS5%2FbxGzGP4NFG%2BHcpYya7HFEL5LXfntZ0lixNtduLYm9ZQXwyFDRbH35VKZWq9Bo9ajNYXYzT6ruwfUw8pHzD4e%2BrVIAkEIvaxr2f25Rt8V%2FBqXODOq37r8kIuLtC8c3ytQm8wMy7RJPjqk8v4MJoHKJ9SNswYtyS06%2FH5Z%2BJ1vTvCXLEE2xUrR0LaovKepMjafvKgHgL1b8L5d%2BAacNt4t6uiF%2FVfNY950IN38QKA0SEx2ns6ybK37SkxqZiPLv8ayui5LnY8h4QtWFzC9Ii9fmYZotyCjlOaGQ9TsfXOGYMI%2F%2BP6deoXOkk8AekDfW5v6c%2BUJmP55AH579BZWKuftrhl%2FxmM5xSH8yQeGBVOPq7SubcHFRLbWauAZgtI1k911QwZfVms9RlETYezDaR2QWdLYbv9Y76361eXX4WtElUoQXzoV3WGbfNUGvAbdqxi5GaUEENI%2BiD0e0s%3D--csIqewSs32s3n7lO--zgiHxzoWyqiIPXHtLIGqCg%3D%3D"

#sends GET request 9 to /api/vintages/158872529/highlights
cookie_request_9_ends = "_ruby-web_session=ChUK7wrLY3M5YOuaPmdxsxIx28AAhk4bW3PgXUXOpC95Uw4DbNYrz5%2BlGeGowSKDJ2jLrXQ8l4gzQBLqME7O7yUawaQEL21vfwDe0mueQ2nFfgJBdu8QPTOpo%2F6tjYKwdle%2F9Q5hn9VqzpJZBFvHF86n4bQNPfRg%2F8q3lsMICH0mD9hSgP44djNotXWt90x09KuuHAVxopPs%2F6sPI4rsnRQfWvbwkSAhaXVgFppvIZX2tIJh2nI%2BJwk0EYa6Ijmm5IbiQYqLpKqm2Nips0vGFBWrn2ywh%2BMGHZy5H9z0gmwVtKQ4gvZ77g9hjUlfUDPoeuObAz6zIon7K7KaSeQhBlga%2FUgjy%2FKOTO93URdOsx4%2FVQcK%2FvWEYZGVuPCarTTHDnYeX690GjubU8WQWczx9hxRufsBm%2FlKB4jGDbMaRPZ8NSRrhyT4vxa28w5LP1k9W%2F%2BE4t76Ig4WRTfNt1UM6VmbNNgwt0etciOPYUFqSLwCMSGU9GwLt9kAOxY4BMTgwN9Y0n8LzATSqOpuMUuH4FUP1%2F9s2Jf2BIlnWR7nCYjOUWBqOMdmtZayS2Gw%2BUspgnCuf0SwxBMmvzWFME87%2FK%2BlRqKC%2Fn6GcXFtGOpRNjYW0EXolyRYmg6PakPTJaITv7LMRwnOa9VJd0NLN9qUDcGK9axaWlPRHbBvT1DTGvxJsDoWtraXx6234aOJ42jFCwqYrt7c3gqnh%2BP8vN%2FydnDe8K4VaxY45QMTD6Sj7IosdNqvjJKRI78JO6nUX1sofBlJ2tJq4smt3sWtY4aCpy4%3D--ltY5UTjlrgUvDLxO--owbtE9ZU6sxbQ6Vh6bmDCg%3D%3D"

#sends GET request 10 to /api/regions/?cache_key=446940327868b531465ef20d93fb20c8e7a1f4ceb068388a6aea
cookie_request_10_ends = "_ruby-web_session=0Cr1sRY%2F4LW7c2SQJ4Wdbq5LFxH5UTCbcxwLg4ryRJkyy%2BncKtfizZteqcvWq4C9fTepIWhJyJIWisS4v3g4ixxkpIUw%2F1DfHt5JMgH0v3CBftxG8I9OtHg5m1m5mvUy3p3tIGMdeOBlAyvdmNRE8vP4S8nQnnzx5J6NnMpMJzQysOmxCloXyugNKIN8JQxOz8vWVeRo76SKvenoB%2BD0ZAinTx496riDuTkmC770Ett5lQ0vWArXqDCarHDp0Of1BuEHI10CCDRvyFRw70LnmZnoe3CTGPa5Up1qgCfe9ZS4Dygv69XVAReYl2Ov7%2FiIrOsHe9DT9ta%2BggCjxGutFRo4LYsXHw5JjNcHX%2BbRAVvb3nEQxdvosoyR4LvUN5SfAZytjUVNmuebJtFuneHSQWcFsBv0YcL0xLIcJis4%2BYic2jTlmJddkioLu6A5qWWIr64Nyw4%2B7BI1JueDFlvctMphNBwk9qNPelqiPIZo9Z5YV%2BE96Nj8%2F9ZiU9ot77oxnURjYkcQdQdF%2FUOH9w7YvCjcLsOnTkeQLuIQG1akj4MtzWyUiniBM05RtrPPwT0M1k05TaQPqmLxdBF032DWTcLToR0rf1cZLrklC4CxEhDo%2BBntDljaTDqkKvYV4u5DHatq0BRfaRe8IAestHqCKlDubo0q%2BL5vAwB%2B10aRe1MXLKqdetVY65eT1XIUKxu32iuP8Li0XUa4%2FjURNoLq3ONUroRRPvtKzhgEu3IB0g%2FBzi19ELD6WaVR566ggslVViwT%2F89SQi1JZ9Vof8lwsn4%3D--Ad99Km%2BUiC2tE6Vl--cIY37u3jVM9qttn6f1DD0A%3D%3D"

#sends GET request 11 to /api/vintages/159883127/highlights
cookie_request_11_ends = "_ruby-web_session=0Cr1sRY%2F4LW7c2SQJ4Wdbq5LFxH5UTCbcxwLg4ryRJkyy%2BncKtfizZteqcvWq4C9fTepIWhJyJIWisS4v3g4ixxkpIUw%2F1DfHt5JMgH0v3CBftxG8I9OtHg5m1m5mvUy3p3tIGMdeOBlAyvdmNRE8vP4S8nQnnzx5J6NnMpMJzQysOmxCloXyugNKIN8JQxOz8vWVeRo76SKvenoB%2BD0ZAinTx496riDuTkmC770Ett5lQ0vWArXqDCarHDp0Of1BuEHI10CCDRvyFRw70LnmZnoe3CTGPa5Up1qgCfe9ZS4Dygv69XVAReYl2Ov7%2FiIrOsHe9DT9ta%2BggCjxGutFRo4LYsXHw5JjNcHX%2BbRAVvb3nEQxdvosoyR4LvUN5SfAZytjUVNmuebJtFuneHSQWcFsBv0YcL0xLIcJis4%2BYic2jTlmJddkioLu6A5qWWIr64Nyw4%2B7BI1JueDFlvctMphNBwk9qNPelqiPIZo9Z5YV%2BE96Nj8%2F9ZiU9ot77oxnURjYkcQdQdF%2FUOH9w7YvCjcLsOnTkeQLuIQG1akj4MtzWyUiniBM05RtrPPwT0M1k05TaQPqmLxdBF032DWTcLToR0rf1cZLrklC4CxEhDo%2BBntDljaTDqkKvYV4u5DHatq0BRfaRe8IAestHqCKlDubo0q%2BL5vAwB%2B10aRe1MXLKqdetVY65eT1XIUKxu32iuP8Li0XUa4%2FjURNoLq3ONUroRRPvtKzhgEu3IB0g%2FBzi19ELD6WaVR566ggslVViwT%2F89SQi1JZ9Vof8lwsn4%3D--Ad99Km%2BUiC2tE6Vl--cIY37u3jVM9qttn6f1DD0A%3D%3D

#sends GET request 12 to /api/vintages/155278137/highlights
cookie_request_12_ends = "_ruby-web_session=Xtil9lCKKzmVPy90RUIeWWUWaoXcMqsCtVUEg6d6nL8m9UNrI4fs0O%2Bp45I9H6oR8ssNLYtJQDzHvtOpuKCpV%2FgVueCngEf%2FfpGDLwivDaUT10oyLqBe45LIfoTqcb5HzMUU%2BG3O70UF8tCzdJpJNH6toKLOXT8mY%2FIfCIlrWF3TuKdyZefApuDwI3cLsuCATJpl9ubhMVwNREYrtc3Jyrb1p3VAucg3BU8zIbmH22lLc%2F2%2FEQYSzLhZuydlf2XtBu9%2FQ0zuX5bZcw%2FLm%2BC3085D2QwYe8xaEC8%2BHmANZZ7jksnw7cDSx9EZQI%2F4dzjn2wZLKhRP8MY%2FEzr3Jxdh3dAyqtmsR9nOyALcBpL42PJvP68ocpslgDgFlUv5wr%2FtrfCF2%2BD4CbQNlsSKVB5QS%2F9vUaKjPyS8Evo%2BtAXRTofqCxjuoRo1MAjg5zQuiYpIcrhIiixUeRO1vERwOeqXZmpSXvu%2BRW9TUAWcnK1GQGWiAVYdmxD1vfoQNPqbkzPL3K9zouY0jndWo6K%2BkSB492%2BqLmhjrFVwLN42wkj3ovwWz34z8GQ2ezjIPvDGpCW8XWeZUBxWLSeNq6esFXiunzB9uq5LFSk1NZnoUxIHRt3iBhdH%2BEuwqy%2BJ0LKfSjCsph5zvYH%2FEe%2Bmr96ak8b7hsale91PkhkmuTEC7cvnw5KdF7shRICusDOGJMjN3y3ZQwhTiN7pCoLNPFRi0XOsWDVmasM1yhSsZBY363TwJAgYjoeXsCSqh%2BK%2FuHJbG5T9k3zi%2BqeDrK%2BzjlrEvvZEDic%3D--5BtR8CL8xHZe5eJE--j0OsfEweo73dxbBbacqg4w%3D%3D"

#sends GET request 13 to reviews
cookie_request_13_ends = "_ruby-web_session=H46hKaA5WLweuvW50117jHIXRg%2FgiUpEwYjSdwpkOAmchg92tPrlUmh5qEimUsMfg70JUy4f1gU0M%2FkFKtbWSKQUsZtH1QkeKvlNekizyjo6kpTgRy5Z9mrKIDYqwfkBeNBkbMtEv%2FdTk29Co2fRtt%2FsBdQRt1iRHSFcGl%2F3FK8zC2ZxkXvoCL0gPELpMki7t%2Bx3juz8DbzaoKGhmMC2kKvGYTTASAAprAvQQqP2frXGB249PuW9Vvaf79%2F%2FMSU26HUd8x8iav9R72pdzKcgBHyVUo7F5igkIQOgESryCBtLMI9oURWBIHr%2FnFzGR%2BJvh2wwkn4Ubl6stPDrxp1mV2gi2M7f6APXi9jEfFcn0l5ZAFsbF36C5O%2Bj51phgPkI8sUwoez%2FEan0uLKZKA6IO%2BGcJLuwXz6KEHpysRz1VlLG63hrLmgi4SbaSZOj6vWq34CyBQyynSNYBVMOsEndtY1XjrStK3K7rSyGQDmqcm0cY7x7wgYCxAbADAF11fbS9GXdLRblfs%2FbbHbF0VpvWWA9KZnWHAB1f8W00ELpS3jmUcpk1qShDUEiiB4fveg7WNicq0myJo4A%2BN4yiTPi6O934h3iOMJAFnDR97epl68X1T%2B6dEIcXjfnqKOqjQM8Ty1ws1Af3fawRHFo7rvphe6IKDO2fRW8Puu%2BlVd1CfqFjP%2BMB9jEuNLqyKSe19zuS6oaBoF9prZa3Lee2eACKNF%2BcQtXZH5TBIngVZU5SyXJFRukb35aM6P%2FqMtnQdla8yJwCxtpc4F6DBmaXOWU8Gw%3D--x8Olh7KGOhU3crkf--Ld9I0%2BDs4ktZ8x3d78kWWA%3D%3D"

cookie_response_2_set = "_ruby-web_session=610ABg35m89YnOUYXSmoIDeFarHza2RRYsAlc4YPkVAf2q8hM8Mq%2FYO0rdzO3WxDxApbQ2L1MWDJ4YPPBW97o4539%2Fq2L2eHtGWJ%2Fs%2FIlSzCLMEzXEBl8%2BTWaCCAM4GDg3MvxIO9T%2BIWKKIX2jUKccD1M1lXn6cwo%2BfRXF51zq6KtQ%2FhRImjamIFQbzO67rclZZa0q2cftGcH%2B3B3%2FyNUX%2B1tAsEK8uDtcQL7%2F7bwS7glVuFPPgEowtYjQRFjc7XYrFqOJ2L0l3DRwnoDmNJRLa2r3XrfdPzVQ6z%2FpKaz43TGOKGPVZd%2FB7MQQUSwyK1SU8ZcLJJO9EYR2sJIm9rn2m0yk9fyQ2dVTSAGPJh1n8eACyBxXwg0V4%2BG3w%2FfXXrcQBD6bS6hXs5JTZSEfDclIHsXxTwdBo%2FNW1Iq%2FP3TE6upDmyavyonNjDQuwVSSshZ9Y1zV0p9LkGsec%2FfiKRjoLfL13Jk%2FrZ4ugTSQZuNVQtOSVYD%2F70hUjW1xLFd6oMQToQbhaQOpzBQkOvAbpDFlywnbFjsqdP7%2B8vCACXhrOlLW%2F%2FfyfygqKXqO0xVCqqfs9730tK3qOxnVqZLWjaQNWUWUb5JBHvD7eP6gMlnQJalvVZ1sAraT%2BC5ML1%2FiFm1eXRW1rH5pN9WoBtxxZyohPHvT96nitSKWmHpP%2FjJgqJMEkr48S8jB%2BaSMF91jJ8O%2F8S8IlW7vsq%2FZsO0RAUxeSaUI2Qx89KkeJ3uk3iY9cteBvQmmSo6v5rCu8%2FFvAMlzSxsSn%2BHyEH6ovVNcXaNiI%3D--ENToRogmTxffKBJo--a4FJAOylSQik%2BcSWvtV0Tw%3D%3D; path=/; expires=Tue, 16 Nov 2021 14:12:55 GMT; HttpOnly; secure; SameSite=Lax"
cookie_response_3_set = "_ruby-web_session=kB2p%2F%2FrYzRwu91d5MsKwrJ6luZgeos4LRW%2B0iqZ7Q9faLiyIUX9T7uNRNFfok8VoSnF%2FrbfsLn4EgFN5wcPWKO7pIiESf8FPTvi%2FVC3Pmv0P%2BJNEGcuif1sgMdEMCTypauejSWQceF99OMqi2Hc3fDp5srxdCgSUB0VFjRe6Jpt16kbgQgJLK2J5XsC%2FqKXE2M2mTk4eNlEksvNmBCRdAZLFul9IHCAPut647Q5rt55AhrrKQQ7tkyGw3JK9yz1l1nHpRIoYV0imv4WGvmqb85gIRKsiQVXe6uOZJ5vVvOTDupgwR4G%2FK6ByxbENam4CA9z3X6j06w1CnewKI72ySsEtESYAxSMyBQDzKaPd%2Fr9cMKCdW%2Bve11AXXV9FzqlUs17nCILM93ktKWCE7Fyfrerotgrj8Dh%2BhYG6jfJDt2nqBUo4CULXWZv%2FweLMPLzyBP6DZ5gIVegz%2B7ovuPCQLNMhs6HfRVoVV6TRuJdjhcGIUicdR40%2BsOeXkKPQD0WeLBImDekljtkSD8C4pjKXevP38is0OvbyLd8jRLSPMgO57wtQa6XdjID1CiXPqICNnRPELi5WQdqFhofTxsMXU0mMygxrkYoRfxqKCHJHAqCNtOvaPRpv06j0%2F3NMyQAKC26HAAFwg7nW1zgGjDKx8VbRKUJuF1lb6XbpvhdlkwgpPTBwDTy1As2pFgLwRngGwSuppYWigBhlOP%2BLrrF%2BrwnrkjM9D1P7hBwr%2BTBNqacWtstSOWPF5UaVDsMVDrg1tf82KBG6Z5R72gObtNJVtF8%3D--duYfzdIX8l0nuK6%2F--gcfoA7AecFLk8XViWsDH6g%3D%3D; path=/; expires=Tue, 16 Nov 2021 14:12:55 GMT; HttpOnly; secure; SameSite=Lax"
cookie_response_4_set = "_ruby-web_session=VJi46fimOfqCzHnWcojAKlJDBqK%2F0l713pQ0ksI1nx2mmljF%2FK0sEo5XdHGBxFYo%2FdSJ8MeZLnn7qD%2FiooBT63VSXxpK6QM2ozjoiq7zGOj8dumTVybZ%2FxVqlwEHiW6ZblQneeZj2pKxR7p9HSkOoL6G66jS0Lu3jMVZ38d40Bzx9yGQT2RGujVxW%2F9nQ8pIkyJwp%2BW2jryPvuD3DV7n4CF0uuGuay1UN%2BFidxgl0g4MpyAT8WvKK9FexgAvs5iVxXPtkIz6LIJBDNJSYj42%2FN6zVDW22mfTkNlz8Zn31auLDloTehRbVLNXy9ru6mBZGjBKCLzwtJwt40mlFYeVQsk0dtz67fJydEZ0%2BS64yxWhrw2sx%2Bss%2BV%2B5MAPN8f5jaipkWAEZ569ekq%2BWc32uBbOjr43r1ZS57LFbM1vuTR711Wq5nbMG1Q2P69QROSqwS6NGbkPX6hASJfyCY1buHJcQI0RT9eIRxq2fnPzY9jSjT%2BFbYOoWfqRRhhLMorRrjlYvbycYLlCltZJw4wrADmib6DEXDouYWGuuOlxeQPwHHkFKj7wlTiMMKQdcS8JmAxcznuUc71CXD4UUFzN905%2FqmhsBVFjIBl8NvdxEE4XD9usoHifY3fZaRH5tkEf0Jqh6NbrFOJAfsLeJ%2BeNyr%2B6AGS1%2FuhNN6We4pIgYxcd0HX%2FkX0Il%2B0hrZ%2FYXPm0I%2FA6N%2F3K83qXX0jjI3%2FrmivTaagSv8iyOfMc5qyhkxOrOb3RwvnFoLIJT0Z0XtIe4Xi3I0tSxm5LyzoSsvtF1CkA%3D--VTU%2BgpG5G8ewfnvd--BUSfbCTOKvH%2FFCRZqL2eTg%3D%3D; path=/; expires=Tue, 16 Nov 2021 14:12:55 GMT; HttpOnly; secure; SameSite=Lax"
cookie_response_5_set = "_ruby-web_session=ZcJJN9OEfqz3gFD72n%2BJSuh7t1SFtQABn3%2FChYMZkWPz2MvFbudtcHWN6rZYSWmhr1u1qQ8tLCOgv17%2FEo4D1cqSzbT2D1tTku81iNwCAGhNPnfG%2BC0skBjYy4KmicyUfTNmhCnBSjvdnqHROiJXqiES4K6fJkUOkm9cYFIF1UCXhdecMrt1Ppbsjfddqlxk6uxCqdu40%2BYMlv5AUKurhWHy1%2B3qdn%2FRLIlJcmI0cdAUUleDKRMWg2GVbO72UWBg%2B62PbPoYAIfJsu06wAJE4heBegDrtoFXnHceOAvMmVefFbSPqFj23zlB%2BZwuA5YAnQvr0BXN%2BIAf4Oya9qgbhqPhQCLFCkTFmng2FWIc2q4n4JKt8RuW90hXyRfDVhCQtdVJLaxpFHF7QR8S%2FIVybBaIf7zAacT8BzpoebbuqwuvVVG%2FD%2FVWz%2Ba6uEKb1hFNwHU2PEWvthmzcL%2BLqxMMTtBzg%2Fcz5gWyMNI%2F%2BmxqB7GK0jEcSwhOfc%2FLIv7iszBdrrGcqgZe8zC%2F68zsnjsgPhgAWBta%2Fd0H21QAb20ulp2Ocptjxl2ttcW8QDDjaop1YWEfZ4sXvTczGzg7BEcY%2BJpxxZZ%2Feyb%2BWSpGpcQkO5GiuU2bGnf1JSQ9JJ7ePOiLY%2Ft%2BwUy9zTD%2Fb8WFjL7GuH1Cx1lgoae8Rf2IPrerQ0YNLiW1vp%2F1zxSkfMdO1z6P7b07KYtVa4pMzg5p74quridbLT9DrylB7L4WF4Qil7zllVVuFA6GT46VYZeXKm2PWKxL%2F%2Fn6F0TClhIcwY6W6ac%3D--j%2FkYlpzD8ZiV9XRs--LeYC2299zH89lMiCRFT6Yg%3D%3D; path=/; expires=Tue, 16 Nov 2021 14:12:55 GMT; HttpOnly; secure; SameSite=Lax"
cookie_response_6_set = "_ruby-web_session=oNuxPOQPixXfw27wIOnDeLVu5H%2FAJycGg9RKhuzatmMswm0SYT28f4MTMhFFLXaYGwL9y2o1l7NvgD%2FgP1XCGMvjI3rkvXAkvcYmE%2F9yJ2%2FlLM4HKc5i22qxppbKhU%2FA7dUp9jPYulk76rLLFZAhQ8jx4bfqlotLwK8%2BZ9%2FYEVcv6PO351cp%2FedXew1I2FXcdzt%2FQzqiQxlHKkfrP1dSUK8yLmcHU4wMU07SYuWfnnneDwstR7x8yQ%2F3kW2EGERbIFvkWwGe06f9spTNoXhJdRiyTB7C5lkEUt1Tn0ysVvAoKKGafRDllJ9mwUQDRImpXBIoBTS5%2FbxGzGP4NFG%2BHcpYya7HFEL5LXfntZ0lixNtduLYm9ZQXwyFDRbH35VKZWq9Bo9ajNYXYzT6ruwfUw8pHzD4e%2BrVIAkEIvaxr2f25Rt8V%2FBqXODOq37r8kIuLtC8c3ytQm8wMy7RJPjqk8v4MJoHKJ9SNswYtyS06%2FH5Z%2BJ1vTvCXLEE2xUrR0LaovKepMjafvKgHgL1b8L5d%2BAacNt4t6uiF%2FVfNY950IN38QKA0SEx2ns6ybK37SkxqZiPLv8ayui5LnY8h4QtWFzC9Ii9fmYZotyCjlOaGQ9TsfXOGYMI%2F%2BP6deoXOkk8AekDfW5v6c%2BUJmP55AH579BZWKuftrhl%2FxmM5xSH8yQeGBVOPq7SubcHFRLbWauAZgtI1k911QwZfVms9RlETYezDaR2QWdLYbv9Y76361eXX4WtElUoQXzoV3WGbfNUGvAbdqxi5GaUEENI%2BiD0e0s%3D--csIqewSs32s3n7lO--zgiHxzoWyqiIPXHtLIGqCg%3D%3D; path=/; expires=Tue, 16 Nov 2021 14:12:55 GMT; HttpOnly; secure; SameSite=Lax"
cookie_response_7_set = "_ruby-web_session=5hwnPtM%2BRN0xTzbyVhrdpj5I0O%2Fet7RQFhWqmnu9Enn9cJGDdr5xjqgfUtySZIy%2B%2BXoRf74BTtQT6uLu8jVi8be68zJUhgBL1oLR7h4VoTN86VZqhdjiM4c0OWSVsa2IBaktsior%2FHN8Vso10uE7CZmfH1U0Fu%2FapOE4PY7H%2BHmlj5oMlsFBjDRysuBYvgq6faJ6sxqsGFbCfuGrHn9qH6o7nSY8v%2BADinM90InvRUr73OiOEO%2BdHbxzO3VVRro1W%2BW47ckw0HrdVQmwCHvUdZ4bQhXPI8mLWh7vfXsf2S0Q%2FsUCJZZh%2B0ddvtBB0fawNU7kHpws9X%2FSfNEvPF3W7KEDOkEA49wK7Zqi1zlSFrICnH5VXRzs%2FpXcseu%2BfubBfxXshcaz1NupcZWDAShOIpUfretl%2FXEdcDBreV8LMp%2BzsaOsUoOR9p3HRaPVO3xkaaHmEAt1BO82Nz7EWrqS%2BrWERW%2FGvfYv9VGIeaiF6JTzoZ1ZyNUAl%2F%2F51B5pTm5AIlMvWhpPWpMNLDufba9V2lufJ2eBFyNTc1tXxdUVdo8EsGXmKjta4W81oBIXR073zKCsbOtH4QmjN5f9s4aJ8YWc4R0YK5AQZA1SbBLgKyPaluO5ptlYSlMXjJ3B2uPkK8cv2Xz3ew112Voz%2BWTxY5c%2BN2CBKI7ZqmbmQAZE4%2BaoX23YRamASQUFVXSFVFeEz3GDKwGshrqndDbsUOjcfnE8or6eD0TBssSmaNEem8dpC0wgZ%2FklRqzxjDcGrBpKNNA3GjedzGMXkkbooaLoJBo%3D--0wjLG4JtOYIHAJd%2F--ulZkgJIyscBTds%2Fu2r2asQ%3D%3D; path=/; expires=Tue, 16 Nov 2021 14:12:55 GMT; HttpOnly; secure; SameSite=Lax"
cookie_response_8_set = "_ruby-web_session=ChUK7wrLY3M5YOuaPmdxsxIx28AAhk4bW3PgXUXOpC95Uw4DbNYrz5%2BlGeGowSKDJ2jLrXQ8l4gzQBLqME7O7yUawaQEL21vfwDe0mueQ2nFfgJBdu8QPTOpo%2F6tjYKwdle%2F9Q5hn9VqzpJZBFvHF86n4bQNPfRg%2F8q3lsMICH0mD9hSgP44djNotXWt90x09KuuHAVxopPs%2F6sPI4rsnRQfWvbwkSAhaXVgFppvIZX2tIJh2nI%2BJwk0EYa6Ijmm5IbiQYqLpKqm2Nips0vGFBWrn2ywh%2BMGHZy5H9z0gmwVtKQ4gvZ77g9hjUlfUDPoeuObAz6zIon7K7KaSeQhBlga%2FUgjy%2FKOTO93URdOsx4%2FVQcK%2FvWEYZGVuPCarTTHDnYeX690GjubU8WQWczx9hxRufsBm%2FlKB4jGDbMaRPZ8NSRrhyT4vxa28w5LP1k9W%2F%2BE4t76Ig4WRTfNt1UM6VmbNNgwt0etciOPYUFqSLwCMSGU9GwLt9kAOxY4BMTgwN9Y0n8LzATSqOpuMUuH4FUP1%2F9s2Jf2BIlnWR7nCYjOUWBqOMdmtZayS2Gw%2BUspgnCuf0SwxBMmvzWFME87%2FK%2BlRqKC%2Fn6GcXFtGOpRNjYW0EXolyRYmg6PakPTJaITv7LMRwnOa9VJd0NLN9qUDcGK9axaWlPRHbBvT1DTGvxJsDoWtraXx6234aOJ42jFCwqYrt7c3gqnh%2BP8vN%2FydnDe8K4VaxY45QMTD6Sj7IosdNqvjJKRI78JO6nUX1sofBlJ2tJq4smt3sWtY4aCpy4%3D--ltY5UTjlrgUvDLxO--owbtE9ZU6sxbQ6Vh6bmDCg%3D%3D; path=/; expires=Tue, 16 Nov 2021 14:12:56 GMT; HttpOnly; secure; SameSite=Lax"
cookie_response_9_set = "_ruby-web_session=0Cr1sRY%2F4LW7c2SQJ4Wdbq5LFxH5UTCbcxwLg4ryRJkyy%2BncKtfizZteqcvWq4C9fTepIWhJyJIWisS4v3g4ixxkpIUw%2F1DfHt5JMgH0v3CBftxG8I9OtHg5m1m5mvUy3p3tIGMdeOBlAyvdmNRE8vP4S8nQnnzx5J6NnMpMJzQysOmxCloXyugNKIN8JQxOz8vWVeRo76SKvenoB%2BD0ZAinTx496riDuTkmC770Ett5lQ0vWArXqDCarHDp0Of1BuEHI10CCDRvyFRw70LnmZnoe3CTGPa5Up1qgCfe9ZS4Dygv69XVAReYl2Ov7%2FiIrOsHe9DT9ta%2BggCjxGutFRo4LYsXHw5JjNcHX%2BbRAVvb3nEQxdvosoyR4LvUN5SfAZytjUVNmuebJtFuneHSQWcFsBv0YcL0xLIcJis4%2BYic2jTlmJddkioLu6A5qWWIr64Nyw4%2B7BI1JueDFlvctMphNBwk9qNPelqiPIZo9Z5YV%2BE96Nj8%2F9ZiU9ot77oxnURjYkcQdQdF%2FUOH9w7YvCjcLsOnTkeQLuIQG1akj4MtzWyUiniBM05RtrPPwT0M1k05TaQPqmLxdBF032DWTcLToR0rf1cZLrklC4CxEhDo%2BBntDljaTDqkKvYV4u5DHatq0BRfaRe8IAestHqCKlDubo0q%2BL5vAwB%2B10aRe1MXLKqdetVY65eT1XIUKxu32iuP8Li0XUa4%2FjURNoLq3ONUroRRPvtKzhgEu3IB0g%2FBzi19ELD6WaVR566ggslVViwT%2F89SQi1JZ9Vof8lwsn4%3D--Ad99Km%2BUiC2tE6Vl--cIY37u3jVM9qttn6f1DD0A%3D%3D; path=/; expires=Tue, 16 Nov 2021 14:16:54 GMT; HttpOnly; secure; SameSite=Lax"
cookie_response_10_set = "_ruby-web_session=UikTXic38Oleukz8hNgtmmxTXZYPQWKGVL1QdVdNOttvf5tYZj42sEIoPBtk%2FGMzz2wE3w74ILj6tOdvQtREGi1uPLynPdKGC1AFnpi7JS8CzlbOpi7nrZy%2BQUfvJEwLi8PSpTyPzdKeAYxHk5kAz4tbePwGpzHPO39hP%2BJEr%2Bkme3TksetyrMyC4DdDb58HUzX0AjMxrCOxubhhNOQZeCfUm6KIccU%2BwUMljNjhOoTSReUmcuWD%2FN%2FnrnxrY26kzf2wLk%2Fjyy%2BZrsYaTf5%2BhcUu4GzmR0PpqUsd7HZGE%2FFwGfj%2FMPgJ%2FPplflyr%2FBhdqbkar8iE1qgDfzY6%2BNsxK9V4YBNlCotPp5j%2BaVoiZrKmdFFQnoxBBlUoL8VD%2FKG92PgtK3EZkZXjX82l71EYciFRd7LTskHzEanFTTEhXreJ0D2dENTfJLes7fLtPZapBSYV5vbXbMQ8H6kNZENlW6h%2BX5D1te%2FHZS8bI4CxCgKAMxiHN%2FUkJC%2FEBjTO1rzZIX%2BK%2FtH8K45e6%2BDiymy%2FcGUe5wvQFjVe%2Fbz4%2FOt768LZ206R6qkg2DTWXbbL%2BKQD9twxhhhJ%2FboSmxNNqP3P5VX02skB3cN9af0OmQCAgOm3l8R0M98rMZhSvyp8zcK9a66aPUkkqwOjOVFu%2B3riXMGNm7EA6%2Fq9uz4a33ZK8aE8goV1PYuRPGgmQDxTrvz5mx%2B6e9kbhsEm1tYnUo7wIjDYdnK7zOh5GCweeSSQuRc3lba8CokM2yx5%2FVVpbK%2FLmStScsif%2BBzXXjqPgOmslQ4%3D--ZpUOsMJpAgImfFFc--P5lWxssFB4WB3OXFeviXVg%3D%3D; path=/; expires=Tue, 16 Nov 2021 14:16:59 GMT; HttpOnly; secure; SameSite=Lax"
cookie_response_11_set = "_ruby-web_session=Xtil9lCKKzmVPy90RUIeWWUWaoXcMqsCtVUEg6d6nL8m9UNrI4fs0O%2Bp45I9H6oR8ssNLYtJQDzHvtOpuKCpV%2FgVueCngEf%2FfpGDLwivDaUT10oyLqBe45LIfoTqcb5HzMUU%2BG3O70UF8tCzdJpJNH6toKLOXT8mY%2FIfCIlrWF3TuKdyZefApuDwI3cLsuCATJpl9ubhMVwNREYrtc3Jyrb1p3VAucg3BU8zIbmH22lLc%2F2%2FEQYSzLhZuydlf2XtBu9%2FQ0zuX5bZcw%2FLm%2BC3085D2QwYe8xaEC8%2BHmANZZ7jksnw7cDSx9EZQI%2F4dzjn2wZLKhRP8MY%2FEzr3Jxdh3dAyqtmsR9nOyALcBpL42PJvP68ocpslgDgFlUv5wr%2FtrfCF2%2BD4CbQNlsSKVB5QS%2F9vUaKjPyS8Evo%2BtAXRTofqCxjuoRo1MAjg5zQuiYpIcrhIiixUeRO1vERwOeqXZmpSXvu%2BRW9TUAWcnK1GQGWiAVYdmxD1vfoQNPqbkzPL3K9zouY0jndWo6K%2BkSB492%2BqLmhjrFVwLN42wkj3ovwWz34z8GQ2ezjIPvDGpCW8XWeZUBxWLSeNq6esFXiunzB9uq5LFSk1NZnoUxIHRt3iBhdH%2BEuwqy%2BJ0LKfSjCsph5zvYH%2FEe%2Bmr96ak8b7hsale91PkhkmuTEC7cvnw5KdF7shRICusDOGJMjN3y3ZQwhTiN7pCoLNPFRi0XOsWDVmasM1yhSsZBY363TwJAgYjoeXsCSqh%2BK%2FuHJbG5T9k3zi%2BqeDrK%2BzjlrEvvZEDic%3D--5BtR8CL8xHZe5eJE--j0OsfEweo73dxbBbacqg4w%3D%3D; path=/; expires=Tue, 16 Nov 2021 14:16:55 GMT; HttpOnly; secure; SameSite=Lax"
cookie_response_12_set = "_ruby-web_session=H46hKaA5WLweuvW50117jHIXRg%2FgiUpEwYjSdwpkOAmchg92tPrlUmh5qEimUsMfg70JUy4f1gU0M%2FkFKtbWSKQUsZtH1QkeKvlNekizyjo6kpTgRy5Z9mrKIDYqwfkBeNBkbMtEv%2FdTk29Co2fRtt%2FsBdQRt1iRHSFcGl%2F3FK8zC2ZxkXvoCL0gPELpMki7t%2Bx3juz8DbzaoKGhmMC2kKvGYTTASAAprAvQQqP2frXGB249PuW9Vvaf79%2F%2FMSU26HUd8x8iav9R72pdzKcgBHyVUo7F5igkIQOgESryCBtLMI9oURWBIHr%2FnFzGR%2BJvh2wwkn4Ubl6stPDrxp1mV2gi2M7f6APXi9jEfFcn0l5ZAFsbF36C5O%2Bj51phgPkI8sUwoez%2FEan0uLKZKA6IO%2BGcJLuwXz6KEHpysRz1VlLG63hrLmgi4SbaSZOj6vWq34CyBQyynSNYBVMOsEndtY1XjrStK3K7rSyGQDmqcm0cY7x7wgYCxAbADAF11fbS9GXdLRblfs%2FbbHbF0VpvWWA9KZnWHAB1f8W00ELpS3jmUcpk1qShDUEiiB4fveg7WNicq0myJo4A%2BN4yiTPi6O934h3iOMJAFnDR97epl68X1T%2B6dEIcXjfnqKOqjQM8Ty1ws1Af3fawRHFo7rvphe6IKDO2fRW8Puu%2BlVd1CfqFjP%2BMB9jEuNLqyKSe19zuS6oaBoF9prZa3Lee2eACKNF%2BcQtXZH5TBIngVZU5SyXJFRukb35aM6P%2FqMtnQdla8yJwCxtpc4F6DBmaXOWU8Gw%3D--x8Olh7KGOhU3crkf--Ld9I0%2BDs4ktZ8x3d78kWWA%3D%3D; path=/; expires=Tue, 16 Nov 2021 14:16:57 GMT; HttpOnly; secure; SameSite=Lax"



cookie_response_2_set = "_ruby-web_session=oNuxPOQPixXfw27wIOnDeLVu5H%2FAJycGg9RKhuzatmMswm0SYT28f4MTMhFFLXaYGwL9y2o1l7NvgD%2FgP1XCGMvjI3rkvXAkvcYmE%2F9yJ2%2FlLM4HKc5i22qxppbKhU%2FA7dUp9jPYulk76rLLFZAhQ8jx4bfqlotLwK8%2BZ9%2FYEVcv6PO351cp%2FedXew1I2FXcdzt%2FQzqiQxlHKkfrP1dSUK8yLmcHU4wMU07SYuWfnnneDwstR7x8yQ%2F3kW2EGERbIFvkWwGe06f9spTNoXhJdRiyTB7C5lkEUt1Tn0ysVvAoKKGafRDllJ9mwUQDRImpXBIoBTS5%2FbxGzGP4NFG%2BHcpYya7HFEL5LXfntZ0lixNtduLYm9ZQXwyFDRbH35VKZWq9Bo9ajNYXYzT6ruwfUw8pHzD4e%2BrVIAkEIvaxr2f25Rt8V%2FBqXODOq37r8kIuLtC8c3ytQm8wMy7RJPjqk8v4MJoHKJ9SNswYtyS06%2FH5Z%2BJ1vTvCXLEE2xUrR0LaovKepMjafvKgHgL1b8L5d%2BAacNt4t6uiF%2FVfNY950IN38QKA0SEx2ns6ybK37SkxqZiPLv8ayui5LnY8h4QtWFzC9Ii9fmYZotyCjlOaGQ9TsfXOGYMI%2F%2BP6deoXOkk8AekDfW5v6c%2BUJmP55AH579BZWKuftrhl%2FxmM5xSH8yQeGBVOPq7SubcHFRLbWauAZgtI1k911QwZfVms9RlETYezDaR2QWdLYbv9Y76361eXX4WtElUoQXzoV3WGbfNUGvAbdqxi5GaUEENI%2BiD0e0s%3D--csIqewSs32s3n7lO--zgiHxzoWyqiIPXHtLIGqCg%3D%3D; path=/; expires=Tue, 16 Nov 2021 14:12:55 GMT; HttpOnly; secure; SameSite=Lax"
cookie_request_2_ending = "_ruby-web_session=ojWEad5gcHym50K%2F9aXQ08Vo3qYrJ8rxVB5SW0kRVIFkVymVYp02FlVYMk1fzUOJ2BAQDRMeLkGgnWy3J4ORuQJu5BrfLHDA8KaK9p5gdgLD4AYhFjEQpwYwKxg0TZ23Oz4Iiig%2Bc88D6SpPrAJj0uvtgsRM%2B4lNOZNgf4bssgvxi4LdNwgxmntKhbM1%2F8OvWONIyxSpZiC2eTqzRIFeeb2g6rJJm2zgMzKrk60cPucOWuMo2900HKEyVsQRfWLZKj9wc9sUXWzYyW0IJuQNF0%2BSGt%2FrJeM6iWotfngysEXM2o7I2jQgeRwsrQDFUH7u2thXTCHUEdFq8%2B11ZWzS7qaxYEMzAvADNiS0DFumxquZ9wDZTbAl3UZpx%2Blw9I2icw9gMNRvaFxgZdp8EySFUNSVyRBCihpMPcLw1S2piAHbVIjPpZoT2ypFFOrqcHU0wVaZRIXrcIJaJWgKHnFBf5kthyRCALnOQbOVHrplCxzUeAgCOaOJ1DKiJ3K68vWElvgLM5DWZnMaYYFjrja19X4yK3CL37JwmUopHc%2BTSd8BbRE5xCgjVVUPD4Uxdgze%2Fa0k6CQ6XH1aQfxwB6oMJzUbcxYbpB81Rp%2BQ8yUeTU7EmRF7VvEvEUNLIUuiVqbr%2B6RghvFuzlRdzAga0953g6eShVfhwBVoDCb1%2BYLaydtYSfZBmlUpzQ7iVK1sgfl7CxdxOuQDz2uvMEanSgll99oQasiT9hxWj9ec8yTD%2FUEXqP9gLWvJhgO5pGBqErk1NUV9vYPEqVrX9K4zJij5laM%3D--GEYwb07Zv7XUBeW4--ZrhLwtHiGVhgMIYVyrDYkA%3D%3D"

cookie_response_3_set = "_ruby-web_session=Jgq%2FbjiZMkAcP6r1fEuyriwRFg%2BFqnACwjjf4vH1IApsO3JjWRed0E84Re17kx96AiHoX2JcOXyqCc5oCCsvIoNUSgnycfB22lpyJK8NCVXmWwmuHjqJNUeEMRBK9PheUVgUK49YN2vqrsK5lztmM7LY80TaDo6YHEov60N4%2BVnIAFLGmvefs%2FoNsVDG5baHCatz2UulROMMdOQZPx4O56h%2Fn0KWOg9hLz6K21wi3ujaL9j%2FkWUTlOp5%2BNaKWLjFDD5fardlho7genwRw9w5G%2F2k%2FqyegpdX7pZrKrgjzZkQJkRqAwHVg13ztA4LVAFsCT0s1eGtSTmgFd0oUwmc5zJrvVARc1TcxDNbBk%2F79%2BS6vOCJh3b7%2FishHZCsDmPbwplO586k6GH%2FIGhN0XdzuHUY4B4XI8GVQlMg336KfwmFj0WGe18YSvavcFpCG3jinoIpDcb4%2FQUGvXUVoXJ0F2BWYD1mBNsaax%2B5DDd15iQYviB9Q0NjG%2F1E%2BN3OGDotVbIdiiEob2Adihxc%2FCUS00ghhcX8yvNYbmFQa8vvhiuu3g6ccAblKGpINNxF07SAnsGyO709oCjz6sBF%2F6nGUpa5UrxZKqQyYs9xgSjt5lLVbIb4kPSkpRCurET4iKJxkBXWjg6K2ts3pl1YZnInWkhpqIiZ7RjDHdV86ApeQi961HGKU6o5nDG6V%2Fjp8UIbypiuPTsVcpiZMQnmDq0EMt5%2Fy%2BZYLGyTNQhUUI2ToUYSbhug01PG59aYYh3476%2FtgfDW79Fie8QtfP1R0AlCZEA%3D--bdMYRAmgs89k5aPF--WxTV%2BWnmlWlCIlPsN7EBJw%3D%3D; path=/; expires=Tue, 16 Nov 2021 14:16:58 GMT; HttpOnly; secure; SameSite=Lax"
cookie_request_3_ending = "_ruby-web_session=H46hKaA5WLweuvW50117jHIXRg%2FgiUpEwYjSdwpkOAmchg92tPrlUmh5qEimUsMfg70JUy4f1gU0M%2FkFKtbWSKQUsZtH1QkeKvlNekizyjo6kpTgRy5Z9mrKIDYqwfkBeNBkbMtEv%2FdTk29Co2fRtt%2FsBdQRt1iRHSFcGl%2F3FK8zC2ZxkXvoCL0gPELpMki7t%2Bx3juz8DbzaoKGhmMC2kKvGYTTASAAprAvQQqP2frXGB249PuW9Vvaf79%2F%2FMSU26HUd8x8iav9R72pdzKcgBHyVUo7F5igkIQOgESryCBtLMI9oURWBIHr%2FnFzGR%2BJvh2wwkn4Ubl6stPDrxp1mV2gi2M7f6APXi9jEfFcn0l5ZAFsbF36C5O%2Bj51phgPkI8sUwoez%2FEan0uLKZKA6IO%2BGcJLuwXz6KEHpysRz1VlLG63hrLmgi4SbaSZOj6vWq34CyBQyynSNYBVMOsEndtY1XjrStK3K7rSyGQDmqcm0cY7x7wgYCxAbADAF11fbS9GXdLRblfs%2FbbHbF0VpvWWA9KZnWHAB1f8W00ELpS3jmUcpk1qShDUEiiB4fveg7WNicq0myJo4A%2BN4yiTPi6O934h3iOMJAFnDR97epl68X1T%2B6dEIcXjfnqKOqjQM8Ty1ws1Af3fawRHFo7rvphe6IKDO2fRW8Puu%2BlVd1CfqFjP%2BMB9jEuNLqyKSe19zuS6oaBoF9prZa3Lee2eACKNF%2BcQtXZH5TBIngVZU5SyXJFRukb35aM6P%2FqMtnQdla8yJwCxtpc4F6DBmaXOWU8Gw%3D--x8Olh7KGOhU3crkf--Ld9I0%2BDs4ktZ8x3d78kWWA%3D%3D"


In [84]:
reviews_df = pd.DataFrame()

for page in range(1, 3):
    page = f'https://www.vivino.com/api/wines/{wine_id_list[0]}/latest_reviews?year=N.V.&per_page=10&page={page}'
    print(page)
    reviews_df = extract_reviews_to_df(page, headers_test, reviews_df)

# https://www.vivino.com/api/wines/1112140/latest_reviews?year=2019&per_page=4


https://www.vivino.com/api/wines/1105374/latest_reviews?year=N.V.&per_page=10&page=1
429
b"Your IP address (45.41.132.223) has been temporarily blocked for exceeding bulk request limits. If you believe this was done in error or you have legitimate needs to access our pages and data above and beyond these limits please contact admin@vivino.com with the subject 'Requests Blocked' and we'll try and resolve the issue."
https://www.vivino.com/api/wines/1105374/latest_reviews?year=N.V.&per_page=10&page=2
429
b"Your IP address (45.41.132.223) has been temporarily blocked for exceeding bulk request limits. If you believe this was done in error or you have legitimate needs to access our pages and data above and beyond these limits please contact admin@vivino.com with the subject 'Requests Blocked' and we'll try and resolve the issue."


In [None]:
""" Reviews:general (same)
Request URL: https://www.vivino.com/domaine-hippolyte-reverdy-sancerre/w/1112140?year=2019&price_id=22907185&cart_item_source=
Request Method: GET
Status Code: 200 
Remote Address: 99.86.111.13:443
Referrer Policy: strict-origin-when-cross-origin



Response headers

cache-control: max-age=0, private, must-revalidate
content-encoding: gzip
content-type: text/html; charset=utf-8
date: Mon, 16 Nov 2020 12:06:15 GMT
etag: W/"0e1480bf1220ba2b7288be500e188753"
referrer-policy: origin-when-cross-origin
set-cookie: deal_merchant_context=zurY--XU3d93SUAGnBACGh--8FEIXEcHe6Xbf1Xpuju3uQ%3D%3D; path=/; secure; HttpOnly; SameSite=Lax
set-cookie: recently_viewed=0stn86aboz1B2X3DoHk9QYsvQuAG6B4PfLOFjVotsnCPUESkdR%2FM0ODO1hMp%2FQ4IN9oSu3TlTYQEv0b3iSCkGJxFwt9LtsPxSsC9r%2BgWLYUoknVqYg8tmW8Ik887honTZGJvRyqs2ju3YcCT0leJJ2t9SEqK6zKnrijSWveLVfrnGhANJn1Tkbkf7TdG6FM2Sl3dGd6Hjw4Qvfe4YgS%2Fc3Npz8LQHQwGsrqoA%2BId9keOQRXBbw5xCWjUdCJEfkynv2EAI4Hq%2FRFG8y5r4v25I6E%2Fo3z04MkDAU5w%2BLAWl0Kd5yOvgXa3lJ4eXqEigVx4JI5wK6QqLTd4Aq2xJsoD%2FCZEBOEs3Lq1Btmc623MoIXu2w%2F2V3QoK3qOQzZNEm%2BedgNzFLLQv2f9HUWFIxO2B5KvShHTJgj08QMnWl0A5yT9d7uRX24WJP7BBzvhGpIfVw%3D%3D--3JY7hvNUwRZRMDdn--auSoKwxjQJ1PKvNY0KgKkA%3D%3D; path=/; expires=Tue, 16 Feb 2021 12:06:15 GMT; secure; HttpOnly; SameSite=Lax
set-cookie: _ruby-web_session=KMKdFfctTJPWhbmAgcH5NNeSTAdxXFbEBn6zWzsmVzfwn615z3VbobzBy6sdnY26YIMun%2Fytqa95I9syO1kMIbeM8KolphimCZcxiNUTu2hfpAVDEp5WuG0XKqtFWBlvmTmi%2FJspvsLFXg8MA56dXmGfFK6MVcGnBqQau6ctv26phNSRraPgVqnPBU9Q9v1zx%2B1L2nO4dX8D%2FcvgPk%2FScLmmQUTzLX%2F9X8OB6y3Yh%2FMdzUkMATXntTi7GjdxF6QBWY%2FMXAv6LsIkh200Pn%2Fl2NOCBF5q3nq4Puw1aN60TsDReI4Rxmju2G1W1lMpZlMyFimIstzR2AAyCUK39078uX2Use4iakRWN2rl%2BeRWl6T2O3ZqPIIRgIZZAnnMbCbDTBaORXMzFfLMLXYg4%2BTzcxLqLbIkPeAoeuVo0qAXkcSdfbaQxvuMx5jCK%2Bca3Sg61ZvtC1JHncvch5fTdR3H4HczTWHQycZEsw%2B2AbmSh1NzfvPPsgJUhjMSdFiP35BQU2SatIqBhSbTKkbdF2JMMuh0dbUlgIuWVnZGpsAOyBXmc6ONROFknXoo7FFG8DKdJ7VnWCHTm55j9moLRFwd3DjDKmHNPiIlfbD0aacIYCt%2FItGpK913biTsljto5QT1tdytOj2G0eE6NqaGmjmSZqCuqTP8fj8iTMVeF2SMJcS2--PnbDrtpAbREHpeHp--AA%2Bn3SYl8LWoVTDuv%2FXcJw%3D%3D; path=/; expires=Tue, 16 Nov 2021 12:06:15 GMT; HttpOnly; secure; SameSite=Lax
status: 200
status: 200 OK
strict-transport-security: max-age=631139040; includeSubdomains; preload
vary: Accept-Encoding
via: 1.1 695eb63e742ec6b6e245772eb313e747.cloudfront.net (CloudFront)
x-amz-cf-id: 4Dsnll7y0QXRSjR_JgA4-PBoYbpLB8d64f_3yumIXXWXhzJPGgG9iQ==
x-amz-cf-pop: LHR61-C1
x-cache: Miss from cloudfront
x-content-type-options: nosniff
x-download-options: noopen
x-frame-options: SAMEORIGIN
x-permitted-cross-domain-policies: none
x-request-id: 5bfe4c59-e899-4d91-bec2-029d287d4cdc
x-xss-protection: 1; mode=block


cache-control: max-age=0, private, must-revalidate
content-encoding: gzip
content-length: 1938
content-type: application/json; charset=utf-8
date: Mon, 16 Nov 2020 12:06:28 GMT
etag: W/"c3e7b0d34a2e5c169fab1244e2c53972"
referrer-policy: origin-when-cross-origin
set-cookie: _ruby-web_session=xRh96DyFUTxQo9zBr3NWeSY9L%2FQVbTNa5mlESaBDtLP3xv%2Fd5iJSDC6YGpDcX796LcUJYkLvGUGTYqkMNypsrtubLw%2FPdWLDkAqC8OYFi%2B%2FkkunDnMFoX2CtjQ08HtkBhxI%2BM%2FetAIkLjwVSkGM49LQDY%2BA5Jrext%2Bmv2W2tykys8TaqFHIKIIBhWKrHZ5VRPVyQzyhjEvcGFoGclwGa86pBueiqduZse%2FMFNKB4LJdl8Vni5SqqCC%2BJikGyiWYra%2F1BBOmewO8%2BtfjtdVEtcR2FMDvkuq6m2Fn2iCAJ83TcXRzThbIsSYP5u%2BmX4KP4FVW6gNFNoruDyE56RXn1P4YIvnxhTIoEm%2BM1p6Zpk6P0XDEhFQfsIcreB6Et0j6jEXd91EKISmNGK2egAqJShtXfKfXhdFqYrWA322cQfwReUSGSEsEGK8rdgrkVce6hwpWgFtrj7V1nBALS9%2BESyu0LRNc%2BB2BY3w8UMYbXucu3X%2FfLru2tBsapr%2B04b1gPsNw%2B307rq9LjjGddtB9KrwkqQ0wenTt4sEhHEAFzz1cbOPEhpI3bX9VdpUIsworUa461DmrDPFGrrI2LWGauYNtz7kPaLgmCJ2iUrPKIHRjnG5%2F6JuTe5D8327Ojqx5us9tnzeyl%2FhpqS7AMrq624RN2QGyY6IvOqbkzv6wrnP4D--oJRdu7%2BtM%2F0gNrT1--Vkvh0ekH70JpdgU7uOWTrQ%3D%3D; path=/; expires=Tue, 16 Nov 2021 12:06:28 GMT; HttpOnly; secure; SameSite=Lax
status: 200
status: 200 OK
strict-transport-security: max-age=631139040; includeSubdomains; preload
vary: Accept-Encoding
via: 1.1 695eb63e742ec6b6e245772eb313e747.cloudfront.net (CloudFront)
x-amz-cf-id: Pic3B99JXIaHkmtFs8TzX4wGexMVCf8FPhgBflvo-7umBjoF0C00aQ==
x-amz-cf-pop: LHR61-C1
x-cache: Miss from cloudfront
x-content-type-options: nosniff
x-download-options: noopen
x-frame-options: SAMEORIGIN
x-permitted-cross-domain-policies: none
x-request-id: 44f661c3-5e48-44c6-95ab-7dcf0325e747
x-xss-protection: 1; mode=block


:authority: www.vivino.com
:method: GET
:path: /domaine-hippolyte-reverdy-sancerre/w/1112140?year=2019&price_id=22907185&cart_item_source=
:scheme: https
accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9
accept-encoding: gzip, deflate, br
accept-language: en-GB,en-US;q=0.9,en;q=0.8
cache-control: max-age=0
cookie: first_time_visit=hS4ZPmom4T%2B8EVq37TqqHsTIJ79zRIt74blMfmiOLRrZSV%2FKl%2Btdl1LCq%2B%2Bpykg%2BUaEAQ18kFslfd1ZcYZHTz1Nn4SpvM0JrFg%3D%3D--qUx%2BZrxgAtAYUGLp--y6cajJQq6BpPk0nGFurazQ%3D%3D; _ga=GA1.2.1380591255.1605016601; _fbp=fb.1.1605016601435.202579630; __auc=38ecb0ac175b271c5f4507ed27d; _hjid=c884efdc-f057-45ed-b800-fff7a167c3fb; G_ENABLED_IDPS=google; eeny_meeny_test_checkout_login_v1=sK%2FI9zE679CE86lwcE9g35kaDMUSKheRprVggC%2FraGma2sJW%2FiWP8LFxKHEhSJ%2FLMZYeTVeexc08z%2FSETLuANQ%3D%3D; eeny_meeny_test_forced_merchant_filter_v1=xwgOK3y6fIfG1BXZD2h3Ungi9QMKd8VE%2BOX2SZWOrlx4EuEOO1ueQoYFXXm%2Fgrr3mr%2F4H12ZPH7XNhMEDJaLxg%3D%3D; client_cache_key=vs6Zd4plwAIrapyY64f8gOo7rNk815aXJf0o%2BYJWVy4UTLwnPKh%2Bfe8XwYWOFZGeZe7Ct7wFtyA7wszM66guWFam%2FS3e8K6fVCHla3DDh61qsUeHhZCL6PBemOTpEUY1Xd8Tdgw6UwYu--AsWqhmZ9YQBWf22o--xywX9K47mskF7W8%2BYyVZmQ%3D%3D; _gid=GA1.2.292313021.1605521568; _hjTLDTest=1; _hjIncludedInSessionSample=1; recently_viewed=6pqASacGKigLaGazupxtJIfxamh88t7opSHMV7B6%2FC%2F6jE%2Bk2%2B6NeLSLFz7gyZ6ZhBcQnaQfe%2Bh6nEtr3dPrkSfSzDA2vJlHHpujks3qvT2Q%2BeJze8GxOaSLdzwZ97I55n4SQQ2fqIE0G426q00uX8zTZWh9t%2FLxu09AFR9F1qvCsuK8mXRh0oMtVuBgLQdWEt7i7TKjtOb4FA9JP6cGgyhgo6ALr0cY1sh17LKdB%2BjaMvxfX5ENXZOMN5SjggI5VCwCz1HJ7LvXU9gd%2BD6zrHKWdoyWbCRFhldPtsg%2BQ4FudkWjBYf88i6k6Qt3TAq3YcBRBE3tSUg%2BXZ1qiEYDg%2FoREOB6Hi1Mx1mgmR0k%2FiCdXB4%2BwIxdXKBrAh7oozwy9KQ%2BDfAVICQ6pZpvG2YEJ88AiohmHZsen3deej28pWoNcDMpWQ4IOVqqOcn6a%2BBkzQ%3D%3D--FovLrmeBUPrexDhy--uOwahd9BMlmM5NWE6NAXDA%3D%3D; __asc=f9d2b60f175d0f296e6b6e4ccd0; _gat_vivinoTracker=1; _hjAbsoluteSessionInProgress=1; _hp2_id.3503103446=%7B%22userId%22%3A%221928037028880653%22%2C%22pageviewId%22%3A%221332257746617635%22%2C%22sessionId%22%3A%226503745282515727%22%2C%22identity%22%3Anull%2C%22trackerVersion%22%3A%224.0%22%7D; _hp2_ses_props.3503103446=%7B%22r%22%3A%22https%3A%2F%2Fwww.vivino.com%2Fexplore%3Fe%3DeJzLLbI1VMvNzLM1UMtNrACykytt3Z3UkoFEgFoBkJ-eZluWWJSZWpKYo5afZFuUWJKZl14cn1iWWpSYnqqWb5uSWpwMAB0IF8k%3D%22%2C%22ts%22%3A1605528361531%2C%22d%22%3A%22www.vivino.com%22%2C%22h%22%3A%22%2Fexplore%22%2C%22q%22%3A%22%3Fe%3DeJzLLbI11jNVy83MswWSiRW2RgZqyZW27k5qyUAiQK3A1lAtPc22LLEoM7UkMUctP8m2KLEkMy-9OD6xLLUoMT1VLd82JbU4Wa28JDoWqBhMGQEAwvwc0w%3D%3D%22%7D; _ruby-web_session=XSDG51H90Mii%2Bz60htImJTVAzdQewEUlDHXhDTKjBzqD9H0yhrVBgsaHYrwCCDzXL5WC2G0aLQjdouEH0q%2BqoZRx00AEW9yQfqjV0nhvhzlrgmEDpO0qMsi8FpI5NBivJYiVZkDwp4AcBzBBBFLRqUScRiezIVUJheb%2BUOjHqtP6k2%2FnJUZ2IJQkdRmhm6NF%2FYZmx4Ij%2BAowKXbQPzLPHMQXtTvZjTbF0GthL1GgcALqfsD1C1Py9SypjlOJE72VqKeygbCX5mIMTQne72POSbDtR%2Bu%2Fg3OGMIuRofky%2FcJutAqnSeLolhRLD2QGJ3Jt61KlQGWhB4gG3ebt4UXfyZJlXD0XeUskwY4R9eKGVYl%2FTvdco5r8OnLA%2BIHReiZM1a%2FHaLpvatpbnFBjmkK0XylAkCgIWrwIObJp46pESnZeDtc%2Fy%2F62H538zDfLmrd5dB1z4siXT%2BIcekWPzi9WvqquvIwSvak2KXnxCtHV0q2FTr0%2BJGzZvY%2B%2BvLD5FmpfnUyt9mAFnF7zX0MfW37BZd83NVHje07gRncJaf2svjUaYj6wKXZSJQH0bilDfgwypA5eS5Kp1NfHovWerTRTE7ZiEzsguxTrOK9BKUfFim825qycxJL1LeUyoRKxPuS7jieev%2B%2B79smYEGeK0Tf%2F%2FYNn7SzkIKhO35fzzCWwIUUO--oVtmMdHgfed5t8WS--ZqQ%2B3kWncJq%2BO%2Feh1y%2F%2Frg%3D%3D
if-none-match: W/"f5fcb238f60161d4102f6b5b9d0c4f7b"
sec-fetch-dest: document
sec-fetch-mode: navigate
sec-fetch-site: none
sec-fetch-user: ?1
upgrade-insecure-requests: 1
user-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.193 Safari/537.36


:authority: www.vivino.com
:method: GET
:path: /api/wines/1112140/reviews?year=2019&per_page=10&page=1
:scheme: https
accept: application/json
accept-encoding: gzip, deflate, br
accept-language: en-GB,en-US;q=0.9,en;q=0.8
content-type: application/json
cookie: first_time_visit=hS4ZPmom4T%2B8EVq37TqqHsTIJ79zRIt74blMfmiOLRrZSV%2FKl%2Btdl1LCq%2B%2Bpykg%2BUaEAQ18kFslfd1ZcYZHTz1Nn4SpvM0JrFg%3D%3D--qUx%2BZrxgAtAYUGLp--y6cajJQq6BpPk0nGFurazQ%3D%3D; _ga=GA1.2.1380591255.1605016601; _fbp=fb.1.1605016601435.202579630; __auc=38ecb0ac175b271c5f4507ed27d; _hjid=c884efdc-f057-45ed-b800-fff7a167c3fb; G_ENABLED_IDPS=google; eeny_meeny_test_checkout_login_v1=sK%2FI9zE679CE86lwcE9g35kaDMUSKheRprVggC%2FraGma2sJW%2FiWP8LFxKHEhSJ%2FLMZYeTVeexc08z%2FSETLuANQ%3D%3D; eeny_meeny_test_forced_merchant_filter_v1=xwgOK3y6fIfG1BXZD2h3Ungi9QMKd8VE%2BOX2SZWOrlx4EuEOO1ueQoYFXXm%2Fgrr3mr%2F4H12ZPH7XNhMEDJaLxg%3D%3D; client_cache_key=vs6Zd4plwAIrapyY64f8gOo7rNk815aXJf0o%2BYJWVy4UTLwnPKh%2Bfe8XwYWOFZGeZe7Ct7wFtyA7wszM66guWFam%2FS3e8K6fVCHla3DDh61qsUeHhZCL6PBemOTpEUY1Xd8Tdgw6UwYu--AsWqhmZ9YQBWf22o--xywX9K47mskF7W8%2BYyVZmQ%3D%3D; _gid=GA1.2.292313021.1605521568; _hjTLDTest=1; _hjIncludedInSessionSample=1; __asc=f9d2b60f175d0f296e6b6e4ccd0; _hjAbsoluteSessionInProgress=1; _hp2_ses_props.3503103446=%7B%22r%22%3A%22https%3A%2F%2Fwww.vivino.com%2Fexplore%3Fe%3DeJzLLbI1VMvNzLM1UMtNrACykytt3Z3UkoFEgFoBkJ-eZluWWJSZWpKYo5afZFuUWJKZl14cn1iWWpSYnqqWb5uSWpwMAB0IF8k%3D%22%2C%22ts%22%3A1605528361531%2C%22d%22%3A%22www.vivino.com%22%2C%22h%22%3A%22%2Fexplore%22%2C%22q%22%3A%22%3Fe%3DeJzLLbI11jNVy83MswWSiRW2RgZqyZW27k5qyUAiQK3A1lAtPc22LLEoM7UkMUctP8m2KLEkMy-9OD6xLLUoMT1VLd82JbU4Wa28JDoWqBhMGQEAwvwc0w%3D%3D%22%7D; deal_merchant_context=zurY--XU3d93SUAGnBACGh--8FEIXEcHe6Xbf1Xpuju3uQ%3D%3D; recently_viewed=0stn86aboz1B2X3DoHk9QYsvQuAG6B4PfLOFjVotsnCPUESkdR%2FM0ODO1hMp%2FQ4IN9oSu3TlTYQEv0b3iSCkGJxFwt9LtsPxSsC9r%2BgWLYUoknVqYg8tmW8Ik887honTZGJvRyqs2ju3YcCT0leJJ2t9SEqK6zKnrijSWveLVfrnGhANJn1Tkbkf7TdG6FM2Sl3dGd6Hjw4Qvfe4YgS%2Fc3Npz8LQHQwGsrqoA%2BId9keOQRXBbw5xCWjUdCJEfkynv2EAI4Hq%2FRFG8y5r4v25I6E%2Fo3z04MkDAU5w%2BLAWl0Kd5yOvgXa3lJ4eXqEigVx4JI5wK6QqLTd4Aq2xJsoD%2FCZEBOEs3Lq1Btmc623MoIXu2w%2F2V3QoK3qOQzZNEm%2BedgNzFLLQv2f9HUWFIxO2B5KvShHTJgj08QMnWl0A5yT9d7uRX24WJP7BBzvhGpIfVw%3D%3D--3JY7hvNUwRZRMDdn--auSoKwxjQJ1PKvNY0KgKkA%3D%3D; _hp2_id.3503103446=%7B%22userId%22%3A%221928037028880653%22%2C%22pageviewId%22%3A%22881277229361610%22%2C%22sessionId%22%3A%226503745282515727%22%2C%22identity%22%3Anull%2C%22trackerVersion%22%3A%224.0%22%7D; _ruby-web_session=uXYkgxGoYI1jgzqCIT%2F%2Bd%2FiHKjvX4HCxtE7lFs0haSL2LI9EZkuh3wKp0No5yj3vuN1bMSp3cPn7%2BQeeCOt8pxEeS0otmQ1Zj%2BwuRmo82cVkjq4h%2BVs9tDKfz%2FP%2BVagZ5%2FXZ2h4QEhV74%2FdnR068C5ZsLzL851hN53rJeTrgDnRGph5ZkjyZG8qW4GUtecicGsz1gU6RbU3MyyRjXNK2xKG8dl31y%2FDXeYH0Et0rLvlIlH4%2BrJQJ2KX62MmEVDoiKrUxfTZoEulWiMd5ZnJo%2B6%2F6uooY%2BpFjNpWQSYNPnsniJ3UH2MTLnbbyHmvphHYZc1enFBiDTI6h9bZyU%2F7CtHzm1A4rmh5mNkGvjLH5ral1YTM95rJdNFqK7MdSxyPBk38Ps0fD193H%2Ffdv4hmRItSdXaGTZVXfm%2F2bBovv6gyaisFabJB34a0xYLYNSu63dF1x4ISTVZvUtxW%2BB%2BJoPgNRJK9vcjp9nWfCh1PCXreqLe3RPwfBHLpQ1ZCmtLfN3HE8HYGRNIEc8j7fJkZ4KVvTfLIQWZ87c06C0dFMgNe8XGQm8c5hqyVYb6tn9%2Figl5KcUYH1T20hRZj%2F3udsh519LuMN4cEBC3qmCmKPxVHr173RFOK3gxMR65SIM9HMR3xgi6C%2BXYUVIzQdZjmU4l3mpv6NO3w65Q9JmMClpv1E--L1qomdWM4TNCVte8--UPlFT4q4F9EMscolHrtt1A%3D%3D
if-none-match: W/"58fb991283b82f735046b9e11c64f71d"
referer: https://www.vivino.com/domaine-hippolyte-reverdy-sancerre/w/1112140?year=2019&price_id=22907185&cart_item_source=
sec-fetch-dest: empty
sec-fetch-mode: cors
sec-fetch-site: same-origin
user-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.193 Safari/537.36
x-requested-with: XMLHttpRequest

:authority: www.vivino.com
:method: GET
:path: /api/explore/explore?min_rating=4.5&order_by=ratings_average&order=desc&page=1&per_page=10&price_range_max=21.838499999999996&wine_style_ids[]=235&wine_type_ids[]=2&vc_only=true
:scheme: https
accept: application/json
accept-encoding: gzip, deflate, br
accept-language: en-GB,en-US;q=0.9,en;q=0.8
content-type: application/json
cookie: first_time_visit=hS4ZPmom4T%2B8EVq37TqqHsTIJ79zRIt74blMfmiOLRrZSV%2FKl%2Btdl1LCq%2B%2Bpykg%2BUaEAQ18kFslfd1ZcYZHTz1Nn4SpvM0JrFg%3D%3D--qUx%2BZrxgAtAYUGLp--y6cajJQq6BpPk0nGFurazQ%3D%3D; _ga=GA1.2.1380591255.1605016601; _fbp=fb.1.1605016601435.202579630; __auc=38ecb0ac175b271c5f4507ed27d; _hjid=c884efdc-f057-45ed-b800-fff7a167c3fb; G_ENABLED_IDPS=google; eeny_meeny_test_checkout_login_v1=sK%2FI9zE679CE86lwcE9g35kaDMUSKheRprVggC%2FraGma2sJW%2FiWP8LFxKHEhSJ%2FLMZYeTVeexc08z%2FSETLuANQ%3D%3D; eeny_meeny_test_forced_merchant_filter_v1=xwgOK3y6fIfG1BXZD2h3Ungi9QMKd8VE%2BOX2SZWOrlx4EuEOO1ueQoYFXXm%2Fgrr3mr%2F4H12ZPH7XNhMEDJaLxg%3D%3D; client_cache_key=vs6Zd4plwAIrapyY64f8gOo7rNk815aXJf0o%2BYJWVy4UTLwnPKh%2Bfe8XwYWOFZGeZe7Ct7wFtyA7wszM66guWFam%2FS3e8K6fVCHla3DDh61qsUeHhZCL6PBemOTpEUY1Xd8Tdgw6UwYu--AsWqhmZ9YQBWf22o--xywX9K47mskF7W8%2BYyVZmQ%3D%3D; _gid=GA1.2.292313021.1605521568; _hjTLDTest=1; _hjIncludedInSessionSample=1; __asc=f9d2b60f175d0f296e6b6e4ccd0; _gat_vivinoTracker=1; _hjAbsoluteSessionInProgress=1; _hp2_ses_props.3503103446=%7B%22r%22%3A%22https%3A%2F%2Fwww.vivino.com%2Fexplore%3Fe%3DeJzLLbI1VMvNzLM1UMtNrACykytt3Z3UkoFEgFoBkJ-eZluWWJSZWpKYo5afZFuUWJKZl14cn1iWWpSYnqqWb5uSWpwMAB0IF8k%3D%22%2C%22ts%22%3A1605528361531%2C%22d%22%3A%22www.vivino.com%22%2C%22h%22%3A%22%2Fexplore%22%2C%22q%22%3A%22%3Fe%3DeJzLLbI11jNVy83MswWSiRW2RgZqyZW27k5qyUAiQK3A1lAtPc22LLEoM7UkMUctP8m2KLEkMy-9OD6xLLUoMT1VLd82JbU4Wa28JDoWqBhMGQEAwvwc0w%3D%3D%22%7D; deal_merchant_context=zurY--XU3d93SUAGnBACGh--8FEIXEcHe6Xbf1Xpuju3uQ%3D%3D; recently_viewed=0stn86aboz1B2X3DoHk9QYsvQuAG6B4PfLOFjVotsnCPUESkdR%2FM0ODO1hMp%2FQ4IN9oSu3TlTYQEv0b3iSCkGJxFwt9LtsPxSsC9r%2BgWLYUoknVqYg8tmW8Ik887honTZGJvRyqs2ju3YcCT0leJJ2t9SEqK6zKnrijSWveLVfrnGhANJn1Tkbkf7TdG6FM2Sl3dGd6Hjw4Qvfe4YgS%2Fc3Npz8LQHQwGsrqoA%2BId9keOQRXBbw5xCWjUdCJEfkynv2EAI4Hq%2FRFG8y5r4v25I6E%2Fo3z04MkDAU5w%2BLAWl0Kd5yOvgXa3lJ4eXqEigVx4JI5wK6QqLTd4Aq2xJsoD%2FCZEBOEs3Lq1Btmc623MoIXu2w%2F2V3QoK3qOQzZNEm%2BedgNzFLLQv2f9HUWFIxO2B5KvShHTJgj08QMnWl0A5yT9d7uRX24WJP7BBzvhGpIfVw%3D%3D--3JY7hvNUwRZRMDdn--auSoKwxjQJ1PKvNY0KgKkA%3D%3D; _hp2_id.3503103446=%7B%22userId%22%3A%221928037028880653%22%2C%22pageviewId%22%3A%22881277229361610%22%2C%22sessionId%22%3A%226503745282515727%22%2C%22identity%22%3Anull%2C%22trackerVersion%22%3A%224.0%22%7D; _ruby-web_session=Tj1wfR3RBNgCM8eWVi5x6G%2BNxts4VwwLok3xaAVCQhvr2uUQ6vwIuTOxT2kLIqXozApiKMJ6ionSgS7PpIQkX83iGqKzfgRLrHz2XM%2B237nKuekru5fI1C1qo%2BJtUqQnfzG19%2Fyf9X80IGpqO1iASNuWfe0HoUb%2FD%2B%2FHXQwOSWzt9jLjIrkKppQOLBVqhnLNopy4yqVrP7oUa%2FIFOfQ7gKJpQY4ALvXYYQoB8bA2ZsTXbLkVuputDz%2B%2BacFvesn4uX0BSA5pMsFyZNU%2FYjqYk5Ay4YGiCf1YsiKlHQmzze1KFGZFtTw9ODzOrTN7zGfRVGKMFqWEGPXHHPitKzTEhVECgr2EU2FqwrOj5P%2BJ6DMsZHatv%2BqMtms5ahbKibBcPDy75if%2BggJl3wZlLlMEV%2Bzhet4CadXCxif%2BtoIqoHJNtrVkbUJUwjtqJ%2BzKgExWh%2Bzkwkj9eGl9wJ6fPa%2B4ubxy8Q8MzEYlpdXODpZmOFs0jWqPg9AcDq0%2BC1NE22L55IC26ncJUaK%2BDgLGOu1J7Km%2FxIS8sSM1xhQhjFgdtroiviPbgA6NAm%2FOW9IevDWPr9ilEZqXgjTcHjig85jAjesK62PxXsiUgS49ayLG4AeaUPOPrhuKN8PtPMg6l%2Fo7b%2Fnk1tZWpf3ZoynUDVwTx77unVl%2FpN2DQPpDQuKgQV2u--0Zptc%2B1zA1uxdPZc--3UfLlu2luElDOyWV6MJZfg%3D%3D
if-none-match: W/"124b19340cff1761b2e405b5daf24385"
referer: https://www.vivino.com/domaine-hippolyte-reverdy-sancerre/w/1112140?year=2019&price_id=22907185&cart_item_source=
sec-fetch-dest: empty
sec-fetch-mode: cors
sec-fetch-site: same-origin
user-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.193 Safari/537.36
x-requested-with: XMLHttpRequest
"""




In [118]:
headers_test_1 = {

':authority': 'www.vivino.com',
':method': 'GET',
':path': '/api/explore/explore?min_rating=4.5&order_by=ratings_average&order=desc&page=1&per_page=10&price_range_max=21.838499999999996&wine_style_ids[]=235&wine_type_ids[]=2&vc_only=true',
':scheme': 'https',
'accept': 'application/json',
'accept-encoding': 'gzip, deflate, br',
'accept-language': 'en-GB,en-US;q=0.9,en;q=0.8',
'content-type': 'application/json',
'cookie': 'first_time_visit=hS4ZPmom4T%2B8EVq37TqqHsTIJ79zRIt74blMfmiOLRrZSV%2FKl%2Btdl1LCq%2B%2Bpykg%2BUaEAQ18kFslfd1ZcYZHTz1Nn4SpvM0JrFg%3D%3D--qUx%2BZrxgAtAYUGLp--y6cajJQq6BpPk0nGFurazQ%3D%3D; _ga=GA1.2.1380591255.1605016601; _fbp=fb.1.1605016601435.202579630; __auc=38ecb0ac175b271c5f4507ed27d; _hjid=c884efdc-f057-45ed-b800-fff7a167c3fb; G_ENABLED_IDPS=google; eeny_meeny_test_checkout_login_v1=sK%2FI9zE679CE86lwcE9g35kaDMUSKheRprVggC%2FraGma2sJW%2FiWP8LFxKHEhSJ%2FLMZYeTVeexc08z%2FSETLuANQ%3D%3D; eeny_meeny_test_forced_merchant_filter_v1=xwgOK3y6fIfG1BXZD2h3Ungi9QMKd8VE%2BOX2SZWOrlx4EuEOO1ueQoYFXXm%2Fgrr3mr%2F4H12ZPH7XNhMEDJaLxg%3D%3D; client_cache_key=vs6Zd4plwAIrapyY64f8gOo7rNk815aXJf0o%2BYJWVy4UTLwnPKh%2Bfe8XwYWOFZGeZe7Ct7wFtyA7wszM66guWFam%2FS3e8K6fVCHla3DDh61qsUeHhZCL6PBemOTpEUY1Xd8Tdgw6UwYu--AsWqhmZ9YQBWf22o--xywX9K47mskF7W8%2BYyVZmQ%3D%3D; _gid=GA1.2.292313021.1605521568; _hjTLDTest=1; _hjIncludedInSessionSample=1; __asc=f9d2b60f175d0f296e6b6e4ccd0; _gat_vivinoTracker=1; _hjAbsoluteSessionInProgress=1; _hp2_ses_props.3503103446=%7B%22r%22%3A%22https%3A%2F%2Fwww.vivino.com%2Fexplore%3Fe%3DeJzLLbI1VMvNzLM1UMtNrACykytt3Z3UkoFEgFoBkJ-eZluWWJSZWpKYo5afZFuUWJKZl14cn1iWWpSYnqqWb5uSWpwMAB0IF8k%3D%22%2C%22ts%22%3A1605528361531%2C%22d%22%3A%22www.vivino.com%22%2C%22h%22%3A%22%2Fexplore%22%2C%22q%22%3A%22%3Fe%3DeJzLLbI11jNVy83MswWSiRW2RgZqyZW27k5qyUAiQK3A1lAtPc22LLEoM7UkMUctP8m2KLEkMy-9OD6xLLUoMT1VLd82JbU4Wa28JDoWqBhMGQEAwvwc0w%3D%3D%22%7D; deal_merchant_context=zurY--XU3d93SUAGnBACGh--8FEIXEcHe6Xbf1Xpuju3uQ%3D%3D; recently_viewed=0stn86aboz1B2X3DoHk9QYsvQuAG6B4PfLOFjVotsnCPUESkdR%2FM0ODO1hMp%2FQ4IN9oSu3TlTYQEv0b3iSCkGJxFwt9LtsPxSsC9r%2BgWLYUoknVqYg8tmW8Ik887honTZGJvRyqs2ju3YcCT0leJJ2t9SEqK6zKnrijSWveLVfrnGhANJn1Tkbkf7TdG6FM2Sl3dGd6Hjw4Qvfe4YgS%2Fc3Npz8LQHQwGsrqoA%2BId9keOQRXBbw5xCWjUdCJEfkynv2EAI4Hq%2FRFG8y5r4v25I6E%2Fo3z04MkDAU5w%2BLAWl0Kd5yOvgXa3lJ4eXqEigVx4JI5wK6QqLTd4Aq2xJsoD%2FCZEBOEs3Lq1Btmc623MoIXu2w%2F2V3QoK3qOQzZNEm%2BedgNzFLLQv2f9HUWFIxO2B5KvShHTJgj08QMnWl0A5yT9d7uRX24WJP7BBzvhGpIfVw%3D%3D--3JY7hvNUwRZRMDdn--auSoKwxjQJ1PKvNY0KgKkA%3D%3D; _hp2_id.3503103446=%7B%22userId%22%3A%221928037028880653%22%2C%22pageviewId%22%3A%22881277229361610%22%2C%22sessionId%22%3A%226503745282515727%22%2C%22identity%22%3Anull%2C%22trackerVersion%22%3A%224.0%22%7D; _ruby-web_session=Tj1wfR3RBNgCM8eWVi5x6G%2BNxts4VwwLok3xaAVCQhvr2uUQ6vwIuTOxT2kLIqXozApiKMJ6ionSgS7PpIQkX83iGqKzfgRLrHz2XM%2B237nKuekru5fI1C1qo%2BJtUqQnfzG19%2Fyf9X80IGpqO1iASNuWfe0HoUb%2FD%2B%2FHXQwOSWzt9jLjIrkKppQOLBVqhnLNopy4yqVrP7oUa%2FIFOfQ7gKJpQY4ALvXYYQoB8bA2ZsTXbLkVuputDz%2B%2BacFvesn4uX0BSA5pMsFyZNU%2FYjqYk5Ay4YGiCf1YsiKlHQmzze1KFGZFtTw9ODzOrTN7zGfRVGKMFqWEGPXHHPitKzTEhVECgr2EU2FqwrOj5P%2BJ6DMsZHatv%2BqMtms5ahbKibBcPDy75if%2BggJl3wZlLlMEV%2Bzhet4CadXCxif%2BtoIqoHJNtrVkbUJUwjtqJ%2BzKgExWh%2Bzkwkj9eGl9wJ6fPa%2B4ubxy8Q8MzEYlpdXODpZmOFs0jWqPg9AcDq0%2BC1NE22L55IC26ncJUaK%2BDgLGOu1J7Km%2FxIS8sSM1xhQhjFgdtroiviPbgA6NAm%2FOW9IevDWPr9ilEZqXgjTcHjig85jAjesK62PxXsiUgS49ayLG4AeaUPOPrhuKN8PtPMg6l%2Fo7b%2Fnk1tZWpf3ZoynUDVwTx77unVl%2FpN2DQPpDQuKgQV2u--0Zptc%2B1zA1uxdPZc--3UfLlu2luElDOyWV6MJZfg%3D%3D',
'if-none-match': 'W/"124b19340cff1761b2e405b5daf24385"',
'referer': 'https://www.vivino.com/domaine-hippolyte-reverdy-sancerre/w/1112140?year=2019&price_id=22907185&cart_item_source=',
'sec-fetch-dest': 'empty',
'sec-fetch-mode': 'cors',
'sec-fetch-site': 'same-origin',
'user-agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.193 Safari/537.36',
'x-requested-with': 'XMLHttpRequest'
    
}

In [126]:
headers_test = {
# 'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
'Accept': 'application/json',
# 'accept-encoding': 'gzip, deflate, br',
# 'accept-language': 'en-GB,en-US;q=0.9,en;q=0.8',
'Content-Type': 'application/json',
   'User-Agent': 'curl/7.71.1',
    
# 'Cookie': 'first_time_visit=hS4ZPmom4T%2B8EVq37TqqHsTIJ79zRIt74blMfmiOLRrZSV%2FKl%2Btdl1LCq%2B%2Bpykg%2BUaEAQ18kFslfd1ZcYZHTz1Nn4SpvM0JrFg%3D%3D--qUx%2BZrxgAtAYUGLp--y6cajJQq6BpPk0nGFurazQ%3D%3D; eeny_meeny_test_checkout_login_v1=I%2FV0u9fALvvYvlXgTiDLvDJ7qTFF69POSotjBIVC8xggqMg6gSNg%2FRjGaTCnuNjzjXzMAC%2BdVUKS6DZa%2FI4D5Q%3D%3D; eeny_meeny_test_forced_merchant_filter_v1=nZoDv6SJ2FhC%2Fu%2Fc1WbAFi82MMN6VI5XI67vKEyf%2B2%2FIir%2BkNTg%2Bt8C2wvCm0BnuPyHcCYadh%2FhV9vl4%2BJqrIg%3D%3D; _ga=GA1.2.1380591255.1605016601; _gid=GA1.2.2136092280.1605016601; _fbp=fb.1.1605016601435.202579630; __auc=38ecb0ac175b271c5f4507ed27d; _hjTLDTest=1; _hjid=c884efdc-f057-45ed-b800-fff7a167c3fb; G_ENABLED_IDPS=google; client_cache_key=eJ3pugRtkwSo41BH9Upy3Zn8FsoWOph7sa43r0wD02cvi0lZmQ1LYb9gNdg2NCVudgV%2BgDrUiSy1fOpCPr95O%2BsYyBgycmevWRgrmstFvQ6sWLHxwpaqhzJ6uD7XZjewKeQs8v0hUAhT--S6JIjEvAeChnru68--w%2B6eUQVbtaeWRpX4UQVMvw%3D%3D; __asc=880e4335175c2888e60df77f8dd; _hjIncludedInSessionSample=1; _hjAbsoluteSessionInProgress=0; _hp2_ses_props.3503103446=%7B%22r%22%3A%22https%3A%2F%2Fwww.vivino.com%2Flittle-beauty-black-edition-sauvignon-blanc%2Fw%2F1156374%3Fyear%3D2012%22%2C%22ts%22%3A1605287044134%2C%22d%22%3A%22www.vivino.com%22%2C%22h%22%3A%22%2Fexplore%22%2C%22q%22%3A%22%3Fe%3DeJwNxLEOQDAUBdC_eaNQjHexWO0i8lQ1TbSVpyn-Xs9wvKCtevIuoMwvVE36wziQLk10oSF7ILM4k_ikuEE4uWDvlbMRtoYidnNretK8QHU_Gkka0A%3D%3D%22%7D; deal_merchant_context=bzxH--w2ZkPrRqYOlL4RhO--Am3k94zfE83ikw5CpOny7w%3D%3D; recently_viewed=opvFQ28lFcK0BiPA9e8OOUmSRCOYPIQzt8p9LmhtwX%2F4OLgK8ld%2FaoIfgNgepRreHg%2BF5IZVmldFD6A4bvPOcWAkz1k3HN6dUt%2Fv%2BITdmIR512VMKRBwajgZ4usNxru7MjpKvmqSzdmmOoLm153TyfenFO0ROhgco3y2sn0B%2FCJ4VAo3o0C8zT8N8oommq6F9eyjAkxtu4YSur9bf%2B4oHsEdQvRK3qZuq0Q3VHGQQv6O%2FVf7YJ2kiCHQtCwEXJxKaCkD%2Fedb7MrhXAOwJapx1yEilHnnl6w7poXNTGnm8CQLaj69KjiRxDGySnsLhL5fbnzmMITri%2BvlM9l0H5LEptJyJ0o1iThB0odoN%2FJv5jSaBeliGOflGogkv08MCR5tgOPYUvDTy9eUr%2Fsa9PigI%2BbdKONjWslZiz5tWm8L00jYC9%2Bb4SpkRx43ANe7B%2BHeOA%3D%3D--fJFOavgy6a%2FzqITf--StL9OzIan6eAHI7KHzIgOA%3D%3D; _hp2_id.3503103446=%7B%22userId%22%3A%221928037028880653%22%2C%22pageviewId%22%3A%225462612720765078%22%2C%22sessionId%22%3A%226545714585431235%22%2C%22identity%22%3Anull%2C%22trackerVersion%22%3A%224.0%22%7D; _ruby-web_session=u9PzoLNxYP6Oc%2FHTI3do9s8VP03ZwLA94LHvvw0vXXrxyUqErae11Ol3jJkURzRhp%2B5T%2FGK6JRl8GYQ8WWDuWc%2BWvA2FzuGcdgBmzEhDA%2BF3Iw%2FqNuJRM5jJc%2FG0H2%2BmKSd%2Bo9%2B%2FT1ROwui%2FrDJOiBddfRjIDUFt6hRj9BzwYl%2FV6ojxlEYAJpgpGXhmT6rCJ2eyp%2BV%2Fji2N%2BEG8zvsgzrIb3O4xSH4BAPyGsotZ3ti0WEXhT%2Btg%2BQSPrICu7WzWrWDliWOjX1A%2BsO9lJzumBOpKoteN8zKF0AkruaKif9tPvfnAHRjp2VMuDy7EpaBB%2BSvEnWP8WHT3%2FS8v7dtw1Vloj3eCL23zsQC72x%2BHpzJsgAaPTBpj26xBVj1yO7gmcd2wp%2BvUUF4hlBHDxYYFwHLQtGWgEqeKxGprvn86O4WTDMMHCATABJ2Yc0w3AXr1zp58GNoLB37nEFYG3bB732VaeJd%2BQjyv1qc1bMfh4T8mMhqnUuroIq02KxQpuBDQwg9TA18EoW0yzX32TyJGhjXywRg%2BaE%2BeEkQnGR6xnRo9u5XyJesWv%2Bp4E%2BoPwDzzINSk%2FHMNL%2Bcsymm%2BnLYBQAXLyPfJwZdUcwYkgpHZ6KLFuBtf624uf%2Bqm8ieG--cju69LbnRF7APwkZ--5aUzzqQpfnVyCcMY9c7cSg%3D%3D',
# 'Cookie': 'first_time_visit=hS4ZPmom4T%2B8EVq37TqqHsTIJ79zRIt74blMfmiOLRrZSV%2FKl%2Btdl1LCq%2B%2Bpykg%2BUaEAQ18kFslfd1ZcYZHTz1Nn4SpvM0JrFg%3D%3D--qUx%2BZrxgAtAYUGLp--y6cajJQq6BpPk0nGFurazQ%3D%3D; eeny_meeny_test_checkout_login_v1=I%2FV0u9fALvvYvlXgTiDLvDJ7qTFF69POSotjBIVC8xggqMg6gSNg%2FRjGaTCnuNjzjXzMAC%2BdVUKS6DZa%2FI4D5Q%3D%3D; eeny_meeny_test_forced_merchant_filter_v1=nZoDv6SJ2FhC%2Fu%2Fc1WbAFi82MMN6VI5XI67vKEyf%2B2%2FIir%2BkNTg%2Bt8C2wvCm0BnuPyHcCYadh%2FhV9vl4%2BJqrIg%3D%3D; _ga=GA1.2.1380591255.1605016601; _gid=GA1.2.2136092280.1605016601; _fbp=fb.1.1605016601435.202579630; __auc=38ecb0ac175b271c5f4507ed27d; _hjTLDTest=1; _hjid=c884efdc-f057-45ed-b800-fff7a167c3fb; G_ENABLED_IDPS=google; client_cache_key=eJ3pugRtkwSo41BH9Upy3Zn8FsoWOph7sa43r0wD02cvi0lZmQ1LYb9gNdg2NCVudgV%2BgDrUiSy1fOpCPr95O%2BsYyBgycmevWRgrmstFvQ6sWLHxwpaqhzJ6uD7XZjewKeQs8v0hUAhT--S6JIjEvAeChnru68--w%2B6eUQVbtaeWRpX4UQVMvw%3D%3D; __asc=880e4335175c2888e60df77f8dd; _hjIncludedInSessionSample=1; _hjAbsoluteSessionInProgress=0; _hp2_ses_props.3503103446=%7B%22r%22%3A%22https%3A%2F%2Fwww.vivino.com%2Flittle-beauty-black-edition-sauvignon-blanc%2Fw%2F1156374%3Fyear%3D2012%22%2C%22ts%22%3A1605287044134%2C%22d%22%3A%22www.vivino.com%22%2C%22h%22%3A%22%2Fexplore%22%2C%22q%22%3A%22%3Fe%3DeJwNxLEOQDAUBdC_eaNQjHexWO0i8lQ1TbSVpyn-Xs9wvKCtevIuoMwvVE36wziQLk10oSF7ILM4k_ikuEE4uWDvlbMRtoYidnNretK8QHU_Gkka0A%3D%3D%22%7D; deal_merchant_context=bzxH--w2ZkPrRqYOlL4RhO--Am3k94zfE83ikw5CpOny7w%3D%3D; recently_viewed=opvFQ28lFcK0BiPA9e8OOUmSRCOYPIQzt8p9LmhtwX%2F4OLgK8ld%2FaoIfgNgepRreHg%2BF5IZVmldFD6A4bvPOcWAkz1k3HN6dUt%2Fv%2BITdmIR512VMKRBwajgZ4usNxru7MjpKvmqSzdmmOoLm153TyfenFO0ROhgco3y2sn0B%2FCJ4VAo3o0C8zT8N8oommq6F9eyjAkxtu4YSur9bf%2B4oHsEdQvRK3qZuq0Q3VHGQQv6O%2FVf7YJ2kiCHQtCwEXJxKaCkD%2Fedb7MrhXAOwJapx1yEilHnnl6w7poXNTGnm8CQLaj69KjiRxDGySnsLhL5fbnzmMITri%2BvlM9l0H5LEptJyJ0o1iThB0odoN%2FJv5jSaBeliGOflGogkv08MCR5tgOPYUvDTy9eUr%2Fsa9PigI%2BbdKONjWslZiz5tWm8L00jYC9%2Bb4SpkRx43ANe7B%2BHeOA%3D%3D--fJFOavgy6a%2FzqITf--StL9OzIan6eAHI7KHzIgOA%3D%3D; _hp2_id.3503103446=%7B%22userId%22%3A%221928037028880653%22%2C%22pageviewId%22%3A%225462612720765078%22%2C%22sessionId%22%3A%226545714585431235%22%2C%22identity%22%3Anull%2C%22trackerVersion%22%3A%224.0%22%7D; _ruby-web_session=u9PzoLNxYP6Oc%2FHTI3do9s8VP03ZwLA94LHvvw0vXXrxyUqErae11Ol3jJkURzRhp%2B5T%2FGK6JRl8GYQ8WWDuWc%2BWvA2FzuGcdgBmzEhDA%2BF3Iw%2FqNuJRM5jJc%2FG0H2%2BmKSd%2Bo9%2B%2FT1ROwui%2FrDJOiBddfRjIDUFt6hRj9BzwYl%2FV6ojxlEYAJpgpGXhmT6rCJ2eyp%2BV%2Fji2N%2BEG8zvsgzrIb3O4xSH4BAPyGsotZ3ti0WEXhT%2Btg%2BQSPrICu7WzWrWDliWOjX1A%2BsO9lJzumBOpKoteN8zKF0AkruaKif9tPvfnAHRjp2VMuDy7EpaBB%2BSvEnWP8WHT3%2FS8v7dtw1Vloj3eCL23zsQC72x%2BHpzJsgAaPTBpj26xBVj1yO7gmcd2wp%2BvUUF4hlBHDxYYFwHLQtGWgEqeKxGprvn86O4WTDMMHCATABJ2Yc0w3AXr1zp58GNoLB37nEFYG3bB732VaeJd%2BQjyv1qc1bMfh4T8mMhqnUuroIq02KxQpuBDQwg9TA18EoW0yzX32TyJGhjXywRg%2BaE%2BeEkQnGR6xnRo9u5XyJesWv%2Bp4E%2BoPwDzzINSk%2FHMNL%2Bcsymm%2BnLYBQAXLyPfJwZdUcwYkgpHZ6KLFuBtf624uf%2Bqm8ieG--cju69LbnRF7APwkZ--5aUzzqQpfnVyCcMY9c7cSg%3D%3D',
# 'Referer': 'https://www.google.com/',
# 'Referer': 'https://www.vivino.com/w-and-j-grahams-ten-year-old-tawny-port/w/1145628?year=N.V.&price_id=21883038&cart_item_source=',
# 'Referer': 'https://www.vivino.com/domaine-hippolyte-reverdy-sancerre/w/1112140?year=2019&price_id=22907185&cart_item_source=',
# 'cookie': 'first_time_visit=hS4ZPmom4T%2B8EVq37TqqHsTIJ79zRIt74blMfmiOLRrZSV%2FKl%2Btdl1LCq%2B%2Bpykg%2BUaEAQ18kFslfd1ZcYZHTz1Nn4SpvM0JrFg%3D%3D--qUx%2BZrxgAtAYUGLp--y6cajJQq6BpPk0nGFurazQ%3D%3D; _ga=GA1.2.1380591255.1605016601; _fbp=fb.1.1605016601435.202579630; __auc=38ecb0ac175b271c5f4507ed27d; _hjid=c884efdc-f057-45ed-b800-fff7a167c3fb; G_ENABLED_IDPS=google; eeny_meeny_test_checkout_login_v1=sK%2FI9zE679CE86lwcE9g35kaDMUSKheRprVggC%2FraGma2sJW%2FiWP8LFxKHEhSJ%2FLMZYeTVeexc08z%2FSETLuANQ%3D%3D; eeny_meeny_test_forced_merchant_filter_v1=xwgOK3y6fIfG1BXZD2h3Ungi9QMKd8VE%2BOX2SZWOrlx4EuEOO1ueQoYFXXm%2Fgrr3mr%2F4H12ZPH7XNhMEDJaLxg%3D%3D; _gid=GA1.2.292313021.1605521568; _hjTLDTest=1; _hjIncludedInSessionSample=1; _hjAbsoluteSessionInProgress=1; client_cache_key=4ceHPrM9NFy%2BQnkmuB2i9MjrCJfZZrHu2CQabFiSlDWLYTsgIESo7OxdvBIAzdAkOGYo1l%2BRdQjbfjkz4bDCLt0KZxGEBm6OJvjiCCsAj0Y1eWPFJXyTylg8d0fx1KIqCEl6nHBX5Vlw--%2BVAF4ZPQQXn9j8A6--BqgQZouS5Eh%2F1kTjEudLjw%3D%3D; deal_merchant_context=Fjwo--Z3c7pODjaaVchA04--L9PrRgSdWbkaHDSiYTMDCQ%3D%3D; recently_viewed=SXek80lDyIO5VimnuX7%2BgtFWQyXgrOkeADhfUOPl4aGGoo%2BqdLSz6IZ7i95UgeR6Z%2FY2a9IDpNEZdhNVk3oqFD36MFUiiUUchycn8EOqoUluECWuJG8PlxPXkOL7JLiQAN0auOXZ3TWueck5CqtIWoRG6BCdetQvYgTEcalGBJLwGjTB1YCA65%2BxbNt6DuO8NrULx8UJmiyhapZzQqsg3YjuHJ%2BYtaTwv3BG%2Bf6F1FTGS7ivn5dKSc1mA%2F%2FdK5Jv5g0CWyoh4feE4x57fyD6Shs6qtFQQ%2F9evkq0eGgpKqspF3Ri8fGBFoeFNVgEI9B%2BTZ42ZgK2OX6bJB0oub3X8dZlklFizYUCfwg1Vmhx%2FeUq45RUwdPU3YF4%2B0%2FaZyYPzsvj03HE8xkoxVGuNxKPWJe0FgEiGskVPZMns4PhTfZUyI5I1vjvrG1gY4%2Ba6mHOgw%3D%3D--FYSdLUrdbUtYTw6K--6p004kHzvinkLKtD9otIDg%3D%3D; __asc=dc307b7f175d1b45e860d8a2d2e; _hp2_id.3503103446=%7B%22userId%22%3A%221928037028880653%22%2C%22pageviewId%22%3A%223146993210316020%22%2C%22sessionId%22%3A%223202205139644097%22%2C%22identity%22%3Anull%2C%22trackerVersion%22%3A%224.0%22%7D; _gat_vivinoTracker=1; _hp2_ses_props.3503103446=%7B%22ts%22%3A1605541060238%2C%22d%22%3A%22www.vivino.com%22%2C%22h%22%3A%22%2Fexplore%22%2C%22q%22%3A%22%3Fe%3DeJzLLbI11jNVy83MswWSiRW2RgZqyZW27k5qyUAiQK3A1lAtPc22LLEoM7UkMUctP8m2KLEkMy-9OD6xLLUoMT1VLd82JbU4Wa28JDoWqBhMGQEAwvwc0w%3D%3D%22%7D; _ruby-web_session=mkOiWkfCL%2B%2Fj8uoOS9FcTUChu3bEpUQ0dBxQEQTmSh7SoucomWYkqJaquIxEgHZmLHxWEfbXZArNeJyNUDElPCxBju%2FaTLrqxzfMXN0RebPtMufr9Ma1PVY%2BGyb4DTB9MdMyLN8usPa5gag6vM2UsE0PVQKPcn947T22hcvUtbURLY0LAGdQ1emupxc%2B8aOXGigEffNXEyQEiUSHaImKubsf7JNEiit1rswtNlCIAAXaQdnKpbb4Ve%2BVpQ59fm%2FChC0ONlTNoz9QhKMb%2Bb46RtwRo%2FvKxQoxCEPZblPecVYotA2Z1WKM11SSvmcQFQnEWaIj0xfmMSnikbqPu9h7DF66RAtzj94cYR4G1wuTC8asE0rSai0%2Bn9Ubw4DRMQ%2BAPUi6wwrIgmb04KDx0n1BIHydzu5z3pHneWdZO%2FrpFBcPExT13xjUaSa4HS5OPokL%2FW%2F6%2F7B5rPjyaEf%2FElmNFqpIF67djb6QLwEGUdgJZM8nlODxnIzFZOcyyjXmWylfh0g5yPmoenasgsIo40pu%2BcUUhyt7efzSsTXMPZ2yO2A8vcKuvrzy53tkvUfkrkOdqEHm9qhd6alYBxA%2BeHT3hcWDR1o5zZEPqkwseWVFDlVCNpIMhKLcQaNoh65g9aM%2FrQwcksmLgBIFbPNviN3Z1%2FZ04tmj3WRhW3xfETLmJD0IWrEoTkI%2FAbASmtvA31zEEWU3u3MS1fT1ZRgxmcTZCW%2BiEibjgmPJLGWdmK%2F1Si9RRToGdqMy0YUAr5b3vKh2ZczNSo8O9By0aKwxO95ujQs%3D--pF5svWlg5PuF6ZfB--7kTulEh3NUxuwg48br%2FkNQ%3D%3D',
    # 'Sec-Fetch-Dest': 'empty',
# 'Sec-Fetch-Mode': 'cors',
# 'Sec-Fetch-Site': 'same-origin',
# 'referer': 'https://www.vivino.com/explore?e=eJzLLbI11jNVy83MswWSiRW2RgZqyZW27k5qyUAiQK3A1lAtPc22LLEoM7UkMUctP8m2KLEkMy-9OD6xLLUoMT1VLd82JbU4Wa28JDoWqBhMGQEAwvwc0w==',
# 'user-agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.193 Safari/537.36',
# 'x-requested-with': 'XMLHttpRequest'
    # 'Sec-Fetch-Dest': 'document',
# 'Sec-Fetch-Mode': 'navigate',
# 'Sec-Fetch-Site': 'none',
# 'Sec-Fetch-User': '?1',
# 'Upgrade-Insecure-Requests': '1'
# 'X-Requested-With': 'XMLHttpRequest'
}

In [114]:
headers_stack = {"pragma": "no-cache",
"sec-fetch-dest": "empty",
"sec-fetch-mode": "cors",
"sec-fetch-site": "same-origin",
"user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.122 Safari/537.36",
"x-requested-with": "XMLHttpRequest"}

In [131]:
reviews_df = pd.DataFrame()

# for page in range(1, 3):
page = "https://www.vivino.com/api/wines/1123733/reviews?year=2018&per_page=1&page=1"
# page = f'https://www.vivino.com/api/wines/{wine_id_list[0]}/latest_reviews?year=N.V.&per_page=10&page={page}'
print(page)

# proxies = {
#   "http": "http://scraperapi:b65a0deee126a85a36e64532b1d7ebeb@proxy-server.scraperapi.com:8001",
#   "https": "http://scraperapi:b65a0deee126a85a36e64532b1d7ebeb@proxy-server.scraperapi.com:8001"}

response = requests.get(page, headers_test)
#     response_pattern = r'2.'
print(response.status_code)
print(response.headers)
print(response.)
if response.status_code // 100 == 2:
    json_str = response.content
#         print(response.content)
    json_obj = json.loads(json_str)
    for review in json_obj['reviews']:
        reviews_df = reviews_df.append(review, ignore_index=True)
else: 
    print(response.content)


https://www.vivino.com/api/wines/1123733/reviews?year=2018&per_page=1&page=1
429
{'Content-Type': 'text/plain', 'Content-Length': '251', 'Connection': 'keep-alive', 'Date': 'Mon, 16 Nov 2020 17:20:18 GMT', 'Cache-Control': 'no-cache', 'Content-Encoding': 'gzip', 'Referrer-Policy': 'origin-when-cross-origin', 'Set-Cookie': 'eeny_meeny_test_checkout_login_v1=E3xiEqphDPa%2FYjTZRKyx9lueYHpKkkxnQ7lkXJuoHmR8R2dg3ZOcKaor6qJyl%2B3CL8mEvSkTEaBwe8E1H2g7sQ%3D%3D; path=/; expires=Tue, 30 Mar 2021 12:00:00 GMT; HttpOnly; SameSite=Strict; secure, eeny_meeny_test_forced_merchant_filter_v1=yYuBrkSroJEnetUBZJecqQL7mlXKgx4kcP3in62B40AE00INvxniruJ%2Bt1PhroPls1fEZjwlQE7zUOJ2IKY6kQ%3D%3D; path=/; expires=Thu, 31 Dec 2020 12:00:00 GMT; HttpOnly; SameSite=Strict; secure', 'Status': '429 Too Many Requests', 'Strict-Transport-Security': 'max-age=631139040; includeSubdomains; preload', 'Vary': 'Accept-Encoding', 'X-Content-Type-Options': 'nosniff', 'X-Download-Options': 'noopen', 'X-Frame-Options': 'SAMEORIGIN'

AttributeError: 'Response' object has no attribute 'retry_after'

In [110]:
s = requests.Session()

In [612]:
# 1145628 in wine_id_list

True

In [617]:
len(reviews)

500

In [643]:
reviews_df = pd.DataFrame()

# for wine_id in wine_id_list[:2]:
# for page in range(1, 3):
# page = f'https://www.vivino.com/api/wines/{wine_id_list[130]}/reviews?year=N.V.&per_page=10&page=1'

# page = 'https://www.vivino.com/api/wines/1145628/reviews?year=N.V.&per_page=10&page=1'


page = 'https://www.vivino.com/api/explore/explore?country_code=GB&currency_code=GBP&grape_filter=varietal&min_rating=1&order_by=ratings_average&order=desc&page=1&price_range_max=400&price_range_min=1&wine_type_ids[]=2'
print(page)
# reviews_df = extract_reviews_to_df(page, headers_test, reviews_df)

test_df = pd.DataFrame()
test_list = []
price_list = []

for i in range(1,2):
#     print(i)
    page = "https://www.vivino.com/api/explore/explore?country_code=GB&currency_code=GBP&grape_filter=varietal&min_rating=1&\
    order_by=ratings_average&order=desc&\
page={}&price_range_max=400&price_range_min=0&wine_type_ids[]=1".format(i)
    test_df = get_wine_df(page, headers_test, test_df)
#     test_list, price_list = get_wine_json(page, headers_api, subset_red_list, price_list)

# rev_df_test = get_reviews_to_df(wine_id_list[:5], headers_browser, reviews_df)

https://www.vivino.com/api/explore/explore?country_code=GB&currency_code=GBP&grape_filter=varietal&min_rating=1&order_by=ratings_average&order=desc&page=1&price_range_max=400&price_range_min=1&wine_type_ids[]=2


In [644]:
test_df

Unnamed: 0,grapes,has_valid_ratings,id,image,name,seo_name,statistics,wine,year
0,,1.0,142749808.0,{'location': '//images.vivino.com/thumbs/BFAcH...,Frank Family Patriarch 2012,frank-family-patriarch-rutherford-red-wine-2012,"{'status': 'Normal', 'ratings_count': 81, 'rat...","{'id': 4382344, 'name': 'Patriarch', 'seo_name...",2012.0
1,,1.0,57646109.0,{'location': '//images.vivino.com/labels/V5JCH...,Chateau D Yguene 2001,chateau-d-yguene-red-wine-v-jdmvq-2001,"{'status': 'Normal', 'ratings_count': 75, 'rat...","{'id': 3474900, 'name': 'Chateau D Yguene', 's...",2001.0
2,,1.0,126346084.0,{'location': '//images.vivino.com/thumbs/4BpJn...,Hundred Acre Few and Far Between Cabernet Sauv...,hundred-acre-few-and-far-between-cabernet-sauv...,"{'status': 'Normal', 'ratings_count': 60, 'rat...","{'id': 4110288, 'name': 'Few and Far Between C...",2013.0
3,,1.0,154595351.0,{'location': '//images.vivino.com/thumbs/0667T...,Amici Echion Cabernet Sauvignon 2014,amici-cellars-echion-cabernet-sauvignon-oak-vi...,"{'status': 'Normal', 'ratings_count': 59, 'rat...","{'id': 5509315, 'name': 'Echion Cabernet Sauvi...",2014.0
4,,1.0,149005571.0,{'location': '//images.vivino.com/thumbs/DtBMh...,Sine Qua Non Rattrapante Grenache 2012,sine-qua-non-rattrapante-grenache-2012,"{'status': 'Normal', 'ratings_count': 58, 'rat...","{'id': 5089513, 'name': 'Rattrapante Grenache'...",2012.0
5,,1.0,145451346.0,{'location': '//images.vivino.com/thumbs/_IiGn...,Realm Beckstoffer Dr. Crane Vineyard 2015,realm-cellars-beckstoffer-dr-crane-vineyard-2015,"{'status': 'Normal', 'ratings_count': 50, 'rat...","{'id': 2103151, 'name': 'Beckstoffer Dr. Crane...",2015.0
6,,1.0,12423443.0,{'location': '//images.vivino.com/thumbs/axBtS...,Teso La Monja Tinto 2013,teso-la-monja-tinto-2013,"{'status': 'Normal', 'ratings_count': 49, 'rat...","{'id': 1450977, 'name': 'Tinto', 'seo_name': '...",2013.0
7,,1.0,6350045.0,{'location': '//images.vivino.com/thumbs/57Ed8...,Henri Jayer Vosne-Romanée Cros Parantoux 1996,domaine-henri-jayer-vosne-romanee-cros-paranto...,"{'status': 'Normal', 'ratings_count': 46, 'rat...","{'id': 1823657, 'name': 'Vosne-Romanée Cros Pa...",1996.0
8,,1.0,11239020.0,{'location': '//images.vivino.com/thumbs/79xOz...,Pine Ridge Fortis 2013,pine-ridge-fortis-2013,"{'status': 'Normal', 'ratings_count': 43, 'rat...","{'id': 3553, 'name': 'Fortis', 'seo_name': 'fo...",2013.0
9,,1.0,2708738.0,{'location': '//images.vivino.com/thumbs/JL0BH...,Bryant Family Vineyard Cabernet Sauvignon Prop...,bryant-family-cabernet-sauvignon-proprietor-gr...,"{'status': 'Normal', 'ratings_count': 41, 'rat...","{'id': 1645873, 'name': 'Cabernet Sauvignon Pr...",2002.0


In [633]:
reviews_df

In [631]:
reviews_df['id'].nunique()

KeyError: 'id'

In [579]:
reviews_df[reviews_df['id'] == 82946742.0]

Unnamed: 0,activity,aggregated,created_at,flavor_word_matches,id,language,note,rating,tagged_note,user,vintage
11,"{'id': 219148789, 'statistics': {'likes_count'...",1.0,2017-12-14T01:15:17.000Z,,82946742.0,en,A way of passing the time. But then so is watc...,2.0,A way of passing the time. But then so is watc...,"{'id': 3982743, 'seo_name': 'sean-blac', 'alia...","{'id': 1540565, 'seo_name': 'esporao-alandra-t..."
61,"{'id': 219148789, 'statistics': {'likes_count'...",1.0,2017-12-14T01:15:17.000Z,,82946742.0,en,A way of passing the time. But then so is watc...,2.0,A way of passing the time. But then so is watc...,"{'id': 3982743, 'seo_name': 'sean-blac', 'alia...","{'id': 1540565, 'seo_name': 'esporao-alandra-t..."


In [569]:
len(reviews)

500

In [582]:
len(set([review['id'] for review in reviews]))

400

### Eugene comments 

Идеи по гипотезам
* Чем меньше популяция, относительно которой мы хотим делать выводы, тем проще нам будет собрать репрезентативную выборку. Поэтому вместо того, чтобы пытаться сделать вывод относительно всего рынка вин по миру, проще (и реалестичнее) пытаться делать выводы о более локальных популяциях- по странам, континентам.
* Например, может быть интересно посравнивать вина америки и европы, как два основных континента-поставщика.
    * Есть ли разница по оценкам между сортами? (некоторые сорта растут лучше в одном регионе, некоторые в другом)
    * Есть ли разница по сочетаниям? (гипотеза, что в европе больше сыра, в америке мяска)
    * Раазница по описаниям (какие слова используют для описания)
    * Сделать что-нибудь вроде описания прототипичного американского и европейского вин (взять средние вкусы/сорта/оценки, вывести средние, найти самое близкое из существующих к этому среднему - у нас есть прототипичные вина континетов! Можно разбить еще по сортам/ белому-красному/ еще чему-нибудь)
* Отедьно, конечно, интересно посмотреть выборку дешевых и дорогих вин.
    * Возможно, это будет сложно, но у меня есть фантазия взять данные по температурам в разные годы в регионах, где делается вино, и покоррелировать температуру и оценку/цену на вина. Можно ли предсказать цену на вино в винодельне по температуре?
    * Посмотреть облако слов для дешевых-средниих-супердорогих вин. Гипотеза, что описания дорогих вин будут более пафосными)
    * Посравнивать описания хороших оценок и плохих оценок для дешевых-средних-супер дорогих вин. Гипотеза - в дорогих винах людей неустраивают другие штуки, в сравнении с дешевыми (например, я посмотрел, что часто единички к дорогим винам ставят с пометкой crooked, плохо хранилось. Может еще будут инсайты)

### Касательно реперезентативности выборки
В зависимости от того, что мы в итоге будем хотеть проверять, намн ужно будет нагенерить репрезентативную выборку относительно именно той популяции, которой мы исследуем. Условно, если мы сравниваем америку и европу, нам нужны репрезентативные выборки по этим двум континентам. Мне кажется, что можно ограничиться следующими парааметрами:
* Страна
* Регион
* Год изготовления
* Тип (белое/красное/розовое/пузырики)
* Сорта
* Цена 

Нам нужно посмотреть распределение вин по этим переменным, и постараться заиметь похожее соотношение в нашей выборке. У нас есть информация по распределению по этим факторам по пупуляции? (например, по континентаам, странам, или по миру).

### One more thing...
Все эти идеи родились у меня в голове, а я про вино знаю чуть больше, чем ничего) Мне кажется, что по-настоящему клевые и интересные гипотезы у нас могут родиться, если мы почитаем про винаа чуть больше. Мне кжется, это вообще важный этап такой работы. Как в науке - делаешь литературный обрзор, потом формируешь гипотезы, потом думаешь о том, какие даанные тебе нужны, потом проверяешь гипотезы) Возможно, это overkill, хотя если цель - поупражняться в аналитике, то предварительный анализ - это важная ее часть. Тогда мы сможем оформить проектик почти как настоящую статью - с интродакшеном, референсами и прочим)


В общем скажи, что думаешь. Можно созвониться и обсудить. Может ты про вино знаешь больше меня и у тебя будут более интересные гипотезы)