# Web Scraping for Tarisio stringed instrument auction sales
[Tarisio.com](https://tarisio.com/cozio-archive/browse-the-archive/makers/?letter=A)

# Contents 

[Extract maker data](#Extract-B-maker-data) <br>
[Scrape all makers by last initial](#Scrape-all-makers-by-last-initial) <br>
[Create makers dataframe](#Create-makers-dataframe) <br>
[Fix datatypes](#Fix-datatypes) <br>

## Dataframes: 
Tarisio2_raw.csv - dataframe with original, unformatted data <br>
Tarisio2.csv - dataframe with proper datatypes

This creates the final .csv versions to be used in app 

In [59]:
from bs4 import BeautifulSoup
import requests

headers = {'User-Agent':
           'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/128.0.0.0 Safari/537.36'
}

response = requests.get('https://tarisio.com/cozio-archive/browse-the-archive/makers/', headers = headers)
soup = BeautifulSoup(response.text, 'html.parser')

In [61]:
response.url

'https://tarisio.com/cozio-archive/browse-the-archive/makers/'

In [None]:
response.status_code

# Extract maker data
[Return to Table of Contents](#Contents) <br>


In [92]:
# Step 1: Scrape the maker page for last names  

from bs4 import BeautifulSoup
import requests

headers = {'User-Agent':
           'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/128.0.0.0 Safari/537.36'
}

response = requests.get('https://tarisio.com/cozio-archive/browse-the-archive/makers/', headers = headers)
soup = BeautifulSoup(response.text, 'html.parser')

# Create empty lists to hold maker data
Maker_urls = []
Maker_data = []

In [94]:
# Step 2: Scrape the maker page for price page urls for each maker

# Set a Maker initial page to scrape
maker_initial = 'A'

# Find the div with rel=maker_initial
makers_section = soup.find('div', {'rel': maker_initial})
makers_section

<div class="letter clearfix" rel="A" style="display: none"><ul class="col"><li><a href="maker/?Maker_ID=2919">Achner, Michael </a></li><li><a href="maker/?Maker_ID=2611">Achner, Philip </a></li><li><a href="maker/?Maker_ID=890">Acoulon, Alexandre Alfred </a></li><li><a href="maker/?Maker_ID=1662">Acton, William John </a></li><li><a href="maker/?Maker_ID=1988">Adam, Jean 'Grand'</a></li><li><a href="maker/?Maker_ID=1946">Adam, Jean I</a></li><li><a href="maker/?Maker_ID=2">Adam, Jean Dominique </a></li><li><a href="maker/?Maker_ID=15692">Ádámi, Géza </a></li><li><a href="maker/?Maker_ID=1242">Adams, Henry T. </a></li><li><a href="maker/?Maker_ID=2260">Adamsen, Peter Petersen </a></li><li><a href="maker/?Maker_ID=15973">Adelmann, Olga </a></li><li><a href="maker/?Maker_ID=3221">Adin, Charles </a></li><li><a href="maker/?Maker_ID=3227">Adler, Johann Georg </a></li><li><a href="maker/?Maker_ID=2261">Aerninck, Hendrick </a></li><li><a href="maker/?Maker_ID=2805">Aerts, Marcel </a></li><li><

In [96]:
# Extract all maker links from within that section
maker_links = makers_section.find_all('a')
maker_links

[<a href="maker/?Maker_ID=2919">Achner, Michael </a>,
 <a href="maker/?Maker_ID=2611">Achner, Philip </a>,
 <a href="maker/?Maker_ID=890">Acoulon, Alexandre Alfred </a>,
 <a href="maker/?Maker_ID=1662">Acton, William John </a>,
 <a href="maker/?Maker_ID=1988">Adam, Jean 'Grand'</a>,
 <a href="maker/?Maker_ID=1946">Adam, Jean I</a>,
 <a href="maker/?Maker_ID=2">Adam, Jean Dominique </a>,
 <a href="maker/?Maker_ID=15692">Ádámi, Géza </a>,
 <a href="maker/?Maker_ID=1242">Adams, Henry T. </a>,
 <a href="maker/?Maker_ID=2260">Adamsen, Peter Petersen </a>,
 <a href="maker/?Maker_ID=15973">Adelmann, Olga </a>,
 <a href="maker/?Maker_ID=3221">Adin, Charles </a>,
 <a href="maker/?Maker_ID=3227">Adler, Johann Georg </a>,
 <a href="maker/?Maker_ID=2261">Aerninck, Hendrick </a>,
 <a href="maker/?Maker_ID=2805">Aerts, Marcel </a>,
 <a href="maker/?Maker_ID=2262">Aerts, Rene </a>,
 <a href="maker/?Maker_ID=15467">Aerts, René &amp; Marcel </a>,
 <a href="maker/?Maker_ID=2665">Agostinelli, Luigi </a>,

In [71]:
# Loop to extract hrefs, urls and IDs for Makers
for maker in maker_links:

    # Initialize a dictionary to store our maker data
    maker_urls_dict = {}
    
    maker_name = maker.get_text().strip()
    maker_href = maker['href']
    maker_href_stripped = maker_href.strip('maker/')
    maker_ID = maker_href.strip('maker/?Maker_ID=')
    maker_urls = f'https://tarisio.com/cozio-archive/browse-the-archive/makers/{maker_href}'
    maker_price_urls = f'https://tarisio.com/cozio-archive/price-history/{maker_href_stripped}'

    # Store maker data in the dictionary
    maker_urls_dict['Maker'] = maker_name
    maker_urls_dict['Price_URL'] = maker_price_urls
    maker_urls_dict['MakerID'] = maker_ID

    # Add dictionary to list
    Maker_urls.append(maker_urls_dict)

In [73]:
maker_ID

'27'

In [49]:
maker_name

'Azzola, Luigi'

In [51]:
maker_href

'maker/?Maker_ID=27'

In [77]:
maker_urls

'https://tarisio.com/cozio-archive/browse-the-archive/makers/maker/?Maker_ID=27'

In [79]:
maker_price_urls

'https://tarisio.com/cozio-archive/price-history/?Maker_ID=27'

In [75]:
Maker_urls

[{'Maker': 'Achner, Michael',
  'Price_URL': 'https://tarisio.com/cozio-archive/price-history/?Maker_ID=2919',
  'MakerID': '2919'},
 {'Maker': 'Achner, Philip',
  'Price_URL': 'https://tarisio.com/cozio-archive/price-history/?Maker_ID=2611',
  'MakerID': '2611'},
 {'Maker': 'Acoulon, Alexandre Alfred',
  'Price_URL': 'https://tarisio.com/cozio-archive/price-history/?Maker_ID=890',
  'MakerID': '890'},
 {'Maker': 'Acton, William John',
  'Price_URL': 'https://tarisio.com/cozio-archive/price-history/?Maker_ID=1662',
  'MakerID': '1662'},
 {'Maker': "Adam, Jean 'Grand'",
  'Price_URL': 'https://tarisio.com/cozio-archive/price-history/?Maker_ID=1988',
  'MakerID': '1988'},
 {'Maker': 'Adam, Jean I',
  'Price_URL': 'https://tarisio.com/cozio-archive/price-history/?Maker_ID=1946',
  'MakerID': '1946'},
 {'Maker': 'Adam, Jean Dominique',
  'Price_URL': 'https://tarisio.com/cozio-archive/price-history/?Maker_ID=2',
  'MakerID': '2'},
 {'Maker': 'Ádámi, Géza',
  'Price_URL': 'https://tarisio.c

# Add code to scrape each maker page for data

In [108]:
# Loop to extract hrefs and urls for Makers
for maker in maker_links:

    # Initialize a dictionary to store our maker data
    maker_urls_dict = {}
    
    maker_name = maker.get_text().strip()
    maker_href = maker['href']
    maker_href_stripped = maker_href.strip('maker/')
    maker_ID = maker_href.strip('maker/?Maker_ID=')
    maker_urls = f'https://tarisio.com/cozio-archive/browse-the-archive/makers/{maker_href}'
    maker_price_urls = f'https://tarisio.com/cozio-archive/price-history/{maker_href_stripped}'

    # Store maker data in the dictionary
    maker_urls_dict['Maker'] = maker_name
    maker_urls_dict['Price_URL'] = maker_price_urls
    maker_urls_dict['MakerID'] = maker_ID

    # Add dictionary to list
    Maker_urls.append(maker_urls_dict)
    
    # URL of the webpage to scrape (replace this with the actual URL)
    url = maker_price_urls

    # Send a GET request to fetch that page's content
    response = requests.get(url)
    
    # Parse the content with BeautifulSoup
    soup = BeautifulSoup(response.content, 'html.parser')

    # Find all the <td> tags with class 'data-cell noSelect'
    rows = soup.find_all('tr', valign='top')  # Target the rows containing the relevant data
    
    # Loop through each row and extract the sale price, instrument type, sale date, auction house, and city

    for row in rows:

        # Initialize a dictionary to store instrument data
        maker_data_dict = {}
        
        # Instrument type
        type_cell = row.find_all('td')[0]   # Assuming the 1st <td> contains the instrument type
        inst_type = type_cell.text.strip()  # Extract and clean the date text
        print(f"Instrument type: {inst_type}")

        # Sale date
        date_cell = row.find_all('td')[3]   # Assuming the 4th <td> contains the sale date
        sale_date = date_cell.text.strip()  # Extract and clean the date text
        print(f"Sale Date: {sale_date}")
        
        # Sale price
        sale_price_cell = row.find_all('td')[5]    # Assuming the 6th <td> contains the sale price
        sale_price = sale_price_cell.text.strip()  # Extract and clean the date text
        print(f"Sale price: {sale_price}") 

        # Auction house
        auction_house_cell = row.find_all('td')[2]       # Assuming the 3rd <td> contains the auction house
        auction_house = auction_house_cell.text.strip()  # Extract and clean the date text
        print(f"Auction house: {auction_house}")

        # City
        city_cell = row.find_all('td')[1]  # Assuming the 2nd <td> contains the city
        city = city_cell.text.strip()  # Extract and clean the date text
        print(f"City: {city}")    

        # Store data in the dictionary
        maker_data_dict['Instrument'] = inst_type
        maker_data_dict['SaleDate'] = sale_date
        maker_data_dict['SalePrice'] = sale_price
        maker_data_dict['AuctionHouse'] = auction_house
        maker_data_dict['City'] = city
        maker_data_dict['MakerID'] = maker_ID
        print(f"MakerID: {maker_ID}")

        # Add the maker to the dictionary 
        # Find the <span> tag with class 'darkGrey' which contains the name
        name_tag = soup.find('span', class_='darkGrey')
            
        # Extract the text inside the <span> tag
        maker = name_tag.text.strip()
        print(f"Maker: {maker}")
        maker_data_dict['Maker'] = maker

        # Add dictionary to list
        Maker_data.append(maker_data_dict)

Instrument type: Violin
Sale Date: Feb 20, 2010
Sale price: $4,200
Auction house: Tarisio
City: Wallgau
MakerID: 2919
Maker: Achner, Michael
Instrument type: Violin
Sale Date: Apr 27, 1987
Sale price: $2,838
Auction house: Bongartz's
City: Mittenwald
MakerID: 2919
Maker: Achner, Michael
Instrument type: Violin
Sale Date: May 17, 2018
Sale price: $24,000
Auction house: Tarisio
City: Mittenwald
MakerID: 2611
Maker: Achner, Philip
Instrument type: Violin
Sale Date: Nov 15, 2008
Sale price: $3,884
Auction house: Bongartz's
City: 
MakerID: 2611
Maker: Achner, Philip
Instrument type: Violin
Sale Date: Mar 27, 1990
Sale price: $2,146
Auction house: Sotheby's
City: Mittenwald
MakerID: 2611
Maker: Achner, Philip
Instrument type: Violin
Sale Date: Nov 8, 2019
Sale price: $2,950
Auction house: Tarisio
City: 
MakerID: 890
Maker: Acoulon, Alexandre Alfred
Instrument type: Violin
Sale Date: Oct 16, 2013
Sale price: $3,000
Auction house: Tarisio
City: Paris
MakerID: 890
Maker: Acoulon, Alexandre Alfr

KeyboardInterrupt: 

In [110]:
Maker_data

[{'MakerID': '2919',
  'Instrument': 'Violin',
  'SaleDate': 'Feb 20, 2010',
  'SalePrice': '$4,200',
  'AuctionHouse': 'Tarisio',
  'City': 'Wallgau',
  'Maker': 'Achner, Michael'},
 {'MakerID': '2919',
  'Instrument': 'Violin',
  'SaleDate': 'Apr 27, 1987',
  'SalePrice': '$2,838',
  'AuctionHouse': "Bongartz's",
  'City': 'Mittenwald',
  'Maker': 'Achner, Michael'},
 {'MakerID': '2611',
  'Instrument': 'Violin',
  'SaleDate': 'May 17, 2018',
  'SalePrice': '$24,000',
  'AuctionHouse': 'Tarisio',
  'City': 'Mittenwald',
  'Maker': 'Achner, Philip'},
 {'MakerID': '2611',
  'Instrument': 'Violin',
  'SaleDate': 'Nov 15, 2008',
  'SalePrice': '$3,884',
  'AuctionHouse': "Bongartz's",
  'City': '',
  'Maker': 'Achner, Philip'},
 {'MakerID': '2611',
  'Instrument': 'Violin',
  'SaleDate': 'Mar 27, 1990',
  'SalePrice': '$2,146',
  'AuctionHouse': "Sotheby's",
  'City': 'Mittenwald',
  'Maker': 'Achner, Philip'},
 {'MakerID': '890',
  'Instrument': 'Violin',
  'SaleDate': 'Nov 8, 2019',
  

In [112]:
Maker_data

[{'MakerID': '2919',
  'Instrument': 'Violin',
  'SaleDate': 'Feb 20, 2010',
  'SalePrice': '$4,200',
  'AuctionHouse': 'Tarisio',
  'City': 'Wallgau',
  'Maker': 'Achner, Michael'},
 {'MakerID': '2919',
  'Instrument': 'Violin',
  'SaleDate': 'Apr 27, 1987',
  'SalePrice': '$2,838',
  'AuctionHouse': "Bongartz's",
  'City': 'Mittenwald',
  'Maker': 'Achner, Michael'},
 {'MakerID': '2611',
  'Instrument': 'Violin',
  'SaleDate': 'May 17, 2018',
  'SalePrice': '$24,000',
  'AuctionHouse': 'Tarisio',
  'City': 'Mittenwald',
  'Maker': 'Achner, Philip'},
 {'MakerID': '2611',
  'Instrument': 'Violin',
  'SaleDate': 'Nov 15, 2008',
  'SalePrice': '$3,884',
  'AuctionHouse': "Bongartz's",
  'City': '',
  'Maker': 'Achner, Philip'},
 {'MakerID': '2611',
  'Instrument': 'Violin',
  'SaleDate': 'Mar 27, 1990',
  'SalePrice': '$2,146',
  'AuctionHouse': "Sotheby's",
  'City': 'Mittenwald',
  'Maker': 'Achner, Philip'},
 {'MakerID': '890',
  'Instrument': 'Violin',
  'SaleDate': 'Nov 8, 2019',
  

# Scrape all makers by last initial
[Return to Table of Contents](#Contents) <br>

## Scrape A names
Execution time: 376.39132809638977 seconds

In [121]:
A_makers

Unnamed: 0,Instrument,SaleDate,SalePrice,AuctionHouse,City,MakerID,Maker
0,Violin,"Feb 20, 2010","$4,200",Tarisio,Wallgau,2919,"Achner, Michael"
1,Violin,"Apr 27, 1987","$2,838",Bongartz's,Mittenwald,2919,"Achner, Michael"
2,Violin,"May 17, 2018","$24,000",Tarisio,Mittenwald,2611,"Achner, Philip"
3,Violin,"Nov 15, 2008","$3,884",Bongartz's,,2611,"Achner, Philip"
4,Violin,"Mar 27, 1990","$2,146",Sotheby's,Mittenwald,2611,"Achner, Philip"
...,...,...,...,...,...,...,...
1360,Cello,"May 21, 2005","$11,840",Bongartz's,,27,"Azzola, Luigi"
1361,Violin,"May 6, 2004","$21,850",Tarisio,Turin,27,"Azzola, Luigi"
1362,Violin,"May 6, 2001","$8,337",Skinner,Turin,27,"Azzola, Luigi"
1363,Violin,"Jul 27, 1989","$3,834",Phillip's,Turin,27,"Azzola, Luigi"


In [123]:
A_makers['Maker'].nunique()

122

In [125]:
A_makers['MakerID'].nunique()

122

## Scrape B names
Execution time: 978.5993416309357 seconds

In [129]:
B_makers

Unnamed: 0,Instrument,SaleDate,SalePrice,AuctionHouse,City,MakerID,Maker
0,Violin,"Oct 6, 2010","$3,244",Bonhams,Mittenwald,2932,"Baader, Johann Evangelist"
1,Violin,"Oct 31, 2012","$2,215",Bonhams,Mittenwald,1264,"Baader & Co., J. A."
2,Violin,"Jul 27, 2012","$1,920",Tarisio,,1264,"Baader & Co., J. A."
3,Violin,"Oct 14, 2007",$823,Skinner,Mittenwald,1264,"Baader & Co., J. A."
4,Violin,"May 7, 2006",$206,Skinner,,1264,"Baader & Co., J. A."
...,...,...,...,...,...,...,...
5021,Violin Bow,"Mar 30, 1989",$371,Sotheby's,Liverpool,4684,"Byrom, George"
5022,Violin,"Dec 18, 1986",$299,Phillip's,Liverpool,4684,"Byrom, George"
5023,Violin,"Oct 15, 1998","$7,833",Phillip's,,1342,"Byrom, John"
5024,Violin Bow,"Jun 18, 1998",$692,Bonhams,,1342,"Byrom, John"


## Scrape C names
Execution time: 819.7238759994507 seconds

In [134]:
C_makers

Unnamed: 0,Instrument,SaleDate,SalePrice,AuctionHouse,City,MakerID,Maker
0,Violin,"Jun 2, 2022","$2,439",,Mirecourt,1555,"Cabasse, Prosper II"
1,Violin,"Mar 19, 1992",$753,Phillip's,Mirecourt,1555,"Cabasse, Prosper II"
2,Violin,"Nov 15, 1990",$645,Phillip's,Mirecourt,1555,"Cabasse, Prosper II"
3,Cello,"Oct 15, 1989","$2,283",Ader Tajan,,1555,"Cabasse, Prosper II"
4,Viola,"Nov 19, 1988","$1,204",Bongartz's,Mirecourt,1555,"Cabasse, Prosper II"
...,...,...,...,...,...,...,...
4715,Cello,"Nov 19, 1980","$8,078",Christie's,,1338,"Cuypers, Johannes Theodorus"
4716,Violin,"Jan 18, 1979","$2,612",Phillip's,,1338,"Cuypers, Johannes Theodorus"
4717,Violin,"May 16, 1978","$5,789",Sotheby's,,1338,"Cuypers, Johannes Theodorus"
4718,Violin,"Oct 26, 1972","$2,696",Sotheby's,,1338,"Cuypers, Johannes Theodorus"


## Scrape D names
Execution time: 604.3089759349823 seconds

In [139]:
D_makers

Unnamed: 0,Instrument,SaleDate,SalePrice,AuctionHouse,City,MakerID,Maker
0,Violin,"Oct 30, 2023","$4,662",,Naples,1936,"D'Aguanno, Carmine"
1,Viola,"Oct 2, 2007","$8,932",Sotheby's,,1936,"D'Aguanno, Carmine"
2,Violin,"May 4, 2008","$1,422",Skinner,,5281,"D'Alessandro, Mario"
3,Violin,"Mar 19, 2018","$6,403",Brompton's,,2749,"D'Aria, Vincenzo"
4,Violin,"Apr 27, 2012","$12,000",Tarisio,Naples,2749,"D'Aria, Vincenzo"
...,...,...,...,...,...,...,...
2645,Violin,"Dec 10, 1998","$1,227",Phillip's,Leeds,5772,"Dykes, Arthur W."
2646,Violin,"Nov 22, 1984",$432,Sotheby's,Leeds,5772,"Dykes, Arthur W."
2647,Violin,"Mar 10, 2010","$5,032",Bonhams,Leeds,5773,"Dykes, George Langton"
2648,Violin,"May 4, 2001","$2,820",Christie's,Leeds,5773,"Dykes, George Langton"


## Scrape E names
Execution time: 145.52475309371948 seconds

In [144]:
E_makers

Unnamed: 0,Instrument,SaleDate,SalePrice,AuctionHouse,City,MakerID,Maker
0,Violin Bow,"Mar 13, 1991","$1,225",Phillip's,,2645,"Earl, David"
1,Violin,"Nov 8, 2018","$4,500",Tarisio,Rotterdam,1097,"Eberle, Eugène I"
2,Violin,"Mar 28, 2013","$6,360",Tarisio,Rotterdam,1097,"Eberle, Eugène I"
3,Violin,"Feb 25, 2013","$5,100",Tarisio,Rotterdam,1097,"Eberle, Eugène I"
4,Cello,"Oct 31, 2012","$20,139",Bonhams,The Hague,1097,"Eberle, Eugène I"
...,...,...,...,...,...,...,...
422,Violin,"Feb 1, 2002",$814,Bonhams,Cowdenbeath,6030,"Ewan, David"
423,Violin,"Jan 26, 1989",$839,Phillip's,Cowdenbeath,6030,"Ewan, David"
424,Violin,"Jan 28, 1988",$141,Phillip's,,6032,"Eyles, Charles"
425,Violin,"Jan 17, 1985",$271,Phillip's,,6032,"Eyles, Charles"


## Scrape F names
Execution time: 509.83846974372864 seconds

In [149]:
F_makers

Unnamed: 0,Instrument,SaleDate,SalePrice,AuctionHouse,City,MakerID,Maker
0,Cello,"Mar 25, 2024","$45,516",,Ascoli Piceno,1796,"Fabiani, Antonio"
1,Violin,"May 6, 2004","$7,475",Tarisio,Ascoli Piceno,1796,"Fabiani, Antonio"
2,Cello,"Jun 27, 1985","$1,251",Phillip's,,1796,"Fabiani, Antonio"
3,Violin,"Apr 30, 1990","$28,172",Bongartz's,Naples,6040,"Fabricatore, Gennaro II"
4,Viola,"Jan 18, 1984",$800,Sotheby Parke Bernet,Naples,6040,"Fabricatore, Gennaro II"
...,...,...,...,...,...,...,...
1965,Violin,"Mar 14, 1991","$5,919",Sotheby's,London,1474,"Furber, Matthew"
1966,Cello,"Nov 23, 1989","$18,937",Sotheby's,London,1474,"Furber, Matthew"
1967,Small Violin,"Mar 30, 1989","$1,485",Sotheby's,London,1474,"Furber, Matthew"
1968,Viola,"Nov 1, 1994","$20,645",Sotheby's,Füssen,6557,"Fürst, Georg"


## Scrape G names
Execution time: 796.1523900032043 seconds

In [154]:
G_makers

Unnamed: 0,Instrument,SaleDate,SalePrice,AuctionHouse,City,MakerID,Maker
0,Violin Bow,"Oct 24, 2023","$2,674",,,15761,"Gabriel, Joseph P."
1,Violin Bow,"Jun 13, 2023","$2,461",,,15761,"Gabriel, Joseph P."
2,Violin,"Nov 24, 2019","$12,300",Skinner,Florence,183,"Gabrielli, Giovanni Battista"
3,Violin,"Oct 8, 2024","$71,273",,Florence,183,"Gabrielli, Giovanni Battista"
4,Violin,"Dec 1, 2022","$49,589",,Florence,183,"Gabrielli, Giovanni Battista"
...,...,...,...,...,...,...,...
3640,Violin,"Nov 20, 1989","$1,797",Bongartz's,Markneukirchen,249,"Gütter, Kurt Arno"
3641,Violin,"Nov 20, 1989","$1,797",Bongartz's,Markneukirchen,249,"Gütter, Kurt Arno"
3642,Violin,"Feb 20, 2015","$2,596",Tarisio,Markneukirchen,3145,"Gütter, Oskar Richard"
3643,Violin,"Aug 15, 2014","$2,360",Tarisio,Markneukirchen,3145,"Gütter, Oskar Richard"


## Scrape H names
Execution time: 665.340888261795 seconds

In [159]:
H_makers

Unnamed: 0,Instrument,SaleDate,SalePrice,AuctionHouse,City,MakerID,Maker
0,Violin,"Jul 15, 2006",$460,Tarisio,,251,"Haas, John"
1,Cello,"Mar 22, 2024","$9,833",,Cremona,1912,"Haddad, Dawn M."
2,Cello,"Nov 8, 2018","$5,900",Tarisio,Cremona,1912,"Haddad, Dawn M."
3,Cello,"Oct 23, 2008","$7,605",Tarisio,Cremona,1912,"Haddad, Dawn M."
4,Cello,"May 9, 2008","$14,040",Tarisio,Cremona,1912,"Haddad, Dawn M."
...,...,...,...,...,...,...,...
5485,Violin,"Feb 20, 2008","$1,404",Tarisio,"Northampton, MA",303,"Hyde, Andrew"
5486,Violin,"Oct 22, 2004","$2,300",Tarisio,"Northampton, MA",303,"Hyde, Andrew"
5487,Violin,"May 2, 2004",$764,Skinner,,303,"Hyde, Andrew"
5488,Viola,"Nov 11, 1990",$715,Skinner,,303,"Hyde, Andrew"


## Scrape I names 
Execution time: 18.59822678565979 seconds

In [164]:
I_makers

Unnamed: 0,Instrument,SaleDate,SalePrice,AuctionHouse,City,MakerID,Maker
0,Violin,"Jun 8, 2021","$5,085",Ingles & Hayday,Florence,305,"Ignesti, Roberto"
1,Violin,"May 19, 2006","$6,325",Tarisio,Livorno,305,"Ignesti, Roberto"
2,Violin,"Mar 21, 1995","$4,194",Sotheby's,Florence,305,"Ignesti, Roberto"
3,Violin,"May 2, 1988","$2,621",Bongartz's,Florence,305,"Ignesti, Roberto"
4,Violin,"Oct 25, 2022","$23,923",,London,15706,"Ihle, Philip"
5,Violin,"Nov 4, 2005","$1,265",Tarisio,Linlithgow,306,"Ingram, William"
6,Violin,"May 5, 2002","$1,763",Skinner,,2875,"Injeian, Phillip"
7,Violin,"Nov 4, 2001","$3,738",Skinner,,2875,"Injeian, Phillip"
8,Violin,"Oct 18, 2020","$23,247",Brompton's,Naples,1843,"Iorio, Vincenzo"
9,Violin,"Mar 11, 2008","$28,859",Brompton's,Naples,1843,"Iorio, Vincenzo"


## Scrape J names
Execution time: 156.963458776474 seconds

In [169]:
J_makers

Unnamed: 0,Instrument,SaleDate,SalePrice,AuctionHouse,City,MakerID,Maker
0,Violin,"Apr 14, 1988","$1,688",Phillip's,Leeds,8093,"J'Anson, Edward Poppleweil"
1,Violin,"Jul 27, 2012","$1,800",Tarisio,Ewnie,2178,"Jack, William G."
2,Violin,"Feb 20, 2010","$1,800",Tarisio,Ewnie,2178,"Jack, William G."
3,Violin,"Nov 1, 2005","$2,962",Sotheby's,,8095,"Jacklin, Cyril William"
4,Violin,"Nov 11, 1986",$253,Christie's,London,8095,"Jacklin, Cyril William"
...,...,...,...,...,...,...,...
391,Violin,"Nov 20, 1994","$2,070",Skinner,Schönbach,8303,"Juzek, Jan (John)"
392,Viola,"Nov 8, 1992",$424,Skinner,Prague,8303,"Juzek, Jan (John)"
393,Violin,"Mar 19, 1992","$1,548",Phillip's,Prague,8303,"Juzek, Jan (John)"
394,Violin,"Jun 21, 1990","$4,175",Phillip's,Prague,8303,"Juzek, Jan (John)"


## Scrape K names
Execution time: 386.228059053421 seconds

In [174]:
K_makers

Unnamed: 0,Instrument,SaleDate,SalePrice,AuctionHouse,City,MakerID,Maker
0,Violin,"May 12, 2005","$2,645",Tarisio,,319,"Kabay, Emmerich"
1,Violin Bow,"May 31, 2024","$3,600",,,15241,"Kagan, Gerald"
2,Cello Bow,"Nov 18, 2022","$2,596",,,15241,"Kagan, Gerald"
3,Cello Bow,"Jun 10, 2022","$1,560",,,15241,"Kagan, Gerald"
4,Cello Bow,"Mar 18, 2022",$960,,,15241,"Kagan, Gerald"
...,...,...,...,...,...,...,...
1178,Viola,"Nov 9, 1997",$920,Skinner,,8706,"Kuster, Frederick"
1179,Violin,"May 9, 2014","$17,700",Tarisio,"San Francisco, CA",2147,"Kuttner, Francis"
1180,Violin,"Oct 16, 2013","$16,800",Tarisio,"San Francisco, CA",2147,"Kuttner, Francis"
1181,Viola,"Oct 18, 2009","$9,600",Tarisio,"San Francisco, CA",2147,"Kuttner, Francis"


## Scrape L names
Execution time: 511.11951303482056 seconds

In [179]:
L_makers

Unnamed: 0,Instrument,SaleDate,SalePrice,AuctionHouse,City,MakerID,Maker
0,Violin,"Jun 6, 2024","$5,436",,Paris,8754,"L'Humbert, Emile"
1,Violin,"Mar 31, 2015","$8,876",Tarisio,Paris,8754,"L'Humbert, Emile"
2,Violin,"Oct 15, 2010","$7,200",Tarisio,,8754,"L'Humbert, Emile"
3,Violin,"Mar 10, 2009","$11,245",Sotheby's,Paris,8754,"L'Humbert, Emile"
4,Violin,"Jun 11, 1996","$6,180",Sotheby's,Paris,8754,"L'Humbert, Emile"
...,...,...,...,...,...,...,...
3362,Violin Bow,"May 8, 2003",$738,Bonhams,,1457,"Lyon & Healy, Firm"
3363,Violin Bow,"Oct 4, 2001","$1,500",Sotheby's,,1457,"Lyon & Healy, Firm"
3364,Violin,"Jun 29, 1993","$1,984",Phillip's,,1457,"Lyon & Healy, Firm"
3365,Violin Bow,"Jun 13, 1990",$526,Christie's,,1457,"Lyon & Healy, Firm"


## Scrape M names
Execution time: 872.6006419658661 seconds

In [187]:
M_makers

Unnamed: 0,Instrument,SaleDate,SalePrice,AuctionHouse,City,MakerID,Maker
0,Violin,"Feb 9, 1998",$173,Butterfield & Butterfield,,1240,"Maag, Henry"
1,Violin,"Feb 9, 1998",$173,Butterfield & Butterfield,,1240,"Maag, Henry"
2,Violin,"Feb 9, 1998",$161,Butterfield & Butterfield,,1240,"Maag, Henry"
3,Violin,"Jun 9, 2011","$1,811",Vichy-Enchères,,2375,"Maccaferri, Mario"
4,Cello,"Apr 22, 1985","$4,735",Bongartz's,Cento,2375,"Maccaferri, Mario"
...,...,...,...,...,...,...,...
3798,Violin,"Nov 18, 1997","$4,680",Sotheby's,Udine,477,"Muschietti, Umberto"
3799,Violin,"Mar 28, 1989","$2,323",Bongartz's,Frankfurt,10149,"Muschke, Johann"
3800,Violin,"Nov 19, 1988","$1,458",Bongartz's,Frankfurt,10149,"Muschke, Johann"
3801,Viola,"May 6, 1991","$3,164",Bongartz's,Prague,10154,"Musil, Cestmir"


## Scrape N names
Execution time: 217.5417869091034 seconds

In [192]:
N_makers

Unnamed: 0,Instrument,SaleDate,SalePrice,AuctionHouse,City,MakerID,Maker
0,Violin,"Oct 31, 2023","$5,098",,Paris,15617,"Nadegini, Jean-Baptiste Léonidas"
1,Violin,"Dec 2, 2021","$3,655",,Paris,15617,"Nadegini, Jean-Baptiste Léonidas"
2,Cello,"Aug 15, 2014","$10,030",Tarisio,,15617,"Nadegini, Jean-Baptiste Léonidas"
3,Violin,"Dec 8, 2011","$2,160",Vichy-Enchères,,15617,"Nadegini, Jean-Baptiste Léonidas"
4,Violin,"Jun 9, 2011","$9,055",Vichy-Enchères,,15617,"Nadegini, Jean-Baptiste Léonidas"
...,...,...,...,...,...,...,...
1827,Violin Bow,"May 16, 2015","$2,360",Tarisio,,943,"Nürnberger-Süss, August"
1828,Violin Bow,"May 9, 2014","$3,600",Tarisio,,943,"Nürnberger-Süss, August"
1829,Violin Bow,"Mar 10, 2011","$3,855",Tarisio,"Novato, CA",943,"Nürnberger-Süss, August"
1830,Violin Bow,"Oct 20, 2009","$2,359",Tarisio,"Novato, CA",943,"Nürnberger-Süss, August"


## Scrape O names
Execution time: 126.16039085388184 seconds

In [197]:
O_makers

Unnamed: 0,Instrument,SaleDate,SalePrice,AuctionHouse,City,MakerID,Maker
0,Violin,"Apr 26, 2015",$861,Skinner,,498,"O'Laughlin, Terrence"
1,Violin,"Jul 21, 2005",$920,Tarisio,"Los Angeles, CA",498,"O'Laughlin, Terrence"
2,Violin,"Feb 16, 2005",$748,Tarisio,"Boston, MA",498,"O'Laughlin, Terrence"
3,Violin,"Oct 19, 2003",$705,Skinner,,498,"O'Laughlin, Terrence"
4,Violin,"May 6, 2001",$460,Skinner,,498,"O'Laughlin, Terrence"
...,...,...,...,...,...,...,...
852,Violin,"Sep 20, 1984","$1,007",Sotheby's,Leeds,1367,"Owen, John William"
853,Violin,"Apr 7, 1983",$926,Sotheby's,Leeds,1367,"Owen, John William"
854,Cello Bow,"Apr 26, 2022","$7,686",,,506,"Oxley, Peter"
855,Viola Bow,"Mar 2, 2007","$2,909",Tarisio,,506,"Oxley, Peter"


## Scrape P names
Execution time: 746.7027010917664 seconds

In [204]:
P_makers

Unnamed: 0,Instrument,SaleDate,SalePrice,AuctionHouse,City,MakerID,Maker
0,Violin,"Sep 15, 1988","$1,324",Phillip's,Mirecourt,2401,"Pacherele, Michel"
1,Violin,"Mar 1, 2019","$8,850",Tarisio,Mirecourt,507,"Pacherele, Pierre"
2,Violin,"May 9, 2014","$18,000",Tarisio,Mirecourt,507,"Pacherele, Pierre"
3,Violin,"Dec 12, 2011","$5,625",Brompton's,,507,"Pacherele, Pierre"
4,Violin,"Oct 22, 2008","$17,031",Tarisio,Mirecourt,507,"Pacherele, Pierre"
...,...,...,...,...,...,...,...
3842,Violin,"Jun 21, 1984","$1,948",Sotheby's,London,584,"Pyne, George"
3843,Viola,"May 10, 1984","$2,434",Phillip's,London,584,"Pyne, George"
3844,Violin,"Mar 29, 1984","$1,277",Phillip's,London,584,"Pyne, George"
3845,Violin,"Jun 23, 1983","$1,695",Sotheby's,London,584,"Pyne, George"


## Scrape Q names
Execution time: 18.37739586830139 seconds

In [209]:
Q_makers

Unnamed: 0,Instrument,SaleDate,SalePrice,AuctionHouse,City,MakerID,Maker
0,Cello Bow,"Nov 19, 2021","$2,832",,,2149,"Quade, Roy G."
1,Cello Bow,"Apr 27, 2012","$3,600",Tarisio,Calgary,2149,"Quade, Roy G."
2,Cello Bow,"Oct 18, 2009","$2,400",Tarisio,Calgary,2149,"Quade, Roy G."
3,Violin Bow,"Oct 18, 2009","$2,640",Tarisio,Calgary,2149,"Quade, Roy G."
4,Violin,"Mar 18, 2019","$7,626",Brompton's,Mantua,15326,"Quareni, Vincenzo"
5,Cello,"Jun 6, 2024","$15,629",,,1957,"Quenoil, Charles"
6,Bass,"Jun 2, 2022","$39,801",,,1957,"Quenoil, Charles"
7,Cello,"Mar 15, 2016","$20,415",Ingles & Hayday,Igny,1957,"Quenoil, Charles"
8,Cello,"May 24, 2013","$5,775",Tajan,Mirecourt,1957,"Quenoil, Charles"
9,Cello,"Jun 9, 2011","$25,354",Vichy-Enchères,,1957,"Quenoil, Charles"


## Scrape R names
Execution time: 514.1028988361359 seconds

In [214]:
R_makers

Unnamed: 0,Instrument,SaleDate,SalePrice,AuctionHouse,City,MakerID,Maker
0,Violin,"Nov 23, 1992","$2,746",Bongartz's,Bubenreuth,11534,"Raab, Willibald"
1,Violin,"Nov 5, 1989",$468,Skinner,Schönbach,11534,"Raab, Willibald"
2,Viola,"Oct 17, 2014","$2,832",Tarisio,Bubenreuth,585,"Raabs, Gottfried"
3,Violin,"Mar 1, 2014","$1,140",Tarisio,Bubenreuth,585,"Raabs, Gottfried"
4,Viola,"Oct 24, 2002","$1,575",Tarisio,Bubenreuth,585,"Raabs, Gottfried"
...,...,...,...,...,...,...,...
1763,Violin,"Oct 6, 2010","$4,771",Bonhams,Amsterdam,12170,"Rutz, Karl"
1764,Viola,"Mar 30, 1989","$1,856",Sotheby's,Luzern,12170,"Rutz, Karl"
1765,Violin,"Dec 15, 2010","$4,309",Bonhams,,12172,"Ruzicka, Josef Vojtech"
1766,Violin,"Aug 7, 1997","$3,286",Phillip's,,12172,"Ruzicka, Josef Vojtech"


## Scrape S names
Execution time: 916.2442841529846 seconds

In [219]:
S_makers

Unnamed: 0,Instrument,SaleDate,SalePrice,AuctionHouse,City,MakerID,Maker
0,Violin,"Oct 8, 2024","$7,907",,Milan,637,"Saccani, Benigno"
1,Violin,"Oct 19, 2019","$6,231",Ingles & Hayday,Milan,637,"Saccani, Benigno"
2,Violin,"Mar 30, 2015","$7,104",Ingles & Hayday,Milan,637,"Saccani, Benigno"
3,Violin,"Jun 26, 2014","$9,929",Tarisio,Milan,637,"Saccani, Benigno"
4,Violin,"Mar 1, 2014","$5,400",Tarisio,Milan,637,"Saccani, Benigno"
...,...,...,...,...,...,...,...
4399,Violin,"Nov 22, 1984","$1,350",Sotheby's,London,956,"Szepessy, Béla"
4400,Violin,"Mar 29, 1984","$3,511",Phillip's,London,956,"Szepessy, Béla"
4401,Violin,"Apr 7, 1983","$3,804",Sotheby's,London,956,"Szepessy, Béla"
4402,Violin,"Feb 26, 2004","$2,070",Tarisio,Arad,730,"Szilagyi, Albert"


## Scrape T names
Execution time: 409.2382938861847 seconds

In [224]:
T_makers

Unnamed: 0,Instrument,SaleDate,SalePrice,AuctionHouse,City,MakerID,Maker
0,Violin,"Dec 2, 2021","$1,687",,Toulon,15625,"Tabiasco,"
1,Violin,"May 31, 2024","$15,600",,Cremona,731,"Tadioli, Maurizio"
2,Violin,"Jun 9, 2023","$19,200",,Cremona,731,"Tadioli, Maurizio"
3,Viola,"Nov 13, 2020","$16,800",Tarisio,Cremona,731,"Tadioli, Maurizio"
4,Violin,"Feb 24, 2017","$9,600",Tarisio,Cremona,731,"Tadioli, Maurizio"
...,...,...,...,...,...,...,...
3499,Violin,"Jun 12, 1986",$142,Phillip's,Weston,767,"Tweedale, Charles L."
3500,Violin,"Nov 26, 1984","$1,078",Bongartz's,,767,"Tweedale, Charles L."
3501,Cello,"Mar 11, 2009",$529,Bonhams,,2457,"Tyson, Herbert William"
3502,Cello,"Mar 10, 2003","$2,820",Bonhams,Louth,2457,"Tyson, Herbert William"


## Scrape U names
Execution time: 35.353904724121094 seconds

In [229]:
U_makers

Unnamed: 0,Instrument,SaleDate,SalePrice,AuctionHouse,City,MakerID,Maker
0,Violin,"Mar 2, 2007","$2,909",Tarisio,Padua,768,"Ubaldino, Cavestro"
1,Violin Bow,"Mar 24, 2020",$772,Tarisio,,1396,"Uebel, Kurt Werner"
2,Violin Bow,"Oct 6, 2010",$725,Brompton's,,1396,"Uebel, Kurt Werner"
3,Cello Bow,"Mar 11, 2009",$827,Bonhams,Markneukirchen,1396,"Uebel, Kurt Werner"
4,Violin Bow,"Jul 14, 2004",$444,Christie's,,1396,"Uebel, Kurt Werner"
...,...,...,...,...,...,...,...
58,Cello,"May 3, 2010","$15,600",Tarisio,,1995,"Utili, Nicola"
59,Violin,"Nov 4, 2001","$4,255",Skinner,Bologna,1995,"Utili, Nicola"
60,Violin,"Oct 22, 1998",$424,Sotheby's,Castelbolognese,1995,"Utili, Nicola"
61,Violin,"Nov 2, 1993","$5,115",Sotheby's,Castelbolognese,1995,"Utili, Nicola"


## Scrape V names
Execution time: 355.8200342655182 seconds

In [234]:
V_makers

Unnamed: 0,Instrument,SaleDate,SalePrice,AuctionHouse,City,MakerID,Maker
0,Violin,"May 4, 2008","$5,925",Skinner,,770,"Vaccari, Alberto"
1,Violin,"May 12, 2005","$7,475",Tarisio,Reggio Emilia,770,"Vaccari, Alberto"
2,Violin,"Nov 15, 1995","$3,046",Christie's,,770,"Vaccari, Alberto"
3,Violin,"May 12, 2017","$18,000",Tarisio,Reggio Emilia,771,"Vaccari, Raffaele"
4,Violin,"Nov 28, 2012","$22,800",Tarisio,,771,"Vaccari, Raffaele"
...,...,...,...,...,...,...,...
2901,Violin,"Jun 22, 1994","$6,693",Christie's,,2544,"Vuillaume (St. Cecile mark), Jean-Baptiste & N..."
2902,Violin,"Oct 11, 1973","$3,760",Sotheby's,,2544,"Vuillaume (St. Cecile mark), Jean-Baptiste & N..."
2903,Violin,"Jun 3, 1971","$1,500",Sotheby's,,2544,"Vuillaume (St. Cecile mark), Jean-Baptiste & N..."
2904,Violin,"May 13, 1971",$314,Sotheby's,,2544,"Vuillaume (St. Cecile mark), Jean-Baptiste & N..."


## Scrape W names
Execution time: 375.236661195755 seconds

In [239]:
W_makers

Unnamed: 0,Instrument,SaleDate,SalePrice,AuctionHouse,City,MakerID,Maker
0,Violin,"Apr 26, 2015","$1,599",Skinner,Faulenbach,14418,"Wachter, Anton"
1,Violin,"Nov 11, 2011","$5,700",Tarisio,Füssen,14418,"Wachter, Anton"
2,Viola,"Mar 6, 2012","$7,504",Sotheby's,,14421,"Wachter, Lorenz"
3,Violin,"Mar 11, 2009","$1,819",Bonhams,Leeds,1708,"Wade, Joseph"
4,Violin,"Oct 25, 2007","$1,872",Tarisio,Leeds,1708,"Wade, Joseph"
...,...,...,...,...,...,...,...
1796,Violin Bow,"Oct 17, 2004","$1,116",Skinner,,14992,"Wunderlich, Otto Felix"
1797,Cello Bow,"Oct 8, 1988",$385,Skinner,,14992,"Wunderlich, Otto Felix"
1798,Violin,"Oct 6, 2000","$4,500",Sotheby's,,856,"Wurlitzer, Firm"
1799,Violin Bow,"Nov 16, 1999",$650,Sotheby's,"New York, NY",856,"Wurlitzer, Firm"


## Scrape Y names
Execution time: 21.329189777374268 seconds


In [244]:
Y_makers

Unnamed: 0,Instrument,SaleDate,SalePrice,AuctionHouse,City,MakerID,Maker
0,Violin,"Feb 16, 2006",$690,Tarisio,,835,"Yarrington, B.A."
1,Violin Bow,"Jun 6, 2024","$7,882",,,1719,"Yeoman, Sydney Braithwaite"
2,Violin Bow,"Nov 17, 2023","$2,950",,,1719,"Yeoman, Sydney Braithwaite"
3,Violin Bow,"Jun 1, 2023","$6,123",,,1719,"Yeoman, Sydney Braithwaite"
4,Viola Bow,"Jun 5, 2023","$8,172",,,1719,"Yeoman, Sydney Braithwaite"
5,Violin Bow,"Dec 1, 2022","$4,029",,,1719,"Yeoman, Sydney Braithwaite"
6,Violin Bow,"Apr 26, 2022","$4,723",,,1719,"Yeoman, Sydney Braithwaite"
7,Cello Bow,"Mar 29, 2022","$3,782",,,1719,"Yeoman, Sydney Braithwaite"
8,Viola Bow,"Mar 24, 2020","$8,427",Tarisio,,1719,"Yeoman, Sydney Braithwaite"
9,Cello Bow,"Feb 28, 2020","$4,200",Tarisio,,1719,"Yeoman, Sydney Braithwaite"


# Scrape Z names
Execution time: 101.7685661315918 seconds

In [249]:
Z_makers

Unnamed: 0,Instrument,SaleDate,SalePrice,AuctionHouse,City,MakerID,Maker
0,Cello Bow,"May 31, 2024","$2,360",,,2843,"Zabinski, Roger"
1,Violin Bow,"Jun 9, 2023","$3,000",,,2843,"Zabinski, Roger"
2,Violin Bow,"Jun 10, 2022","$3,900",,,2843,"Zabinski, Roger"
3,Violin Bow,"Mar 18, 2022","$1,534",,,2843,"Zabinski, Roger"
4,Cello Bow,"Oct 24, 2015","$3,000",Tarisio,,2843,"Zabinski, Roger"
...,...,...,...,...,...,...,...
165,Violin,"Apr 27, 2012","$108,000",Tarisio,"Brooklyn, NY",844,"Zygmuntowicz, Samuel"
166,Violin,"Dec 14, 2009","$15,634",Brompton's,,844,"Zygmuntowicz, Samuel"
167,Violin,"Dec 14, 2009","$15,634",Brompton's,,844,"Zygmuntowicz, Samuel"
168,Violin,"May 8, 2003","$130,000",Tarisio,"Brooklyn, NY",844,"Zygmuntowicz, Samuel"


In [246]:
# Code for scraping a page of makers by last initial
# Be sure to: choose the alphabet initial to scrape at the top of the code
#             and rename the dataframe with this letter at the end of the code!

# Time the procedure
import time

start_time = time.time()  # Record the start time

# Step 1: Scrape the maker page for last names  

from bs4 import BeautifulSoup
import requests

headers = {'User-Agent':
           'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/128.0.0.0 Safari/537.36'
}

response = requests.get('https://tarisio.com/cozio-archive/browse-the-archive/makers/', headers = headers)
soup = BeautifulSoup(response.text, 'html.parser')

# Create empty lists to hold maker data
Maker_urls = []
Maker_data = []

# Step 2: Scrape the maker page for price page urls for each maker

# Set a Maker initial page to scrape
maker_initial = 'Z'

# Find the div with rel=maker_initial
makers_section = soup.find('div', {'rel': maker_initial})
makers_section

# Extract all maker links from within that section
maker_links = makers_section.find_all('a')
maker_links

# Loop to extract hrefs and urls for Makers
for maker in maker_links:

    # Initialize a dictionary to store our maker data
    maker_urls_dict = {}
    
    maker_name = maker.get_text().strip()
    maker_href = maker['href']
    maker_href_stripped = maker_href.strip('maker/')
    maker_ID = maker_href.strip('maker/?Maker_ID=')
    maker_urls = f'https://tarisio.com/cozio-archive/browse-the-archive/makers/{maker_href}'
    maker_price_urls = f'https://tarisio.com/cozio-archive/price-history/{maker_href_stripped}'

    # Store maker data in the dictionary
    maker_urls_dict['Maker'] = maker_name
    maker_urls_dict['Price_URL'] = maker_price_urls
    maker_urls_dict['MakerID'] = maker_ID

    # Add dictionary to list
    Maker_urls.append(maker_urls_dict)

    # URL of the webpage to scrape (replace this with the actual URL)
    url = maker_price_urls

    # Send a GET request to fetch that page's content
    response = requests.get(url)
    
    # Parse the content with BeautifulSoup
    soup = BeautifulSoup(response.content, 'html.parser')

    # Find all the <td> tags with class 'data-cell noSelect'
    rows = soup.find_all('tr', valign='top')  # Target the rows containing the relevant data
    
    # Loop through each row and extract the sale price, instrument type, sale date, auction house, and city

    for row in rows:

        # Initialize a dictionary to store instrument data
        maker_data_dict = {}
        
        # Instrument type
        type_cell = row.find_all('td')[0]   # Assuming the 1st <td> contains the instrument type
        inst_type = type_cell.text.strip()  # Extract and clean the date text
        print(f"Instrument type: {inst_type}")

        # Sale date
        date_cell = row.find_all('td')[3]   # Assuming the 4th <td> contains the sale date
        sale_date = date_cell.text.strip()  # Extract and clean the date text
        print(f"Sale Date: {sale_date}")
        
        # Sale price
        sale_price_cell = row.find_all('td')[5]    # Assuming the 6th <td> contains the sale price
        sale_price = sale_price_cell.text.strip()  # Extract and clean the date text
        print(f"Sale price: {sale_price}") 

        # Auction house
        auction_house_cell = row.find_all('td')[2]       # Assuming the 3rd <td> contains the auction house
        auction_house = auction_house_cell.text.strip()  # Extract and clean the date text
        print(f"Auction house: {auction_house}")

        # City
        city_cell = row.find_all('td')[1]  # Assuming the 2nd <td> contains the city
        city = city_cell.text.strip()  # Extract and clean the date text
        print(f"City: {city}")    

        # Store data in the dictionary
        maker_data_dict['Instrument'] = inst_type
        maker_data_dict['SaleDate'] = sale_date
        maker_data_dict['SalePrice'] = sale_price
        maker_data_dict['AuctionHouse'] = auction_house
        maker_data_dict['City'] = city
        maker_data_dict['MakerID'] = maker_ID
        print(f"MakerID: {maker_ID}")

        # Add the maker to the dictionary 
        # Find the <span> tag with class 'darkGrey' which contains the name
        name_tag = soup.find('span', class_='darkGrey')
            
        # Extract the text inside the <span> tag
        maker = name_tag.text.strip()
        print(f"Maker: {maker}")
        maker_data_dict['Maker'] = maker

        # Add dictionary to list
        Maker_data.append(maker_data_dict)

# Store makers in a dataframe
import pandas as pd

Z_makers = pd.DataFrame(Maker_data)

# Record the end time
end_time = time.time()  
            
# Calculate the execution time
execution_time = end_time - start_time  
print(f'Execution time: {execution_time} seconds')

Instrument type: Cello Bow
Sale Date: May 31, 2024
Sale price: $2,360
Auction house: 
City: 
MakerID: 2843
Maker: Zabinski, Roger
Instrument type: Violin Bow
Sale Date: Jun 9, 2023
Sale price: $3,000
Auction house: 
City: 
MakerID: 2843
Maker: Zabinski, Roger
Instrument type: Violin Bow
Sale Date: Jun 10, 2022
Sale price: $3,900
Auction house: 
City: 
MakerID: 2843
Maker: Zabinski, Roger
Instrument type: Violin Bow
Sale Date: Mar 18, 2022
Sale price: $1,534
Auction house: 
City: 
MakerID: 2843
Maker: Zabinski, Roger
Instrument type: Cello Bow
Sale Date: Oct 24, 2015
Sale price: $3,000
Auction house: Tarisio
City: 
MakerID: 2843
Maker: Zabinski, Roger
Instrument type: Violin Bow
Sale Date: May 9, 2014
Sale price: $3,300
Auction house: Tarisio
City: 
MakerID: 2843
Maker: Zabinski, Roger
Instrument type: Violin Bow
Sale Date: Nov 28, 2012
Sale price: $1,800
Auction house: Tarisio
City: 
MakerID: 2843
Maker: Zabinski, Roger
Instrument type: Violin Bow
Sale Date: May 6, 2011
Sale price: $2,

# Create makers dataframe
[Return to Table of Contents](#Contents) <br>

Put all the makers into a single dataframe, makers_df

In [251]:
# Do a small test first
A_makers.shape

(1365, 7)

In [253]:
B_makers.shape

(5026, 7)

In [255]:
test_frames = [A_makers, B_makers]
test = pd.concat(test_frames, ignore_index=True)
test.shape

(6391, 7)

In [257]:
test

Unnamed: 0,Instrument,SaleDate,SalePrice,AuctionHouse,City,MakerID,Maker
0,Violin,"Feb 20, 2010","$4,200",Tarisio,Wallgau,2919,"Achner, Michael"
1,Violin,"Apr 27, 1987","$2,838",Bongartz's,Mittenwald,2919,"Achner, Michael"
2,Violin,"May 17, 2018","$24,000",Tarisio,Mittenwald,2611,"Achner, Philip"
3,Violin,"Nov 15, 2008","$3,884",Bongartz's,,2611,"Achner, Philip"
4,Violin,"Mar 27, 1990","$2,146",Sotheby's,Mittenwald,2611,"Achner, Philip"
...,...,...,...,...,...,...,...
6386,Violin Bow,"Mar 30, 1989",$371,Sotheby's,Liverpool,4684,"Byrom, George"
6387,Violin,"Dec 18, 1986",$299,Phillip's,Liverpool,4684,"Byrom, George"
6388,Violin,"Oct 15, 1998","$7,833",Phillip's,,1342,"Byrom, John"
6389,Violin Bow,"Jun 18, 1998",$692,Bonhams,,1342,"Byrom, John"


In [259]:
frames = [A_makers, B_makers, C_makers, D_makers, E_makers, F_makers, G_makers, 
          H_makers, I_makers, J_makers, K_makers, L_makers, M_makers, N_makers, 
          O_makers, P_makers, Q_makers, R_makers, S_makers, T_makers, U_makers, 
          V_makers, W_makers, Y_makers, Z_makers]
#frames = [A_makers, B_makers, C_makers]
makers_df = pd.concat(frames, ignore_index=True)
makers_df
makers_df.shape

(55283, 7)

In [261]:
makers_df

Unnamed: 0,Instrument,SaleDate,SalePrice,AuctionHouse,City,MakerID,Maker
0,Violin,"Feb 20, 2010","$4,200",Tarisio,Wallgau,2919,"Achner, Michael"
1,Violin,"Apr 27, 1987","$2,838",Bongartz's,Mittenwald,2919,"Achner, Michael"
2,Violin,"May 17, 2018","$24,000",Tarisio,Mittenwald,2611,"Achner, Philip"
3,Violin,"Nov 15, 2008","$3,884",Bongartz's,,2611,"Achner, Philip"
4,Violin,"Mar 27, 1990","$2,146",Sotheby's,Mittenwald,2611,"Achner, Philip"
...,...,...,...,...,...,...,...
55278,Violin,"Apr 27, 2012","$108,000",Tarisio,"Brooklyn, NY",844,"Zygmuntowicz, Samuel"
55279,Violin,"Dec 14, 2009","$15,634",Brompton's,,844,"Zygmuntowicz, Samuel"
55280,Violin,"Dec 14, 2009","$15,634",Brompton's,,844,"Zygmuntowicz, Samuel"
55281,Violin,"May 8, 2003","$130,000",Tarisio,"Brooklyn, NY",844,"Zygmuntowicz, Samuel"


In [263]:
# Save raw data to a .csv file

makers_df.to_csv('Tarisio2_raw.csv')

# Fix datatypes
[Return to Table of Contents](#Contents) <br>

### EDA with Vinod

In [265]:
makers_df.columns

Index(['Instrument', 'SaleDate', 'SalePrice', 'AuctionHouse', 'City',
       'MakerID', 'Maker'],
      dtype='object')

In [267]:
instr = makers_df.groupby(['Instrument']).count()
instr

Unnamed: 0_level_0,SaleDate,SalePrice,AuctionHouse,City,MakerID,Maker
Instrument,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Bass,74,74,74,74,74,74
Bass Bow,188,188,188,188,188,188
Bass Viol,13,13,13,13,13,13
Cello,2635,2635,2635,2635,2635,2635
Cello Bow,4205,4205,4205,4205,4205,4205
Miscellaneous,4,4,4,4,4,4
Small Violin,295,295,295,295,295,295
Tenor Viol,16,16,16,16,16,16
Treble Viol,12,12,12,12,12,12
Viola,2891,2891,2891,2891,2891,2891


In [269]:
summary = makers_df.groupby(['Instrument']).describe()
summary

Unnamed: 0_level_0,SaleDate,SaleDate,SaleDate,SaleDate,SalePrice,SalePrice,SalePrice,SalePrice,AuctionHouse,AuctionHouse,...,City,City,MakerID,MakerID,MakerID,MakerID,Maker,Maker,Maker,Maker
Unnamed: 0_level_1,count,unique,top,freq,count,unique,top,freq,count,unique,...,top,freq,count,unique,top,freq,count,unique,top,freq
Instrument,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2,Unnamed: 10_level_2,Unnamed: 11_level_2,Unnamed: 12_level_2,Unnamed: 13_level_2,Unnamed: 14_level_2,Unnamed: 15_level_2,Unnamed: 16_level_2,Unnamed: 17_level_2,Unnamed: 18_level_2,Unnamed: 19_level_2,Unnamed: 20_level_2,Unnamed: 21_level_2
Bass,74,61,"Dec 3, 2020",3,74,74,"$33,929",1,74,12,...,,33,74,46,861,7,74,46,"Prescott, Abraham",7
Bass Bow,188,121,"Dec 8, 2011",8,188,166,"$1,320",4,188,16,...,,117,188,62,1446,24,188,62,"Morizot, Louis Joseph père",24
Bass Viol,13,11,"Jul 26, 2024",2,13,13,"$7,800",1,13,6,...,,9,13,7,954,7,13,7,"Norman, Barak",7
Cello,2635,755,"Jun 15, 1989",19,2635,2266,"$30,000",10,2635,29,...,,696,2635,826,356,70,2635,825,"Thibouville-Lamy, Jérôme Firm",70
Cello Bow,4205,687,"Dec 3, 2020",38,4205,2697,"$3,600",23,4205,24,...,,2010,4205,398,1626,594,4205,398,"Hill & Sons, W. E. Firm",594
Miscellaneous,4,4,"Jun 2, 2022",1,4,4,"$3,852",1,4,2,...,,2,4,4,368,1,4,4,"Lauxerrois, Jean Paul",1
Small Violin,295,202,"Nov 22, 1984",6,295,270,$402,3,295,15,...,,79,295,127,356,42,295,127,"Thibouville-Lamy, Jérôme Firm",42
Tenor Viol,16,14,"Dec 1, 1985",2,16,16,"$1,702",1,16,6,...,Cremona,3,16,12,1781,3,16,12,"Galetti, Pierre Luigi",3
Treble Viol,12,9,"Jun 28, 2010",3,12,12,"$5,205",1,12,3,...,Pieve di Cento,2,12,11,108,2,12,11,"Carletti, Natale",2
Viola,2891,735,"Nov 24, 1988",20,2891,2086,"$6,600",15,2891,27,...,,687,2891,1006,141,40,2891,1005,"Craske, George",40


In [271]:
# instrument data frame to play with
dfi = makers_df[['Instrument','SalePrice']]
dfi

Unnamed: 0,Instrument,SalePrice
0,Violin,"$4,200"
1,Violin,"$2,838"
2,Violin,"$24,000"
3,Violin,"$3,884"
4,Violin,"$2,146"
...,...,...
55278,Violin,"$108,000"
55279,Violin,"$15,634"
55280,Violin,"$15,634"
55281,Violin,"$130,000"


In [273]:
dfi.groupby('Instrument').describe()

Unnamed: 0_level_0,SalePrice,SalePrice,SalePrice,SalePrice
Unnamed: 0_level_1,count,unique,top,freq
Instrument,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2
Bass,74,74,"$33,929",1
Bass Bow,188,166,"$1,320",4
Bass Viol,13,13,"$7,800",1
Cello,2635,2266,"$30,000",10
Cello Bow,4205,2697,"$3,600",23
Miscellaneous,4,4,"$3,852",1
Small Violin,295,270,$402,3
Tenor Viol,16,16,"$1,702",1
Treble Viol,12,12,"$5,205",1
Viola,2891,2086,"$6,600",15


This doesn't make sense

In [275]:
dfi.groupby('Instrument').max()

Unnamed: 0_level_0,SalePrice
Instrument,Unnamed: 1_level_1
Bass,"$9,761"
Bass Bow,$918
Bass Viol,"$7,800"
Cello,$999
Cello Bow,$999
Miscellaneous,"$31,665"
Small Violin,$984
Tenor Viol,$920
Treble Viol,"$9,945"
Viola,$999


Prices are object types, not numeric values, so $999 would be the max value.

In [277]:
dfi.query('Instrument == "Violin"')

Unnamed: 0,Instrument,SalePrice
0,Violin,"$4,200"
1,Violin,"$2,838"
2,Violin,"$24,000"
3,Violin,"$3,884"
4,Violin,"$2,146"
...,...,...
55278,Violin,"$108,000"
55279,Violin,"$15,634"
55280,Violin,"$15,634"
55281,Violin,"$130,000"


In [279]:
dfi.query('Instrument == "Violin"').max()

Instrument    Violin
SalePrice       $999
dtype: object

In [281]:
type(dfi.SalePrice)

pandas.core.series.Series

In [283]:
type(dfi[['SalePrice']])

pandas.core.frame.DataFrame

In [285]:
dfi.dtypes

Instrument    object
SalePrice     object
dtype: object

In [318]:
# Create a function to convert prices to right data type
def convert_price(df):
    df['price'] = df['SalePrice'].str.replace('$', '')
    df['price'] = df['SalePrice'].str.replace(',', '')
    df['price'] = df['SalePrice'].astype('float')
    df['price'] = df['SalePrice'].astype('Int64')
    return df

In [2]:
# Apply to dfi
#convert_price(dfi)

In [4]:
#dfi['Price'] = dfi['SalePrice'].str.replace('US\$', '', regex=True).astype(int)

In [293]:
dfi['Price'] = dfi['SalePrice'].str.replace('[^\d]', '', regex=True).astype(int)

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  dfi['Price'] = dfi['SalePrice'].str.replace('[^\d]', '', regex=True).astype(int)


This is a warning; this is ok.

In [295]:
dfi

Unnamed: 0,Instrument,SalePrice,price,Price
0,Violin,"$4,200",$4200,4200
1,Violin,"$2,838",$2838,2838
2,Violin,"$24,000",$24000,24000
3,Violin,"$3,884",$3884,3884
4,Violin,"$2,146",$2146,2146
...,...,...,...,...
55278,Violin,"$108,000",$108000,108000
55279,Violin,"$15,634",$15634,15634
55280,Violin,"$15,634",$15634,15634
55281,Violin,"$130,000",$130000,130000


Warning - changing a data frame without making a copy.

In [303]:
dfi_copy = dfi.copy()
dfi_copy['Price'] = dfi_copy['SalePrice'].str.replace('[^\d]', '', regex=True).astype(int)

In [305]:
dfi_copy

Unnamed: 0,Instrument,SalePrice,price,Price
0,Violin,"$4,200",$4200,4200
1,Violin,"$2,838",$2838,2838
2,Violin,"$24,000",$24000,24000
3,Violin,"$3,884",$3884,3884
4,Violin,"$2,146",$2146,2146
...,...,...,...,...
55278,Violin,"$108,000",$108000,108000
55279,Violin,"$15,634",$15634,15634
55280,Violin,"$15,634",$15634,15634
55281,Violin,"$130,000",$130000,130000


In [309]:
# Changes dfi in place
dfi.loc[:, 'Price'] = dfi['SalePrice'].str.replace('[^\d]', '', regex=True).astype(int)

In [311]:
dfi

Unnamed: 0,Instrument,SalePrice,price,Price
0,Violin,"$4,200",$4200,4200
1,Violin,"$2,838",$2838,2838
2,Violin,"$24,000",$24000,24000
3,Violin,"$3,884",$3884,3884
4,Violin,"$2,146",$2146,2146
...,...,...,...,...
55278,Violin,"$108,000",$108000,108000
55279,Violin,"$15,634",$15634,15634
55280,Violin,"$15,634",$15634,15634
55281,Violin,"$130,000",$130000,130000


In [320]:
dfi.groupby('Instrument').describe()

Unnamed: 0_level_0,Price,Price,Price,Price,Price,Price,Price,Price
Unnamed: 0_level_1,count,mean,std,min,25%,50%,75%,max
Instrument,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2
Bass,74.0,19846.256757,39342.902669,86.0,3982.75,7994.0,17254.5,252532.0
Bass Bow,188.0,2708.712766,5690.354608,72.0,774.5,1474.0,2857.75,71036.0
Bass Viol,13.0,32253.692308,56438.903451,20.0,6086.0,11318.0,30736.0,212500.0
Cello,2635.0,35397.424668,100422.869398,110.0,5100.0,12361.0,28616.5,2466386.0
Cello Bow,4205.0,6284.183591,13158.907224,42.0,1359.0,2853.0,6182.0,224648.0
Miscellaneous,4.0,19103.0,12103.047137,3852.0,12723.75,20447.5,26826.75,31665.0
Small Violin,295.0,6209.820339,18772.093939,29.0,412.5,868.0,3076.0,240000.0
Tenor Viol,16.0,7385.0625,12973.710872,680.0,1234.25,2057.0,4155.25,47368.0
Treble Viol,12.0,7381.333333,7056.05011,2280.0,3078.75,5765.0,8205.0,28066.0
Viola,2891.0,13418.816327,67311.050934,95.0,2191.0,4800.0,10334.0,2889958.0


This makes more sense. Now fix the date type

In [322]:
makers_df

Unnamed: 0,Instrument,SaleDate,SalePrice,AuctionHouse,City,MakerID,Maker
0,Violin,"Feb 20, 2010","$4,200",Tarisio,Wallgau,2919,"Achner, Michael"
1,Violin,"Apr 27, 1987","$2,838",Bongartz's,Mittenwald,2919,"Achner, Michael"
2,Violin,"May 17, 2018","$24,000",Tarisio,Mittenwald,2611,"Achner, Philip"
3,Violin,"Nov 15, 2008","$3,884",Bongartz's,,2611,"Achner, Philip"
4,Violin,"Mar 27, 1990","$2,146",Sotheby's,Mittenwald,2611,"Achner, Philip"
...,...,...,...,...,...,...,...
55278,Violin,"Apr 27, 2012","$108,000",Tarisio,"Brooklyn, NY",844,"Zygmuntowicz, Samuel"
55279,Violin,"Dec 14, 2009","$15,634",Brompton's,,844,"Zygmuntowicz, Samuel"
55280,Violin,"Dec 14, 2009","$15,634",Brompton's,,844,"Zygmuntowicz, Samuel"
55281,Violin,"May 8, 2003","$130,000",Tarisio,"Brooklyn, NY",844,"Zygmuntowicz, Samuel"


In [324]:
df_date = makers_df[['Instrument','SaleDate']]

In [326]:
df_date

Unnamed: 0,Instrument,SaleDate
0,Violin,"Feb 20, 2010"
1,Violin,"Apr 27, 1987"
2,Violin,"May 17, 2018"
3,Violin,"Nov 15, 2008"
4,Violin,"Mar 27, 1990"
...,...,...
55278,Violin,"Apr 27, 2012"
55279,Violin,"Dec 14, 2009"
55280,Violin,"Dec 14, 2009"
55281,Violin,"May 8, 2003"


In [328]:
df_date.dtypes

Instrument    object
SaleDate      object
dtype: object

### Resources from Vinod: 
https://www.statology.org/how-to-parse-dates-from-text-in-python/ <br>
https://www.statology.org/pandas-to-datetime-format/ <br>
https://www.dataquest.io/blog/datetime-in-pandas/#:~:text=Now%2C%20the%20data%20type%20of,precision%20of%20the%20DateTime%20object.
<br><br>

Month name, DD, YYYY format: <br>
date_string2 = "July 26, 2023" <br>
date_object2 = datetime.strptime(date_string2, "%B %d, %Y") <br>
print(date_object2)

In [330]:
from datetime import datetime

date_string = 'Feb 20, 2010'
date_object = datetime.strptime(date_string, '%b %d, %Y').date()
print(date_object)

2010-02-20


In [332]:
# Convert the date strings in dataframe to datetime objects
df_date['SaleDate'] = pd.to_datetime(df_date['SaleDate'], format='%b %d, %Y').dt.date

df_date

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df_date['SaleDate'] = pd.to_datetime(df_date['SaleDate'], format='%b %d, %Y').dt.date


Unnamed: 0,Instrument,SaleDate
0,Violin,2010-02-20
1,Violin,1987-04-27
2,Violin,2018-05-17
3,Violin,2008-11-15
4,Violin,1990-03-27
...,...,...
55278,Violin,2012-04-27
55279,Violin,2009-12-14
55280,Violin,2009-12-14
55281,Violin,2003-05-08


In [334]:
df_date.dtypes

Instrument    object
SaleDate      object
dtype: object

In [336]:
makers_df

Unnamed: 0,Instrument,SaleDate,SalePrice,AuctionHouse,City,MakerID,Maker
0,Violin,"Feb 20, 2010","$4,200",Tarisio,Wallgau,2919,"Achner, Michael"
1,Violin,"Apr 27, 1987","$2,838",Bongartz's,Mittenwald,2919,"Achner, Michael"
2,Violin,"May 17, 2018","$24,000",Tarisio,Mittenwald,2611,"Achner, Philip"
3,Violin,"Nov 15, 2008","$3,884",Bongartz's,,2611,"Achner, Philip"
4,Violin,"Mar 27, 1990","$2,146",Sotheby's,Mittenwald,2611,"Achner, Philip"
...,...,...,...,...,...,...,...
55278,Violin,"Apr 27, 2012","$108,000",Tarisio,"Brooklyn, NY",844,"Zygmuntowicz, Samuel"
55279,Violin,"Dec 14, 2009","$15,634",Brompton's,,844,"Zygmuntowicz, Samuel"
55280,Violin,"Dec 14, 2009","$15,634",Brompton's,,844,"Zygmuntowicz, Samuel"
55281,Violin,"May 8, 2003","$130,000",Tarisio,"Brooklyn, NY",844,"Zygmuntowicz, Samuel"


## Fix dates and times 

Adjust the dates and times in the makers dataframe, makers_df. <br>
The new makers date-time dataframe is makers_dt_df.

In [340]:
# Make a copy of the raw dataframe
makers_dt_df = makers_df.copy()

# Change the SalePrice to an integer
makers_dt_df['SalePrice'] = makers_dt_df['SalePrice'].str.replace('[^\d]', '', regex=True).astype(int)

In [342]:
makers_dt_df

Unnamed: 0,Instrument,SaleDate,SalePrice,AuctionHouse,City,MakerID,Maker
0,Violin,"Feb 20, 2010",4200,Tarisio,Wallgau,2919,"Achner, Michael"
1,Violin,"Apr 27, 1987",2838,Bongartz's,Mittenwald,2919,"Achner, Michael"
2,Violin,"May 17, 2018",24000,Tarisio,Mittenwald,2611,"Achner, Philip"
3,Violin,"Nov 15, 2008",3884,Bongartz's,,2611,"Achner, Philip"
4,Violin,"Mar 27, 1990",2146,Sotheby's,Mittenwald,2611,"Achner, Philip"
...,...,...,...,...,...,...,...
55278,Violin,"Apr 27, 2012",108000,Tarisio,"Brooklyn, NY",844,"Zygmuntowicz, Samuel"
55279,Violin,"Dec 14, 2009",15634,Brompton's,,844,"Zygmuntowicz, Samuel"
55280,Violin,"Dec 14, 2009",15634,Brompton's,,844,"Zygmuntowicz, Samuel"
55281,Violin,"May 8, 2003",130000,Tarisio,"Brooklyn, NY",844,"Zygmuntowicz, Samuel"


In [346]:
makers_dt_df.SalePrice.max()

15821285

In [348]:
# Convert the date strings in dataframe to datetime objects
makers_dt_df['SaleDate'] = pd.to_datetime(makers_dt_df['SaleDate'], format='%b %d, %Y').dt.date
makers_dt_df

Unnamed: 0,Instrument,SaleDate,SalePrice,AuctionHouse,City,MakerID,Maker
0,Violin,2010-02-20,4200,Tarisio,Wallgau,2919,"Achner, Michael"
1,Violin,1987-04-27,2838,Bongartz's,Mittenwald,2919,"Achner, Michael"
2,Violin,2018-05-17,24000,Tarisio,Mittenwald,2611,"Achner, Philip"
3,Violin,2008-11-15,3884,Bongartz's,,2611,"Achner, Philip"
4,Violin,1990-03-27,2146,Sotheby's,Mittenwald,2611,"Achner, Philip"
...,...,...,...,...,...,...,...
55278,Violin,2012-04-27,108000,Tarisio,"Brooklyn, NY",844,"Zygmuntowicz, Samuel"
55279,Violin,2009-12-14,15634,Brompton's,,844,"Zygmuntowicz, Samuel"
55280,Violin,2009-12-14,15634,Brompton's,,844,"Zygmuntowicz, Samuel"
55281,Violin,2003-05-08,130000,Tarisio,"Brooklyn, NY",844,"Zygmuntowicz, Samuel"


In [350]:
# Change column 'City' to 'AuctionCity'
makers_dt_df.rename(columns = {'City':'AuctionCity'}, inplace=True)
makers_dt_df

Unnamed: 0,Instrument,SaleDate,SalePrice,AuctionHouse,AuctionCity,MakerID,Maker
0,Violin,2010-02-20,4200,Tarisio,Wallgau,2919,"Achner, Michael"
1,Violin,1987-04-27,2838,Bongartz's,Mittenwald,2919,"Achner, Michael"
2,Violin,2018-05-17,24000,Tarisio,Mittenwald,2611,"Achner, Philip"
3,Violin,2008-11-15,3884,Bongartz's,,2611,"Achner, Philip"
4,Violin,1990-03-27,2146,Sotheby's,Mittenwald,2611,"Achner, Philip"
...,...,...,...,...,...,...,...
55278,Violin,2012-04-27,108000,Tarisio,"Brooklyn, NY",844,"Zygmuntowicz, Samuel"
55279,Violin,2009-12-14,15634,Brompton's,,844,"Zygmuntowicz, Samuel"
55280,Violin,2009-12-14,15634,Brompton's,,844,"Zygmuntowicz, Samuel"
55281,Violin,2003-05-08,130000,Tarisio,"Brooklyn, NY",844,"Zygmuntowicz, Samuel"


In [352]:
makers_dt_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 55283 entries, 0 to 55282
Data columns (total 7 columns):
 #   Column        Non-Null Count  Dtype 
---  ------        --------------  ----- 
 0   Instrument    55283 non-null  object
 1   SaleDate      55283 non-null  object
 2   SalePrice     55283 non-null  int64 
 3   AuctionHouse  55283 non-null  object
 4   AuctionCity   55283 non-null  object
 5   MakerID       55283 non-null  object
 6   Maker         55283 non-null  object
dtypes: int64(1), object(6)
memory usage: 3.0+ MB


In [360]:
# Convert SaleDate to a datetime object
makers_dt_df['SaleDate'] = pd.to_datetime(makers_dt_df['SaleDate'])
makers_dt_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 55283 entries, 0 to 55282
Data columns (total 7 columns):
 #   Column        Non-Null Count  Dtype         
---  ------        --------------  -----         
 0   Instrument    55283 non-null  object        
 1   SaleDate      55283 non-null  datetime64[ns]
 2   SalePrice     55283 non-null  int64         
 3   AuctionHouse  55283 non-null  object        
 4   AuctionCity   55283 non-null  object        
 5   MakerID       55283 non-null  object        
 6   Maker         55283 non-null  object        
dtypes: datetime64[ns](1), int64(1), object(5)
memory usage: 3.0+ MB


In [362]:
print(type(makers_dt_df.SaleDate))

<class 'pandas.core.series.Series'>


In [364]:
makers_dt_df.describe()

Unnamed: 0,SaleDate,SalePrice
count,55283,55283.0
mean,2002-09-03 05:56:51.283757952,14882.76
min,1829-05-03 00:00:00,0.0
25%,1991-11-11 00:00:00,1507.0
50%,2003-10-19 00:00:00,3577.0
75%,2012-06-25 00:00:00,9216.5
max,2024-10-08 00:00:00,15821280.0
std,,122954.7


In [366]:
makers_dt_df.head()

Unnamed: 0,Instrument,SaleDate,SalePrice,AuctionHouse,AuctionCity,MakerID,Maker
0,Violin,2010-02-20,4200,Tarisio,Wallgau,2919,"Achner, Michael"
1,Violin,1987-04-27,2838,Bongartz's,Mittenwald,2919,"Achner, Michael"
2,Violin,2018-05-17,24000,Tarisio,Mittenwald,2611,"Achner, Philip"
3,Violin,2008-11-15,3884,Bongartz's,,2611,"Achner, Philip"
4,Violin,1990-03-27,2146,Sotheby's,Mittenwald,2611,"Achner, Philip"


In [368]:
# Save dataframe with formatted datatypes to a .csv file

makers_dt_df.to_csv('Tarisio2.csv')

In [381]:
makers_dt_df[makers_dt_df['Maker'] == 'Schindler, Gustav']

Unnamed: 0,Instrument,SaleDate,SalePrice,AuctionHouse,AuctionCity,MakerID,Maker
44117,Violin Bow,2002-10-13,823,Skinner,,12492,"Schindler, Gustav"
44118,Violin Bow,2000-11-05,862,Skinner,,12492,"Schindler, Gustav"
44119,Violin Bow,1992-05-11,201,Bongartz's,Markneukirchen,12492,"Schindler, Gustav"
44120,Violin Bow,1987-12-01,293,Ader Tajan,,12492,"Schindler, Gustav"


[Return to Table of Contents](#Contents) <br>