## Modules

In [1]:
from bs4 import BeautifulSoup
import requests
import pandas as pd

## Variables

In [2]:
stingray_url = 'https://www.music-man.com/instruments/'
basses_url = 'https://www.music-man.com/instruments/basses/'

## Get data

### Make the call

In [3]:
response = requests.get(basses_url)
content = BeautifulSoup(response.content,'lxml')

### Obtain the products

In [4]:
products = content.find('section', {'class': 'guitar-bg'}).find_all('div', {'class': 'guitar-imgs text-center'})

For each product, get the name and url

In [5]:
products_dict = {}

for product in products:
    
    product = product.find('a', {'class': 'the-steve'})
    products_dict[product.text] = [product.attrs['href']]

Convert to DF to manipulate it more easily

In [6]:
products_df = pd.DataFrame(products_dict).transpose()
products_df.columns = ['href']
products_df.index = products_df.reset_index()['index'].apply(lambda x : x.split('\n')[1])

At the moment, we only have the url to the instrument

In [7]:
products_df.head()

Unnamed: 0_level_0,href
index,Unnamed: 1_level_1
StingRay 5 35th Anniversary,basses/stingray-5-35th-anniversary
StingRay Special Collection,families/basses/stingray-special
Short Scale StingRay,basses/short-scale-stingray
DarkRay,families/basses/darkray
Joe Dart Collection,families/basses/joe-dart


### Obtain info per product

In [8]:
for name, url in zip(products_df['href'].index, products_df['href']):
    
    print(f'Getting info for {name} ...')
    
    # make the call to the url
    response = requests.get(stingray_url + url)
    content = BeautifulSoup(response.content,'lxml')
    
    # Some will fail since they are not pages that contain individdual basses, but rather
    # redirect to more pages with more basses. An improvement to this project would be to explore those cases later
    try:
        products_df.loc[name, 'Description'] = content.find('p', {'class': 'instrument-desc'}).text
    except AttributeError:
        continue
    
    # Get the color
    products_df.loc[name, 'Color'] = content.find('div', {'class': 'color-set row active'}).text
    
    # There are many specs contained in a table. Asign a column to each
    columns = content.find_all('td', {'class': 'text-right'})
    columns = [aux.text for aux in columns]
    # This was done since there are some other values that, instead of being specs, are more
    # like comparisons with other basses. Drop tse.
    where_to_end = [1 if aux == 'Strings'  else 0 for aux in columns ]
    where_to_end = where_to_end.index(1) + 1
    columns = columns[:where_to_end]

    # We get the values for each of the specs
    values = content.find_all('td', {'class': 'text-left'})
    values = [aux.text for aux in values]
    values = values[:where_to_end]
    
    # And assign them to the DF
    for column, value in zip(columns, values):
        products_df.loc[name, column] = value

Getting info for StingRay 5 35th Anniversary ...
Getting info for StingRay Special Collection ...
Getting info for Short Scale StingRay ...
Getting info for DarkRay ...
Getting info for Joe Dart Collection ...
Getting info for Tim Commerford ...
Getting info for Bongo Collection ...
Getting info for John Myung ...
Getting info for Mike Herrera ...
Getting info for Sterling Collection ...
Getting info for Cliff Williams ...


We can see that some url did not go through. This is because they have `families/` in them. When we look at these links, they each still have more independent links that can be accessed. Thus, for each of these ones, we could explore all the sublinks and get data for every bass.

In [9]:
products_df

Unnamed: 0_level_0,href,Description,Color,Model,Size,Body Wood,Body Finish,Body Colors,Bridge,Scale Length,...,Electronic Shielding,Controls,Switching,Pickups,Left Handed,Strings,Pickguard,Order,Neck Binding,Body Bindings
index,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
StingRay 5 35th Anniversary,basses/stingray-5-35th-anniversary,\nIn 1987 Ernie Ball Music Man created the leg...,\n\n\n\nSpalted Sunburst\n\n\n\n,StingRay 5 35th Anniversary,"13-3/8"" wide, 1-3/4"" thick, 45-3/4"" long (34.0...",Ash,High gloss polyester,Spalted Sunburst,"Standard - Music Man® Black plated, hardened s...","34"" (86.4 cm)",...,Graphite acrylic resin coated body cavity and ...,3-band active preamp with 18 volts of headroom...,H: 3-way lever pickup selector; HH: 5-way leve...,Single or Dual Humbucking with Neodymium Magnets,No,46w-65w-80w-100w-130w (Regular Slinky Bass #2836),,,,
StingRay Special Collection,families/basses/stingray-special,,,,,,,,,,...,,,,,,,,,,
Short Scale StingRay,basses/short-scale-stingray,\nThe Ernie Ball Music Man Short Scale StingRa...,\n\n\n\nCandy Man\n\n\n\n\n\n\nBurnt Ends\n\n\...,Short Scale StingRay Bass,"12-3/8"" wide, 1-5/8"" thick, 40-7/8"" long (31.4...",Ash,High gloss polyester,,"Vintage Music Man® top loaded chrome plated, s...","30"" (76.2 cm)",...,Chrome plated aluminum control cover,Passive 500kohm push/push volume POT for gain ...,3-way rotary pickup selector,Single Music Man humbucking pickup- neodymium ...,No,45w-65w-85w-105w (Short Scale Regular Slinky B...,,,,
DarkRay,families/basses/darkray,,,,,,,,,,...,,,,,,,,,,
Joe Dart Collection,families/basses/joe-dart,,,,,,,,,,...,,,,,,,,,,
Tim Commerford,basses/tim-commerford,\nWelcome to the Tim Commerford Artist Series ...,\n\n\n\nBlack Full-Scale (Active)\n\n\n\n\n\n\...,Tim Commerford Active Short-Scale,"12-3/8"" wide, 1-5/8"" thick, 40-7/8"" long (31.4...",Ash,High gloss polyester,Natural Gloss,"Music Man® chrome plated, strings-thru-the-bod...","30"" (76.2 cm)",...,Chrome plated aluminum control cover,3-band active preamp with 18 volts of headroom...,,Single Humbucking with Neodymium magnets,,45w-65w-85w-105w (Medium Scale Regular Slinky ...,Black,,,
Bongo Collection,families/basses/bongo,,,,,,,,,,...,,,,,,,,,,
John Myung,basses/john-myung,\nThe John Myung artist series Bongo is a slee...,\n\n\n\nPlatinum Silver\n\n\n\n\n\n\nBlack\n\n...,John Myung Bongo 6 HH,"12-3/4"" wide, 1-5/8"" thick, 47-5/8"" long (32.4...",Basswood,High gloss polyester,Black,"Music Man® chrome plated, steel bridge plate w...","34"" (86.4 cm)",...,Graphite acrylic resin coated body cavity and ...,Volume,5-way pickup blend knob,Dual Humbucking with Neodymium magnets,No,32w-45w-65w-80w-100w-130w (Cobalt Bass),Black,1.0,,
Mike Herrera,basses/mike-herrera,\nIntroducing the Ernie Ball Music Man Mike He...,\n\n\n\nSeafoam Green\n\n\n\n,Mike Herrera StingRay Bass,"13-1/2"" wide x 1-5/8"" thick x 44-7/8"" long (34...",Ash,High gloss polyester,Seafoam Green,"Music Man® Chrome plated, hardened steel bridg...","34"" (86.4 cm)",...,,"3 dummy knobs, pickup wired directly to jack",,Music Man® humbucking with Alnico magnets,no,50w-70w-85w-105w (Regular Slinky Bass #2832),"Engraved White Pearloid, 3-ply with black cent...",,,
Sterling Collection,families/basses/sterling,,,,,,,,,,...,,,,,,,,,,


In [15]:
products_df.to_csv("/Users/cross/OneDrive/Documentos/GitHub/project-data-extraction/bass_prods.csv")