## Warmup

Let's practice on a sample file that lives at `../practice-table.html`. We don't need to use requests because we'll be operating on a local file -- let's pretend we downloaded this and are working on our local copy. (Which, again, is a good idea.)

The Mountain Goats [have a new album out last month](https://themountaingoats.bandcamp.com/album/goths) (it is good, you should buy it); the HTML we're going to operate on is just a `<table>` showing the track listing.

Let's start by importing the `BeautifulSoup` class from our `bs4` module.

In [None]:
from bs4 import BeautifulSoup

Next, we're going to open the HTML file with our practice table in it. We're going to use something called a `with` block to open it, and then we'll use the `read()` method to read its contents into memory. Note:

- We are opening it in `r` mode, which means "read" (as opposed to "write," which we'll do later)
- We are defining a variable for the file object using `as`
- The lines under the `with` block are indented

In [None]:
with open('../practice-table.html', 'r') as html_file:
    html_code = html_file.read()
    print(html_code)

Now let's feed the file contents to a BeautifulSoup object and assign the result to the variable `soup`. You might get an error unless you also pass `'html.parser'` as the second argument. Now print `type(soup)`.

In [None]:
with open('../practice-table.html', 'r') as html_file:
    html_code = html_file.read()
    soup = BeautifulSoup(html_code, 'html.parser')
    print(type(soup))

Cool. We're locked and loaded. Our string of HTML is now a tree that we can climb through to find the things we want.

There are a couple of ways to isolate the table we want using the `find` or `find_all` methods -- by class, by ID, by position on the page, by style. (There are others, remember.) Let's try this:

In [None]:
with open('../practice-table.html', 'r') as html_file:
    html_code = html_file.read()
    soup = BeautifulSoup(html_code, 'html.parser')
    
    # by position on the page
    # find_all returns a list of matching elements, and we want the second ([1]) one
    # song_table = soup.find_all('table')[1]
    
    # by class name
    # => with `find`, you can pass in a dictionary of element attributes to match on
    # song_table = soup.find('table', {'class': 'song-table'})
    
    # by ID
    # song_table = soup.find('table', {'id': 'my-cool-table'})
    
    # by style
    song_table = soup.find('table', {'style': 'width: 95%;'})
    
    print(song_table)

We've targeted the correct table. Now what if we wanted to print a list of track numbers and song titles? Look at the structure of the table -- a `table` has rows represented by the tag `tr`, and within each row there are cells represented by `td`. The `find_all()` method, you'll recall, returns a _list_. And we know how to iterate over lists: with a `for` loop.

In [None]:
with open('../practice-table.html', 'r') as html_file:
    html_code = html_file.read()
    soup = BeautifulSoup(html_code, 'html.parser')

    song_table = soup.find('table', {'class': 'song-table'})
    
    table_rows = song_table.find_all('tr')
    
    # let's skip the header row
    # more on list slicing: http://pythoncentral.io/how-to-slice-listsarrays-and-tuples-in-python/
    for row in table_rows[1:]:
        # get a list of cells in the row
        cols = row.find_all('td')
        
        # the track number is is in the first ([0]) "column"
        # the `.text` attribute gets the contents of a BeautifulSoup Tag object
        track_number = cols[0].text
        
        # the song title is in the second ([1]) "column"
        song_title = cols[1].text

        print(track_number + '.', song_title)
