# Beautiful Soup Selectors

*(N.B. Hackernews now uses the .titleline class instead of the .storylink class so you just need to make sure you enter .titleline in the next video when you see me use .storylink)*

One of the best ways to use the soup object is to use the `select` method on it:


In [None]:
import requests
from bs4 import BeautifulSoup

res = requests.get('https://news.ycombinator.com/')
soup = BeautifulSoup(res.text, 'html.parser')
print(soup.select(id='score_33256446'))

This allows us to grab a piece of data--__using a CSS selector__--from the soup object that we just downloaded and created. The selector allows us to access this data

- .class
- #id
- element

The code below will select all of the `<a>` tags 

In [None]:
import requests
from bs4 import BeautifulSoup

res = requests.get('https://news.ycombinator.com/')
soup = BeautifulSoup(res.text, 'html.parser')
print(soup.select('a'))

But maybe we want to select the `class="score"` with the dot notation, and this will grab all of the scores on the page:

In [2]:
import requests
from bs4 import BeautifulSoup

res = requests.get('https://news.ycombinator.com/')
soup = BeautifulSoup(res.text, 'html.parser')
print(soup.select('.score'))

[<span class="score" id="score_33261846">106 points</span>, <span class="score" id="score_33261125">76 points</span>, <span class="score" id="score_33261766">18 points</span>, <span class="score" id="score_33261288">21 points</span>, <span class="score" id="score_33261399">14 points</span>, <span class="score" id="score_33228398">52 points</span>, <span class="score" id="score_33260525">238 points</span>, <span class="score" id="score_33251954">321 points</span>, <span class="score" id="score_33256446">272 points</span>, <span class="score" id="score_33260380">15 points</span>, <span class="score" id="score_33249215">2606 points</span>, <span class="score" id="score_33257197">138 points</span>, <span class="score" id="score_33259351">113 points</span>, <span class="score" id="score_33259835">23 points</span>, <span class="score" id="score_33244819">110 points</span>, <span class="score" id="score_33256378">396 points</span>, <span class="score" id="score_33228387">90 points</span>, <sp

It's just a list of all the spans because the span has a .score class. We can do the same with `"id=`.

In [4]:
import requests
from bs4 import BeautifulSoup

res = requests.get('https://news.ycombinator.com/')
soup = BeautifulSoup(res.text, 'html.parser')
print(soup.select('#score_33256446'))

[<span class="score" id="score_33256446">273 points</span>]


What we want to do is grab all elements with  the class '.titleline'.  We get a ton of lists, but they're all `<a>` tags.

In [6]:
import requests
from bs4 import BeautifulSoup

res = requests.get('https://news.ycombinator.com/')
soup = BeautifulSoup(res.text, 'html.parser')
print(soup.select('.titleline'))

[<span class="titleline"><a href="https://brave.com/privacy-updates/21-blocking-cookie-notices/">Brave browser now blocks cookie banners</a><span class="sitebit comhead"> (<a href="from?site=brave.com"><span class="sitestr">brave.com</span></a>)</span></span>, <span class="titleline"><a href="https://wix-ux.com/when-life-gives-you-lemons-write-better-error-messages-46c5223e1a2f">Write Better Error Messages</a><span class="sitebit comhead"> (<a href="from?site=wix-ux.com"><span class="sitestr">wix-ux.com</span></a>)</span></span>, <span class="titleline"><a href="https://ctan.org/pkg/wargame">Wargame: LaTeX package to prepare hex'n'counter wargames</a><span class="sitebit comhead"> (<a href="from?site=ctan.org"><span class="sitestr">ctan.org</span></a>)</span></span>, <span class="titleline"><a href="https://lwn.net/SubscriberLink/910766/7678f8c4ede60928/">Identity management for WireGuard networks</a><span class="sitebit comhead"> (<a href="from?site=lwn.net"><span class="sitestr">lwn.

We can grab the first element, using [0]:

In [7]:
import requests
from bs4 import BeautifulSoup

res = requests.get('https://news.ycombinator.com/')
soup = BeautifulSoup(res.text, 'html.parser')
print(soup.select('.titleline')[0])

<span class="titleline"><a href="https://brave.com/privacy-updates/21-blocking-cookie-notices/">Brave browser now blocks cookie banners</a><span class="sitebit comhead"> (<a href="from?site=brave.com"><span class="sitestr">brave.com</span></a>)</span></span>


`.titleline` grabs us the `<a>` tags for each one of the story titles. But we could just as easily grab the `vote`.

In [None]:
<span class="score" id="score_33228398">27 points</span>

So, we'll grab all of the classes:

In [9]:
import requests
from bs4 import BeautifulSoup

res = requests.get('https://news.ycombinator.com/')
soup = BeautifulSoup(res.text, 'html.parser')
links = soup.select('.titleline')
votes = soup.select('.score')
print(votes[0])

<span class="score" id="score_33262637">267 points</span>


With Beautiful Soup, we can keep chaining:

In [12]:
import requests
from bs4 import BeautifulSoup

res = requests.get('https://news.ycombinator.com/')
soup = BeautifulSoup(res.text, 'html.parser')
links = soup.select('.titleline')
votes = soup.select('.score')
print(votes[0].get('id'))

score_33262637


We have the information that we need. It's time to go to the next step, which is to try and combine what's above and make it more useful.