## Web Scraping with `Requests` and `BeautifulSoup`


#### 1. Import `requests` and `BeautifulSoup` and read an url of your choice:

In [None]:
import requests
from bs4 import BeautifulSoup

response = requests.get('https://www.pokemon.com/us/')

In [None]:
response.status_code

- html:

In [None]:
response.content

- This is not easy to understand; let's get a more readable version with BeautifulSoup instead:

In [None]:
def makesoup(response):
    return BeautifulSoup(response.content, 'html5lib')

makesoup(response)

- We can also use the prettify() function from BeautifulSoup:

In [None]:
def makeprettysoup(response):
    return BeautifulSoup(response.content, 'html5lib').prettify()

makeprettysoup(response)

#### 2. Create a `BeautifulSoup` object and navigate the data structure:

In [None]:
soup = BeautifulSoup(response.text, 'html.parser')
soup

In [None]:
soup.title

In [None]:
soup.title.name

In [None]:
soup.title.string

In [None]:
soup.title.parent.name

In [None]:
soup.p

In [None]:
soup.a

In [None]:
soup.div

In [None]:
soup.a['class']

In [None]:
soup.a['href']

In [None]:
soup.div['id']

#### 3. Use the `find()` method to get a result from a specific tag:

In [None]:
soup.find('a')

In [None]:
soup.find('p')

In [None]:
soup.find('div')

In [None]:
soup.find(id="gus-wrapper")

#### 4. Use `find_all()` to get all the results of a given type:

In [None]:
soup.find_all('a')

In [None]:
all_links = soup.find_all('a')
all_images = soup.find_all('img')
print(len(all_links))
print(len(all_images))

In [None]:
all_links[0]

In [None]:
all_links[-1]

In [None]:
all_images[:3]

#### 5. Get the `alt` attributes from all the `<img>` elements:

In [None]:
for link in soup.find_all('img'):
    print(link.get('alt'))