# Fetching HTML with requests

The first step when you're scraping a web site is to request a copy of the page from the server where it lives -- that's the first thing that happens when you type the URL into a browser. But in this case, we're not going to _render_ the HTML in a browser -- we're going to (eventually) parse it and extract that sweet, sweet data.

To accomplish this, we're going to use a popular third-party Python library called [requests](http://docs.python-requests.org/en/master/), which has become the de facto standard for making HTTP requests in Python.

First, we need to import the requests library (you'll only need to do this once):

In [1]:
import requests

Now that we've imported the library, we can use its functionality. Specifically, we're going to use its `get()` method to, well, get a web page. Let's start with the Texas death row offenders page -- notice that we're surrounding the URL in single quotes, because the `get()` method expects a URL as text (a _string_, to use the lingo).

`'https://www.tdcj.texas.gov/death_row/dr_offenders_on_dr.html'`

And as we do this, we're also going to save the results of the operation as a _variable_ that we can access later. Variable names can be pretty much anything, but it's usually best to name them something that describes the value they're holding on to. So in this case, I'll call my variable `deathrow_page`.

👉For more information on strings and variable assignment, [see this notebook](../_Python%20syntax%20cheat%20sheet.ipynb).

In [10]:
deathrow_page = requests.get('https://www.tdcj.texas.gov/death_row/dr_offenders_on_dr.html')

The requests object has a lot of potentially useful stuff, but for now we're just interested in the `.text` attribute, which in this case is the page's HTML -- same as if we'd done `view-source` in a browser.

In [13]:
deathrow_page.text

'<!doctype html>\r\n<html lang="en-US"><!-- InstanceBegin template="/Templates/generic_inside.dwt" codeOutsideHTMLIsLocked="false" -->\r\n<head>\r\n<meta charset="utf-8">\r\n<meta name="viewport" content="width=device-width, initial-scale=1">\r\n<!-- stylesheet: global -->\r\n<link rel="stylesheet" href="/stylesheets/global.css">\r\n<!-- stylesheet: page-specific -->\r\n<link rel="stylesheet" href="/stylesheets/content.css">\r\n<link rel="stylesheet" href="/stylesheets/menu_style.css">\r\n<!-- InstanceBeginEditable name="stylesheets" -->\r\n\r\n<!-- InstanceEndEditable -->\r\n<!-- jQuery library (if CDN fails, use local copy) -->\r\n<script type="text/javascript" src="//ajax.googleapis.com/ajax/libs/jquery/3.2.1/jquery.min.js"></script>\r\n<script type="text/javascript"> window.jQuery || document.write(\'<script src="/javascripts/jquery.min.js"><\\/script>\') </script>\r\n<!-- javascripts -->\r\n<script type="text/javascript" src="/javascripts/google_analytics.js"></script>\r\n<script 

## Your turn

In groups, practice fetching the HTML for these pages:
- https://www.newportbeachca.gov
- https://www.texasagriculture.gov/Portals/0/Reports/PIR/certified_lead_burn_instructors.html
- https://web.archive.org/web/20031202214318/http://www.tdcj.state.tx.us:80/stat/finalmeals.htm