# Downloading Stuff from the Internet

As you saw on Monday, we can use **curl** to download files from the internet when on the command line. We can also perform such activities in Python.

While there are several modules available via Python for working with internet data, we recommend using the **Requests** module. Before we can use it, however, we need to import it.

In [1]:
import requests

Lets take a quick look at the help manual (in Python) for **requests**.

You can also find more detailed information on how to use this library on the developer's website at http://docs.python-requests.org/en/master/.

In [2]:
help(requests)

Help on package requests:

NAME
    requests

DESCRIPTION
    Requests HTTP library
    ~~~~~~~~~~~~~~~~~~~~~
    
    Requests is an HTTP library, written in Python, for human beings. Basic GET
    usage:
    
       >>> import requests
       >>> r = requests.get('https://www.python.org')
       >>> r.status_code
       200
       >>> 'Python is a programming language' in r.content
       True
    
    ... or POST:
    
       >>> payload = dict(key1='value1', key2='value2')
       >>> r = requests.post('http://httpbin.org/post', data=payload)
       >>> print(r.text)
       {
         ...
         "form": {
           "key2": "value2",
           "key1": "value1"
         },
         ...
       }
    
    The other HTTP methods are supported - see `requests.api`. Full documentation
    is at <http://python-requests.org>.
    
    :copyright: (c) 2016 by Kenneth Reitz.
    :license: Apache 2.0, see LICENSE for more details.

PACKAGE CONTENTS
    _internal_utils
    adapters
    api
 

Next, we can use the **get()** function to retrieve something from the internet, at a specified webpage.

For this example, I'm going to use Jane Austen's *Sense and Sensibility* from the Project Gutenberg. The text version of the book can be found at http://www.gutenberg.org/cache/epub/161/pg161.txt.

In [3]:
Sense = requests.get("http://www.gutenberg.org/cache/epub/161/pg161.txt")

Next, we can use the **text** function to read what data was retrieved with the **get()** function. 

**Note:** Oddly, this function does not require brackets () in order to be used.

In [5]:
Sense.text



*What do you notice about the text?*

Is it just me, or does it seem to include a heck of a lot of back slashes?!?

We learned about the concept of a back slash in association with a string on the command line when we were creating new folders. The back slash essentially tells the computer to STOP and not consider the following character as part of a command or code, instead, the computer needs to interpret that character as meaning something else. These are referred to as **escape characters**.

\n means a newline
\r means a carriage return (where someone actually pressed the Enter button)
\' means a single quotation mark
\" means a double quotation mark
\? means a question mark

You'll notice that a \r almost always occurs with a \n. This is because a newline and a carriage return are essentially the same thing, so they mainly happen at the same time.

#### Looking for Stuff in the Data

In the Section 2.1.7b, we'll go into detail on how to search for and manipulate data within a text file. But, you can also do some simple searching at this stage, while your downloaded data is saved in a variable.

#### Saving the File

We'll go into more detail on these functions in the next section, but let's take a moment to save out downloaded data into a text file.

In [7]:
Book = open("SenseAndSensibility2.txt", 'w')

Book.write(Sense.content)




TypeError: write() argument must be str, not bytes