# requests 模块：HTTP for Human

<h1>Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#简介" data-toc-modified-id="简介-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>简介</a></span></li><li><span><a href="#传入-URL-参数" data-toc-modified-id="传入-URL-参数-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>传入 URL 参数</a></span></li><li><span><a href="#读取响应内容" data-toc-modified-id="读取响应内容-3"><span class="toc-item-num">3&nbsp;&nbsp;</span>读取响应内容</a></span></li><li><span><a href="#响应状态码" data-toc-modified-id="响应状态码-4"><span class="toc-item-num">4&nbsp;&nbsp;</span>响应状态码</a></span></li><li><span><a href="#响应头" data-toc-modified-id="响应头-5"><span class="toc-item-num">5&nbsp;&nbsp;</span>响应头</a></span></li><li><span><a href="#爬取网页" data-toc-modified-id="爬取网页-6"><span class="toc-item-num">6&nbsp;&nbsp;</span>爬取网页</a></span></li></ul></div>

## 简介

In [1]:
import requests

`Python`标准库中的 `urllib2` 模块提供了你所需要的大多数` HTTP `功能，但是它的 `API `不是特别方便使用。

In [2]:
r = requests.get("http://httpbin.org/get")
r = requests.post('http://httpbin.org/post', data = {'key':'value'})
r = requests.put("http://httpbin.org/put")
r = requests.delete("http://httpbin.org/delete")
r = requests.head("http://httpbin.org/get")
r = requests.options("http://httpbin.org/get")

## 传入 URL 参数

我想访问 `httpbin.org/get?key=val`，我就可以使用 `params `传入这些参数：

In [3]:
payload = {'key1': 'value1', 'key2': 'value2'}
r = requests.get("http://httpbin.org/get", params=payload)

In [4]:
# 看一下url
print r.url

http://httpbin.org/get?key2=value2&key1=value1


## 读取响应内容

`Requests` 会**自动解码**来自服务器的内容。大多数 `unicode` 字符集都能被无缝地解码。

In [5]:
r = requests.get('https://github.com/timeline.json')
print r.text

{"message":"Hello there, wayfaring stranger. If you’re reading this then you probably didn’t see our blog post a couple of years back announcing that this API would go away: http://git.io/17AROg Fear not, you should be able to get what you need from the shiny new Events API instead.","documentation_url":"https://developer.github.com/v3/activity/events/#list-public-events"}


In [6]:
# 查看文字编码：
r.encoding

'utf-8'

In [7]:
# 每次改变文字编码，text 的内容也随之变化：
r.encoding = "ISO-8859-1"
r.text

u'{"message":"Hello there, wayfaring stranger. If you\xe2\x80\x99re reading this then you probably didn\xe2\x80\x99t see our blog post a couple of years back announcing that this API would go away: http://git.io/17AROg Fear not, you should be able to get what you need from the shiny new Events API instead.","documentation_url":"https://developer.github.com/v3/activity/events/#list-public-events"}'

In [8]:
# Requests 中也有一个内置的 JSON 解码器处理 JSON 数据：
# 如果 JSON 解码失败， r.json 就会抛出一个异常。
r.json()

{u'documentation_url': u'https://developer.github.com/v3/activity/events/#list-public-events',
 u'message': u'Hello there, wayfaring stranger. If you\xe2\x80\x99re reading this then you probably didn\xe2\x80\x99t see our blog post a couple of years back announcing that this API would go away: http://git.io/17AROg Fear not, you should be able to get what you need from the shiny new Events API instead.'}

## 响应状态码

In [9]:
r = requests.get('http://httpbin.org/get')
r.status_code

200

## 响应头

In [10]:
r.headers['Content-Type']

'application/json'

## 爬取网页

In [11]:
# 爬取我的笔记仓的READE.md
r = requests.get('https://raw.githubusercontent.com/ds-ebooks/jupyter-notebook/f57ce8f10a83250bce9833568103520d5c8bba34/README.md')
print r.text

# About Jupyter Notebook
* 该笔记共三个分支：
  - Jupyter是笔记源文件；
  - master主要是Jupyter生成的md文件，以及其他的重要md笔记；
  - gh-pages是用来展示笔记的静态网站源文件.
---
* 笔记展示：
  - [md格式展示](https://github.com/ds-ebooks/jupyter-notebook/tree/master/docs)
  - [Jupyter渲染](https://nbviewer.jupyter.org/github/ds-ebooks/jupyter-notebook/tree/jupyter/)
  - [Python静态网站](https://ds-ebooks.github.io/jupyter-notebook/README/index.html)
---
* 书籍推荐：
  - [Python书籍大全](https://ds-ebooks.github.io/jupyter-notebook/Guide/02-python-cn/index.html)
  - [Guide整理](https://github.com/ds-ebooks/jupyter-notebook/tree/master/docs/Guide)


In [12]:
r.encoding

'utf-8'