### Responses 对象属性
+ r.status_code 返回状态
+ r.text 相应的字符串形式
+ r.encoding 从Http header中猜测的相应编码方式
+ r.apparent_encoding 从内容中分析相应的编码方式（备选）
+ r.content HTTP相应的二进制形式

In [None]:
import requests

url = 'http://www.baidu.com'
r = requests.get(url)
if r.status_code == 200:
    r.encoding = 'utf-8'
    data = r.text
    for key, value in r.headers.items():
        print(key + ":" + value)
else:
    print("connect error status_code is "+str(r.status_code))

### Resuests库异常
+ requests.ConnectionError 网络连接错误异常
+ requests.HTTP HTTP错误异常
+ requests.URLRequired URL缺失异常
+ requests.TooManyRedirets 超过最大重定向次数
+ requests.ConnectTimeout 连接服务器超时异常
+ requests.Timeout 请求URL超时
+ r.raise_for_status() 如果不是200，产生HTTPError

In [5]:
def gethtmltext(url):
    try:
        r = requests.get(url, timeout = 30)
        r.raise_for_status()
        r.encoding = r.apparent_encoding
        return r.text
    except:
        return '产生异常'
print(len(gethtmltext(url)))

2287


In [8]:
r = requests.head(url)
print(r.headers.items())
print(r.text)

ItemsView({'Server': 'bfe/1.0.8.18', 'Date': 'Mon, 18 Sep 2017 03:11:56 GMT', 'Content-Type': 'text/html', 'Last-Modified': 'Mon, 13 Jun 2016 02:50:08 GMT', 'Connection': 'Keep-Alive', 'Cache-Control': 'private, no-cache, no-store, proxy-revalidate, no-transform', 'Pragma': 'no-cache', 'Content-Encoding': 'gzip'})



### requests.request(method, url, **kwargs)
+ params : 字典或字节序列，作为参数增加到URL中
+ data : 字典，字节序列或文件对象，作为Request的内容，用于向服务器提交内容使用
+ json : 控制访问参数，
+ headers : http请求头
+ cookies : 字典或CookieJar, Request中的cookie
+ auth : 元组，支持HTTP认证功能
+ files : 字典类型，传输文件
 ```python
 fs = {'file', open('data.xls', 'rb')}
 r = requests.request('POST', 'http://python123.io/ws', files = fs)
 ```
+ timeout : 以秒为单位
+ proxies : 字典类型，设定访问代理服务器，可以增加登录认证
 ```python
 pxs = {
    'http':'http://user:pass@10.10.10.1:1234',
    'https':'https://10.10.10.1:4321'}
 requests.request('GET', url, proxies = pxs)
 ```
+ allow_redirects : True/False 重定向开关（TRUE）
+ stream : T/F 获取内容是否立即下载（TRUE）
+ verify : T/F 认证SSL证书开关（TRUE）
+ cert : 本地SSL证书路径

In [15]:
# params 参数
par = {"key1":"value1", 'key2':'value2'}
# data参数
body = "主题内容"
my_headers = {
    "User-Agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:55.0) Gecko/20100101 Firefox/55.0",
    "Accept":"text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
    "Accept-Language":"zh-CN,zh;q=0.8,en-US;q=0.5,en;q=0.3",
    "Accept-Encoding":"gzip, deflate",
    "Connection": "keep-alive"
}
r = requests.request("GET",
                     'http://python123.io/ws',
                     params=par,
                     data = body.encode('utf-8'),
                     headers = my_headers)
print(r.url)

https://python123.io/ws?key1=value1&key2=value2


In [20]:
import requests
r = requests.get("http://www.imooc.com/course/programdetail/pid/32")
print("r.status_code:" + r.encoding)
print("r.apparent_encoding:" + r.apparent_encoding)
print(r.request.headers)

r.status_code:utf-8
r.apparent_encoding:utf-8
{'User-Agent': 'python-requests/2.18.4', 'Accept-Encoding': 'gzip, deflate', 'Accept': '*/*', 'Connection': 'keep-alive'}


### 图片爬取代码

In [23]:
import requests
import os
url = "https://gss2.bdstatic.com/5K1IcD3801kUm2zgoZyCKT6zgBYnreHg-_/2e2aecc6-291b-4f80-8ac4-8a29a4555f23.jpeg"
root = "D:/python/img/"
path = root + url.split('/')[-1]
try:
    if not os.path.exists(root):
        os.makedirs(root)
    if not os.path.exists(path):
        r = requests.get(url)
        with open(path, 'wb') as f:
            f.write(r.content)
            f.close()
            print('文件创建成功')
    else :
        print("文件已经存在")
    os.system(path)
except:
    print("爬取失败")

文件已经存在
