# 第17章 使用API

### 在本章中， 将学习如何编写一个独立的程序， 并对其获取的数据进行可视化。 这个程序将使用Web应用编程接口 (API)自动请求网站的特定信息而不是整个网页， 再对这些信息进行可视化。 由于这样编写的程序始终使用最新的数据来生成可视化， 因此即便数据瞬息万变， 它呈现的信息也都是最新的。

## 17.1 使用Web API

#### Web API是网站的一部分，用于与使用非常具体的URL请求特定信息的程序交互。这种请求称为API调用。请求的数据将以易于处理的格式(如JSON或CSV)返回。依赖于外部数据源的大多数应用程序都依赖于API调用。

### 17.1.1 Git和GitHub

#### 1.GitHub是一个让程序员能够协作开发项目的网站；
#### 2.我们将使用GitHub的API来请求有关该网站中Python项目的信息。

##### GitHub( https://github.com/ )的名字源自Git， Git是一个分布式版本控制系统， 让程序员团队能够协作开发项目。 Git帮助大家管理为项目所做的工作， 避免一个人所做的修改影响其他人所做的修改。 你在项目中实现新功能时， Git将跟踪你对每个文件所做的修改。 确定代码可行后， 你提交所做的修改， 而Git将记录项目最新的状态。 如果你犯了错， 想撤销所做的修改， 可轻松地返回以前的任何可行状态（要更深入地了解如何使用Git进行版本控制， 请参阅附录D） 。 GitHub上的项目都存储在仓库中， 后者包含与项目相关联的一切： 代码、 项目参与者的信息、 问题或bug报告等。

### 17.1.2 使用API调用请求数据

#### GitHub的API让你能够通过API来请求各种信息。

#### https://api.github.com/search/repositories?q=language:python&sort=stars

#### 这个调用返回GitHub当前托管了多少个Python项目， 还有有关最受欢迎的Python仓库的信息。 下面来仔细研究这个调用。 第一部分( https://api.github.com/ )将请求发送到GitHub网站中响应API调用的部分； 接下来的一部分(search/repositories)让API搜索GitHub上的所有仓库。repositories 后面的问号指出我们要传递一个实参。 q 表示查询， 而等号让我们能够开始指定查询(q=)。 通过使用language:python ， 我们指出只想获取主要语言为Python的仓库的信息。 最后一部分（&sort=stars ） 指定将项目按其获得的星级进行排序。

![UPL.PNG](./screenshot/UPL.PNG)

### 17.1.3 安装 requests

#### requests包让python程序能够轻松地向网站请求信息以及检查返回的响应。要安装requests， 请执行类似于下面的命令：
#### $ pip install --user requests

### 17.1.4 处理API响应

#### python_repos.py

In [11]:
import requests

# 执行API调用并存储响应
url = 'https://api.github.com/search/repositories?q=language:python&sort=stars'
r = requests.get(url)
print("Status code: ", r.status_code)

# 将API响应存储在一个变量中
response_dict = r.json()

# 处理结果
print(response_dict.keys())

Status code:  200
dict_keys(['total_count', 'incomplete_results', 'items'])


### 17.1.5 处理响应字典

#### python_repos.py

In [1]:
import requests

# 执行API调用并存储响应
url = 'https://api.github.com/search/repositories?q=language:python&sort=stars'
r = requests.get(url)
print("Status code:", r.status_code)

# 将API响应存储在一个变量中
response_dict = r.json()
print("Total repositories:", response_dict['total_count'])

# 探索有关仓库的信息
repo_dicts = response_dict['items']
print("Repositories returned:", len(repo_dicts))

# 研究第一个仓库
repo_dict = repo_dicts[0]
print("\nKeys:", len(repo_dict))
for key in sorted(repo_dict.keys()):
    print(key)

Status code: 200
Total repositories: 6059993
Repositories returned: 30

Keys: 74
archive_url
archived
assignees_url
blobs_url
branches_url
clone_url
collaborators_url
comments_url
commits_url
compare_url
contents_url
contributors_url
created_at
default_branch
deployments_url
description
disabled
downloads_url
events_url
fork
forks
forks_count
forks_url
full_name
git_commits_url
git_refs_url
git_tags_url
git_url
has_downloads
has_issues
has_pages
has_projects
has_wiki
homepage
hooks_url
html_url
id
issue_comment_url
issue_events_url
issues_url
keys_url
labels_url
language
languages_url
license
merges_url
milestones_url
mirror_url
name
node_id
notifications_url
open_issues
open_issues_count
owner
private
pulls_url
pushed_at
releases_url
score
size
ssh_url
stargazers_count
stargazers_url
statuses_url
subscribers_url
subscription_url
svn_url
tags_url
teams_url
trees_url
updated_at
url
watchers
watchers_count


#### 下面来提取repo_dict 中与一些键相关联的值：

#### python_repos.py

In [2]:
import requests

# 执行API调用并存储响应
url = 'https://api.github.com/search/repositories?q=language:python&sort=stars'
r = requests.get(url)
print("Status code:", r.status_code)

# 将API响应存储在一个变量中
response_dict = r.json()
print("Total repositories:", response_dict['total_count'])

# 探索有关仓库的信息
repo_dicts = response_dict['items']
print("Repositories returned:", len(repo_dicts))

# 研究第一个仓库
repo_dict = repo_dicts[0]
print("\nSelected information about first repository:")
print('Name:', repo_dict['name'])
print('Owner:', repo_dict['owner']['login'])
print('Stars:', repo_dict['stargazers_count'])
print('Repository:', repo_dict['html_url'])
print('Created:', repo_dict['created_at'])
print('Updated:', repo_dict['updated_at'])
print('Description:', repo_dict['description'])

Status code: 200
Total repositories: 6190091
Repositories returned: 30

Selected information about first repository:
Name: system-design-primer
Owner: donnemartin
Stars: 112403
Repository: https://github.com/donnemartin/system-design-primer
Created: 2017-02-26T16:15:28Z
Updated: 2020-11-14T10:35:48Z
Description: Learn how to design large-scale systems. Prep for the system design interview.  Includes Anki flashcards.


### 17.1.6 概述最受欢迎的仓库

#### python_repos.py

In [3]:
import requests

# 执行API调用并存储响应
url = 'https://api.github.com/search/repositories?q=language:python&sort=stars'
r = requests.get(url)
print("Status code:", r.status_code)

# 将API响应存储在一个变量中
response_dict = r.json()
print("Total repositories:", response_dict['total_count'])

# 探索有关仓库的信息
repo_dicts = response_dict['items']
print("Repositories returned:", len(repo_dicts))

print("\nSelected information about each repository:")
for repo_dict in repo_dicts:
    print('\nName:', repo_dict['name'])
    print('Owner:', repo_dict['owner']['login'])
    print('Stars:', repo_dict['stargazers_count'])
    print('Repository:', repo_dict['html_url'])
    print('Description:', repo_dict['description'])

Status code: 200
Total repositories: 6203959
Repositories returned: 30

Selected information about each repository:

Name: system-design-primer
Owner: donnemartin
Stars: 112404
Repository: https://github.com/donnemartin/system-design-primer
Description: Learn how to design large-scale systems. Prep for the system design interview.  Includes Anki flashcards.

Name: public-apis
Owner: public-apis
Stars: 100108
Repository: https://github.com/public-apis/public-apis
Description: A collective list of free APIs for use in software and web development.

Name: Python-100-Days
Owner: jackfrued
Stars: 95435
Repository: https://github.com/jackfrued/Python-100-Days
Description: Python - 100天从新手到大师

Name: Python
Owner: TheAlgorithms
Stars: 91559
Repository: https://github.com/TheAlgorithms/Python
Description: All Algorithms implemented in Python

Name: awesome-python
Owner: vinta
Stars: 88733
Repository: https://github.com/vinta/awesome-python
Description: A curated list of awesome Python framework

### 17.1.7 监视API的速率限制

#### 大多数API都存在速率限制，即你在特定时间内可制行的请求数存在限制。要获悉你是否接近了GitHub的限制，请在浏览器中输入 https://api.github.com/rate_limit ， 你将看到类似于下面的响应：

![rate_limit.PNG](./screenshot/rate_limit.PNG)

#### reset 值指的是配额将重置的Unix时间或新纪元时间 （1970年1月1日午夜后多少秒）。用完配额后， 你将收到一条简单的响应， 由此知道已到达API极限。 到达极限后， 你必须等待配额重置。

### 注意：
#### 很多API都要求你注册获得API密钥后才能执行API调用。编写本书时， GitHub没有这样的要求， 但获得API密钥后， 配额将高得多。

## 17.2 使用 Pygal 可视化仓库

#### python_repos.py

In [8]:
import requests
import pygal
from pygal.style import LightColorizedStyle as LCS, LightenStyle as LS

# 执行API调用并存储响应
URL = 'https://api.github.com/search/repositories?q=language:python&sort=stars'
r = requests.get(URL)
print("Status code:", r.status_code)

# 将API响应存储在一个变量中
response_dict = r.json()
print("Total repositories:", response_dict['total_count'])

# 研究有关仓库的信息
repo_dicts = response_dict['items']

names, stars = [], []
for repo_dict in repo_dicts:
    names.append(repo_dict['name'])
    stars.append(repo_dict['stargazers_count'])
    
# 可视化
my_style = LS('#333366', base_style=LCS)
chart = pygal.Bar(style=my_style, x_label_rotation=45, show_legend=False)
chart.title = 'Most-Starrted Python Projects on GitHub'
chart.x_labels = names

chart.add('', stars)
chart.render_to_file('python_repos.svg')

Status code: 200
Total repositories: 6168645


![17_1.PNG](./result/17_1.PNG)

### 17.2.1 改进 Pygal 图表

#### python_repos.py

#### 创建一个配置对象，在其中包含要传递给Bar()的所有定制：

In [13]:
import requests
import pygal
from pygal.style import LightColorizedStyle as LCS, LightenStyle as LS

# 执行API调用并存储响应
URL = 'https://api.github.com/search/repositories?q=language:python&sort=stars'
r = requests.get(URL)
print("Status code:", r.status_code)

# 将API响应存储在一个变量中
response_dict = r.json()
print("Total repositories:", response_dict['total_count'])

# 研究有关仓库的信息
repo_dicts = response_dict['items']

names, stars = [], []
for repo_dict in repo_dicts:
    names.append(repo_dict['name'])
    stars.append(repo_dict['stargazers_count'])
    
# 可视化
my_style = LS('#333366', base_style=LCS)

my_config = pygal.Config()
my_config.x_label_rotation = 45
my_config.show_legend = False
my_config.title_font_size = 24
my_config.label_font_size = 14
my_config.major_label_font_size = 18
my_config.truncate_label = 15
my_config.show_y_guides = False
my_config.width = 1000

chart = pygal.Bar(my_config, style=my_style)
chart.title = 'Most-Starrted Python Projects on GitHub'
chart.x_labels = names

chart.add('', stars)
chart.render_to_file('python_repos_1.svg')

Status code: 200
Total repositories: 6121992


![17_2.PNG](./result/17_2.PNG)

### 17.2.2 添加自定义工具提示

#### 在Pygal中，将鼠标指向条形将显示它表示的信息，这通常称为工具提示。在这个示例中， 当前显示的是项目获得了多少个星。 下面来创建一个自定义工具提示， 以同时显示项目的描述。

#### bar_descriptions.py

In [3]:
import pygal
from pygal.style import LightColorizedStyle as LCS, LightenStyle as LS

my_style = LS('#333366', base_style=LCS)
chart = pygal.Bar(style=my_style, x_label_rotation=45, show_legend=False)

chart.title = 'Python Projects'
chart.x_labels = ['httpie', 'django', 'flask']

plot_dicts = [
    {'value': 16101, 'label': 'Description of httpie.'},
    {'value': 15028, 'label': 'Description of django.'},
    {'value': 14798, 'label': 'Description of flask.'},
    ]

chart.add('', plot_dicts)
chart.render_to_file('bar_descriptions.svg')

![17_3.PNG](./result/17_3.PNG)

#### 1.Pygal根据与键‘value’相关联的数字来确定条形的高度，并使用与‘label’相关联的字符串给条形创建工具提示；
#### 2.方法add()接受一个字符串和一个列表。

### 17.2.3 根据数据绘图

#### 为根据数据绘图， 我们将自动生成plot_dicts ， 其中包含API调用返回的30个项目的信息。完成这种工作的代码如下：

#### python_repos.py

In [7]:
import requests
import pygal
from pygal.style import LightColorizedStyle as LCS, LightenStyle as LS

# 执行API调用并存储响应
URL = 'https://api.github.com/search/repositories?q=language:python&sort=stars'
r = requests.get(URL)
print("Status code:", r.status_code)

# 将API响应存储在一个变量中
response_dict = r.json()
print("Total repositories:", response_dict['total_count'])

# 研究有关仓库的信息
repo_dicts = response_dict['items']
print("Number of items: ", len(repo_dicts))

names, plot_dicts = [], []
for repo_dict in repo_dicts:
    names.append(repo_dict['name'])
    
    plot_dict = {
        'value': repo_dict['stargazers_count'],
        'label': str(repo_dict['description']),
        }
    plot_dicts.append(plot_dict)
    
# 可视化
my_style = LS('#333366', base_style=LCS)

my_config = pygal.Config()
my_config.x_label_rotation = 45
my_config.show_legend = False
my_config.title_font_size = 24
my_config.label_font_size = 14
my_config.major_label_font_size = 18
my_config.truncate_label = 15
my_config.show_y_guides = False
my_config.width = 1000

chart = pygal.Bar(my_config, style=my_style)
chart.title = 'Most-Starrted Python Projects on GitHub'
chart.x_labels = names

chart.add('', plot_dicts)
chart.render_to_file('python_repos_2.svg')

Status code: 200
Total repositories: 6113117
Number of items:  30


![17_4.PNG](./result/17_4.PNG)

### 17.2.4 在图表中添加可单击的链接

#### Pygal还允许你将图表中的每个条形用作网站的链接。为此，只需添加一行代码，在为每个项目创建的字典中，添加一个键为'xlink'的键-值对：

#### python_repos.py

In [9]:
import requests
import pygal
from pygal.style import LightColorizedStyle as LCS, LightenStyle as LS

# 执行API调用并存储响应
URL = 'https://api.github.com/search/repositories?q=language:python&sort=stars'
r = requests.get(URL)
print("Status code:", r.status_code)

# 将API响应存储在一个变量中
response_dict = r.json()
print("Total repositories:", response_dict['total_count'])

# 研究有关仓库的信息
repo_dicts = response_dict['items']
print("Number of items: ", len(repo_dicts))

names, plot_dicts = [], []
for repo_dict in repo_dicts:
    names.append(repo_dict['name'])
    
    plot_dict = {
        'value': repo_dict['stargazers_count'],
        'label': str(repo_dict['description']),
        'xlink': repo_dict['html_url'],
        }
    plot_dicts.append(plot_dict)
    
# 可视化
my_style = LS('#333366', base_style=LCS)

my_config = pygal.Config()
my_config.x_label_rotation = 45
my_config.show_legend = False
my_config.title_font_size = 24
my_config.label_font_size = 14
my_config.major_label_font_size = 18
my_config.truncate_label = 15
my_config.show_y_guides = False
my_config.width = 1000

chart = pygal.Bar(my_config, style=my_style)
chart.title = 'Most-Starrted Python Projects on GitHub'
chart.x_labels = names

chart.add('', plot_dicts)
chart.render_to_file('python_repos_3.svg')

Status code: 200
Total repositories: 6051123
Number of items:  30


![17_5.PNG](./result/17_5.PNG)

#### Pygal根据与键‘xlink’相关联的URL将每个条形都转换为活跃的链接。单击图表中的任何条形时，都将在浏览器中打开一个新的标签页，并在其中显示相应项目的GitHub页面。

## 17.3 Hacker News API

#### 为探索如何使用其他网站的API调用， 我们来看看Hacker News( http://news.ycombinator.com/ )。Hacker News的API让你能够访问有关该网站所有文章和评论的信息， 且不要求你通过注册获得密钥。

#### 下面的调用返回本书编写时最热门的文章的信息：
#### https://hacker-news.firebaseio.com/v0/item/9884165.json

#### 响应是一个字典， 包含ID为9884165的文章的信息：

![17_6.PNG](./result/17_6.PNG)

#### 下面来执行一个API调用， 返回Hacker News上当前热门文章的ID， 再查看每篇排名靠前的文章：

#### hn_submissions.py

In [10]:
import requests
from operator import itemgetter

# 执行API调用并存储响应
url = 'https://hacker-news.firebaseio.com/v0/topstories.json'
r = requests.get(url)
print("Status code:", r.status_code)

# 处理有关每篇文章的信息
submission_ids = r.json()
submission_dicts = []
for submission_id in submission_ids[:30]:
    "对于每篇文章，都执行一个API调用"
    
    url = ('https://hacker-news.firebaseio.com/v0/item/' + str(submission_id) + '.json')
    submission_r = requests.get(url)
    print(submission_r.status_code)
    response_dict = submission_r.json()
    
    submission_dict = {
        'title': response_dict['title'],
        'link': 'http://news.ycombinator.com/item?id=' + str(submission_id),
        'comments': response_dict.get('descendants', 0)
        }
    submission_dicts.append(submission_dict)
    
submission_dicts = sorted(submission_dicts, key=itemgetter('comments'), reverse=True)

for submission_dict in submission_dicts:
    print("\nTitle:", submission_dict['title'])
    print("Discussion link:", submission_dict['link'])
    print("Comments:", submission_dict['comments'])

ConnectionError: HTTPSConnectionPool(host='hacker-news.firebaseio.com', port=443): Max retries exceeded with url: /v0/topstories.json (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x0000025DB0275DC0>: Failed to establish a new connection: [WinError 10060] 由于连接方在一段时间后没有正确答复或连接的主机没有反应，连接尝试失败。'))

### 注：
#### Hacker News网站打不开，不能执行API调用，我们读一读代码就好。

#### 1.不确定某个键是否包含在字典中时， 可使用方法dict.get() ， 它在指定的键存在时返回与之相关联的值， 并在指定的键不存在时返回你指定的值（这里是0）；
#### 2.我们要根据评论数对字典列表submission_dicts 进行排序， 为此， 使用了模块operator 中的函数itemgetter() 。我们向这个函数传递了键'comments' ， 因此它将从这个列表的每个字典中提取与键'comments' 相关联的值。 这样， 函数sorted() 将根据这种值对列表进行排序。 

## 17.4 小结

### 在本章中，学习了：
#### 1.如何使用API来编写独立的程序， 它们自动采集所需的数据并对其进行可视化；
#### 2.使用GitHub API来探索GitHub上星级最高的Python项目， 还大致地了解了Hacker News API；
#### 3.如何使用requests包来自动执行GitHub API调用， 以及如何处理调用的结果;
#### 4.还简要地介绍了一些Pygal设置， 使用它们可进一步定制生成的图表的外观。

### 动手试一试

#### 17-1 其他语言

In [11]:
# 导入必要的模块
import requests
import pygal
from pygal.style import LightColorizedStyle as LCS, LightenStyle as LS

# Make an API call, and store the response.
url = 'https://api.github.com/search/repositories?q=language:javascript&sort=stars'
r = requests.get(url)
print("Status code:", r.status_code)

# Store API response in a variable.
response_dict = r.json()
print("Total repositories:", response_dict['total_count'])

# Explore information about the repositories.
repo_dicts = response_dict['items']

names, plot_dicts = [], []
for repo_dict in repo_dicts:
    names.append(repo_dict['name'])

    # When a project is removed, it's still listed with stars.
    #   So it's in the top projects, but has no description. The description
    #   is None, which causes an exception when being used as a label.
    if repo_dict['description']:
        desc = repo_dict['description']
    else:
        desc = 'No description provided.'
    
    plot_dict = {
        'value': repo_dict['stargazers_count'],
        'label': desc,
        'xlink': repo_dict['html_url'],
        }
    plot_dicts.append(plot_dict)

# Make visualization.
my_style = LS('#333366', base_style=LCS)
my_style.title_font_size = 24
my_style.label_font_size = 14
my_style.major_label_font_size = 18

my_config = pygal.Config()
my_config.x_label_rotation = 45
my_config.show_legend = False
my_config.truncate_label = 15
my_config.show_y_guides = False
my_config.width = 1000

chart = pygal.Bar(my_config, style=my_style)
chart.title = 'Most-Starred JavaScript Projects on GitHub'
chart.x_labels = names

chart.add('', plot_dicts)
chart.render_to_file('E:\VSCode_work\chapter17\js_repos.svg')

Status code: 200
Total repositories: 11225716


![javascript_17_1.PNG](./result/javascript_17_1.PNG)

#### 17-2 最活跃的讨论

In [12]:
# 导入模块
import requests
import pygal
from pygal.style import LightColorizedStyle as LCS, LightenStyle as LS
from operator import itemgetter

# Make an API call, and store the response.
url = 'https://hacker-news.firebaseio.com/v0/topstories.json'
r = requests.get(url)
print("Status code:", r.status_code)

# Process information about each submission.
submission_ids = r.json()
submission_dicts = []
for submission_id in submission_ids[:30]:
    # Make a separate API call for each submission.
    url = ('https://hacker-news.firebaseio.com/v0/item/' +
            str(submission_id) + '.json')
    submission_r = requests.get(url)
    print(submission_r.status_code)
    response_dict = submission_r.json()
    
    submission_dict = {
        'title': response_dict['title'],
        'link': 'http://news.ycombinator.com/item?id=' + str(submission_id),
        'comments': response_dict.get('descendants', 0)
        }
    submission_dicts.append(submission_dict)
    
submission_dicts = sorted(submission_dicts, key=itemgetter('comments'),
                            reverse=True)

for submission_dict in submission_dicts:
    print("\nTitle:", submission_dict['title'])
    print("Discussion link:", submission_dict['link'])
    print("Comments:", submission_dict['comments'])

titles, plot_dicts = [], []
for submission_dict in submission_dicts:
    titles.append(submission_dict['title'])
    plot_dict = {
        'value': submission_dict['comments'],
        'label': submission_dict['title'],
        'xlink': submission_dict['link'],
        }
    plot_dicts.append(plot_dict)

# Make visualization.
my_style = LS('#333366', base_style=LCS)
my_style.title_font_size = 24
my_style.label_font_size = 14
my_style.major_label_font_size = 18

my_config = pygal.Config()
my_config.x_label_rotation = 45
my_config.show_legend = False
my_config.truncate_label = 15
my_config.show_y_guides = False
my_config.width = 1000
my_config.y_title = 'Number of Comments'

chart = pygal.Bar(my_config, style=my_style)
chart.title = 'Most Active Discussions on Hacker News'
chart.x_labels = titles

chart.add('', plot_dicts)
chart.render_to_file('hn_discussions_17_2.svg')

ConnectionError: HTTPSConnectionPool(host='hacker-news.firebaseio.com', port=443): Max retries exceeded with url: /v0/topstories.json (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x0000025DB078E0A0>: Failed to establish a new connection: [WinError 10060] 由于连接方在一段时间后没有正确答复或连接的主机没有反应，连接尝试失败。'))

### 备注：不知道怎么回事，Hacker News这个网站调用不了API。

中文版文档官方链接：https://docs.python.org/zh-cn/3/