# Built-In Data Structures

# 内建数据结构

> We have seen Python's simple types: ``int``, ``float``, ``complex``, ``bool``, ``str``, and so on.
Python also has several built-in compound types, which act as containers for other types.
These compound types are:

> | Type Name | Example                   |Description                            |
|-----------|---------------------------|---------------------------------------|
| ``list``  | ``[1, 2, 3]``             | Ordered collection                    |
| ``tuple`` | ``(1, 2, 3)``             | Immutable ordered collection          |
| ``dict``  | ``{'a':1, 'b':2, 'c':3}`` | Unordered (key,value) mapping         |
| ``set``   | ``{1, 2, 3}``             | Unordered collection of unique values |

我们已经看过了Python的简单类型：``int``, ``float``, ``complex``, ``bool``, ``str`` 等等。
Python同样有许多内建的复杂类型，作为提供给其他类型的容器。
这些复杂类型是：

| 名称 | 例子                   |描述                           |
|-----------|---------------------------|---------------------------------------|
| ``list``  | ``[1, 2, 3]``             | 有序集合                    |
| ``tuple`` | ``(1, 2, 3)``             | 不可变的有序集合          |
| ``dict``  | ``{'a':1, 'b':2, 'c':3}`` | 无序的映射        |
| ``set``   | ``{1, 2, 3}``             | 无序且不重复的集合 |

> As you can see, round, square, and curly brackets have distinct meanings when it comes to the type of collection produced.
We'll take a quick tour of these data structures here.

你将会看到，当我们使用这些复合类型时，小括号、中括号和大括号有着它们独特的意义。
我们接下来会快速的浏览一遍这些数据结构。

## Lists

## 列表

> Lists are the basic *ordered* and *mutable* data collection type in Python.
They can be defined with comma-separated values between square brackets; for example, here is a list of the first several prime numbers:

列表是基本的*有序*及*可变*的数据集合类型。
列表可以使用中括号中的逗号分隔的元素来定义；例如，下面定义了一个列表含有一些质数：

In [1]:
L = [2, 3, 5, 7]

> Lists have a number of useful properties and methods available to them.
Here we'll take a quick look at some of the more common and useful ones:

列表有许多有用的属性和方法。
下面我们快速地看看其中一些最常用和有用的：

In [2]:
# Length of a list
# 列表的长度
len(L)

4

In [3]:
# Append a value to the end
# 在列表末尾添加元素
L.append(11)
L

[2, 3, 5, 7, 11]

In [4]:
# Addition concatenates lists
# 列表的连接
L + [13, 17, 19]

[2, 3, 5, 7, 11, 13, 17, 19]

In [5]:
# sort() method sorts in-place
# 排序
L = [2, 5, 1, 6, 3, 4]
L.sort()
L

[1, 2, 3, 4, 5, 6]

> In addition, there are many more built-in list methods; they are well-covered in Python's [online documentation](https://docs.python.org/3/tutorial/datastructures.html).

还有很多内建的列表方法；你可以在Python的[在线文档](https://docs.python.org/3/tutorial/datastructures.html)中找到它们。

> While we've been demonstrating lists containing values of a single type, one of the powerful features of Python's compound objects is that they can contain objects of *any* type, or even a mix of types. For example:

虽然我们展示的列表都只包含同一种类型的元素，实际上，Python复合类型最强大的特性之一就是它们可以包含*任何*类型的元素，甚至包含复合类型本身。例如：

In [6]:
L = [1, 'two', 3.14, [0, 3, 5]]

> This flexibility is a consequence of Python's dynamic type system.
Creating such a mixed sequence in a statically-typed language like C can be much more of a headache!
We see that lists can even contain other lists as elements.
Such type flexibility is an essential piece of what makes Python code relatively quick and easy to write.

这种灵活性实际上是Python动态类型系统的结果。
在静态类型语言中（如C）创建这样的混合类型集合会比在Python中头疼许多！
我们在上面的例子中看到列表可以包含另外一个列表作为它的元素。
这种灵活性是使得Python代码能够更快和容易编写的关键因素。

> So far we've been considering manipulations of lists as a whole; another essential piece is the accessing of individual elements.
This is done in Python via *indexing* and *slicing*, which we'll explore next.

目前我们只是看到了如何将列表作为一个整体进行操作；还有一个核心的操作是访问和操作列表中的单个或部分元素。
在Python中，我们通过*索引*和*切片*操作实现，我们现在来研究一下。

### List indexing and slicing

### 列表索引和切片

> Python provides access to elements in compound types through *indexing* for single elements, and *slicing* for multiple elements.
As we'll see, both are indicated by a square-bracket syntax.
Suppose we return to our list of the first several primes:

Python提供了*索引*操作用来访问复合类型中的单个元素，*切片*操作用来访问多个元素。
我们将会看到，这两个操作都使用中括号语法。
我们回到之前那个质数的列表：

In [7]:
L = [2, 3, 5, 7, 11]

> Python uses *zero-based* indexing, so we can access the first and second element in using the following syntax:

Python使用*0基准*的索引，因此我们可以通过以下语法访问列表的第一个和第二个元素：

In [8]:
L[0]

2

In [9]:
L[1]

3

> Elements at the end of the list can be accessed with negative numbers, starting from -1:

列表末尾的元素可以使用负数进行访问，从-1开始：

In [10]:
L[-1]

11

In [11]:
L[-2]

7

> You can visualize this indexing scheme this way:

你可以通过下面的图了解索引：

![List Indexing Figure](fig/list-indexing.png)

> Here values in the list are represented by large numbers in the squares; list indices are represented by small numbers above and below.
In this case, ``L[2]`` returns ``5``, because that is the next value at index ``2``.

这里列表中的元素值使用方格中的大数字表示；列表的索引表示成上面和下面的小数字。
在这里，``L[2]`` 得到 ``5``，因为它是索引``2``的下一个元素值。

> Where *indexing* is a means of fetching a single value from the list, *slicing* is a means of accessing multiple values in sub-lists.
It uses a colon to indicate the start point (inclusive) and end point (non-inclusive) of the sub-array.
For example, to get the first three elements of the list, we can write:

*索引*是从列表中访问单个元素值的方法，*切片*是取得列表多个元素值或者叫子列表的方法。
使用冒号指明开始索引（包含）和结束索引（不包含）来进行切片。
例如，要活的列表的头三个元素值，我们可以写成：

In [12]:
L[0:3]

[2, 3, 5]

> Notice where ``0`` and ``3`` lie in the preceding diagram, and how the slice takes just the values between the indices.
If we leave out the first index, ``0`` is assumed, so we can equivalently write:

注意前面那张图中的``0`` 和 ``3``的位置，和切片操作如何在索引之间取得元素的。
如果我们将开始索引留空，将默认为``0``，因此我们也可以等同的写成：

In [13]:
L[:3]

[2, 3, 5]

> Similarly, if we leave out the last index, it defaults to the length of the list.
Thus, the last three elements can be accessed as follows:

同样的，如果我们将结束索引留空，将默认为列表的长度。
因此，列表最后三个元素可以如下方式访问：

In [14]:
L[-3:]

[5, 7, 11]

> Finally, it is possible to specify a third integer that represents the step size; for example, to select every second element of the list, we can write:

最后，可以使用第三个整数表示切片的步长；例如，要选择列表中的偶数索引元素的话，我们可以写为：

In [15]:
L[::2]  # equivalent to L[0:len(L):2] 等同于L[0:len(L):2]

[2, 5, 11]

> A particularly useful version of this is to specify a negative step, which will reverse the array:

可以将步长指定为负数，这样的做法将会反向切片列表，非常有用：

In [16]:
L[::-1]

[11, 7, 5, 3, 2]

> Both indexing and slicing can be used to set elements as well as access them.
The syntax is as you would expect:

索引和切片操作除了访问元素值外，还可以用来设置元素值。
你可以预见到语法：

In [17]:
L[0] = 100
print(L)

[100, 3, 5, 7, 11]


In [18]:
L[1:3] = [55, 56]
print(L)

[100, 55, 56, 7, 11]


> A very similar slicing syntax is also used in many data science-oriented packages, including NumPy and Pandas (mentioned in the introduction).

这些索引和切片的语法会在其他一些数据科学常用的包当中使用，包括NumPy和Pandas，基本没有太大语法变化。

> Now that we have seen Python lists and how to access elements in ordered compound types, let's take a look at the other three standard compound data types mentioned earlier.

现在我们学习了Python的列表，以及如何在有序的复合类型中访问元素。让我们继续讨论其他的三个标准复合数据类型。

## Tuples

## 元组

> Tuples are in many ways similar to lists, but they are defined with parentheses rather than square brackets:

元组在许多方面都与列表近似，但是元组使用小括号进行定义而不是中括号：

In [19]:
t = (1, 2, 3)

> They can also be defined without any brackets at all:

元组也可以不使用任何括号进行定义：

In [20]:
t = 1, 2, 3
print(t)

(1, 2, 3)


> Like the lists discussed before, tuples have a length, and individual elements can be extracted using square-bracket indexing:

就像之前讨论的列表，元组有一个长度，也可以使用方括号语法进行索引或切片：

In [21]:
len(t)

3

In [22]:
t[0]

1

> The main distinguishing feature of tuples is that they are *immutable*: this means that once they are created, their size and contents cannot be changed:

元组最重要的特性是它们是*不可变*的。这意味着，一旦元组创建之后，它的长度和它包含的元素值都是不能改变的：

In [23]:
t[1] = 4

TypeError: 'tuple' object does not support item assignment

In [24]:
t.append(4)

AttributeError: 'tuple' object has no attribute 'append'

> Tuples are often used in a Python program; a particularly common case is in functions that have multiple return values.
For example, the ``as_integer_ratio()`` method of floating-point objects returns a numerator and a denominator; this dual return value comes in the form of a tuple:

元组在Python中使用广泛；特别是在函数返回多个值时。
例如，浮点数的``as_integer_ratio()``函数会返回分子和分母；这两个返回值会作为元组返回：

In [25]:
x = 0.125
x.as_integer_ratio()

(1, 8)

> These multiple return values can be individually assigned as follows:

这些函数返回的多个值可以单独地赋值给变量：

In [26]:
numerator, denominator = x.as_integer_ratio()
print(numerator / denominator)

0.125


> The indexing and slicing logic covered earlier for lists works for tuples as well, along with a host of other methods.
Refer to the online [Python documentation](https://docs.python.org/3/tutorial/datastructures.html) for a more complete list of these.

元组中索引和切片操作的使用与列表一样，其他的一些方法也是。
参考在线[Python文档](https://docs.python.org/3/tutorial/datastructures.html)以获得更完整的信息。

## Dictionaries

## 字典

> Dictionaries are extremely flexible mappings of keys to values, and form the basis of much of Python's internal implementation.
They can be created via a comma-separated list of ``key:value`` pairs within curly braces:

字典是非常灵活的键值对映射关系，它是许多Python内部实现的基础。
字典可以通过大括号内的逗号分隔的``键:值``对的方式来创建：

In [27]:
numbers = {'one':1, 'two':2, 'three':3}

> Items are accessed and set via the indexing syntax used for lists and tuples, except here the index is not a zero-based order but valid key in the dictionary:

访问字典的元素时，通过与列表和元组相似的索引操作来进行，区别是这里的索引不是0开始的整数，而是字典中相应的一个键：

In [28]:
# Access a value via the key
# 通过键访问字典值
numbers['two']

2

> New items can be added to the dictionary using indexing as well:

新的键值对可以通过索引操作加入字典：

In [29]:
# Set a new key:value pair
# 加入一个新的键值对
numbers['ninety'] = 90
print(numbers)

{'three': 3, 'ninety': 90, 'two': 2, 'one': 1}


> Keep in mind that dictionaries do not maintain any sense of order for the input parameters; this is by design.
This lack of ordering allows dictionaries to be implemented very efficiently, so that random element access is very fast, regardless of the size of the dictionary (if you're curious how this works, read about the concept of a *hash table*).
The [python documentation](https://docs.python.org/3/library/stdtypes.html) has a complete list of the methods available for dictionaries.

记住字典不会保留任何的顺序信息。
字典不记录顺序这个特性使得它非常有效，随机访问元素非常快，基本不在乎字典的大小（如果你对此有兴趣，请阅读*哈希表*的相关概念）。
在线[python文档](https://docs.python.org/3/library/stdtypes.html)提供了字典完整的方法列表。

## Sets

## 集

> The fourth basic collection is the set, which contains unordered collections of unique items.
They are defined much like lists and tuples, except they use the curly brackets of dictionaries:

第四个基础集合类型就是set，包含着无序的且非重复的元素。
set定义的语法与列表和元组类似，除了它使用的是和字典一样的大括号：

In [30]:
primes = {2, 3, 5, 7}
odds = {1, 3, 5, 7, 9}

> If you're familiar with the mathematics of sets, you'll be familiar with operations like the union, intersection, difference, symmetric difference, and others.
Python's sets have all of these operations built-in, via methods or operators.
For each, we'll show the two equivalent methods:

如果你熟悉数学上的集合，你就会熟悉集合的有关操作，如并集、交集、差集、对称差集等。
Python内建了所有的这些操作，通过方法或运算符。
每一个操作我们都能看到两个效果相同的方法：

In [31]:
# union: items appearing in either
# 并集
primes | odds      # with an operator 使用运算符
primes.union(odds) # equivalently with a method 使用方法

{1, 2, 3, 5, 7, 9}

In [32]:
# intersection: items appearing in both
# 交集
primes & odds             # with an operator 使用运算符
primes.intersection(odds) # equivalently with a method 使用方法

{3, 5, 7}

In [33]:
# difference: items in primes but not in odds
# 差集
primes - odds           # with an operator 使用运算符
primes.difference(odds) # equivalently with a method 使用方法

{2}

In [34]:
# symmetric difference: items appearing in only one set
# 对称差集
primes ^ odds                     # with an operator 使用运算符
primes.symmetric_difference(odds) # equivalently with a method 使用方法

{1, 2, 9}

> Many more set methods and operations are available.
You've probably already guessed what I'll say next: refer to Python's [online documentation](https://docs.python.org/3/library/stdtypes.html) for a complete reference.

还有很多可用的方法和运算符。
你可能也猜到作者下面要说什么了：参见[在线文档](https://docs.python.org/3/library/stdtypes.html)获得完整的指南。

## More Specialized Data Structures

## 更多特定的数据结构

> Python contains several other data structures that you might find useful; these can generally be found in the built-in ``collections`` module.
The collections module is fully-documented in [Python's online documentation](https://docs.python.org/3/library/collections.html), and you can read more about the various objects available there.

Python包含了一些其他有用的数据结构；这些数据结构通常可以在内建的``collections``模块中找到。
collections模块的详细说明可以在[在线文档](https://docs.python.org/3/library/collections.html)中查看。

> In particular, I've found the following very useful on occasion:

> - ``collections.namedtuple``: Like a tuple, but each value has a name
> - ``collections.defaultdict``: Like a dictionary, but unspecified keys have a user-specified default value
> - ``collections.OrderedDict``: Like a dictionary, but the order of keys is maintained

特别的，作者发现下面几种数据结果非常有用：

- ``collections.namedtuple``: 像元组，但是每个元素值都有一个名称
- ``collections.defaultdict``: 像字典，但是对于不存在的键，都会得到一个用户指定的值
- ``collections.OrderedDict``: 像字典，但是里面的键都是有序存储的

> Once you've seen the standard built-in collection types, the use of these extended functionalities is very intuitive, and I'd suggest [reading about their use](https://docs.python.org/3/library/collections.html).

一旦你掌握了內建的集合类型之后，对于其他扩展的类型也很容易上手，当然我仍然推荐你[阅读文档](https://docs.python.org/3/library/collections.html)。