## Introduction

We have thus far avoided discussing directly *types*. The '*type*' is the type of object that a variable is associated with. This affects how a computer stores the object in memory, and how operations, such as multiplication and division, are performed.

In *statically typed* languages, like C and C++, types come up from the very beginning because
you usually need to specify types explicitly in your programs. Python is a *dynamically typed* language, which means types are deduced when a program is run. This is why we have been able to postpone the discussion until now.
It is important to have a basic understanding of types, and how types can affect how your programs behave. One can go very deep into this topic, especially for numerical computations, but we will cover the general concept from a high level,
show some examples, and highlight some potential pitfalls for engineering computations.

This is a dry topic - it contains important background information that you need to know for later, so hang in there. The below account highlights what can go wrong without an awareness of types and how computers process numbers.

## 介绍

到目前为止，我们避免直接讨论*类型*。 '*类型*'是变量与之关联的对象类型。这会影响计算机如何将对象存储在内存中，以及如何执行诸如乘法和除法之类的操作。

在*静态类型*语言中，如 C 和 C ++，类型从一开始就出现，因为您通常需要在程序中明确指定类型。Python 是一种*动态类型*语言，这意味着在程序运行时推断出类型。这就是为什么我们能够推迟到现在才开始讨论。
重要的是要对类型有基本的了解，以及类型如何影响程序的行为方式。人们可以深入研究这个主题，特别是对于数值计算，但我们将从高层次上介绍一般概念，展示一些例子，并强调工程计算的一些潜在缺陷。

这是一个 dry topic - 它包含您需要了解的重要背景信息。以下突出显示了在不了解类型以及计算机如何处理数字的情况下可能出现的问题。

### Patriot Missile failure and the Ariane 5 explosion

There have been numerous accidents due to programs not correctly handling types, type conversions and floating point arithmetic. Here are two examples:

1. In 1991, a US Patriot missile failed to intercept an Iraqi Scud missile at Dhahran in Saudi Arabi, leading to
   a loss of life. The subsequent investigation found that the Patriot missile failed to intercept the Scud
   missile due to a software flaw. The software developers did not account for the effects of 'floating point
   arithmetic'.
   This led to a small error in computing the time, which in turn caused the Patriot to miss the incoming Scud
   missile.

<img src="https://upload.wikimedia.org/wikipedia/commons/e/eb/Patriot_System_2.jpg" width="300" />

We will reproduce the precise mistake the developers of the Patriot Missile software made. See
https://en.wikipedia.org/wiki/MIM-104_Patriot#Failure_at_Dhahran for more background on the interception
failure.

1. Poor programming related to how computers store numbers led in 1996 to a European Space Agency *Ariane 5*
   unmanned rocket exploding shortly after lift-off. The rocket payload, worth US\$500 M,
   was destroyed. You can find background at https://en.wikipedia.org/wiki/Cluster_(spacecraft)#Launch_failure.
   We will reproduce their mistake, and show how a few lines code would have saved over US\$500 M.

   <img src="https://upload.wikimedia.org/wikipedia/commons/thumb/3/3c/Ariane_5ES_with_ATV_4_on_its_way_to_ELA-3.jpg/320px-Ariane_5ES_with_ATV_4_on_its_way_to_ELA-3.jpg" width="200" />

### 爱国者导弹失败和阿丽亚娜 5 爆炸

由于程序无法正确处理类型、类型转换和浮点运算，因此发生了大量事故。这是两个例子：

1. 1991 年，美国爱国者导弹未能拦截沙特阿拉伯达赫兰的伊拉克飞毛腿导弹，造成人员伤亡。随后的调查发现，由于软件缺陷，爱国者导弹未能拦截飞毛腿导弹。软件开发人员没有考虑 “浮点运算” 的影响。这导致了计算时间的一个小错误，这反过来导致爱国者错过了传入的飞毛腿导弹。

  <img src="https://upload.wikimedia.org/wikipedia/commons/e/eb/Patriot_System_2.jpg" width="300" />

   我们将重现爱国者导弹软件的开发者所犯的确切错误。参考
   https://en.wikipedia.org/wiki/MIM-104_Patriot#Failure_at_Dhahran 了解有关拦截失败的更多背景信息。
   

1. 编程失误还导致了 1996 年计算机存储数据如何导致欧洲航天局*阿丽亚娜 5*无人火箭在升空后不久爆炸。价值 5 亿美元的火箭被摧毁。您可以在 https://en.wikipedia.org/wiki/Cluster_(spacecraft)#Launch_failure 找到背景信息。
   我们将重现他们的错误，并展示几行代码将如何节省超过 US\$500 M.
   
   <img src="https://upload.wikimedia.org/wikipedia/commons/thumb/3/3c/Ariane_5ES_with_ATV_4_on_its_way_to_ELA-3.jpg/320px-Ariane_5ES_with_ATV_4_on_its_way_to_ELA-3.jpg" width="200" />

### Background: bits and bytes

An important part of understanding types is appreciating how computer storage works. Computer memory is made up of *bits*, and each bit can take on one of two
values - 0 or 1. A bit is the smallest building block of memory.
Bits are very fine-grained, so for many computer architectures the smallest 'block' we can normally work with is a *byte*. One byte is made up of 8 bits. This why when we talk about bits, e.g. a 64-bit operating system, the number of bits will almost always be a multiple of 8 (one byte).

The 'bigger' a thing we want to store, the more bytes we need. This is important for engineering computations since the the number of bytes used to store a number determines the accuracy with which the number can be stored,
and how big or small the number can be. The more bytes the greater the accuracy, but the price to be paid is higher memory usage. Also, it can be more expensive to perform operations like multiplication and division when using more bytes.

**XUE.cn 练习题**：

- [选择题 ★★ 字符串编码格式 UTF8 使用多少个字节表示一个汉字？](https://xue.cn/hub/app/exercise/821)

### 背景：位和字节

理解类型的一个重要部分是欣赏计算机存储的工作原理。计算机存储器由*位*组成，每个位可以取两个值中的一个 - 0 或 1。一位是存储器的最小构建块。位是非常细粒度的，因此对于许多计算机体系结构，我们通常可以使用的最小 “块” 是*字节*。一个字节由 8 位组成。这就是我们谈论比特时的原因，例如在 64 位操作系统中，位数几乎总是 8 的倍数（一个字节）。

我们想要存储的 “更大” 的东西，我们需要的字节越多。这对于工程计算很重要，因为用于存储数字的字节数决定了数字的存储精度，以及数字的大小。字节越多，准确度越高，但要支付的价格是更高的内存使用量。此外，当使用更多字节时，执行乘法和除法等操作会更昂贵。

### Objectives

- Introduce primitive data types (booleans, strings and numerical types)
- Type inspection
- Basic type conversion
- Introduction to pitfalls of floating point arithmetic

### 目标

- 介绍原始数据类型（布尔值，字符串和数字类型）
- 类型检验
- 基本类型转换
- 介绍浮点运算的缺陷

## What is type?

All variables have a 'type', which indicates what the variable is, e.g. a number, a string of characters, etc. In 'statically typed' languages we usually need to be explicit in declaring the type of a variable in a program. In a dynamically typed language, such as Python, variables still have types but the interpreter can determine types dynamically.

Type is important because it determines how a variable is stored, how it behaves when we perform operations on it, and how it interacts with other variables. For example, multiplication of two real numbers is different from multiplication of two complex numbers.

## 什么是类型？

所有变量都有一个'类型'，表示变量是什么，例如 一个数字，一串字符等。在 “静态类型” 语言中，我们通常需要明确声明程序中变量的类型。在动态类型语言（如 Python）中，变量仍然具有类型，但解释器可以动态地确定类型。

类型很重要，因为它决定了变量的存储方式，对它执行操作时的行为方式以及它与其他变量的交互方式。例如，两个实数的乘法与两个复数的乘法不同。

## Introspection

Before getting into types, we look at how we can check the type in Python. A powerful feature of Python is *introspection*. This means that we can probe a program to ask about the type of a variable. To check
the type of a variable we use the function`type`:

## Introspection

在进入类型之前，我们先看看如何在 Python 中检查类型。Python 的一个强大功能是*introspection*。这意味着我们可以探测程序以询问变量的类型。
我们使用函数 `type` 的来检查变量的类型：

In [1]:
x = True
print(type(x))

a = 1
print(type(a))

a = 1.0
print(type(a))

<class 'bool'>
<class 'int'>
<class 'float'>


Note that`a = 1`and`a = 1.0`are different types! This distinction is very important for numerical computations.
More on this further down.

Use`type`freely when exploring and testing, to develop an understanding for what your program is doing.

请注意，`a = 1` 和 `a = 1.0` 是不同的类型！这种区别对于数值计算非常重要。
后面有更多关于此区别的内容。

在探索和测试时使用 `type`，以了解您的程序正在做什么。

## Booleans

You have already seen the 'Boolean' type that can take on one of two values - true or false. This is the simplest type.

## 布尔值

您已经看到了'布尔'类型，它可以采用两个值之一 - 真或假。这是最简单的类型。

In [2]:
a = True
b = False
test = a or b # test will be True if a or b are True
print(test, type(test))

True <class 'bool'>


In principle, we could represent a boolean with just one bit (0 or 1 switch).

原则上，我们可以用一位（0 或 1 开关）表示一个布尔值。

**XUE.cn 练习题**：

- [选择题 ★ python 中的布尔值](https://xue.cn/hub/app/exercise/63)

- [选择题 ★ 识别 python 中的布尔值](https://xue.cn/hub/app/exercise/70)

## Strings

A string is a collection of characters. We have been using strings in previous activities for printing informative messages. In Python we create a string using single or double quotes (the choice is personal preference), e.g.

In [None]:
my_string = 'This is a string.'

or

In [None]:
my_string = "This is a string."

Below we assign a string to a variable, display the string, and then check its type:

## 字符串

字符串是字符的集合。我们一直在以前的活动中使用字符串来打印信息性消息。在 Python 中，我们使用单引号或双引号创建一个字符串（选择哪个是个人偏好），例如

     my_string ='这是一个字符串。'
    
或者

     my_string = “这是一个字符串。”
    
下面我们为变量分配一个字符串，显示字符串，然后检查它的类型：

In [3]:
my_string = "This is a string."
print(my_string)
print(type(my_string))

This is a string.
<class 'str'>


We can perform many different operations on strings. We can extract a particular character as a new string:

我们可以对字符串执行许多不同的操作。我们可以将特定字符提取为新字符串：

In [4]:
## Get 3rd character (Python counts from zero)
s2 = my_string[2]
print(s2)
print(type(s2))

i
<class 'str'>


or extract a range of characters:

或提取一系列字符：

In [5]:
## Get first six characters, print and check type
s3 = my_string[0:6]
print(s3)
print(type(s3))

## Get last four characters and print
s4 = my_string[-4:]
print(s4)

This i
<class 'str'>
ing.


We can add strings together:

我们可以将字符串拼接到一起：

In [6]:
introduction = "My name is:"
name = "Joe"

personal_introduction = introduction + " " + name
print(personal_introduction)

My name is: Joe


We can also check the length (number of characters) of a string using`len`:

我们还可以使用 `len` 来检查字符串的长度（字符的个数）：

In [7]:
print(len(personal_introduction))

15


There are *many* more operations that can be performed on strings. We will see more in later activities.

有*许多*可以对字符串执行的操作。我们将在以后的课程中看到更多。

**XUE.cn 练习题**：

- [判断题 ★ 表达式 'a'+1 的值为'b'。这种说法正确吗？](https://xue.cn/hub/app/exercise/957)

- [编程题 ★ 将字符串中的 are 全部替换为 were](https://xue.cn/hub/app/exercise/6)

- [选择题 ★ r 或 R 作为字符串前缀的作用](https://xue.cn/hub/app/exercise/253)

- [选择题 ★ 关于字符串，下列哪种说法是错误的？](https://xue.cn/hub/app/exercise/332)

- [选择题 ★ 转义字符'\n'的含义是什么？](https://xue.cn/hub/app/exercise/447)

- [选择题 ★ 表达式 'Hello world. I like Python.'.rfind('python') 的值是什么？](https://xue.cn/hub/app/exercise/547)

- [判断题 ★ 已知 x 为非空字符串，那么表达式 ','.join(x.split(',')) == x 的值一定为 True。这种说法正确吗？](https://xue.cn/hub/app/exercise/1023)

- [编程题 ★★★ 判断一个数字是否为回文数](https://xue.cn/hub/app/exercise/1168)

- [编程题 ★★★ 打印 * 构成的菱形](https://xue.cn/hub/app/exercise/1216)

- [编程题 ★★★ 打印 \ 和 / 构成的菱形](https://xue.cn/hub/app/exercise/1217)

- [编程题 ★★★ 打印 * 构成的大 C](https://xue.cn/hub/app/exercise/1218)

- [编程题 ★★ 帮小明修改分数](https://xue.cn/hub/app/exercise/1229)

## Numeric types

Numeric types are important in many computing applications, and particularly in scientific and engineering programs. Python 3 has three native numerical types:

- integers (`int`)
- floating point numbers (`float`)
- complex numbers (`complex`)

This is typical for most programming languages, although there can be some subtle differences.

## 数字类型

数字类型在许多计算应用中很重要，特别是在科学和工程程序中。Python 3 有三种原生数字类型：

- 整数（`int`）
- 浮点数 (`float`)
- 复数 (`complex`)

这是大多数编程语言的典型特征，尽管可能存在一些细微差别。

### Integers

Integers (`int`) are whole numbers, and can be postive or negative. Integers should be used when a value can only take on a whole number, e.g. the year, or the number of students following this course. Python infers the type of a number from the way we input it. It will infer an`int`if we assign a number with no decimal place:

### 整数

整型（`int`）是整数，可以是正数或负数。当值只能取整数时，应使用整型，例如 年份，或该课程的学生人数。Python 从我们输入数字的方式推断数字的类型。如果我们分配一个没有小数位的数字，它将推断出应当是 `int`：

In [8]:
a = 2
print(type(a))

<class 'int'>


If we add a decimal point, the variable type becomes a`float`(more on this later)

如果我们加上一个小数点，那么变量的类型则变成了浮点型 `float`（后续有更多关于此的内容）

In [9]:
a = 2.0
print(type(a))

<class 'float'>


Integer operations that result in an integer, such as multiplying or adding two integers, are performed exactly (there is no error). This does however depend on a variable having enough memory (sufficient bytes) to represent the result.

整数的运算得到整数，例如两个整数相乘或相加，将完全执行（没有错误）。但这取决于变量是否有足够内存（足够的字节）来表示结果。

<font size=4 weight=600 >Integer storage and overflow</font>

In most languages, a fixed number of bits are used to store a given type of integer. In C and C++ a standard integer (`int`) is usually stored using 32 bits (it is possible to declare shorter and longer integer types).
The largest integer that can be stored using 32 bits is $2^{31} - 1 = 2,147,483,647$.
We explain later where this comes from. The message for now is that for a fixed number of bits, there is a bound on the largest number that can be represented/stored.

<font size=4>Integer overflow</font>

Integer overflow is when an operation creates an integer that is too big to be represented by the given integer type. For example, attempting to assign $2^{31} + 1$ to a 32-bit integer will cause an overflow and potentially unpredictable program response. This would usually be a *bug*.

The Ariane 5 rocket explosion in 1996 was caused by integer overflow. The rocket navigation software was taken from the older, slower Ariane 4 rocket. The program assigned the rocket speed to a 16-bit integer (the largest number a 16-bit integer can store is $2^{15} - 1 = 32767$), but the Ariane 5 could travel faster than the older generation of rocket and the speed value exceeded $32767$. The resulting integer overflow led to
failure of the rocket's navigation system and
explosion of the rocket; a very costly rocket and a very expensive payload were destroyed.
We will reproduce the error that caused this failure when we look at *type conversions*.

Python avoids integer overflows by dynamically changing the number of bits used to represent an integer. You can inspect the number of bits required to store an integer in binary (not including the bit for the sign) using the function [bit_length](https://docs.python.org/3/library/stdtypes.html#int.bit_length):

<font size=4 weight=600 >整数存储和溢出</font>

在大多数语言中，固定数量的位用于存储给定类型的整数。在 C 和 C ++中，标准整数（`int`）通常使用 32 位存储（可以声明更短和更长的整数类型）。
可以使用 32 位存储的最大整数是$ 2 ^ {31} - 1 = 2,147,483,647 $。
我们稍后会解释它的原因。现在的信息是，对于固定数量的比特，存在可以表示 / 存储的最大数字的界限。

<font size=4>整数溢出</font>

整数溢出是指操作创建的整数太大而无法由给定的整数类型表示。例如，尝试将$ 2 ^ {31} + 1 $分配给 32 位整数将导致溢出和可能无法预测的程序响应。这通常是一个* bug *。

1996 年阿丽亚娜 5 号火箭爆炸是由整数溢出引起的。火箭导航软件取自较旧的较慢的阿丽亚娜 4 号火箭。该程序将火箭速度分配给一个 16 位整数（一个 16 位整数可以存储的最大数字是$2^{15} - 1 = 32767$），但是阿丽亚娜 5 的行进速度可能比老一代火箭更快，速度值超过了$32767 $。由此产生的整数溢出导致火箭导航系统的故障和火箭爆炸;一枚非常昂贵的火箭和非常昂贵的有效载荷被摧毁。
当我们查看*类型转换*时，我们将重现导致此失败的错误。

Python 通过动态更改用于表示整数的位数来避免整数溢出。您可以使用函数 [bit_length](https://docs.python.org/3/library/stdtypes.html#int.bit_length) 检查以二进制形式存储整数所需的位数（不包括符号位）：

In [10]:
a = 8
print(type(a))
print(a.bit_length())

<class 'int'>
4


We see that 4 bits are necessary to represent the number 8. If we increase the size of the number dramatically by raising it to the power of 12:

我们看到需要 4 位来表示数字 8. 如果我们通过将其增加到 12 次幂来显着增加数字的大小：

In [11]:
b = a**12
print(b)
type(b)
print(b.bit_length())

68719476736
37


We see that 37 bits are required to represent the number. If the`int`type was limited to 32 bits for storing the value, this operation would have caused an overflow.

我们看到需要 37 位来表示数字。如果 `int` 类型限制为 32 位用于存储该值，则此操作将导致溢出。

**XUE.cn 练习题**：

- [选择题 ★★ 表达式 int('123', 8) 的值是什么？](https://xue.cn/hub/app/exercise/453)

- [选择题 ★ 表达式 int('123') 的值是什么？](https://xue.cn/hub/app/exercise/454)

<font size=4>Gangnam Style</font>

In 2014, Google switched from 32-bit integers to 64-bit integers to count views when the video "Gangnam Style" was viewed more than 2,147,483,647 times, which is the limit of 32-bit integers (see https://plus.google.com/+YouTube/posts/BUXfdWqu86Q ).

<font size=4>江南 Style</font>

2014 年，当视频 “Gangnam Style” 被观看超过 2,147,483,647 次时，谷歌从 32 位整数切换到 64 位整数计数视图，因其已经超过 32 位整数的限制（请参阅 https://plus.google.com/+YouTube/posts/BUXfdWqu86Q ).。

<font size=4>Boeing 787 Dreamliner bug</font>

Due to an integer overflow bug, the electricity generators on a Boeing 787 will shut down if the plane is
powered continuously for 248 days, due to an overflow. The 'quick fix' was to make sure that
generator control units do not operate for more than 248 days.
See
https://www.theguardian.com/business/2015/may/01/us-aviation-authority-boeing-787-dreamliner-bug-could-cause-loss-of-control and
https://s3.amazonaws.com/public-inspection.federalregister.gov/2015-10066.pdf for background.

<font size=4>波音 787 Dreamliner bug</font>

由于整数溢出错误，如果连续供电 248 天，波音 787 上的发电机将关闭。“快速修复” 是为了确保发电机控制单元的运行时间不超过 248 天。

参考：
https://www.theguardian.com/business/2015/may/01/us-aviation-authority-boeing-787-dreamliner-bug-could-cause-loss-of-control and
https://s3.amazonaws.com/public-inspection.federalregister.gov/2015-10066.pdf

### Floating point storage

Most engineering calculations involve numbers that cannot be represented as integers. Numbers that have a
decimal point are stored using the`float`type. Computers store floating point numbers by storing the sign, the significand (also known as the mantissa) and the exponent, e.g.: for $10.45$

$$
10.45 = \underbrace{+}_{\text{sign}} \underbrace{1045}_{\text{significand}} \times \underbrace{10^{-2}}_{\text{exponent} = -2}
$$

Python uses 64 bits to store a`float`(in C and C++ this is known as a`double`). The sign requires one bit, and there are standards that specify how many bits should be used for the significand and how many for the exponent.

Since a finite number of bits are used to store a number, the precision with which numbers can be represented is limited. As a guide, using 64 bits a floating point number is precise to 15 to 17 significant figures.
More on this, and why the Patriot missile failed, later.

### 浮点存储

大多数工程计算涉及不能表示为整数的数字。具有小数点的数字使用 `float` 类型存储。计算机通过存储符号，有效数字（也称为尾数）和指数来存储浮点数，例如：$ 10.45 $

$$
10.45 = \underbrace{+}_{\text{sign}} \underbrace{1045}_{\text{significand}} \times \underbrace{10^{-2}}_{\text{exponent} = -2}
$$

Python 使用 64 位来存储 `float`（在 C 和 C ++中，这被称为 `double`）。符号需要一位，并且有一些标准指定有效位数应该使用多少位以及指数使用多少位。

由于有限数量的比特用于存储数字，因此可以表示数字的精度是有限的。作为指南，使用 64 位浮点数精确到 15 到 17 位有效数字。后面将有更多内容关于这一点，以及为什么爱国者导弹失败。

<font size=4 weight=600 >Floats</font>

We can declare a float by adding a decimal point:

<font size=4 weight=600 >浮点</font>

我们可以通过添加小数点来声明一个浮点数：

In [12]:
a = 2.0
print(a)
print(type(a))

b = 3.
print(b)
print(type(b))

2.0
<class 'float'>
3.0
<class 'float'>


or by using`e`or`E`(the choice between`e`and`E`is just a matter of taste):

或者使用 `e` 或 `E`（`e` 和 `E` 之间的选择仅仅是品味问题）：

In [13]:
a = 2e0
print(a, type(a))

b = 2e3
print(b, type(b))

c = 2.1E3
print(c, type(c))

2.0 <class 'float'>
2000.0 <class 'float'>
2100.0 <class 'float'>


**XUE.cn 练习题**：

- [选择题 ★ 理解浮点数类型](https://xue.cn/hub/app/exercise/245)

- [选择题 ★ 关于 Python 语言的浮点数类型，以下选项中描述错误的是哪个？](https://xue.cn/hub/app/exercise/351)

- [选择题 ★ 关于 Python 的浮点数类型，以下选项中描述错误的是哪个？](https://xue.cn/hub/app/exercise/359)

- [选择题 ★ 以下选项中，正确地描述了浮点数 0.0 和整数 0 相同性的是](https://xue.cn/hub/app/exercise/412)

<font size=4 weight=600 >Complex numbers</font>

A complex number is a more elaborate float with two parts - the real and imaginary components. We can declare a complex number in Python by adding`j`or`J`after the complex part of the number:

<font size=4 weight=600 >复数</font>

复数是一个更复杂的浮点数，包含两个部分 - 实部和虚部。我们可以通过在数字的虚数部分之后添加 `j` 或 `J` 来声明 Python 中的复数：

In [14]:
a = 2j
print(a, type(a))

b = 4 - 3j
print(b, type(b))

2j <class 'complex'>
(4-3j) <class 'complex'>


The usual addition, subtraction, multiplication and division operations can all be performed on complex numbers. The real and imaginary parts can be extracted:

通常的加法，减法，乘法和除法运算都可以在复数上执行。可以提取实部和虚部：

In [15]:
print(b.imag)
print(b.real)

-3.0
4.0


and the complex conjugate can be taken:

并且可以求出复共轭：

In [16]:
print(b.conjugate())

(4+3j)


We can compute the modulus of a complex number using`abs`:

我们可以使用 `abs` 计算复数的模：

In [17]:
print(abs(b))

5.0


More generally,`abs`returns the absolute value, e.g.:

更一般地说，`abs` 返回绝对值，例如：

In [18]:
a = -21.6
a = abs(a)
print(a)

21.6


**XUE.cn 练习题**：

- [选择题 ★ 复数的类型](https://xue.cn/hub/app/exercise/242)

- [选择题 ★ 关于 Python 中的复数，下列哪个说法错误？](https://xue.cn/hub/app/exercise/329)

- [选择题 ★ 关于 Python 的数字类型，以下选项中描述错误的是哪个？](https://xue.cn/hub/app/exercise/358)

- [选择题 ★ 以 3 为实部 4 为虚部，Python 复数的表达形式是什么？](https://xue.cn/hub/app/exercise/425)

- [判断题 ★ 3+4j 不是合法的 Python 表达式。这种说法正确吗？](https://xue.cn/hub/app/exercise/868)

- [判断题 ★ 3+4j 是合法 Python 数字类型。这种说法正确吗？](https://xue.cn/hub/app/exercise/874)

## Type conversions (casting)

We can often change between types. This is called *type conversion* or *type casting*. In some cases it happens implicitly, and in other cases we can instruct our program to change the type.

If we add two integers, the results will be an integer:

## 类型转换（转换）

我们经常可以在不同类型之间转换。这称为*type conversion*或*type casting*。在某些情况下，它会隐式发生，在其他情况下，我们可以指示我们的程序更改至特定类型。

如果我们添加两个整数，结果将是一个整数：

In [19]:
a = 4
b = 15
c = a + b
print(c, type(c))

19 <class 'int'>


However, if we add an`int`and a`float`, the result will be a float:

但是，如果我们把一个 `int` 和一个 `float` 相加，结果将是一个浮点数：

In [20]:
a = 4
b = 15.0 # Adding the '.0' tells Python that it is a float
c = a + b
print(c, type(c))

19.0 <class 'float'>


If we divide two integers, the result will be a`float`:

如果我们做两个整数相除，结果将是一个 `float`：

In [21]:
a = 16
b = 4
c = a/b
print(c, type(c))
b = 2

4.0 <class 'float'>


When dividing two integers, we can do 'integer division' using`//`, e.g.

当两个整数相除时，我们可以使用 `//'进行'整数除法'，例如

In [22]:
a = 16
b = 3
c = a//b
print(c, type(c))

5 <class 'int'>


in which case the result is an`int`.

In general, operations that mix an`int`and`float`will generate a`float`, and operations that mix an`int`or a`float`with`complex`will return a`complex`type. If in doubt, use`type`to experiment and check.

在这种情况下，结果是一个 `int`。

通常，混合 `int` 和 `float` 的操作将生成 `float`，将 `int` 或 `float` 与 `complex` 混合的操作将返回 `complex` 类型。如果有疑问，请使用 `type` 进行实验和检查。

### Explicit type conversion

We can explicitly change the type (perform a cast), e.g. cast from an`int`to a`float`:

### 显式类型转换

我们可以明确地改变类型（执行转换），例如 从 `int` 转换为 `float`：

In [23]:
a = 1
print(a, type(a))

a = float(a) # This converts the int associated with 'a' to a float, and assigns the result to the variable 'a'
print(a, type(a))

1 <class 'int'>
1.0 <class 'float'>


Going the other way,

向另一个方向，

In [24]:
y = 1.99
print(y, type(y))

z = int(y)
print(z, type(z))

1.99 <class 'float'>
1 <class 'int'>


Note that rounding is applied when converting from a`float`to an`int`; the values after the decimal point are discarded. This type of rounding is called 'round towards zero' or 'truncation'.

A common task is converting numerical types to-and-from strings. We might read a number from a file as a string, or a user might input a value which Python reads in as a string. Converting a float to a string:

注意，从 `float` 转换为 `int` 时小数点后的值将被丢弃。这种类型的舍入称为 “向零舍入” 或 “截断”。

一个常见的任务是将数字类型转换为字符串。我们可能会将文件中的数字作为字符串读取，或者用户可能会输入 Python 作为字符串读入的值。将浮点数转换为字符串：

In [25]:
a = 1.023
b = str(a)
print(b, type(b))

1.023 <class 'str'>


and in the other direction:

另一个方向

In [26]:
a = "15.07"
b = "18.07"

print(a + b)
print(float(a) + float(b))

15.0718.07
33.14


If we tried

In [None]:
print(int(a) + int(b))

we could get an error that the strings could not be converted to`int`. It works in the case:

如果我们试过

In [None]:
print(int(a) + int(b))

我们可能会得到一个错误，即字符串无法转换为 `int`。它适用于以下情况：

In [27]:
a = "15"
b = "18"
print(int(a) + int(b))

33


since these strings can be correctly cast to integers.

因为这些字符串可以正确地转换为整数。

**XUE.cn 练习题**：

- [编程题 ★★ 数字的转换](https://xue.cn/hub/app/exercise/85)

- [选择题 ★ 表达式 int(4**0.5) 的值是什么？](https://xue.cn/hub/app/exercise/464)

- [选择题 ★ 表达式 int(str(34)) == 34 的值是什么？](https://xue.cn/hub/app/exercise/718)

- [选择题 ★★ 表达式 print(0b10101) 的值是什么？](https://xue.cn/hub/app/exercise/745)

### Ariane 5 rocket explosion and type conversion

The Ariane 5 rocket explosion was caused by an integer overflow. The speed of the rocket was stored as a 64-bit float, and this was converted in the navigation software to a 16-bit integer. However, the value of the float was greater than $32767$, the largest number a 16-bit integer can represent, and this led to an overflow that in turn caused the navigation system to fail and the rocket to explode.

We can demonstrate what happened in the rocket program. We consider a speed of $40000.54$ (units are not relevant to what is being demonstrated), stored as a`float`(64 bits):

### 阿丽亚娜 5 火箭爆炸和类型转换

阿丽亚娜 5 火箭爆炸是由整数溢出引起的。火箭的速度存储为 64 位浮点数，并在导航软件中将其转换为 16 位整数。但是，浮点数的值大于$ 32767 $，这是 16 位整数可以表示的最大数字，这导致溢出，从而导致导航系统失败并导致火箭爆炸。

我们可以展示火箭程序中发生了什么。我们考虑$40000.54$的速度（单位与正在演示的内容无关），存储为 `float`（64 位）：

In [28]:
speed_float = 40000.54

If we first convert the float to a 32-bit`int`(we use NumPy to get integers with a fixed number of bits, more on NumPy in a later notebook):

如果我们首先将浮点数转换为 32 位 `int`（我们使用 NumPy 来获取具有固定位数的整数，在以后的笔记本中会更多使用 NumPy）：

In [29]:
import numpy as np
speed_int = np.int32(speed_float) # Convert the speed to a 32-bit int
print(speed_int)

40000


The conversion behaves as we would expect. Now, if we convert the speed from the`float`to a 16-bit integer:

转换的行为与我们预期的一样。现在，如果我们将速度从 `float` 转换为 16 位整数：

In [30]:
speed_int = np.int16(speed_float)
print(speed_int)

-25536


We see clearly the result of an integer overflow since the 16-bit integer has too few bits to represent the number
40000.

The Ariane 5 failure would have been averted with pre-launch testing and the following few lines:

我们清楚地看到整数溢出的结果，因为 16 位整数的位数太少而无法表示数字 40000。

阿丽亚娜 5 的失败可以通过发布前测试和以下几行来避免：

In [31]:
if abs(speed_float) > np.iinfo(np.int16).max:
    print("***Error, cannot assign speed to 16-bit int. Will cause overflow.")
    # Call command here to exit program
else:
    speed_int = np.int16(speed_float)

***Error, cannot assign speed to 16-bit int. Will cause overflow.


These few lines and careful testing would have saved the $500M payload and the cost of the rocket.

The Ariane 5 incident is an example not only of a poor piece of programming, but also very poor testing and software engineering. Careful pre-launch testing of the software would have detected this problem. The program should have checked the value of the velocity before performing the conversion, and triggered an error message that the type conversion would cause an overflow.

这几行程序和仔细测试可以节省 5 亿美元的有效载荷和火箭的成本。

阿丽亚娜 5 事件不仅是一个糟糕的程序，而且是非常糟糕的测试和软件工程的一个例子。在启动前仔细的软件测试会发现这个问题。程序应该在执行转换之前检查速度值，并触发类型转换会导致溢出的错误消息。

## Binary representation and floating point arithmetic

### Binary (base 2) representation

Computers store data using 'bits', and a bit is a switch that can have a value of 0 or 1. This means that computers store numbers in binary (base 2), whereas we almost always work with decimal numbers (base 10).
For example, the binary number $110$ is equal to $0 \times 2^{0} + 1 \times 2^{1} + 1 \times 2^{2} = 6$
(read $110$ right-to-left).
Below is a table with decimal (base 10) and the corresponding binary (base 2) representation of some numbers. See <https://en.wikipedia.org/wiki/Binary_number> if you want to learn more.

## 二进制表示和浮点运算

### 二进制（基数 2）表示

计算机使用 “位” 存储数据，并且位是可以具有值 0 或 1 的开关。这意味着计算机以二进制（基数 2）存储数字，而我们几乎总是使用十进制数（基数 10）。
例如，二进制数$110$ 等于$0 \times 2^{0} + 1 \times 2^{1} + 1 \times 2^{2} = 6$
（从右到左阅读$ 110 $）。
下面是一个带有十进制（基数为 10）的和相应二进制（基数 2）表示表格。如果您想了解更多信息，请参阅 <https://en.wikipedia.org/wiki/Binary_number>

| 十进制 | 二进制 |
| ------ |-------- |
|0 |	 0 |
|1 |	1 |
|2 |	10 |
|3 |	11 |
|4 |	100 |
|5 |	101 |
|6 |	110 |
|7 |	111 |
|8 |	1000 |
|9 |	1001 |
|10 |	1010 |
|11	 | 1011 |
|12	 | 1100 |
|13 |	1101 |
|14 |	1110 |
|15 |	1111 |

To represent any integer, all we need are enough bits to store the binary representation. If we have $n$ bits, the largest number we can store is $2^{n -1} - 1$ (the power is $n - 1$ because we use one bit to store the sign of the integer).

We can display the binary representation of an integer in Python using the function`bin`:

为了表示任何整数，我们只需要足够的位来存储二进制表示。如果我们有$ n $位，我们可以存储的最大数字是$2^{n -1} - 1$ （指数是$ n - 1 $，因为我们使用一位来存储整数的符号）。

我们可以使用函数 `bin` 在 Python 中显示整数的二进制表示：

In [32]:
print(bin(2))
print(bin(6))
print(bin(110))

0b10
0b110
0b1101110


The prefix`0b`is to denote that the representation is binary.

前缀 “0b” 表示是二进制的。

**XUE.cn 练习题**：

- [选择题 ★★ 表达式 int(bin(54321), 2) 的值是什么？](https://xue.cn/hub/app/exercise/797)

- [选择题 ★★ 表达式 int('101',2) 的值是什么？](https://xue.cn/hub/app/exercise/455)

### Floating point numbers

We introduced the representation

$$
10.45 = \underbrace{+}_{\text{sign}} \underbrace{1045}_{\text{significand}} \times \underbrace{10^{-2}}_{\text{exponent}}
$$

earlier. However, this was a little misleading because computers do not use base 10
to store the significand and the exponent, but base 2.

When using the familiar base 10, we cannot represent $1/3$ exactly as a decimal. If we liked using base 3 (ternary numeral system) for our mental arithmetic (which we really don't), we could represent $1/3$ exactly. However, fractions that are simple to represent exactly in base 10 might not be representable in another base.
A consequence is that fractions that are simple in base 10 cannot necessarily be represented exactly by computers using binary.

A classic example is $1/10 = 0.1$. This simple number cannot be represented exactly in
binary. On the contrary, $1/2 = 0.5$ can be represented exactly. To explore this, let's assign the number 0.1 to the variable`x`and print the result:

### 浮点数字

我们之前介绍了

$$
10.45 = \underbrace{+}_{\text{sign}} \underbrace{1045}_{\text{significand}} \times \underbrace{10^{-2}}_{\text{exponent}}
$$

但是，这有点误导，因为计算机不使用基数 10 来存储有效数和指数，而是用基数 2。

当使用熟悉的十进制时，我们不能将$ 1/3 $精确地表示为小数。如果我们喜欢将基数 3（三进制）用于我们的心算（我们实际上没有），我们可以完全表示$ 1/3 $。但是，在十进制中精确表示的分数可能无法在另一个进位制中简单表示。
结果是，十进制中简单的分数不一定由使用二进制的计算机精确表示。

一个典型的例子是$ 1/10 = 0.1 $。这个数字不能用二进制简单的表示。相反，$ 1/2 = 0.5 $可以准确表示。为了解决这个问题，让我们将数字 0.1 赋给变量 `x` 并打印结果：

In [33]:
x = 0.1
print(x)

0.1


This looks fine, but the`print`statement is hiding some details. Asking the`print`statement to use 30 characters we see that`x`is not exactly 0.1:

这看起来很好，但 `print` 语句隐藏了一些细节。要求 `print` 语句使用 30 个字符，我们看到 `x` 不完全是 0.1：

In [34]:
print('{0:.30f}'.format(x))

0.100000000000000005551115123126


The difference between 0.1 and the binary representation is the *roundoff error* (we'll look at print formatting syntax in a later activity). From the above, we can see that the representation is accurate to about 17 significant figures.

Checking for 0.5, we see that it appears to be represented exactly:

0.1 和二进制表示之间的差异是*舍入误差*（我们将在后面的活动中查看打印格式语法）。从上面我们可以看出，该表示精确到大约 17 个有效数字。

检查 0.5，我们看到它似乎是精确表示的：

In [35]:
print('{0:.30f}'.format(0.5))

0.500000000000000000000000000000


The round-off error for the 0.1 case is small, and in many cases will not present a problem. However, sometimes round-off errors can accumulate and destroy accuracy.

0.1 的情况下的舍入误差很小，并且在许多情况下不会出现问题。但是，有时舍入误差会累积并破坏准确性。

<font size=4 weight=600 >Example: inexact representation</font>

It is trivial that

$$
x = 11x - 10x
$$

If $x = 0.1$, we can write

$$
x = 11x - 1
$$

Now, starting with $x = 0.1$ we evaluate the right-hand side to get a 'new' $x$, and use this new $x$ to then evaluate the right-hand side again. The arithmetic is trivial: $x$ should remain equal to $0.1$.
We test this in a program that repeats this process 20 times:

<font size=4 weight=600 >示例：不精确的表示</font>

这是微不足道的

$$
x = 11x - 10x
$$

如果$ x = 0.1 $，我们可以写

$$
x = 11x - 1
$$

现在，从$ x = 0.1 $开始，我们计算右侧获得'新'$ x $，并使用这个新的$ x $然后再次计算右侧。
算术是微不足道的：$ x $应保持等于$ 0.1 $。
我们在一个重复此过程 20 次的程序中对此进行测试：

In [36]:
x = 0.1
for i in range(20):
    x = x*11 - 1
    print(x)

0.10000000000000009
0.10000000000000098
0.10000000000001075
0.10000000000011822
0.10000000000130038
0.1000000000143042
0.10000000015734622
0.10000000173080847
0.10000001903889322
0.10000020942782539
0.10000230370607932
0.10002534076687253
0.10027874843559781
0.1030662327915759
0.13372856070733485
0.4710141677806834
4.181155845587517
44.992714301462684
493.9198573160895
5432.118430476985


The solution blows up and deviates widely from $x = 0.1$. Round-off errors are amplified at each step, leading to a completely wrong answer. The computer representation of $0.1$ is not exact, and every time we multiply $0.1$ by $11$, we increase the error by around a factor of 10 (we can see above that we lose a digit of accuracy in each step).
You can observe the same issue using spreadsheet programs.

答案误差爆炸并且偏离$ x = 0.1 $。每一步都会放大舍入误差，导致完全错误的答案。 $ 0.1 $的计算机表示并不精确，每次我们将$ 0.1 $乘以$ 11 $，我们将误差增加大约 10 倍（我们可以看到上面我们在每一步中都失去了一定数量的准确度）。
您可以使用电子表格程序观察相同的问题。

If we use $x = 0.5$, which can be represented exactly in binary:

如果我们使用$ x = 0.5 $，它可以用二进制表示：

In [37]:
x = 0.5
for i in range(20):
    x = x*11 - 5
    print(x)

0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5


The result is exact in this case.

By default, Python uses 64 bits to store a float. We can use the module NumPy to create a
float that uses only 32 bits. Testing this for the $x = 0.1$ case:

在这种情况下，结果是准确的。

默认情况下，Python 使用 64 位来存储浮点数。我们可以使用 NumPy 模块创建一个只使用 32 位的浮点数。测试这个$ x = 0.1 $的情况：

In [38]:
x = np.float32(0.1)
for i in range(20):
    x = x*11 - 1
    print(x)

0.10000001639127731
0.10000018030405045
0.1000019833445549
0.10002181679010391
0.10023998469114304
0.1026398316025734
0.12903814762830734
0.41941962391138077
3.6136158630251884
38.74977449327707
425.2475194260478
4676.722713686526
51442.949850551784
565871.4483560696
6224584.931916766
68470433.25108442
753174764.7619286
8284922411.381214
91134146524.19336
1002475611765.127


The error blows up faster in this case compared to the 64 bit case - using 32 bits leads to a poorer approximation of $0.1$ than when using 64 bits.

*Note:* Some languages have special tools for performing decimal (base 10) arithmetic (e.g., https://docs.python.org/3/library/decimal.html). This would, for example, allow $0.1$ to be represented exactly. However, decimal is not the 'natural' arithmetic of computers so operations in decimal could be expected to be much slower. 

与 64 位情况相比，在这种情况下误差变化更快 - 使用 32 位导致比使用 64 位时近似$0.1$的误差更大。

*注意：*某些语言具有执行十进制（基数为 10）算术的特殊工具（例如，https://docs.python.org/3/library/decimal.html) 。例如，这将允许$ 0.1 $准确表示。但是，十进制不是计算机的 “自然” 算术，因此十进制运算可能会慢得多。

### Patriot Missile Failure

The inexact representation of $0.1$ was the cause of the software error in the Patriot missile system (see preamble to this notebook).
The missile system tracked time from boot (system start) using an integer counter that was incremented every $1/10$ of a second. To
get the time in seconds, the missile software multiplied the counter by the float representation of $0.1$.
The control software used 24 bits to store floats. The round-off error due to the inexact representation of $0.1$ lead to an error of $0.32$ s after 100 hours of operation (time since boot), which due to the high velocity of the missile was enough to cause failure to intercept the incoming Scud.

We don't have 24-bit floats in Python, but we can test with 16, 32 and 64 bit floats.
We first compute what the system counter (an integer) would be after 100 hours:

### 爱国者导弹失败

$ 0.1 $的不精确表示是爱国者导弹系统软件错误的原因（见此笔记本的序言）。
导弹系统使用整数计数器跟踪从启动（系统启动）开始的时间，该计数器$1/10$一秒递增一次。为了取得秒数，导弹软件将计数器乘以浮点数$0.1$。
控制软件使用 24 位来存储浮点数。由于导致运行 100 小时（启动后的时间），$ 0.1 $的不精确表示造成的舍入错误导致出现了$ 0.32 $ s 的误差，由于导弹的高速度，此误差足以导致无法拦截飞毛腿。

我们在 Python 中没有 24 位浮点数，但我们可以使用 16 位，32 位和 64 位浮点数进行测试。
我们首先计算 100 小时后系统计数器（整数）：

In [39]:
## Compute internal system counter after 100 hours (counter increments every 1/10 s)
num_hours = 100
num_seconds = num_hours*60*60
system_counter = num_seconds*10 # system clock counter

Converting the system counter to seconds using different representations of 0.1:

使用 0.1 的不同表示将系统计数器转换为秒：

In [40]:
## Test with 16 bit float
dt = np.float16(0.1)
time = dt*system_counter
print("Time error after 100 hours using 16 bit float (s):", abs(time - num_seconds))

## Test with 32 bit float
dt = np.float32(0.1)
time = dt*system_counter
print("Time error after 100 hours using 32 bit float (s):", abs(time - num_seconds))

## Test with 64 bit float
dt = np.float64(0.1)
time = dt*system_counter
print("Time error after 100 hours using 64 bit float (s):", abs(time - num_seconds))

Time error after 100 hours using 16 bit float (s): 87.890625
Time error after 100 hours using 32 bit float (s): 0.005364418029785156
Time error after 100 hours using 64 bit float (s): 0.0


The time computation with 16-bit floats is more than a minute off after 100 hours! The stop-gap measure
for the Patriot missiles at the time was to reboot the missile systems frequently, thereby resetting the system counter and reducing the time error.

使用 16 位浮点计算的时间在 100 小时后超过一分钟！当时爱国者导弹的权宜之计是频繁重启导弹系统，从而重置系统计数器并减少时间误差。

## Summary

The key points from this activity are:

- The size of an integer that a computer can store is determined by the number of bits used to represent the
  integer.
- Computers do not perform exact arithmetic with non-integer numbers. This does not usually cause a problem, but
  it can in cases. Problems can often be avoided with careful programming.
- Be thoughtful when converting between types - undesirable consequences can follow when careless with
  conversions.

## 总结

本课的关键点是：

- 计算机可以存储的整数大小由用于表示整数的位数决定。
- 计算机使用非整数时可能无法执行精确的算术运算。这通常不会导致问题，但在某些情况下可能会出现问题。仔细编程通常可以避免问题。
- 在不同类型之间进行转换时要考虑周到 - 在转换时的疏忽大意会产生不良后果。

## 练习

现在完成 [练习 03 Exercises](https://xue.cn/hub/reader?bookId=2&path=PartIA-Computing-Michaelmas-zh-CN/zh-CN/Exercises/03_Exercises.ipynb) .