# 5 Sequences

- 使用 `make_array()` 来将多个值放入一个数组中 (先 import datascience)

In [1]:
from datascience import *

baseline_high = 14.48
highs = make_array(
    baseline_high - 0.880, baseline_high - 0.093, baseline_high + 0.105, baseline_high + 0.684
)
highs

array([13.6  , 14.387, 14.585, 15.164])

- 可以使用 `sum()` 计算一个 collection 的和
- 可以使用 `len()` 计算一个 collection 的长度

In [2]:
sum(highs) / len(highs)

14.434000000000001

---

## 5.1 Arrays

There are many kinds of collections in Python, we will work primarily with arrays in this class.

- Arrays 也可包含其他类型的值，但是一个 array 只能包含一种数据类型

In [3]:
english_parts_of_speech = make_array(
    "noun", "pronoun", "verb", "adverb", "adjective", "conjunction", "preposition", "interjection"
)
english_parts_of_speech

array(['noun', 'pronoun', 'verb', 'adverb', 'adjective', 'conjunction',
       'preposition', 'interjection'], dtype='<U12')

- Arrays 可以像一个向量一样进行运算 (类似 Matlab)

In [4]:
(9/5) * highs + 32  # 将摄氏度转换为华氏度

array([56.48  , 57.8966, 58.253 , 59.2952])

- Arrows 也可以调用方法，如 `sum()`
- Arrows 与 Table 类似，也存在 property, 如 `size`

In [5]:
print(highs.size)
print(highs.sum())
print(highs.mean())

4
57.736000000000004
14.434000000000001


---

### 5.1.1 Functions on Arrays

- `numpy` 包 (程序中缩写为 `np`) 提供了便捷强大的函数，用于创建和操作数组
- [full Numpy reference](https://numpy.org/doc/stable/reference/)
-  a small subset, commonly used for data processing applications, are grouped into different packages within np.

In [6]:
import numpy as np
np.diff(highs)  # 求差分 (后一元素减去前一元素，数组 size--)

array([0.787, 0.198, 0.579])

#### np List for Reference

- Each of these functions takes an array as an argument and returns a single value.

    | **Function**       | Description                                                  |
    |:------------------ |:------------------------------------------------------------ |
    | `np.prod`          | Multiply all elements together                               |
    | `np.sum`           | Add all elements together                                    |
    | `np.all`           | Test whether all elements are true values (non-zero numbers are true) |
    | `np.any`           | Test whether any elements are true values (non-zero numbers are true) |
    | `np.count_nonzero` | Count the number of non-zero elements                        |

- Each of these functions takes an array as an argument and returns an array of values.

    | **Function** | Description                                                  |
    |:------------ |:------------------------------------------------------------ |
    | `np.diff`    | Difference between adjacent elements                         |
    | `np.round`   | Round each number to the nearest integer (whole number)      |
    | `np.cumprod` | A cumulative product: for each element, multiply all elements so far |
    | `np.cumsum`  | A cumulative sum: for each element, add all elements so far  |
    | `np.exp`     | Exponentiate each element                                    |
    | `np.log`     | Take the natural logarithm of each element                   |
    | `np.sqrt`    | Take the square root of each element                         |
    | `np.sort`    | Sort the elements                                            |

- Each of these functions takes an array of strings and returns an array.
	
    | **Function**        | **Description**                                              |
    |:------------------- |:------------------------------------------------------------ |
    | `np.char.lower`     | Lowercase each element                                       |
    | `np.char.upper`     | Uppercase each element                                       |
    | `np.char.strip`     | Remove spaces at the beginning or end of each element        |
    | `np.char.isalpha`   | Whether each element is only letters (no numbers or symbols) |
    | `np.char.isnumeric` | Whether each element is only numeric (no letters)            |

- Each of these functions takes both an array of strings and a *search string*; each returns an array.

    | **Function**         | **Description**                                              |
    |:-------------------- |:------------------------------------------------------------ |
    | `np.char.count`      | Count the number of times a search string appears among the elements of an array |
    | `np.char.find`       | The position within each element that a search string is found first |
    | `np.char.rfind`      | The position within each element that a search string is found last |
    | `np.char.startswith` | Whether each element starts with the search string           |


---

## 5.2 Ranges

- Def. A **range** is an array of numbers in increasing or decreasing order, each separated by a regular interval.
- ranges 使用 `np.arange()` 来定义
    - 其接受 1, 或 2, 或 3
    - 接受 1 个参数: 该参数作为 `end`, 假定了 `start = 0`, `step = 1`
    - 接受 2 个参数: 分别作为 `start` 和 `end`, 假定了 `step = 1`
    - 接受 3 个参数: 分别作为 `start`, `end`, `step`
    

In [26]:
np.arange(5)     # 注意只到 4 (end 不算在范围内)

array([0, 1, 2, 3, 4])

In [8]:
np.arange(3, 9)  # 注意只到 8

array([3, 4, 5, 6, 7, 8])

In [9]:
np.arange(3, 30, 5)

array([ 3,  8, 13, 18, 23, 28])

In [27]:
np.arange(1.5, -2, -0.5)    # step 可以是负数、浮点数, e.g. -0.5
# 注意只到 -1.5

array([ 1.5,  1. ,  0.5,  0. , -0.5, -1. , -1.5])

### 5.2.1 Examples: Leibniz's formula for $\pi$

考虑利用以下公式近似计算 $\pi$
$$
\pi = 4 \left( 1 - \frac{1}{3} + \frac{1}{5} - \frac{1}{7} + \frac{1}{9} - \frac{1}{11} + \cdots \right)
$$

In [15]:
positive_term_denominators = np.arange(1, 10000, 4)
positive_terms = 1 / positive_term_denominators      # 这里区别于 Matlab, 除号直接是 ./
negative_terms = 1 / (positive_term_denominators + 2)
approx_to_pi = 4 * (sum(positive_terms) - sum(negative_terms))
approx_to_pi

3.1413926535917955

---

## 5.3 More on Arrays

- 如果算术运算符作用于两个相同 size 的数组，则运算结果是对两个数组中对应元素的运算
    - 例如 `+` `-` `*` `/`

In [18]:
baseline_low = 3.00
lows = make_array(baseline_low - 0.872, baseline_low - 0.629,
                  baseline_low - 0.126, baseline_low + 0.728)
make_array(
    highs.item(0) - lows.item(0),
    highs.item(1) - lows.item(1),
    highs.item(2) - lows.item(2),
    highs.item(3) - lows.item(3),
)

array([11.472, 12.016, 11.711, 11.436])

In [19]:
highs - lows # 对比上面的结果

array([11.472, 12.016, 11.711, 11.436])

### 5.3.2 Example: Wallis' Formula for $\pi$

$\pi$ 也可表示为一个 infinite product
$$
\pi = 2 \left( \frac{2}{1} \cdot \frac{2}{3} \cdot \frac{4}{3} \cdot \frac{4}{5} \cdot \frac{6}{5} \cdot \frac{6}{7}
\cdot \cdots \right)
\approx 2 \left( \frac{2}{1} \cdot \frac{4}{3} \cdot \frac{6}{5} \cdot \cdots \cdot \frac{1,000,000}{999,999} \right)
\cdot \left( \frac{2}{3} \cdot \frac{4}{5} \cdot \frac{6}{7}
\cdot \cdots \cdot \frac{1,000,000}{1,000,001} \right)
$$

In [21]:
even = np.arange(2, 1000001, 2)  # 分子
one_below_even = even - 1        # 前一坨的分母
one_above_even = even + 1        # 后一坨的分母
approx_to_pi = 2 * np.prod(even / one_below_even) * np.prod(even / one_above_even)
approx_to_pi

3.1415910827951143

---

<br />

## Extra (hw02)

### hw02-1 Creating Arrays

- String 的 `join()` 方法

In [22]:
book_title_words = make_array("Eats", "Shoots", "and Leaves")
with_commas = ", ".join(book_title_words)
without_commas = " ".join(book_title_words)
print('with_commas:', with_commas)
print('without_commas:', without_commas)

with_commas: Eats, Shoots, and Leaves
without_commas: Eats Shoots and Leaves


### hw02-2 Indexing Arrays

- 使用 `array.item(index)`, data8 中不要使用方括号来索引, 如 `arr[0]`

In [23]:
some_numbers = make_array(-1, -3, -6, -10, -15)
some_numbers.item(2)

-6

In [24]:
some_numbers[2]

-6

In [25]:
blank_a = "third"
blank_b = "fourth"
blank_c = 0
blank_d = 3
elements_of_some_numbers = Table().with_columns(
    "English name for position", make_array("first", "second", blank_a, blank_b, "fifth"),
    "Index",                     make_array(blank_c, 1, 2, blank_d, 4),
    "Element",                   some_numbers)
elements_of_some_numbers

English name for position,Index,Element
first,0,-1
second,1,-3
third,2,-6
fourth,3,-10
fifth,4,-15


### hw02-3 Basic Array Arithmetic

- 读 table 的 column 的结果是一个 array，例如
    ```python
    max_temperatures = Table.read_table("temperatures.csv").column("Daily Max Temperature")
    celsius_max_temperatures = np.round((max_temperatures - 32) * (5/9))
    ```
    
### hw02-4 Old Faithful

- 表的 `take` method



---

## Extra (Lab03)

- [list of string methods](https://docs.python.org/3/library/stdtypes.html#string-methods)
- Strings 和 numers 不可以直接相加，例如 
    ```python
    8 + "8"   # error
    ```
- 可以考虑类型转换

    | Function |                                                  Description |
    |:------- |:----------------------------------------------------------- |
    |    `int` | Converts a string of digits or a float to an integer ("int") value |
    |  `float` | Converts a string of digits (perhaps with a decimal point) or an int to a decimal ("float") value |
    |    `str` |                               Converts any value to a string |
    
-  the `round` function takes in two arguments: the number to be rounded, and the number of decimal places to round to. The second argument can be thought of as how many steps right or left you move from the decimal point.
- NumPy provides its own version of `round` that rounds each element of an array. `np.round()`
- **elementwise**

In [1]:
round(1234.5, -2)

1200.0

- 创建一个 table
    - creates an empty table using the expression Table(),
    - adds two columns by calling with_columns with four arguments,
- Table 的一些 methods
    - `with_columns()`
    - `read.table()`
    - `column()`
    - `take()`: 以一个 range 作为参数，返回 range 中索引值对应的行