### main differences with python lists are:

1. Arrays support vectorised operations, while lists don’t.
2. Once an array is created, you cannot change its size. You will have to create a new array or overwrite the existing one.
3. Every array has one and only one dtype. All items in it should be of that dtype.
4. An equivalent numpy array occupies much less space than a python list of lists

> equivalent 等值;occupy 占用

### bool index
A boolean index array is of the same shape as the array-to-be-filtered and it contains only True and False values. The values corresponding to True positions are retained in the output

> corresponding 相应的

### compute mean, min, max on the ndarray
compute the minimum values row wise or column wise, use the np.amin version instead (apply_over_axis)

###  reshape flatten
Reshaping is changing the arrangement of items so that shape of the array changes while maintaining the same number of dimensions.

Flattening, however, will convert a multi-dimensional array to a flat 1d array. And not any other shape.

> arrangements 商定; maintain 维持

### difference between flatten() and ravel()
flatten it does not create a copy


> https://www.machinelearningplus.com/python/numpy-tutorial-part1-array-python-examples/


In [None]:
import numpy as np

a=np.full((3,4),1)
b=a.reshape((2,6))
np.shares_memory(a,b)


np.shares_memory(a.flatten(),a.ravel())


### where
1. locates the positions in the array where a given condition holds true
2. np.where also accepts 2 more optional arguments x and y. Whenever condition is true, ‘x’ is yielded else ‘y’
3. location of max `argmax`
4. location of min `argmin`

### import and export data as a csv file
using `np.genfromtxt`.It can import datasets from web URLs, handle missing values, multiple delimiters, handle irregular number of columns etc.

### handle datasets that has both numbers and text columns
set the dtype as 'object' or None

`np.savetxt`export the array as a csv file


###  How to save and load numpy objects? 
we will want to save large transformed numpy arrays to disk and load it back to console directly without having the re-run the data transformations code. Numpy provides the .npy and the .npz file types for this purpose. If you want to store a single ndarray object, store it as a .npy file using np.save. This can be loaded back using the np.load. If you want to store more than 1 ndarray object in a single file, then save it as a .npz file using
> transformed 有改观的;re-run重放

###  concatenate two numpy arrays columnwise and row wise
- np.concatenate by changeing the axis parameter to 0 and 1
- np.vstack and np.hstack
- np.r_[] and np.c_[]

### sort a numpy array based on one or more columns
- np.sort(arr,axis=0)
- agrsort return the index position of that would make a given 1d array sorted
- lexsort by passing a tuple of columns based on which the array should be sorted,Just remember to place the column to be sorted first at the rightmost side inside the tuple 
```python
# sort by col 0, then by col 1 (0相同再排1)
lexsorted_index = np.lexsort((arr[:, 1], arr[:, 0]))
```
> place 放在

### working the dates
np.datetime64 object which supports a precision till nanoseconds. You can create one using a standard YYYY-MM-DD formatted date strings.

### create a sequence of dates
using np.arange `dates = np.arange(np.datetime64('2018-02-01'), np.datetime64('2018-02-10'))`

### vectorize - Make a scalar function work on vectors
vectorize() you can make a function that is meant to work on individual numbers, to work on arrays

### apply_along_axis – Apply a function column wise or row wise


### searchsorted – Find the location to insert so the array will remain sorted
It gives the index position at which a number should be inserted in order to keep the array sorted.

### Digitize
Use np.digitize to return the index position of the bin each element belongs to.
> bin 容器; 

### Clip
Use np.clip to cap the numbers within a given cutoff range. All number lesser than the lower limit will be replaced by the lower limit. Same applies to the upper limit also.

> clip 剪短;cap 限制;cutoff 提供


### Histogram and Bincount
Both histogram() and bincount() gives the frequency of occurrences. But with certain differences. While histogram() gives the frequency counts of the bins, bincount() gives the frequency count of all the elements in the range of the array between the min and max values. Including the values that did not occur.

> occurrences 存在的数量;occur 存在


> https://www.machinelearningplus.com/python/numpy-tutorial-python-part2/

In [None]:
# where 
a=np.random.randint(1,4,size=10)
# using take extract value
a[np.where(a>2)],a.take(np.where(a>2))

np.where(a>2,'gt2','le2')

In [None]:
# sort
matrix=np.random.randint(1,20,size=20)

matrix=matrix.reshape(4,-1)
# sort arr by column 
matrix[matrix[:,0].argsort()]

In [None]:
index=np.lexsort((matrix[:,1],matrix[:,3]))
matrix[index]

In [None]:
# 先排1列,如果第一列相同,在按照第二列的排序,排序列一般在最右边
index=np.lexsort((matrix[:,-1],matrix[:,0]))
matrix[index]


In [None]:
# datetime64
np.datetime64('2024-01-01 12:01:01')

np.arange(np.datetime64('2024-01-01 12:01:01'),np.datetime64('2024-01-10 12:01:01'))

In [None]:
# vectorize

def mmax(a,b):
    return a if a>b else b

a=np.random.randint(1,9,size=4)
b=np.random.randint(1,9,size=4)

# 让函数支持向量化
v_mmax=np.vectorize(mmax)

v_mmax(a,b),a,b

In [None]:
# searchsort

a=np.arange(1,10)

np.searchsorted(a,9),np.searchsorted(a,9,side='right')

# clip
np.clip(a,3,8)

In [None]:
x = np.array([1,1,2,2,2,4,4,5,6,6,6])

np.bincount(x) # 0 occurs 0 times, 1 occurs 2 times, 2 occurs thrice, 3 occurs 0 times,


# Histogram example  bins=[]
counts, bins = np.histogram(x, [0, 2, 4, 6, 8])
print('Counts: ', counts)
print('Bins: ', bins)