In [None]:
# Chapter 4: NumPy Basics: Arrays and Vectorized Computation

## Outline

Here are some of the things you'll find in NumPy:

- ndarray, an efficient multidimensional array providing fast array-oriented arithmetic operations and flexible broadcasting capabilities.

- Mathematical functions for fast operations on entire arrays of data without having to write loops.

- Tools for reading/writing array data to disk and working with memory-mapped files.

- Linear algebra, random number generation, and Fourier transform capabilities.

- A C API for connecting NumPy with libraries written in C, C++, or FORTRAN.

For most data analysis applications, the main areas of functionality I’ll focus on are:

- Fast array-based operations for data munging and cleaning, subsetting and filtering, transformation, and any other kinds of computations

- Common array algorithms like sorting, unique, and set operations

- Efficient descriptive statistics and aggregating/summarizing data

- Data alignment and relational data manipulations for merging and joining together heterogeneous datasets

- Expressing conditional logic as array expressions instead of loops with if-elif-else branches

- Group-wise data manipulations (aggregation, transformation, function application)

One of the reasons NumPy is so important for numerical computations in Python is because it is designed for efficiency on large arrays of data. There are a number of reasons for this:


- NumPy internally stores data in a contiguous block of memory, independent of other built-in Python objects. NumPy's library of algorithms written in the C language can operate on this memory without any type checking or other overhead. NumPy arrays also use much less memory than built-in Python sequences.

- NumPy operations perform complex computations on entire arrays without the need for Python for loops, which can be slow for large sequences. NumPy is faster than regular Python code because its C-based algorithms avoid overhead present with regular interpreted Python code.

## 4.1 The NumPy ndarray: A Multidimensional Array Object

In [1]:
# What is array?
# An array is a multidimensional object that can store a collection of elements of the same type.
arr_1d = [1, 2]
arr_2d = [[1, 2], [3, 4]]
arr_3d = [[[1, 2], [3, 4]], [[5, 6], [7, 8]]]
if arr_3d[0] == arr_2d:
    print("arr_3d[0] == arr_2d")

arr_3d[0] == arr_2d


In [9]:
import numpy as np

# ndarray
# An ndarray is a multidimensional object that can store a collection of elements of the same type.
my_ndarr = np.array(arr_2d)
print(f'Type is {type(my_ndarr)}')
print(f'array is\n{my_ndarr}')
print(f'Shape is {my_ndarr.shape}')
print(f'Dtype is {my_ndarr.dtype}')

my_ndarr
print(f'Min is {my_ndarr.min(axis=0)}')
print(f'Min is {my_ndarr.min(axis=1)}')
print('pause')


Type is <class 'numpy.ndarray'>
array is
[[1 2]
 [3 4]]
Shape is (2, 2)
Dtype is int64
Min is [1 2]
Min is [1 3]
pause


In [24]:
# Zeros and Ones and Empty

arr = np.arange(1, 12, 2)
print(f'type arr is {type(arr)}')
print(f'arr is\n{arr}')

arr_zeros = np.zeros((5))
print(f'arr zeros is\n{arr_zeros}')

arr_zeros_int = np.zeros((5), dtype=int)
print(f'arr zeros int is\n{arr_zeros_int}')

arr_zeros_like = np.zeros_like(arr)
print(f'arr zeros like {arr} is\n{arr_zeros_like}')

arr_full = np.full((5), 10)
print(f'arr full is\n{arr_full}')

arr_full_3d = np.full((4, 3, 2), 10)
print(f'arr full 3d is\n{arr_full_3d}')

arr_full_3d_dumb = np.ones((4, 3, 2)) * 10
print(f'arr full 3d dumb is\n{arr_full_3d_dumb}')

type arr is <class 'numpy.ndarray'>
arr is
[ 1  3  5  7  9 11]
arr zeros is
[0. 0. 0. 0. 0.]
arr zeros int is
[0 0 0 0 0]
arr zeros like [ 1  3  5  7  9 11] is
[0 0 0 0 0 0]
arr full is
[10 10 10 10 10]
arr full 3d is
[[[10 10]
  [10 10]
  [10 10]]

 [[10 10]
  [10 10]
  [10 10]]

 [[10 10]
  [10 10]
  [10 10]]

 [[10 10]
  [10 10]
  [10 10]]]
arr full 3d dumb is
[[[10. 10.]
  [10. 10.]
  [10. 10.]]

 [[10. 10.]
  [10. 10.]
  [10. 10.]]

 [[10. 10.]
  [10. 10.]
  [10. 10.]]

 [[10. 10.]
  [10. 10.]
  [10. 10.]]]


In [27]:
arr2 = np.empty(())

print('pause')

pause


### Interval

- closed, closed interval [a, b]
    contain a, b
    
- closed, open interval (a, b)

- closed, half-open interval [a, b) √

- closed, half-open interval (a, b)


[a:] a to end

[:b] start to b

[:] all

