## Lecture III: NumPy

March 3, 2025

https://numpy.org/doc/stable/user/index.html

# 5 min warmup - 5 activity points!

In [None]:
import keyword
print(keyword.kwlist)

In [None]:
# will be useful today
# helping - similar to R "str"
print(list('abcd'))
dir(list('abcd')) 

### Warm up 2nd round:

In [None]:
# lambda function

In [None]:
x = [1, 2, 3, 4, 5]
doubled = list(map(lambda x: x * 2, x)) # map applies the function to each element of the list
print(doubled)

In [None]:
x = [1, 2, 3, 4, 5, 6, 7, 8]
even_numbers = list(filter(lambda x: x % 2 == 0, x)) # 
print(even_numbers)

In [None]:
pairs = [(1, 3), (2, 2), (3, 1)]
sorted_pairs = sorted(pairs, key=lambda x: x[1]) # sort by the second element of the tuple
print(sorted_pairs) 

# Moving forward from Python's primitive data types

In [None]:
# Integers
# Float
# Strings
# Boolean

In [None]:
[1, 'a', 3.14, True]

# Numpy <a name="numpy"></a>

* Num(erical) Py(thon)
* NumPy is at the base of Python's scientific stack of tools 
* Python already has *high-level number objects* (integers, floating point) and *containers*  (lists, dictionaries ) 
* np arrays contain *only one type* - unlike general lists
* **Memory-efficient container that provides fast numerical operations.**
* **ndarray** = block of memory + indexing scheme + data type descriptor
    * raw data 
    * how to locate an element
    * how to interpret an element

<img src="03_pics/ndarray.png" width="600">

Key Features of an `ndarray`:

- **Homogeneous:** All elements must have the same data type.
- **Multidimensional:** It can have any number of dimensions (1D, 2D, 3D, etc.).
- **Fixed size:** The size of the array is determined at creation and cannot change, though you can reshape it.


In [7]:
!pip freeze 

appnope==0.1.4
asttokens==3.0.0
comm==0.2.2
debugpy==1.8.12
decorator==5.2.1
executing==2.2.0
ipykernel==6.29.5
ipython==9.0.0
ipython_pygments_lexers==1.1.1
jedi==0.19.2
jupyter_client==8.6.3
jupyter_core==5.7.2
matplotlib-inline==0.1.7
nest-asyncio==1.6.0
numpy==2.2.3
packaging==24.2
parso==0.8.4
pexpect==4.9.0
platformdirs==4.3.6
prompt_toolkit==3.0.50
psutil==7.0.0
ptyprocess==0.7.0
pure_eval==0.2.3
Pygments==2.19.1
python-dateutil==2.9.0.post0
pyzmq==26.2.1
six==1.17.0
stack-data==0.6.3
tornado==6.4.2
traitlets==5.14.3
wcwidth==0.2.13


In [4]:
# it is not natively in the python distribution, do you have it installed?

!pip freeze | grep numpy # bit of bash magic

In [5]:
!pip freeze > requirements.txt

In [None]:
# executing shell comands from the jupyter notebook
# !ls -lha
#!pip install numpy

In [6]:
!pip install numpy

Collecting numpy
  Using cached numpy-2.2.3-cp313-cp313-macosx_14_0_arm64.whl.metadata (62 kB)
Using cached numpy-2.2.3-cp313-cp313-macosx_14_0_arm64.whl (5.1 MB)
Installing collected packages: numpy
Successfully installed numpy-2.2.3


#### Import numpy

In [8]:
# np is alias ""(used when name of the packages are too long or coders are rightly lazy)
import numpy as np # very common usage

In [9]:
# Simple array
a = np.array(1, 2, 3, 4)

TypeError: array() takes from 1 to 2 positional arguments but 4 were given

In [10]:
# Simple array - you need list
a = np.array([0, 1, 2, 3, 4])
a

array([0, 1, 2, 3, 4])

In [11]:
a.append(5) # this will not work, numpy arrays are fixed size

AttributeError: 'numpy.ndarray' object has no attribute 'append'

In [12]:
a = np.append(a, 5) # this will work
a

array([0, 1, 2, 3, 4, 5])

In [13]:
for i in a:
    print(i)

0
1
2
3
4
5


In [14]:
dir(a)

['T',
 '__abs__',
 '__add__',
 '__and__',
 '__array__',
 '__array_finalize__',
 '__array_function__',
 '__array_interface__',
 '__array_namespace__',
 '__array_priority__',
 '__array_struct__',
 '__array_ufunc__',
 '__array_wrap__',
 '__bool__',
 '__buffer__',
 '__class__',
 '__class_getitem__',
 '__complex__',
 '__contains__',
 '__copy__',
 '__deepcopy__',
 '__delattr__',
 '__delitem__',
 '__dir__',
 '__divmod__',
 '__dlpack__',
 '__dlpack_device__',
 '__doc__',
 '__eq__',
 '__float__',
 '__floordiv__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getitem__',
 '__getstate__',
 '__gt__',
 '__hash__',
 '__iadd__',
 '__iand__',
 '__ifloordiv__',
 '__ilshift__',
 '__imatmul__',
 '__imod__',
 '__imul__',
 '__index__',
 '__init__',
 '__init_subclass__',
 '__int__',
 '__invert__',
 '__ior__',
 '__ipow__',
 '__irshift__',
 '__isub__',
 '__iter__',
 '__itruediv__',
 '__ixor__',
 '__le__',
 '__len__',
 '__lshift__',
 '__lt__',
 '__matmul__',
 '__mod__',
 '__mul__',
 '__ne__',
 '__neg__',


In [None]:
# a.__dir__()

- Use `dir()` to inspect an object's attributes.
- Use `__dir__()` when you need to customize what `dir()` returns for an object.

### NumPy array

In [15]:
a

array([0, 1, 2, 3, 4, 5])

In [16]:
print(a.ndim)

len(a)

1


6

In [17]:
print(a) # prints almost like list

[0 1 2 3 4 5]


In [18]:
a.shape # 

(6,)

An array → dimensions we address to

<img src="03_pics/np-axis.png">

Different way to look at an array

<img src="03_pics/array_construct.png">

In [19]:
np?

[31mType:[39m        module
[31mString form:[39m <module 'numpy' from '/Users/annabrichackova/Documents/GitHub/Data-Processing-in-Python/.venv/lib/python3.13/site-packages/numpy/__init__.py'>
[31mFile:[39m        ~/Documents/GitHub/Data-Processing-in-Python/.venv/lib/python3.13/site-packages/numpy/__init__.py
[31mDocstring:[39m  
NumPy
=====

Provides
  1. An array object of arbitrary homogeneous items
  2. Fast mathematical operations over arrays
  3. Linear Algebra, Fourier Transforms, Random Number Generation

How to use the documentation
----------------------------
Documentation is available in two forms: docstrings provided
with the code, and a loose standing reference guide, available from
`the NumPy homepage <https://numpy.org>`_.

We recommend exploring the docstrings using
`IPython <https://ipython.org>`_, an advanced Python shell with
TAB-completion and introspection capabilities.  See below for further
instructions.

The docstring examples assume that `numpy` has be

In [20]:
np.array?

[31mDocstring:[39m
array(object, dtype=None, *, copy=True, order='K', subok=False, ndmin=0,
      like=None)

Create an array.

Parameters
----------
object : array_like
    An array, any object exposing the array interface, an object whose
    ``__array__`` method returns an array, or any (nested) sequence.
    If object is a scalar, a 0-dimensional array containing object is
    returned.
dtype : data-type, optional
    The desired data-type for the array. If not given, NumPy will try to use
    a default ``dtype`` that can represent the values (by applying promotion
    rules when necessary.)
copy : bool, optional
    If ``True`` (default), then the array data is copied. If ``None``,
    a copy will only be made if ``__array__`` returns a copy, if obj is
    a nested sequence, or if a copy is needed to satisfy any of the other
    requirements (``dtype``, ``order``, etc.). Note that any copy of
    the data is shallow, i.e., for arrays with object dtype, the new
    array will poi

**Create an ndarray:** You can create an `ndarray` from:

- A Python list or nested lists (for multi-dimensional arrays).
- Other methods like `np.zeros()`, `np.ones()`, `np.random()`, etc.

In [21]:
np.empty((2, 3))

array([[0., 0., 0.],
       [0., 0., 0.]])

In [22]:
# Simple array
List = [2.0, 3.0, 16.9, 17.2, 1.0]
List2 = np.array(List) # converting list to ndarray
print("Elements in List2 variable are : ", List2)
print("Type of our List2 variable is : ", type(List2))

Elements in List2 variable are :  [ 2.   3.  16.9 17.2  1. ]
Type of our List2 variable is :  <class 'numpy.ndarray'>


## How can you define a matrix?

### array of arrays?

In [23]:
# multi dimensional objects
# array of array is a matrix
a = np.array([
    [1,3], [2,3]
])

In [24]:
a.shape # 2 rows, 2 columns

(2, 2)

In [25]:
a

array([[1, 3],
       [2, 3]])

In [26]:
arr = np.array([[1, 2, 3], [4, 5, 6]])

In [27]:
print("Array:\n", arr)
print("Shape:", arr.shape)    # (2, 3)
print("Data type:", arr.dtype)  # int64
print("Number of dimensions:", arr.ndim)  # 2
print("Size:", arr.size)    # 6

Array:
 [[1 2 3]
 [4 5 6]]
Shape: (2, 3)
Data type: int64
Number of dimensions: 2
Size: 6


Construct array like a civilized person. (Martin Hronec's way! :)

In [28]:
xx = np.arange(6) # equivalent to list(range(5))

In [29]:
xx

array([0, 1, 2, 3, 4, 5])

In [30]:
xx.dtype

dtype('int64')

In [31]:
nda = np.arange(8).reshape((2,2,2))
nda

array([[[0, 1],
        [2, 3]],

       [[4, 5],
        [6, 7]]])

In [32]:
nda.shape # corresponds to the shape from initialization

(2, 2, 2)

In [33]:
np.arange(10).reshape((2,6)) #is this ok?#2*6 elements

ValueError: cannot reshape array of size 10 into shape (2,6)

Dimensions and manipulation

In [34]:
a = np.arange(6)

In [35]:
a

array([0, 1, 2, 3, 4, 5])

In [36]:
a.shape

(6,)

In [None]:
a

In [37]:
a[np.newaxis, :]

array([[0, 1, 2, 3, 4, 5]])

In [38]:
a[np.newaxis, :].shape

(1, 6)

In [39]:
a[:, np.newaxis]

array([[0],
       [1],
       [2],
       [3],
       [4],
       [5]])

In [None]:
a[:, np.newaxis].shape

In [40]:
np.expand_dims(a, axis=0).shape

(1, 6)

In [41]:
np.expand_dims(a, axis=1).shape

(6, 1)

In [42]:
b = np.expand_dims(a, axis=1)
b

array([[0],
       [1],
       [2],
       [3],
       [4],
       [5]])

In [None]:
a[0]

In [None]:
b[0]

In [None]:
b[0][0]

Different between `list` and `ndarray`

In [43]:
n = [1,2]
m = [3,4]
n + m

[1, 2, 3, 4]

In [44]:
k = np.array([1,2])
l = np.array([3,4])
k + l

array([4, 6])

## Broadcasting !!!

In [45]:
k * l

array([3, 8])

`NumPy` provides familiar mathematical functions such as sin, cos, and exp. In NumPy, these are called “universal functions” (ufunc). Within NumPy, these functions operate element-wise on an array, producing an array as output. (from docs)

```
all, any, apply_along_axis, argmax, argmin, argsort, average, bincount, ceil, clip, conj, corrcoef, cov, cross, cumprod, cumsum, diff, dot, floor, inner, invert, lexsort, max, maximum, mean, median, min, minimum, nonzero, outer, prod, re, round, sort, std, sum, trace, transpose, var, vdot, vectorize, where
```

In [46]:
a = np.array([1,2,3,4,5])
np.add(a, 1) # add is summing the arrays element wise

array([2, 3, 4, 5, 6])

In [47]:
a + 1

array([2, 3, 4, 5, 6])

In [48]:
a.all() > 3

np.False_

In [49]:
a > 3

array([False, False, False,  True,  True])

In [50]:
np.all(a == a) # all elements are equal to themselves

np.True_

In [51]:
# evenly spaced

# chain operations on a single object
a = np.arange(10).reshape((2,5))

print(a)
print(a.mean())

[[0 1 2 3 4]
 [5 6 7 8 9]]
4.5


In [52]:
# Built-in Functions
x = np.arange(1, 6)
np.sqrt(x), np.exp(x), np.log(x), np.sin(x), np.cos(x)

(array([1.        , 1.41421356, 1.73205081, 2.        , 2.23606798]),
 array([  2.71828183,   7.3890561 ,  20.08553692,  54.59815003,
        148.4131591 ]),
 array([0.        , 0.69314718, 1.09861229, 1.38629436, 1.60943791]),
 array([ 0.84147098,  0.90929743,  0.14112001, -0.7568025 , -0.95892427]),
 array([ 0.54030231, -0.41614684, -0.9899925 , -0.65364362,  0.28366219]))

In [53]:
c = np.array([1, 2, 3])
d = np.array([[1], [2], [3]])
print(c + d)

[[2 3 4]
 [3 4 5]
 [4 5 6]]


In [54]:
a.mean(axis=0) # mean of each column

array([2.5, 3.5, 4.5, 5.5, 6.5])

In [55]:
a.mean(axis=0)  # why axis 0?
b = a.mean(axis=1)
b

array([2., 7.])

In [56]:
a.std(axis=1)

array([1.41421356, 1.41421356])

In [57]:
a.argmax() # what is the largest element?

np.int64(9)

In [58]:
a

array([[0, 1, 2, 3, 4],
       [5, 6, 7, 8, 9]])

In [59]:
a.cumsum()

array([ 0,  1,  3,  6, 10, 15, 21, 28, 36, 45])

In [60]:
a.cumprod(axis=1) #why is it useful?

array([[    0,     0,     0,     0,     0],
       [    5,    30,   210,  1680, 15120]])

In [61]:
a.shape

(2, 5)

In [62]:
a.T.dot(a) #what is going on here?

array([[25, 30, 35, 40, 45],
       [30, 37, 44, 51, 58],
       [35, 44, 53, 62, 71],
       [40, 51, 62, 73, 84],
       [45, 58, 71, 84, 97]])

In [63]:
a.dot(a.T) #what is going on here?

array([[ 30,  80],
       [ 80, 255]])

In [64]:
# bit of extra linear algebra

sq = np.arange(4).reshape((1,-1))
sq

array([[0, 1, 2, 3]])

In [65]:
np.arange(4).reshape(1,-1)

array([[0, 1, 2, 3]])

In [66]:
print("Shape:", sq.shape)
sq

Shape: (1, 4)


array([[0, 1, 2, 3]])

In [67]:
x

array([1, 2, 3, 4, 5])

In [68]:
sq.T.dot(sq)

array([[0, 0, 0, 0],
       [0, 1, 2, 3],
       [0, 2, 4, 6],
       [0, 3, 6, 9]])

In [69]:
dir(a)

['T',
 '__abs__',
 '__add__',
 '__and__',
 '__array__',
 '__array_finalize__',
 '__array_function__',
 '__array_interface__',
 '__array_namespace__',
 '__array_priority__',
 '__array_struct__',
 '__array_ufunc__',
 '__array_wrap__',
 '__bool__',
 '__buffer__',
 '__class__',
 '__class_getitem__',
 '__complex__',
 '__contains__',
 '__copy__',
 '__deepcopy__',
 '__delattr__',
 '__delitem__',
 '__dir__',
 '__divmod__',
 '__dlpack__',
 '__dlpack_device__',
 '__doc__',
 '__eq__',
 '__float__',
 '__floordiv__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getitem__',
 '__getstate__',
 '__gt__',
 '__hash__',
 '__iadd__',
 '__iand__',
 '__ifloordiv__',
 '__ilshift__',
 '__imatmul__',
 '__imod__',
 '__imul__',
 '__index__',
 '__init__',
 '__init_subclass__',
 '__int__',
 '__invert__',
 '__ior__',
 '__ipow__',
 '__irshift__',
 '__isub__',
 '__iter__',
 '__itruediv__',
 '__ixor__',
 '__le__',
 '__len__',
 '__lshift__',
 '__lt__',
 '__matmul__',
 '__mod__',
 '__mul__',
 '__ne__',
 '__neg__',


In [70]:
# generate sequences

# number of points from an interval
start = 0
end = 1
n_points = 100
a = np.linspace(start, end, n_points) #R: seq()
a

array([0.        , 0.01010101, 0.02020202, 0.03030303, 0.04040404,
       0.05050505, 0.06060606, 0.07070707, 0.08080808, 0.09090909,
       0.1010101 , 0.11111111, 0.12121212, 0.13131313, 0.14141414,
       0.15151515, 0.16161616, 0.17171717, 0.18181818, 0.19191919,
       0.2020202 , 0.21212121, 0.22222222, 0.23232323, 0.24242424,
       0.25252525, 0.26262626, 0.27272727, 0.28282828, 0.29292929,
       0.3030303 , 0.31313131, 0.32323232, 0.33333333, 0.34343434,
       0.35353535, 0.36363636, 0.37373737, 0.38383838, 0.39393939,
       0.4040404 , 0.41414141, 0.42424242, 0.43434343, 0.44444444,
       0.45454545, 0.46464646, 0.47474747, 0.48484848, 0.49494949,
       0.50505051, 0.51515152, 0.52525253, 0.53535354, 0.54545455,
       0.55555556, 0.56565657, 0.57575758, 0.58585859, 0.5959596 ,
       0.60606061, 0.61616162, 0.62626263, 0.63636364, 0.64646465,
       0.65656566, 0.66666667, 0.67676768, 0.68686869, 0.6969697 ,
       0.70707071, 0.71717172, 0.72727273, 0.73737374, 0.74747

### Why is it useful?

In [None]:
### generating data with numpy is easy

In [71]:
dir(np.random)

['BitGenerator',
 'Generator',
 'MT19937',
 'PCG64',
 'PCG64DXSM',
 'Philox',
 'RandomState',
 'SFC64',
 'SeedSequence',
 '__RandomState_ctor',
 '__all__',
 '__builtins__',
 '__cached__',
 '__doc__',
 '__file__',
 '__loader__',
 '__name__',
 '__package__',
 '__path__',
 '__spec__',
 '_bounded_integers',
 '_common',
 '_generator',
 '_mt19937',
 '_pcg64',
 '_philox',
 '_pickle',
 '_sfc64',
 'beta',
 'binomial',
 'bit_generator',
 'bytes',
 'chisquare',
 'choice',
 'default_rng',
 'dirichlet',
 'exponential',
 'f',
 'gamma',
 'geometric',
 'get_bit_generator',
 'get_state',
 'gumbel',
 'hypergeometric',
 'laplace',
 'logistic',
 'lognormal',
 'logseries',
 'mtrand',
 'multinomial',
 'multivariate_normal',
 'negative_binomial',
 'noncentral_chisquare',
 'noncentral_f',
 'normal',
 'pareto',
 'permutation',
 'poisson',
 'power',
 'rand',
 'randint',
 'randn',
 'random',
 'random_integers',
 'random_sample',
 'ranf',
 'rayleigh',
 'sample',
 'seed',
 'set_bit_generator',
 'set_state',
 'shuf

In [72]:
np.random?

[31mType:[39m        module
[31mString form:[39m <module 'numpy.random' from '/Users/annabrichackova/Documents/GitHub/Data-Processing-in-Python/.venv/lib/python3.13/site-packages/numpy/random/__init__.py'>
[31mFile:[39m        ~/Documents/GitHub/Data-Processing-in-Python/.venv/lib/python3.13/site-packages/numpy/random/__init__.py
[31mDocstring:[39m  
Random Number Generation

Use ``default_rng()`` to create a `Generator` and call its methods.

Generator
--------------- ---------------------------------------------------------
Generator       Class implementing all of the random number distributions
default_rng     Default constructor for ``Generator``

BitGenerator Streams that work with Generator
--------------------------------------------- ---
MT19937
PCG64
PCG64DXSM
Philox
SFC64

Getting entropy to initialize a BitGenerator
--------------------------------------------- ---
SeedSequence


Legacy
------

For backwards compatibility with previous versions of numpy before 1.17,

In [75]:
# random seed is cell-specific! 
np.random.seed(1234)

# random (normal)
r = np.random.randn(4)
r

array([ 0.47143516, -1.19097569,  1.43270697, -0.3126519 ])

In [76]:
np.random.randn(4) #re-running this yields different result -> seed is not in play

array([-0.72058873,  0.88716294,  0.85958841, -0.6365235 ])

In [78]:
np.random.standard_t(4, 4) #t-distribution

array([ 0.89486102,  0.06915798,  0.3986898 , -0.70403755])

In [77]:
np.random.uniform(0, 1, 4) #uniform distribution

array([0.95813935, 0.87593263, 0.35781727, 0.50099513])

# A crucial skill

### Indexing and Slicing
* In 2D, the first dimension corresponds to rows, the second to columns.
* in the multidimensional case, `a[0]` gives all elements in the unspecified dimension

<img src="03_pics/indexing.png">

In [None]:
a = np.arange(27).reshape((3,3,3))

<img src="03_pics/array_construct.png">

In [None]:
a

In [None]:
a[0]

In [None]:
a[0,:,:]

In [None]:
a[0,0,:]

In [None]:
a[1,2,:]

In [None]:
# create toy diagonal matrix
a = np.diag([1,2,3,4])
a

In [None]:
a > 0

Indexing and Slicing:

In [None]:
# print(a[2]) 

# print(a[2,:]) #slicing - equivalent to first

# print(a[2][2]) #access single element matrix
# print(a[2,2])

# print(a[:,1])

# print(a[:,-2:])

In [None]:
a

In [None]:
a[0:3:2] # by default going by axis 0

In [None]:
a[:,0:3:2] # by now going by axis 1 - notice the :, at beginning

In [None]:
# select from start to an end with certain step (could be zero instead of missing)
# advanced tricks

a[:3:2,:3:2] #step n is every n-th observation

In [None]:
s = np.arange(100)
print(s)
# step can also be negative
# start:end:step
print(s[:80:-3])

The sliced array `s[:80:-3]` results in `[99, 96, 93, 90, 87, 84, 81]`. 
This array is generated by starting from the end of s (since the start index is omitted and step is negative, it defaults to the last item, which is 99), and includes every third element in reverse order until just before it reaches the index 80.

**Copies vs. views**
* a slicing creates a **view** on the original array (just a way of accessing array data)
    * the original array is not copied in memory
* when modifying the view, the original array is modified as well! (SURPRISE, SURPRISE)
    * allows to save memory and time
* In CS it is called **shallow copy** vs **deep copy**

In [None]:
a = [10]
b = a
print(id(a), id(b))
print(a, b)
a[0] = 5
print(a, b)
print(id(a), id(b))

In [None]:
a = np.arange(10)
print(a)
b = a 
b[2] = 22

# #print(a.data, b.data)

print(a)
print(b) # a anb b are the same??

In [None]:
print(np.may_share_memory(a, b)) #

In [None]:
a = np.arange(10)
c = a.copy()  # force a copy -> create new memory
c[0] = 12
print(c)
print(a)

print(np.may_share_memory(a, c))
#print(a.data, c.data)

### Typical mistake in pandas slicing dataframes → stay tuned for next lecture!

SettingWithCopyWarning:
 
A value is trying to be set on a copy of a slice from a DataFrame.

Try using .loc[row_indexer,col_indexer] = value instead

**Speed of basic numpy operations**
   * much faster then in pure python

In [79]:
# if unsure about an algo
# run on a small sample and get a time estimate! before run for days...
a = np.arange(10000)
%timeit -n 100 a + 1

#caching results is good!

The slowest run took 4.93 times longer than the fastest. This could mean that an intermediate result is being cached.
6.07 μs ± 4.82 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)


In [80]:
l = range(10000)
%timeit -n 100 [i+1 for i in l] 

232 μs ± 44.4 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)


In [None]:
# remember the difference between %time and %timeit
# timeit runs a number of loops
# time times just one evaluation of the cell

%time res = [i+1 for i in a]

In [None]:
%timeit res =  a + 1 

**Changing shape of an array**
* flattening
* reshaping (inverse of flattening)

In [81]:
a = np.array([[1, 2, 3], [4, 5, 6]])
a

array([[1, 2, 3],
       [4, 5, 6]])

In [82]:
# flattening
print(a.flatten())


[1 2 3 4 5 6]


In [None]:
print(a.shape)
a.T

In [None]:
a.T.flatten() #or use flatten/ravel order='F' in Fortran column-wise order

In [None]:
a.T.flatten(order='F')

In [None]:
a.T.flatten(order='C')

In [None]:
np.exp(a) # a bit like R - vectorized operations (broadcasted)

**Pictures? Just pixels.**

In [83]:
# we will get to the matplotlib and pyplot in the last part of the lecture
import matplotlib.pyplot as plt
# another ipython magic
%matplotlib inline 

ModuleNotFoundError: No module named 'matplotlib'

In [None]:
# for more M.C. Escher's pictures: https://www.mcescher.com/
import matplotlib.pyplot as plt

img = plt.imread("03_pics/mc_escher_print gallery.png")
plt.imshow(img, interpolation="nearest", aspect="auto")

In [None]:
type(img)

In [None]:
img

In [None]:
# image shape as (H, W, D), depth: https://www.wikiwand.com/en/Color_depth
img.shape

In [None]:
# just an array!
img

In [None]:
self_centered = img[200:,200:500]

In [None]:
plt.imshow(self_centered)

In [None]:
lx, ly, ld = img.shape
X, Y = np.ogrid[0:lx, 0:ly]
mask = (X - lx / 2) ** 2 + (Y - ly / 2) ** 2 > lx * ly / 4
img[mask] = 0
img[range(300), range(300)] = 255

plt.figure(figsize=(3, 3))
plt.axes([0, 0, 1, 1])
plt.imshow(img, cmap=plt.cm.gray)

In [None]:
# Element-wise operations
array1 = np.array([1, 2, 3])
array2 = np.array([4, 5, 6])

# Addition
sum_arrays = array1 + array2

# Multiplication
product_arrays = array1 * array2

sum_arrays, product_arrays

# Other topics

In [None]:

# Joining arrays
array1 = np.array([1, 2, 3])
array2 = np.array([4, 5, 6])
joined_array = np.concatenate((array1, array2)) #concat is a general concept

# Splitting arrays
split_arrays = np.split(joined_array, 2)
#from docs:
# If indices_or_sections is an integer, N, the array will be divided into N equal arrays along axis. If such a split is not possible, an error is raised
# If indices_or_sections is a 1-D array of sorted integers, the entries indicate where along axis the array is split. For example, [2, 3] would, for axis=0, result in
    # ary[:2]
    # ary[2:3]
    # ary[3:]

joined_array, split_arrays

In [None]:
#other operations
print(array1.sum())
#2d array ->
print(array1.argmax())
print(array1.mean())

### Vectorize function

Function that computes `f(x)=x^2 + 2x + 1` and apply it element-wise to a NumPy array using `np.vectorize()`.

In [None]:
def poly_fun(x):
    return x**2 + 2*x + 1

vectorized_func = np.vectorize(poly_fun) # vectorize the function
arr = np.arange(-5, 6) # array from -5 to 5

print("Original array:", arr)
print("Transformed array:", vectorized_func(arr))

In [None]:
%timeit poly_fun(arr)

In [None]:
%timeit vectorized_func(arr)

In [None]:
poly_fun([-5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5]) # this is not vectorized

In [None]:
vectorized_func([-5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5])

### Simulate Random walk

In [None]:
import matplotlib.pyplot as plt

steps = np.random.choice([-1, 1], size=300) # random steps
position = np.cumsum(steps) # cumulative sum of the steps

plt.plot(position)
plt.title("1D Random Walk")
plt.xlabel("Step")
plt.ylabel("Position")
plt.show()

### Play with matrices

In [None]:
A = np.random.randint(1, 10, (3, 3))
B = np.random.randint(1, 10, (3, 3))

C = np.dot(A, B) # matrix multiplication
inv_A = np.linalg.inv(A) # matrix inverse
det_A = np.linalg.det(A) # matrix determinant
eigenvalues, eigenvectors = np.linalg.eig(A) # matrix eigenvalues and eigenvectors

print("Matrix A:\n", A)
print("Matrix B:\n", B)
print("Matrix Multiplication A * B:\n", C)
print("Inverse of A:\n", inv_A)
print("Determinant of A:", det_A)
print("Eigenvalues of A:", eigenvalues)
print("Eigenvectors of A:\n", eigenvectors)