# What is NumPy?
NumPy is the fundamental package for scientific computing in Python. 

It is a Python library that provides a multidimensional array object, various derived objects (such as masked arrays and matrices), and an assortment of routines for fast operations on arrays, including mathematical, logical, shape manipulation, sorting, selecting, I/O, discrete Fourier transforms, basic linear algebra, basic statistical operations, random simulation and much more.

### The NumPy ndarray: A Multidimensional Array Object
At the core of the NumPy package, is the `ndarray` object. This encapsulates n-dimensional arrays of homogeneous data types, with many operations being performed in compiled code for performance.

Arrays enable you to perform mathematical operations on whole blocks of data using similar syntax to the equivalent operations between scalar elements.
Yani, tekil sayilar uzerinde islem yapar gibi bir syntax ile sayi dizileri uzerinde islem yapabiliyoruz.

An ndarray is a generic multidimensional container for "homogeneous" data; that is, all of the elements must be the same type. 
Every array has a shape, a tuple indicating the size of each dimension, and a dtype, an object describing the data type of the array

- NumPy arrays have a **fixed size at creation**, unlike Python lists (which can grow dynamically). Changing the size of an ndarray will create a new array and delete the original.
- NumPy arrays facilitate advanced mathematical and other types of operations on large numbers of data. Typically, such operations are executed more efficiently and with less code than is possible using Python’s built-in sequences.
- A growing plethora of scientific and mathematical Python-based packages are using NumPy arrays; though these typically support Python-sequence input, they convert such input to NumPy arrays prior to processing, and they often output NumPy arrays.


## Why is NumPy Fast?
**Vectorization** describes the absence of any explicit looping, indexing, etc., in the code - these things are taking place, of course, just “behind the scenes” in optimized, pre-compiled C code.

- vectorized code is more concise and easier to read
- the code more closely resembles standard mathematical notation.

Lets see some examples on how this works:

### Arithmetic with NumPy Arrays
Arrays are important because they enable you to express batch operations on data without writing any for loops. 
NumPy users call this "vectorization". Any arithmetic operations between equal-size arrays applies the operation element-wise.

Consider the case of multiplying each element in a 1-D sequence with the corresponding element in another sequence of the same length:

````
   a = [1, 3, 5]
   b = [2, 4, 6]
   c = []
   for i in range(len(a)):
      c.append(a[i] * b[i])   # we append (unlike C arrays that use a fast index lookup)

   print(c)

````

This produces the correct answer, but if a and b each contain millions of numbers, we will pay the price for the inefficiencies of looping in Python. We could accomplish the same task much more quickly in C by writing:

```
   int a[] = {1, 2, 3, 4, 5};
   int b[] = {10, 20, 30, 40, 50};
   int n = sizeof(a) / sizeof(a[0]);
   int result[n];

   for(int i=0; i<n; i++) {
      result[i] = arr1[i] * arr2[i];
   }
```

In NumPy,
```
   c = a * b
```
does what the earlier examples do, at near-C speeds, but with the code simplicity we expect from something based on Python.

In [33]:
a = [1, 3, 5]
b = [2, 4, 6]
c = []
for i in range(len(a)):
    c.append(a[i] * b[i])   # we append (unlike C arrays, where we can assign using a fast insertion using an index)

print(c)

[2, 12, 30]


NumPy-based algorithms are generally 10 to 100 times faster (or more) than their pure Python counterparts and use significantly less memory.
To give you an idea of the performance difference, consider a NumPy array of one million integers, and the equivalent Python list, and multiply each sequence by 2:

In [34]:
import numpy as np
import time

np_arr = np.arange(1_000_000)
my_list = list(range(1_000_000))

# Measure the starting time
start_cpu = time.process_time()
start_real = time.time()

for _ in range(10):
    my_arr2 = np_arr * 2

end_cpu = time.process_time()
end_real = time.time()

elapsed_cpu = end_cpu - start_cpu
elapsed_real_np = end_real - start_real

# print("Numpy array time: ")
# print(f"Elapsed CPU time: {elapsed_cpu} seconds")
# print(f"Elapsed real time: {elapsed_real_np} seconds")

# Measure the starting time
start_cpu = time.process_time()
start_real = time.time()

for _ in range(10):
    my_list2 = [x * 2 for x in my_list]

end_cpu = time.process_time()
end_real = time.time()

elapsed_cpu = end_cpu - start_cpu
elapsed_real = end_real - start_real

# print("Python list time: ")
# print(f"Elapsed CPU time: {elapsed_cpu} seconds")
# print(f"Elapsed real time: {elapsed_real} seconds")

print(f"{round(elapsed_real / elapsed_real_np)}", "times faster.")


19 times faster.


## Creating ndarrays
1. `np.array()`: 
<br>The easiest way to create an array is to use the array() function. This accepts any sequence-like object(including other arrays) and produces a new NumPy array containing the passed data. You can create an ndarray from a regular Python list or tuple using the np.array() function. <br><br>A frequent error consists in calling array with multiple arguments, rather than providing a single sequence as an argument:
```
    a = np.array(1, 2, 3, 4)    # TypeError: array() takes from 1 to 2 positional arguments
```
2. `np.zeros()` `np.ones()` and `empty()`: 
<br>Often, the elements of an array are originally unknown, but its size is known. Hence, NumPy offers several functions to create arrays with initial placeholder content. These minimize the necessity of growing arrays, an expensive operation.<br><br> The function `zeros()` creates an array full of zeros, the function `ones()` creates an array full of ones, and the function `empty()` creates an array whose initial content is random and depends on the state of the memory.

In [67]:
numbers = [1, 2, 3, 4, 5]
np_numbers = np.array(numbers)
# a = np.array(1, 2, 3, 4)    # TypeError: array() takes from 1 to 2 positional arguments

print(f"{numbers=}")        # [1, 2, 3, 4, 5]
print(f"{np_numbers=}")     # array([1, 2, 3, 4, 5])

# 2. zeros() and ones() create arrays of 0s or 1s, respectively, with a given length or shape.
scores_empty = np.empty(shape=(2, 3))
"""
array([[55.92639956, 66.10605961, 74.1785897 ],
       [59.17517896, 75.96718052, 68.6305342 ]])
"""

scores_0 = np.zeros(shape=10)       # given length as 1D shape - array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0.])


scores_1 = np.ones(shape=(2, 3, 4), dtype=np.int8)   # given shape -
'''
array([[[1, 1, 1, 1],
        [1, 1, 1, 1],
        [1, 1, 1, 1]],

       [[1, 1, 1, 1],
        [1, 1, 1, 1],
        [1, 1, 1, 1]]], dtype=int8)
'''


print(f"{scores_empty=}")
print(f"{scores_0=}")
print(f"{scores_1=}")




numbers=[1, 2, 3, 4, 5]
np_numbers=array([1, 2, 3, 4, 5])
scores_empty=array([[55.92639956, 66.10605961, 74.1785897 ],
       [59.17517896, 75.96718052, 68.6305342 ]])
scores_0=array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0.])
scores_1=array([[[1, 1, 1, 1],
        [1, 1, 1, 1],
        [1, 1, 1, 1]],

       [[1, 1, 1, 1],
        [1, 1, 1, 1],
        [1, 1, 1, 1]]], dtype=int8)



3. `np.full()` 
<br>Produces an array of the given shape and dtype with all values set to the indicated “fill value”

In [68]:
scores3 = np.full(shape=(2, 4), fill_value=9)
print(f"{scores3=}")
"""
array([[9, 9, 9, 9],
       [9, 9, 9, 9]])
"""




scores3=array([[9, 9, 9, 9],
       [9, 9, 9, 9]])


'\narray([[-1, -1, -1, -1],\n       [-1, -1, -1, -1]])\n'

4. `arange()` 
<br>Array-valued version of the built-in Python range function:


In [69]:
# 4. arange() is an array-valued version of the built-in Python range function:
g7 = np.arange(7)   # array([0, 1, 2, 3, 4, 5, 6])
print(f"{g7=}")


g7=array([0, 1, 2, 3, 4, 5, 6])



5. Returning a sample from the "normal" distribution.
- `np.random.randn()`
<br>Return a sample from the "standard normal" distribution. (mean 0 and standard deviation 1)
- `np.random.normal()`
<br>Generate random numbers from any normal distribution with a specified mean and standard deviation.

In [39]:
# 5. Generate a 3x4 array of random numbers from a standard normal distribution (mean 0 and standard deviation 1)
sample_stdnormal = np.random.randn(5)  # array([-0.20175391, -0.87452102,  1.21002324,  0.45234304, -1.2349739])
print("Random numbers generated using np.random.randn:", sample_stdnormal)

mu, sigma = 70, 10  # mean and standard deviation
sample_normal = np.random.normal(loc=mu, scale=sigma, size=(2, 3))
print(f"Random numbers generated using np.random.normal, center:{mu}, spread:{sigma}:", sample_normal)
"""
array([ [86.11293119 64.7618898  72.96080438]
        [75.90800491 65.75190401 69.85628739] ])
"""


Random numbers generated using np.random.randn: [ 0.26776564  0.52120951  0.99939551  0.12674425 -0.36850836]
Random numbers generated using np.random.normal, center:70, spread:10: [[55.92639956 66.10605961 74.1785897 ]
 [59.17517896 75.96718052 68.6305342 ]]


'\narray([ [86.11293119 64.7618898  72.96080438]\n        [75.90800491 65.75190401 69.85628739] ])\n'

### The Basics
NumPy’s main object is the homogeneous multidimensional array. 
<br><br>It is a table of elements (usually numbers), all of the same type, indexed by a tuple of non-negative integers. In NumPy dimensions are called axes.

#### Important attributes of an ndarray
```
ndarray.dtype
```
an object describing the type of the elements in the array. One can create or specify dtype’s using standard Python types. Additionally NumPy provides types of its own. numpy.int32, numpy.int16, and numpy.float64 are some examples.

```

ndarray.ndim
```
the number of axes (dimensions) of the array.

```

ndarray.shape
```
the dimensions of the array. This is a tuple of integers indicating the size of the array in each dimension. 
<br>For a matrix with r rows and c columns, shape will be (r, c). The length of the shape tuple is therefore the number of axes, ndim.

```

ndarray.size
```
the total number of elements of the array. This is equal to the **product of the elements of shape**.


```

ndarray.itemsize
```
the size in bytes of each element of the array. 
<br>For example, an array of elements of type float64 has itemsize 8 (=64/8) 
<br>It is equivalent to `ndarray.dtype.itemsize`

In [54]:
import numpy as np

a = np.arange(15).reshape(3, 5)

print(f"{a=}")
print(f"{type(a)=}")    # <class 'numpy.ndarray'>
print(f"{a.shape=}")    # (3, 5)
print(f"{a.ndim=}")     # 2
print(f"{a.dtype=}")    # dtype('int64')
print(f"{a.itemsize=}") # 8
print(f"{a.size=}")     # 15    (number of elements in the array)


a=array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14]])
type(a)=<class 'numpy.ndarray'>
a.shape=(3, 5)
a.ndim=2
a.dtype=dtype('int64')
a.itemsize=8
a.size=15


## Arithmetic with NumPy Arrays
Arrays are important because they enable you to express batch operations on data without writing any for loops.
NumPy users call this "vectorization". 
<br><br>Any arithmetic operations between equal-size arrays applies the operation element-wise.
Arithmetic operations with scalars propagate the scalar argument to each element in the array:

In [78]:
a = np.ones((2,3))
a *= 5
print(a, a.dtype)  # float64
"""
[[5. 5. 5.]
 [5. 5. 5.]] float64
"""

b = np.arange(6).reshape((2, 3))
print(b, b.dtype)  # int64
"""
[[0 1 2]
 [3 4 5]] int64
"""

result = a + b # Note implicit type conversion 
print(result, result.dtype)

"""
[[ 5.  6.  7.]
 [ 8.  9. 10.]] float64
"""


[[5. 5. 5.]
 [5. 5. 5.]] float64
[[0 1 2]
 [3 4 5]] int64
[[ 5.  6.  7.]
 [ 8.  9. 10.]] float64


'\n[[ 5.  6.  7.]\n [ 8.  9. 10.]] float64\n'

#### Example: Calculate BMI of a group of people
Assume we have heights in cm and weights in kg. Calculate their BMIs.

The body mass index (BMI) is a measure that uses your height and weight to work out if your weight is healthy. 
The BMI calculation divides an adult's weight in kilograms by their height in metres squared. For example, A BMI of 25 means 25kg/m2.

In [44]:
heights_cm = [180, 215, 210, 210, 188]
weights_kg = [100, 90, 90, 90, 78]

# Create an ndarray from python list and divide ndarray by a scalar
np_heights_m = np.array(heights_cm) / 100
np_weights = np.array(weights_kg)

# Calculate bmi
# mathematical operations on whole blocks of data using similar syntax to the equivalent operations between scalar elements:
bmi = np_weights / (np_heights_m ** 2)
print("BMIs:", bmi)


BMIs: [30.86419753 19.46998378 20.40816327 20.40816327 22.06880942]


### Filtering Arrays
In NumPy, you filter an array using a boolean index list. 
<br>A boolean index list is a list of booleans corresponding to indexes in the array.

```
    filtered_std = students[[True, False, False, True]]
```

#### Creating the Filter Array
Using scalar values, a comparison returns a value (True or False)
<br>Using ndarrays, it returns a list of boolean values!

````
    grades = np.array([70, 56, 43, 95])
    failed = grades[grades < 60]
```


In [149]:
# Example 1 - Manual boolean index list
students = np.array(["Harry", "Ron", "Hermione", "Draco"])
bil_filter = [True, False, False, True]
filtered_std = students[bil_filter]    # ['Harry' 'Draco']
print(filtered_std)

# Example 2 - Create filter from the array itself:
grades = np.array([70, 56, 43, 95])

# in scalar values, a comparison returns a value (True or False)
# in ndarrays, it returns a list of boolean values!
bil_filter = grades < 60
failed = grades[bil_filter]
# or simply we write:
failed = grades[grades < 60]
print(f"{failed= }")    # array([56, 43])

# Create a boolean numpy array: the element should be True if the corresponding BMI is below 21.
# bmi < 21    # [False  True  True  True False]
print(bmi[bmi < 21])    


['Harry' 'Draco']
failed= array([56, 43])
[19.46998378 20.40816327 20.40816327]


## Random Sampling (numpy.random)
- Generate Random Number: `np.random.rand()`, `np.random.randint()`
- Generate Random Array: add size arguments to above functions.

In [143]:
import random

# Generate a random number from 0 to 1:
print(f"{np.random.rand()=}")

# Generate a random number form 0 to 10
print(f"{np.random.randint(low=0, high=11)=}")   # from low (inclusive) to high (exclusive).


# Generate a 2-D array with 3 rows, each row containing 5 random integers from 0 to 1:
print(f"{np.random.rand(3, 5)= }")  # (d0, d1, ..., dn)
"""
array([[0.79770957, 0.25666303, 0.75178902, 0.53975694, 0.00344041],
       [0.31638409, 0.44651645, 0.81296002, 0.02374936, 0.19201632],
       [0.91262206, 0.15413818, 0.79251226, 0.16634575, 0.13350289]])
"""

# Generate an array containing 5 random integers from 0 to 100:
print(f"{np.random.randint(100, size=5)= }")    # array([30, 80, 65, 15, 22])

# Generate a 2-D array with 3 rows, each row containing 5 random integers from 0 to 100:
print(f"{np.random.randint(100, size=(3, 5))= }")
""" 
array([[73, 75, 15, 87,  0],
       [27, 99, 62,  2, 68],
       [20,  7, 57, 47, 24]])
"""


np.random.rand()=0.10896681558377519
np.random.randint(low=0, high=11)=5
np.random.rand(3, 5)= array([[0.21923359, 0.98309702, 0.21138138, 0.46382997, 0.21292258],
       [0.64907256, 0.99062101, 0.97918949, 0.45042682, 0.6898179 ],
       [0.59600844, 0.79305191, 0.42727444, 0.0335427 , 0.04755666]])
np.random.randint(100, size=5)= array([95, 96, 18, 91, 36])
np.random.randint(100, size=(3, 5))= array([[37, 73, 74, 28, 31],
       [30,  4, 73, 64, 10],
       [28, 52, 79, 20, 86]])


' \narray([[73, 75, 15, 87,  0],\n       [27, 99, 62,  2, 68],\n       [20,  7, 57, 47, 24]])\n'