### Numerical Data

yeild = w1 * temp + w2 * rainfall + w3 * humidity

In [3]:
w1,w2,w3 = 0.2,0.3,0.5

Given some climate data for a region, we can now predict the yield of apples. Here's some sample data:

<img src="https://i.imgur.com/TXPBiqv.png" style="width:360px;">

To begin, we can define some variables to record climate data for a region.

In [6]:
kanto_temp = 73
kanto_rain = 67
kanto_humdity = 43

In [8]:
kanto_yield = kanto_temp * w1 + kanto_rain * w2 + kanto_humdity *w3

kanto_yield

56.2

In [9]:
print("The expected yield of apples in Kanto region is {} tons per hectare.".format(kanto_yield))

The expected yield of apples in Kanto region is 56.2 tons per hectare.


In [15]:
kanto = [73,67,43] 
kanto

[73, 67, 43]

In [14]:
weights = [w1,w2,w3]
weights

[0.2, 0.3, 0.5]

In [27]:
def crop_yeild(region,weights):
    result = 0
    for x,y in zip(region,weights):
        result += x * y
    return result

In [29]:
crop_yeild(kanto,weights)

56.2

### Numpy Arrays

In [36]:
import numpy as np

In [38]:
#Numpy Array
kanto = np.array([73,67,43])

In [39]:
kanto

array([73, 67, 43])

In [43]:
weights = np.array([0.2,0.3,0.5])

In [45]:
weights

array([0.2, 0.3, 0.5])

In [46]:
type(weights)

numpy.ndarray

In [47]:
weights[0]

np.float64(0.2)

In [51]:
kanto[1]

np.int64(67)

### Numpy Operations

In [52]:
np.dot(kanto,weights)

np.float64(56.2)

In [55]:
(kanto * weights).sum()

np.float64(56.2)

`np.dot()` is a NumPy function used to compute the dot product of two arrays. Depending on the input arrays, it can handle both 1-D and 2-D arrays and can also perform matrix multiplication.

### 1-D Arrays (Vectors)

When `np.dot()` is applied to two 1-D arrays, it computes the dot product of the two vectors.

#### Example:

```python
import numpy as np

# Define two 1-D arrays (vectors)
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])

# Compute the dot product
dot_product = np.dot(a, b)
print(dot_product)  # Output: 32
```

### Explanation:

For vectors {a} = [1, 2, 3]\) and \(\mathbf{b} = [4, 5, 6]\):

{a} \cdot \mathbf{b} = (1 \cdot 4) + (2 \cdot 5) + (3 \cdot 6) = 4 + 10 + 18 = 32 \]

### 2-D Arrays (Matrices)

When `np.dot()` is applied to 2-D arrays (matrices), it performs matrix multiplication.

#### Example:

```python
import numpy as np

# Define two 2-D arrays (matrices)
A = np.array([[1, 2], [3, 4]])
B = np.array([[5, 6], [7, 8]])

# Compute the matrix product
matrix_product = np.dot(A, B)
print(matrix_product)
# Output:
# [[19 22]
#  [43 50]]
```

### Explanation:

For matrices \(A\) and \(B\):

\[ A = \begin{bmatrix} 1 & 2 \\ 3 & 4 \end{bmatrix}, \quad B = \begin{bmatrix} 5 & 6 \\ 7 & 8 \end{bmatrix} \]

The matrix product \(A \cdot B\) is computed as:

\[ A \cdot B = \begin{bmatrix} (1 \cdot 5 + 2 \cdot 7) & (1 \cdot 6 + 2 \cdot 8) \\ (3 \cdot 5 + 4 \cdot 7) & (3 \cdot 6 + 4 \cdot 8) \end{bmatrix} = \begin{bmatrix} 19 & 22 \\ 43 & 50 \end{bmatrix} \]

### Higher-Dimensional Arrays

When `np.dot()` is applied to arrays with more than two dimensions, it performs a sum-product over the last axis of the first array and the second-to-last axis of the second array.

#### Example:

```python
import numpy as np

# Define two higher-dimensional arrays
C = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])
D = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])

# Compute the dot product
dot_product = np.dot(C, D)
print(dot_product)
```

### Summary

- **1-D Arrays**: Computes the dot product of two vectors.
- **2-D Arrays**: Performs matrix multiplication.
- **Higher-Dimensional Arrays**: Performs a sum-product over the last axis of the first array and the second-to-last axis of the second array.

### Common Use Cases

- **Vectors**: Calculating the dot product for projections, similarity measures, etc.
- **Matrices**: Performing linear transformations, solving systems of linear equations, etc.

The `np.dot()` function is versatile and efficient, leveraging NumPy's optimized C code for fast computations.

In [66]:
arr1 = np.array(0)
arr2 = np.array([1,2])
arr3 = np.array([[1,2],[11,22]])

In [67]:
arr1.ndim, arr2.ndim, arr3.ndim

(0, 1, 2)

In [68]:
arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])

In [69]:
np.dot(arr1,arr2)

np.int64(32)

### Multi-dimensional Numpy arrays 

In [70]:
climate_data = np.array([[73, 67, 43],
                         [91, 88, 64],
                         [87, 134, 58],
                         [102, 43, 37],
                         [69, 96, 70]])

In [71]:
climate_data.ndim

2

In [73]:
#2d Array

climate_data.shape

(5, 3)

In [74]:
weights

array([0.2, 0.3, 0.5])

In [83]:
weights.shape
weights.ndim

1

In [78]:
#3d Array
arr3 = np.array([
    [[11,12,13],
     [14,15,16]],
    [[17,18,19],
     [44,55,66.2]]])

In [79]:
arr3.shape

(2, 2, 3)

In [81]:
weights.dtype,climate_data.dtype, arr3.dtype

(dtype('float64'), dtype('int64'), dtype('float64'))

We can now compute the predicted yields of apples in all the regions, using a single matrix multiplication between `climate_data` (a 5x3 matrix) and `weights` (a vector of length 3). Here's what it looks like visually:

<img src="https://i.imgur.com/LJ2WKSI.png" width="240">

You can learn about matrices and matrix multiplication by watching the first 3-4 videos of this playlist: https://www.youtube.com/watch?v=xyAuNHPsq-g&list=PLFD0EB975BA0CC1E0&index=1 .

We can use the `np.matmul` function or the `@` operator to perform matrix multiplication.

In [91]:
climate_data

array([[ 73,  67,  43],
       [ 91,  88,  64],
       [ 87, 134,  58],
       [102,  43,  37],
       [ 69,  96,  70]])

In [119]:
np.dot(climate_data,weights)

array([56.2, 76.6, 86.6, 51.8, 77.6])

In [84]:
np.matmul(climate_data,weights)

array([56.2, 76.6, 86.6, 51.8, 77.6])

In [120]:
climate_data @ weights

array([56.2, 76.6, 86.6, 51.8, 77.6])

Working with CSV Files

In [121]:
import urllib.request

urllib.request.urlretrieve(
    'https://gist.github.com/BirajCoder/a4ffcb76fd6fb221d76ac2ee2b8584e9/raw/4054f90adfd361b7aa4255e99c2e874664094cea/climate.csv', 
    'climate.txt'
)

('climate.txt', <http.client.HTTPMessage at 0x2ef619a7f50>)

In [127]:
climate_data = np.genfromtxt('climate.txt',  delimiter=",", skip_header=True)

In [128]:
climate_data

array([[25., 76., 99.],
       [39., 65., 70.],
       [59., 45., 77.],
       ...,
       [99., 62., 58.],
       [70., 71., 91.],
       [92., 39., 76.]])

In [129]:
climate_data.shape

(10000, 3)

In [133]:
weights =np.array([0.2,0.3,0.5])

In [136]:
yeilds = climate_data @ weights

In [137]:
yeilds.shape

(10000,)

In [151]:
yeilds.reshape(10000,1)

array([[77.3],
       [62.3],
       [63.8],
       ...,
       [67.4],
       [80.8],
       [68.1]])

In [150]:
np.concatenate((climate_data,yeilds.reshape(10000,1)),axis=1)

array([[25. , 76. , 99. , 77.3],
       [39. , 65. , 70. , 62.3],
       [59. , 45. , 77. , 63.8],
       ...,
       [99. , 62. , 58. , 67.4],
       [70. , 71. , 91. , 80.8],
       [92. , 39. , 76. , 68.1]])

In [162]:
np.concatenate((np.arange(1,10).reshape(3,3),np.arange(11,20).reshape(3,3)),axis=1)

array([[ 1,  2,  3, 11, 12, 13],
       [ 4,  5,  6, 14, 15, 16],
       [ 7,  8,  9, 17, 18, 19]])

In [158]:
np.concatenate((np.arange(1,10).reshape(3,3),np.arange(11,20).reshape(3,3)),axis=0)

array([[ 1,  2,  3],
       [ 4,  5,  6],
       [ 7,  8,  9],
       [11, 12, 13],
       [14, 15, 16],
       [17, 18, 19]])

Numpy Arithematics Operations

In [163]:
arr1 = np.arange(1,13).reshape(3,4)
arr1

array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12]])

In [169]:
arr2 = np.arange(11,23).reshape(3,4)
arr2

array([[11, 12, 13, 14],
       [15, 16, 17, 18],
       [19, 20, 21, 22]])

In [168]:
# Adding a Scalar
arr1 + 2

array([[ 3,  4,  5,  6],
       [ 7,  8,  9, 10],
       [11, 12, 13, 14]])

In [171]:
# element wise Subtract
 
arr2-arr1

array([[10, 10, 10, 10],
       [10, 10, 10, 10],
       [10, 10, 10, 10]])

In [173]:
#Division
arr2 / 2

array([[ 5.5,  6. ,  6.5,  7. ],
       [ 7.5,  8. ,  8.5,  9. ],
       [ 9.5, 10. , 10.5, 11. ]])

### Array Broadcasting

Numpy arrays also support *broadcasting*, allowing arithmetic operations between two arrays with different numbers of dimensions but compatible shapes. Let's look at an example to see how it works.

In [174]:
arr2 = np.array([[1, 2, 3, 4], 
                 [5, 6, 7, 8], 
                 [9, 1, 2, 3]])

In [175]:
arr2.shape

(3, 4)

In [176]:
arr4 = np.array([4,5,6,7])

In [178]:
arr4.shape, arr4.ndim

((4,), 1)

In [179]:
arr1 + arr4

array([[ 5,  7,  9, 11],
       [ 9, 11, 13, 15],
       [13, 15, 17, 19]])

In [180]:
arr1 * arr4

array([[ 4, 10, 18, 28],
       [20, 30, 42, 56],
       [36, 50, 66, 84]])

When the expression `arr2 + arr4` is evaluated, `arr4` (which has the shape `(4,)`) is replicated three times to match the shape `(3, 4)` of `arr2`. Numpy performs the replication without actually creating three copies of the smaller dimension array, thus improving performance and using lower memory.

<img src="https://jakevdp.github.io/PythonDataScienceHandbook/figures/02.05-broadcasting.png" width="360">

Broadcasting only works if one of the arrays can be replicated to match the other array's shape.

In [181]:
arr5 = np.array([4,6])

In [182]:
arr1 * arr5

ValueError: operands could not be broadcast together with shapes (3,4) (2,) 

Array Comparison

In [183]:
arr1 = np.array([[1,2,3],[4,5,6]])
arr2 = np.array([[1,2,3],[7,8,9]])

In [184]:
arr1 == arr2

array([[ True,  True,  True],
       [False, False, False]])

In [185]:
arr1 != arr2

array([[False, False, False],
       [ True,  True,  True]])

In [188]:
arr1 >= arr2

array([[ True,  True,  True],
       [False, False, False]])

In [189]:
arr1 <= arr2

array([[ True,  True,  True],
       [ True,  True,  True]])

In [190]:
(arr1 == arr2).sum()

np.int64(3)

Array Indexing A& Slicing

In [191]:
arr3 = np.array([
    [[11, 12, 13, 14], 
     [13, 14, 15, 19]], 
    
    [[15, 16, 17, 21], 
     [63, 92, 36, 18]], 
    
    [[98, 32, 81, 23],      
     [17, 18, 19.5, 43]]])

In [196]:
arr3.shape

(3, 2, 4)

In [197]:
arr3.ndim

3

In [198]:
#Singe Element

arr3[1,1,2]

np.float64(36.0)

In [212]:
# Subarray using ranges
arr3[1:,:1,:2]

array([[[15., 16.]],

       [[98., 32.]]])

In [222]:
# Mixing indices and ranges
arr3[1:,1,3]

array([18., 43.])

In [232]:
# Using fewer indices
arr3[:2, 1]

array([[13., 14., 15., 19.],
       [63., 92., 36., 18.]])

The notation and its results can seem confusing at first, so take your time to experiment and become comfortable with it. Use the cells below to try out some examples of array indexing and slicing, with different combinations of indices and ranges. Here are some more examples demonstrated visually:

<img src="https://scipy-lectures.org/_images/numpy_indexing.png" width="360">

In [244]:
arr4 = np.vstack(
    (np.arange(6).reshape(1,6),
    np.arange(10,16).reshape(1,6),
    np.arange(20,26).reshape(1,6),
    np.arange(30,36).reshape(1,6),
    np.arange(40,46).reshape(1,6),
    np.arange(50,56).reshape(1,6)
    
    )
)

In [245]:
arr4

array([[ 0,  1,  2,  3,  4,  5],
       [10, 11, 12, 13, 14, 15],
       [20, 21, 22, 23, 24, 25],
       [30, 31, 32, 33, 34, 35],
       [40, 41, 42, 43, 44, 45],
       [50, 51, 52, 53, 54, 55]])

In [246]:
arr4[0,3:5]

array([3, 4])

In [257]:
arr4[:,2:3]

array([[ 2],
       [12],
       [22],
       [32],
       [42],
       [52]])

In [259]:
arr4[2:3,::2]

array([[20, 22, 24]])

In [267]:
arr4[4:,4:]

array([[44, 45],
       [54, 55]])

## Other ways of creating Numpy arrays

Numpy also provides some handy functions to create arrays of desired shapes with fixed or random values. Check out the [official documentation](https://numpy.org/doc/stable/reference/routines.array-creation.html) or use the `help` function to learn more.

In [275]:
# All zeros
np.zeros((3,2))

array([[0., 0.],
       [0., 0.],
       [0., 0.]])

In [276]:
# All ones
np.ones([2, 2, 3])

array([[[1., 1., 1.],
        [1., 1., 1.]],

       [[1., 1., 1.],
        [1., 1., 1.]]])

In [279]:
np.linspace(1,21,5)

array([ 1.,  6., 11., 16., 21.])