## What is numpy?

Numpy is a powerful library in Python that provides support for large, multi-dimensional arrays and matrices, along with a collection of mathematical functions to operate on these arrays. Numpy arrays are more efficient than Python lists for numerical computations, and it is designed to work with large datasets efficiently.

In summary, NumPy is a fundamental library in Python that provides support for large, multi-dimensional arrays and matrices, along with a collection of mathematical functions for numerical computations. It is widely used in data analysis, scientific computing, and machine learning applications.

## Why is numpy important?

NumPy offers several advantages over traditional Python lists:

1. Performance: NumPy arrays are more efficient than Python lists for numerical computations. They use memory more efficiently, and they support vectorized operations, which can lead to faster and more readable code.

2. Flexibility: NumPy arrays can have a variable number of dimensions, allowing for more complex data structures and operations. They also support different data types, such as integers, floating-point numbers, and complex numbers, which can be useful in various applications.

3. Convenient mathematical functions: NumPy provides a collection of mathematical functions, such as `sin`, `cos`, `exp`, `log`, and `sqrt`, that can be applied to arrays efficiently. These functions are implemented in C, which can lead to significant performance improvements compared to Python's built-in functions.

In summary, NumPy is a powerful library in Python that provides support for large, multi-dimensional arrays and matrices, along with a collection of mathematical functions for numerical computations. It is widely used in data analysis, scientific computing, and machine learning applications, and it offers several advantages over traditional Python lists.

## How can I get started with numpy?

To get started with NumPy, you can follow these steps:

1. Import the NumPy library: Use the `import numpy as np` statement to import the NumPy library and give it a shorter alias, such as `np`.

2. Create NumPy arrays: Use the `np.array()` function to create arrays from Python lists. You can also create arrays using various other functions, such as `np.zeros()`, `np.ones()`, `np.full()`, `np.arange()`, and `np.linspace()`.

3. Perform mathematical operations: Use the NumPy array methods and functions to perform mathematical operations on arrays. Some common operations include addition, subtraction, multiplication, division, and exponentiation.

By following these steps, you can start using NumPy to work with large, multi-dimensional arrays and matrices efficiently.

I hope this information helps you get started with NumPy! Let me know if you have any further questions.



In [1]:
import numpy as np

x = np.array([1, 2, 3, 4, 5])
print(x)
print(type(x))

[1 2 3 4 5]
<class 'numpy.ndarray'>


In [2]:
y = [1, 2, 3, 4, 5]
print(y)
print(type(y))

[1, 2, 3, 4, 5]
<class 'list'>


## Can numpy arrays store different data types?

Yes — NumPy arrays can store heterogeneous data, but it’s usually not recommended, and there are important caveats.

When it comes to data types, NumPy arrays are designed to be efficient and flexible. They can store a variety of data types, including integers, floating-point numbers, and even complex numbers. However, it's important to note that:

- While NumPy arrays can store heterogeneous data, it's generally not recommended to mix data types within a single array. This can lead to unexpected behavior and inefficient memory usage.
- If you need to store heterogeneous data, you can use a combination of different data types within the same array. For example, you can create an array of integers and a separate array of floating-point numbers.

Here's an example of creating a NumPy array with heterogeneous data types:

In [3]:
d = np.array([1, 2, 3, 4, 5, 6.0, '7'])
print(d)

## In this example, the array `d` contains integers, floating-point numbers, and a string.

['1' '2' '3' '4' '5' '6.0' '7']


In [4]:
## By design, Numpy array has one data type for the entire array, which in this case is a floating-point number.
arr = np.array([1, 2, 3, 4, 5], dtype=int)
print(arr.dtype)


# If you mix types, NumPy upcasts everything to a common type. This is still homogeneous.
arr = np.array([1, 2.5, 3])
print(arr)         # [1.  2.5 3. ]
print(arr.dtype)   # float64


# By default, Numpy is smart enough to automatically convert the string to a floating-point number when performing mathematical operations.

## By default, NumPy will store the integers and floating-point numbers as floating-point numbers. However, if you want to store the integers as integers, you can specify the `dtype` parameter

## True Heterogeneous Numpy Array -> Arrays can store heterogeneous data, but it's generally not recommended, and there are important caveats. You can force heterogeneity using `object` dtype:
arr = np.array([1, "hello", 3.14, [1, 2, 3]], dtype=object)
print(arr)         # [1 2 3]
print(arr.dtype)   # int64

## Downside of this, this can lead to slower performance and increased memory usage.

int64
[1.  2.5 3. ]
float64
[1 'hello' 3.14 list([1, 2, 3])]
object


In [5]:
# If the requirment is different types per column, use structured arrays

dt = np.dtype([
    ("id", np.int32),
    ("price", np.float64),
    ("symbol", "U10")
])

arr = np.array([
    (1, 1.2, "AAPL"),
    (2, 1.5, "GOOG"),
    (3, 0.8, "MSFT")
], dtype=dt)

print(arr)
print(arr.dtype)

[(1, 1.2, 'AAPL') (2, 1.5, 'GOOG') (3, 0.8, 'MSFT')]
[('id', '<i4'), ('price', '<f8'), ('symbol', '<U10')]


### Perfomance of numpy arrays compared to Python lists

In [6]:
##  Performace of numpy arrays compared to Python lists
%timeit np.arange(1, 9)**2
%timeit [i**2 for i in range(1, 9)]

# Numpy array are faster and more memory efficient than Python lists for numerical computations because they are implemented in C.

560 ns ± 14.1 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
266 ns ± 2.76 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)


In [7]:
arr = np.array([1, 2, 3, 4, 5])
print(arr.ndim)

1


## Creating n dimension array in numpy
- Pass to `n` dimensional list to `np.array()` to create `n` dimensional numpy array. 
- To get the dimension of a numpy array use `.ndim` on the numpy array

In [8]:
## Creating 1D array
arr1 = np.array([1, 2, 3])
print(arr1.ndim)


## Creating 2D array. Size of each element of the multi-dimensional array must be same. Think of it like an matrix
arr2 = np.array([[1, 2, 3], [4, 5, 7]])
print(arr2.ndim)
print(arr2.dtype)

# Creating a 10D array
arr10 = np.array([[1, 2, 3], [4, 5, 6]], ndmin=10)
print(arr10)
print(arr10.ndim)

1
2
int64
[[[[[[[[[[1 2 3]
         [4 5 6]]]]]]]]]]
10


### Creating different types of Numpy array

In [9]:
## Creating a zero array
zeroarr = np.zeros(shape=(2, 3, 4), dtype=int)
print(zeroarr)


# [   
#     [  
#         [0, 0, 0, 0],
#         [0, 0, 0, 0],
#         [0, 0, 0, 0]
#     ],
#     [
#         [0, 0, 0, 0],
#         [0, 0, 0, 0],
#         [0, 0, 0, 0]
#     ]
# ]


[[[0 0 0 0]
  [0 0 0 0]
  [0 0 0 0]]

 [[0 0 0 0]
  [0 0 0 0]
  [0 0 0 0]]]


In [10]:
## Creating a ones array
onesarr = np.ones(shape=(2, 3, 4), dtype=int)
print(onesarr)


[[[1 1 1 1]
  [1 1 1 1]
  [1 1 1 1]]

 [[1 1 1 1]
  [1 1 1 1]
  [1 1 1 1]]]


In [11]:
# Creating a empty array
emptyarr = np.empty(shape=(2,3), dtype=int)
print(emptyarr)

[[0 0 0]
 [0 0 0]]


In [12]:
## Creating range array -> Always outputs a 1D array. Start and stop may not be included
rangearr = np.arange(start=1, stop=5, step=.5, dtype=float)
print(rangearr)


# Create a array from range where elements are equi-distance from each other, start and end of a range is included.
randomarr = np.linspace(1, 10, num=3, dtype=int)
print(randomarr)

[1.  1.5 2.  2.5 3.  3.5 4.  4.5]
[ 1  5 10]


In [13]:
# For creating identity matrix, both `np.identity()` and `np.eye()`  can be used to create identity matrix, but `np.eye()` gives
# more control
 
identityarr = np.identity(3)
print(identityarr)




identityarr2 = np.eye(3)
print(identityarr2)


# N → number of rows
# M → number of columns (default = N)
# k → diagonal offset
# 0 → main diagonal
# >0 → upper diagonal
# <0 → lower diagonal
identityarr3 = np.eye(M=5, N=4, k=1)
print(identityarr3)

[[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]]
[[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]]
[[0. 1. 0. 0. 0.]
 [0. 0. 1. 0. 0.]
 [0. 0. 0. 1. 0.]
 [0. 0. 0. 0. 1.]]


In [14]:
## Creating numpy arrays from random numbers frm 0 to 1

randarr = np.random.rand(4) # No. of times to render
print(randarr)

print("=============")

randarr1 = np.random.rand(2, 5)  # These are dimensions
print(randarr1)

[0.4195     0.05318058 0.23997381 0.56362527]
[[0.18168307 0.38060775 0.6644252  0.41587458 0.76807154]
 [0.02829926 0.75712041 0.06066739 0.37112529 0.21499683]]


In [15]:
## Generate random between -1 to 1 close to 0

r = np.random.randn(2, 3) ## Shape
print(r)

[[0.81436764 1.31015482 1.58669962]
 [1.41390187 0.02803093 0.51470789]]


In [16]:
## Genearte random numbers between [0.0, 1.0) -> 1 is not inclusive

arr = np.random.ranf((3, 4))
print(arr)

[[0.26043936 0.93542289 0.03932523 0.5662606 ]
 [0.91751321 0.55042568 0.63880451 0.59223625]
 [0.45337012 0.34272014 0.45966359 0.54111321]]


In [17]:
## Generate random numbers int within a range

rnum = np.random.randint(low=1, high=6, size=(2, 4)) 
rnum

array([[5, 5, 1, 4],
       [5, 5, 4, 2]])

## Datatypes in Numpy arrays
- You can specify the datatype of a numpy array while creating it using the `dtype` parameter
- You can also change the datatype of an existing numpy array using the `.astype()` method 
- Numpy supports various datatypes like `int`, `float`, `complex`, `bool`, `object`, etc.
- In summary, while NumPy arrays can store heterogeneous data types, it's generally recommended to use homogeneous data types for efficiency and performance. If you need to work with heterogeneous data, consider using separate arrays for each data type or using the `dtype=object` option with caution.

In [18]:
# Creating a numpy array with heterogeneous data types
heterogeneous_array = np.array([1, 2.5, 3+4j, 'Hello'], dtype=object)
print(heterogeneous_array)

print("===========")

# Creating a numpy array with homogenous int data types
int_array = np.array([1, 2, 3, 4, 5], dtype=int)
print(int_array)

print("===========")

# Creating a numpy array with floating-point numbers
float_array = np.array([1.0, 2.5, 3.3, 4.8, 5.1], dtype=float)
print(float_array)

print("===========")

# Creating a numpy array with complex numbers
complex_array = np.array([1+2j, 3+4j, 5+6j], dtype=complex)
print(complex_array)

print("===========")

# Changing the datatype of an existing numpy array
original_array = np.array([1, 2, 3, 4, 5])
float_array = original_array.astype(float)
print(float_array)

[1 2.5 (3+4j) 'Hello']
[1 2 3 4 5]
[1.  2.5 3.3 4.8 5.1]
[1.+2.j 3.+4.j 5.+6.j]
[1. 2. 3. 4. 5.]
