# NumPy Getting Started Guide

NumPy, which stands for Numerical Python, is a fundamental library for scientific computing in Python. But its capabilities go far beyond that, as we will see in these videos. And it serves as a good foundation for Pandas, which is one of the most popular libraries for data analysis in Python.

In [None]:
# !pip install numpy

## NumPy Arrays

An array is a data structure that stores values of the same type. In Python, this is a big advantage because it saves space and allows for more efficient operations.

In [2]:
import numpy as np

array = np.array([1,2,3,4,5])

print(array)

[1 2 3 4 5]


It is important to understand the difference between a list and an array.

A **list** is one of the most basic data structures in Python. It can contain any type of elements, such as numbers, strings, other lists, and they can all be of different types. For example:


In [14]:
list1=[1,'two',3.0]

print(list1)
print(type(list1))

for el in list1:
    print(type(el))

[1, 'two', 3.0]
<class 'list'>
<class 'int'>
<class 'str'>
<class 'float'>


An **array**, on the other hand, is a data structure that also stores elements, but all elements must be of the same type. If you try to create an array with elements of different types, NumPy will convert them all to the more general type. For example:

In [8]:
array = np.array(list)

print(array)

for el in array:
    print(type(el))

# or you can use dtype
print(array.dtype) # Unicode 32 bits

['1' 'two' '3.0']
<class 'numpy.str_'>
<class 'numpy.str_'>
<class 'numpy.str_'>
<U32


## Math operations

If you try to add a number to all elements in a list, you will receive an error.



In [9]:
list1 = [1,2,3,4,5]
new_list1 = list1+1 # if you try to add 1 in each element will raise a TypeError

# you can solve this with a for loop. Check below.

TypeError: can only concatenate list (not "int") to list

In [10]:
new_list1=[]

for number in list1:
    new_list1.append(number+1)

print(new_list1)

# So, it worked. But, how can we solve this problem with a numpy array? Check below.

[2, 3, 4, 5, 6]


With a NumPy array, you can add (or subtract, multiply, divide) a number to all elements at once.



In [12]:
array = np.array([1,2,3,4,5])

new_array = array + 1 # adding 1 to all elements
print(new_array)

[2 3 4 5 6]


## Performance

For large amounts of data, NumPy arrays are significantly more memory and performance efficient than Python lists. Here is an example that demonstrates this:



In [3]:
import time

# create a list and an array w/ 10 million numbers
list1 = list(range(1,10_000_001))
array = np.array(range(1,10_000_001))

# calculate the sum of all numbers - list
start = time.time()
sum_list1 = sum(list1)
end = time.time()

print(f'Time to sum all numbers in the List: {end-start} seconds')

# calculate the sum of all numbers - array
start = time.time()
sum_array = np.sum(array)
end = time.time()

print(f'Time to sum all numbers in the Array: {end-start} seconds')


Time to sum all numbers in the List: 0.3453941345214844 seconds
Time to sum all numbers in the Array: 0.0 seconds


## In summary

Here are some key differences between lists and arrays:

1. **Data type**: Lists can store elements of different types at the same time, while arrays store elements of the same type.

2. **Mathematical operations**: You can perform mathematical operations on all elements of an array at once, which is not possible with lists.

3. **Performance**: Arrays are more memory and performance efficient than lists when working with large amounts of numeric data.

4. **Features**: NumPy arrays come with several built-in functions for mathematical and scientific operations such as averaging, summing, matrix multiplication, etc., which are not available with lists.