## Introduction to NumPy and Pandas Series

NumPy (Numerical Python) and Pandas are powerful libraries in Python that are used extensively in data analysis and manipulation. NumPy provides support for large, multi-dimensional arrays and matrices along with a collection of mathematical functions to operate on these elements efficiently. On the other hand, Pandas brings in high-level data structures and tools that are designed to make data analysis fast and easy in Python. A Pandas Series is a one-dimensional labelled array which can hold any data type. This notebook is geared towards providing a thorough understanding of these libraries.

### Importing the Numpy and Pandas libraries

Before we begin, we need to import the libraries. The `numpy` and `pandas` libraries are typically imported under the `np` and `pd` aliases respectively.

In [1]:
import numpy as np
import pandas as pd

### The Numpy Array

The Numpy array, also known as ndarray, is a grid of values. It is similar to lists in Python, however, Numpy provides more efficient storage and data manipulation functionalities than Python lists. For example, you can perform mathematical operations on ndarrays like multiplication, subtraction etc.

In [2]:
# Code example: construct a numpy array
mylist = list(range(0,100))
myarray = np.array(mylist)
print(myarray)

[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47
 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71
 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95
 96 97 98 99]


A more challenging example would be to create a 2D Numpy array:

In [3]:
# Convert nested lists into a 2D numpy array
mylist1 = list(range(0,50))
mylist2 = list(range(50,100))
mynList = [mylist1, mylist2]
my2DArray = np.array(mynList)
print(my2DArray)

[[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
  24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47
  48 49]
 [50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73
  74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97
  98 99]]


### The Pandas Series

The Pandas Series is a powerful data structure in pandas. It is a one-dimensional labeled array capable of holding any type of data. The axis labels are collectively referred to as the index. Pandas Series is nothing but a column in an Excel sheet. It also allows you to name the index.

In [4]:
# Code example: Construct a pandas series
np.random.seed(10)
names = ["Luke", "Ken", "Han", "Chewy"]
ages = np.random.randint(20, 60, 4)
series = pd.Series(ages, index=names)
print(series)

Luke     29
Ken      56
Han      35
Chewy    20
dtype: int64


A more challenging example would involve creating a pandas series from a dictionary:

In [5]:
# Create a dictionary
my_dict = {'a': 50, 'b': 100, 'c': 150, 'd': 200, 'e': 250}
#Convert dictionary into series
seriesFromDict = pd.Series(my_dict)
print(seriesFromDict)

a     50
b    100
c    150
d    200
e    250
dtype: int64


### Merging Numpy arrays and Pandas Series

You can concatenate two or more numpy arrays along an axis or stack them vertically or horizontally. Similarly, pandas provide various ways to combine Series and DataFrame objects.

In [6]:
# Code example: Join a numpy array and a pandas series
nparray = np.array(['lamb','cow','goat'])
pandasSeries = pd.Series(['apple', 'banana', 'melon'])
newSeries = pd.concat([pd.Series(nparray),pandasSeries])
print(newSeries)

0      lamb
1       cow
2      goat
0     apple
1    banana
2     melon
dtype: object


More challenging example:

In [7]:
# Create two pandas series
series3 = pd.Series([100, 200, 300], ['Tom', 'Bob', 'Nancy'])
series4 = pd.Series([500, 600, 700], ['Tom', 'Bob', 'Nancy'])

# Concatenate the two series
newDataFrame = pd.concat([series3, series4], axis = 1)
print(newDataFrame)

         0    1
Tom    100  500
Bob    200  600
Nancy  300  700
