#### Pandas
<font size=2>
Pandas contains high-level data structures and manipulation tools designed to make data
analysis fast and easy in Python. pandas is built on top of NumPy and makes it easy to
use in NumPy-centric applications.

Pandas is widely used in data science, machine learning, and data analysis due to its ability to handle large datasets and perform complex data operations efficiently.
</font>

In [26]:
import pandas as pd
import numpy as np      # importing numpy just in case

<font size=3>
Two major data types in Pandas
</font>
<font size=2>

- <b>DataFrame:</b> A two-dimensional, size-mutable, and heterogeneous tabular data structure with labeled axes (rows and columns). It is similar to a spreadsheet or SQL table.
- <b>Series:</b> A one-dimensional, labeled array that can hold any data type. It is like a single column in a DataFrame.

Let's get started with

</font>
<br>
<b>Series</b>
<font size=2>Series is formed from only an array of data:</font>

In [3]:
s1 = pd.Series([4, 7, -5, 3])
s1

0    4
1    7
2   -5
3    3
dtype: int64

<font size=2>
The string representation of a Series displayed interactively shows the index on the left
and the values on the right. Since we did not specify an index for the data, a default
one consisting of the integers 0 through N - 1 (where N is the length of the data) is
created. You can get the array representation and index object of the Series via its values
and index attributes, respectively:
</font>

In [9]:
print(s1.values)
s1.index

[ 4  7 -5  3]


RangeIndex(start=0, stop=4, step=1)

<font size=2>since the above index attribute is returning a range index object we can loop over it</font>

In [13]:
for i in s1.index:
    print(i, end=" ")

0 1 2 3 

we can assign our own index for unique identification of each element

In [16]:
s2 = pd.Series([4, 7, -5, 3], index=['a','b','c','d'])
s2

a    4
b    7
c   -5
d    3
dtype: int64

<font size=2>
Compared with a regular NumPy array, you can use values in the index when selecting
single values or a set of values
</font>

In [21]:
print("Accessing element: ",s2['a'],"\n") # accessing the element from series using index
s2['d'] = 6     # assiging element using index
print("Accessing elements using multiple indexes:")
s2[['c', 'a', 'd']] # accessing multiple element using an 
                    # array of indexes

Accessing element:  4 

Accessing elements using multiple indexes:


c   -5
a    4
d    6
dtype: int64

<font size=2>
NumPy array operations, such as filtering with a boolean array, scalar multiplication,
or applying math functions, will preserve the index-value link:

</font>

In [23]:
s2[s2 > 0]

a    4
b    7
d    6
dtype: int64

In [24]:
s2 * 2

a     8
b    14
c   -10
d    12
dtype: int64

In [27]:
np.exp(s2)

a      54.598150
b    1096.633158
c       0.006738
d     403.428793
dtype: float64

<font size=2>It can be substituted into many functions that expect a
dict:</font>

In [28]:
print('b' in s2)
print('e' in s2)

True
False


<font size=2>you can create a series from a dictionary as well</font>

In [29]:
sdata = {'Ohio': 35000,
         'Texas': 71000,
         'Oregon': 16000,
         'Utah': 5000}
s3 = pd.Series(sdata)
s3

Ohio      35000
Texas     71000
Oregon    16000
Utah       5000
dtype: int64