# What is Pandas?

**pandas** is a fast, powerful, flexible and easy to use open source data analysis and manipulation tool,
built on top of the Python programming language.

[source](https://pandas.pydata.org/)

Pandas is built on top of numpy (see [here](https://github.com/econdesousa/portfolio/tree/main/Intro-to-numpy) for an introduction to numpy) and uses Series and dataframes as primary data structures.

## Install Pandas

As most of the python libraries, pandas can be installed from conda
>conda install pandas
or from pip
>pip install pandas

## Importing Pandas

To import pandas into your script

In [1]:
import pandas as pd

## Pandas Series

In [2]:
numericalSeries = pd.Series([10,21,32,43])
numericalSeries

0    10
1    21
2    32
3    43
dtype: int64

In [3]:
CharacterSeries = pd.Series(['a','b','c'])
CharacterSeries

0    a
1    b
2    c
dtype: object

Combining pandas with numpy

In [4]:
import numpy as np

numpySeries = pd.Series(np.random.randn(10))
numpySeries

0   -0.919858
1   -0.047902
2   -0.800299
3   -1.759447
4   -1.599635
5   -0.747060
6    1.648134
7   -1.346622
8    1.217698
9    1.179268
dtype: float64

## Assigning indexes to series

In [5]:
numpySeriesWithIndex = pd.Series(np.random.randn(10),index=['a','b','c','d','e','f','g','h','i','j'])
numpySeriesWithIndex

a    0.932594
b   -0.472996
c   -0.345444
d   -0.636789
e    0.859272
f   -1.082917
g   -0.169093
h    0.209436
i   -1.213947
j    0.556698
dtype: float64

Or the same from a dictionary

In [6]:
dictionary = {'a' : 1,'b' : 2,'c' : 3,'d' : 4,'e' : 5,'f' : 6,'g' : 7,'h' : 8,'i' : 9,'j' : 10}
numpySeriesWithIndexFromDict = pd.Series(dictionary)
numpySeriesWithIndexFromDict

a     1
b     2
c     3
d     4
e     5
f     6
g     7
h     8
i     9
j    10
dtype: int64

## Indexing

### from index


In [7]:
numpySeriesWithIndexFromDict[0] # first element of the Series

1

In [8]:
numpySeriesWithIndexFromDict[-1] # last element of the Series

10

In [9]:
numpySeriesWithIndexFromDict[5:-2] # from sixth element (index=5) to penultimate element (index=-2)

f    6
g    7
h    8
dtype: int64

### from name of index

In [10]:
numpySeriesWithIndexFromDict['b']

2

### Edit values by math operations (numpy style)

In [11]:
numpySeriesWithIndexFromDict + 2

a     3
b     4
c     5
d     6
e     7
f     8
g     9
h    10
i    11
j    12
dtype: int64

In [12]:
numpySeriesWithIndexFromDict / 2

a    0.5
b    1.0
c    1.5
d    2.0
e    2.5
f    3.0
g    3.5
h    4.0
i    4.5
j    5.0
dtype: float64

In [13]:
numpySeriesWithIndexFromDict ** 2 

a      1
b      4
c      9
d     16
e     25
f     36
g     49
h     64
i     81
j    100
dtype: int64

## Pandas Dataframe

A pandas dataframe is an heterogenous struture in python
It can be thought as the python equivalent to an excel spreedsheet

In [14]:
twoColDataframe = pd.DataFrame({'var1' : [1,2,3,4,5], 'var2' : np.random.rand(5)})
twoColDataframe

Unnamed: 0,var1,var2
0,1,0.164482
1,2,0.792775
2,3,0.477068
3,4,0.001666
4,5,0.181086


In [15]:
dataframeFromDictionary = pd.DataFrame([dictionary]) # if passed as imput to a dataframe, each dictionary entry will became a column
dataframeFromDictionary

Unnamed: 0,a,b,c,d,e,f,g,h,i,j
0,1,2,3,4,5,6,7,8,9,10


## Import and export to csv

To save a dataframe to csv we just use the variable pointing to the dataframe and appy to it the function to_csv() 

In [16]:
filename = 'data2CSV.csv'
twoColDataframe.to_csv(filename)

In [17]:
newDataframeFromFile = pd.read_csv(filename)
newDataframeFromFile

Unnamed: 0.1,Unnamed: 0,var1,var2
0,0,1,0.164482
1,1,2,0.792775
2,2,3,0.477068
3,3,4,0.001666
4,4,5,0.181086


Saving to and reading from CSV can be customized. For example, instead of using comm as separator we can use tab 

In [18]:
twoColDataframe.to_csv(filename,sep="\t",index=False);
newDataframeFromFile=pd.read_csv(filename,sep="\t")
newDataframeFromFile

Unnamed: 0,var1,var2
0,1,0.164482
1,2,0.792775
2,3,0.477068
3,4,0.001666
4,5,0.181086


<a style='text-decoration:none;line-height:16px;display:flex;color:#5B5B62;padding:10px;justify-content:end;' href='https://deepnote.com?utm_source=created-in-deepnote-cell&projectId=438bfc66-3ecc-45f9-a712-d9b30601a8bf' target="_blank">
 </img>
Created in <span style='font-weight:600;margin-left:4px;'>Deepnote</span></a>