### An Introduction To
**Pandas** is a package for data manipulation and analysis in Python. Pandas is derived the econometrics term *Panel Data*.
Panda incorporates two additional data structures into Python, namely **Panda Series** and **Panda DataFrame**.

In this essay, we are going to work with some of the most important features of Pandas, and you'll be able to use the basic methods.

You'll learn:
 * How to import Pandas.
 * How to create Pandas work with series in pandas.
 * How to access and change elements in Series and DataFrames
 * How to perform arithmatic operations on Series
 * How to load into a DataFrame
 * How to deal with Not a Number(NaN) values

#### Downloading Pandas
You can use **Anaconda**, If you google "install anaconda python (mac os x/ linux/ windows), you'll easily can install it.
**After that**, you can easily use command *conda install pandas*, and It's **Done**.

###### HOW TO IMPORT PANDAS & CREATE PANDAS SERIES

In [1965]:
import pandas as pd
groceries = pd.Series(data=[20,4,'Yes','No'], index=['eggs', 'age', 'sex','mashtehsan'])

In [1966]:
groceries.size


4

In [1967]:
groceries['eggs'] = 123

In [1968]:
groceries[['sex', 'age']]


sex    Yes
age      4
dtype: object

In [1969]:
groceries[0]


123

In [1970]:
groceries[-1]


'No'

In [1971]:
groceries[[0,1]]

eggs    123
age       4
dtype: object

In [1972]:
groceries.loc[['eggs', 'sex']]

eggs    123
sex     Yes
dtype: object

In [1973]:
groceries.iloc[[0, 1]]

eggs    123
age       4
dtype: object

In [1974]:
groceries.drop('sex', inplace=True)

##### CREATING PANDAS DATAFRAMES

In [1975]:
items = {'Amir': pd.Series([245, 25, 55], index=['bike', 'pants', 'perfume']),
         'Shirin': pd.Series([40, 110, 56, 600], index=['perfume', 'boat', 'coffe', 'phone_charger'])}
type(items)

dict

In [1976]:
shoppin_carts = pd.DataFrame(items)
shoppin_carts

Unnamed: 0,Amir,Shirin
bike,245.0,
boat,,110.0
coffe,,56.0
pants,25.0,
perfume,55.0,40.0
phone_charger,,600.0


In [1977]:
shirin_shopping_cart = pd.DataFrame(items, columns=['Shirin'])
shirin_shopping_cart

Unnamed: 0,Shirin
perfume,40
boat,110
coffe,56
phone_charger,600


In [1978]:
sel_shopping_cart = pd.DataFrame(items, index=['pants', 'book'])
sel_shopping_cart

Unnamed: 0,Amir,Shirin
pants,25.0,
book,,


In [1979]:
shirin_sel_shopping_item = pd.DataFrame(items, index=['perfume', 'boat'], columns=['Shirin'])
shirin_sel_shopping_item

Unnamed: 0,Shirin
perfume,40
boat,110


In [1980]:
data = {'Integers': [1,2,3],
        'Floats': [4.6, 7.4, 2.5]}
df = pd.DataFrame(data, index=['lablel 1', 'lable 2', 'lable 3'])
df

Unnamed: 0,Integers,Floats
lablel 1,1,4.6
lable 2,2,7.4
lable 3,3,2.5


In [1981]:
items = [{'bikes': 20, 'pants': 30, 'watches': 35}, {'watches': 10, 'glasses': 50, 'bikes': 15, 'pants': 5}]
store_items = pd.DataFrame(items, index=['store 1', 'store 2'])
store_items

Unnamed: 0,bikes,pants,watches,glasses
store 1,20,30,35,
store 2,15,5,10,50.0


In [1982]:
store_items[['bikes']]

Unnamed: 0,bikes
store 1,20
store 2,15


In [1983]:
store_items[['bikes', 'pants']]

Unnamed: 0,bikes,pants
store 1,20,30
store 2,15,5


In [1984]:
store_items.loc[['store 1']]

Unnamed: 0,bikes,pants,watches,glasses
store 1,20,30,35,


In [1985]:
store_items['bikes']['store 2']

15

In [1986]:
store_items['shirts'] = [65, 78]
store_items

Unnamed: 0,bikes,pants,watches,glasses,shirts
store 1,20,30,35,,65
store 2,15,5,10,50.0,78


In [1987]:
store_items['suits'] = store_items['shirts'] + store_items['pants']
store_items

Unnamed: 0,bikes,pants,watches,glasses,shirts,suits
store 1,20,30,35,,65,95
store 2,15,5,10,50.0,78,83


In [1988]:
new_items = [{'bikes': 20, 'pants': 30, 'watches': 35, 'glasses': 4 }]

new_store = pd.DataFrame(new_items, index=['store 3'])
new_store

Unnamed: 0,bikes,pants,watches,glasses
store 3,20,30,35,4


In [1989]:
store_items = store_items.append(new_store)
store_items

Unnamed: 0,bikes,pants,watches,glasses,shirts,suits
store 1,20,30,35,,65.0,95.0
store 2,15,5,10,50.0,78.0,83.0
store 3,20,30,35,4.0,,


In [1990]:
store_items['new_watches'] = store_items['watches'][1:]
store_items

Unnamed: 0,bikes,pants,watches,glasses,shirts,suits,new_watches
store 1,20,30,35,,65.0,95.0,
store 2,15,5,10,50.0,78.0,83.0,10.0
store 3,20,30,35,4.0,,,35.0


In [1991]:
store_items.insert(5, 'shoes', [8, 5, 0])
store_items

Unnamed: 0,bikes,pants,watches,glasses,shirts,shoes,suits,new_watches
store 1,20,30,35,,65.0,8,95.0,
store 2,15,5,10,50.0,78.0,5,83.0,10.0
store 3,20,30,35,4.0,,0,,35.0


In [1992]:
store_items.pop('new_watches')
store_items

Unnamed: 0,bikes,pants,watches,glasses,shirts,shoes,suits
store 1,20,30,35,,65.0,8,95.0
store 2,15,5,10,50.0,78.0,5,83.0
store 3,20,30,35,4.0,,0,


In [1993]:
store_items = store_items.drop(['watches', 'shoes'], axis=1)
store_items

Unnamed: 0,bikes,pants,glasses,shirts,suits
store 1,20,30,,65.0,95.0
store 2,15,5,50.0,78.0,83.0
store 3,20,30,4.0,,


In [1994]:
store_items = store_items.drop(['store 2'], axis=0)
store_items

Unnamed: 0,bikes,pants,glasses,shirts,suits
store 1,20,30,,65.0,95.0
store 3,20,30,4.0,,


In [1995]:
store_items = store_items.rename(columns={'bikes': 'hats'})
store_items

Unnamed: 0,hats,pants,glasses,shirts,suits
store 1,20,30,,65.0,95.0
store 3,20,30,4.0,,


In [1996]:
store_items = store_items.rename(index={'store 3': 'store 2'})
store_items

Unnamed: 0,hats,pants,glasses,shirts,suits
store 1,20,30,,65.0,95.0
store 2,20,30,4.0,,


In [1997]:
store_items = store_items.set_index('pants')
store_items

Unnamed: 0_level_0,hats,glasses,shirts,suits
pants,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
30,20,,65.0,95.0
30,20,4.0,,


##### DEALING WITH NAN

In [1998]:
items = [{'bikes': 20, 'pants': 30, 'watches': 35, 'shirts': 15, 'shoes': 8, 'suits': 45},
         {'watches': 10, 'glasses': 50, 'bikes': 15, 'pants': 5, 'shirts': 2, 'shoes': 5, 'suits': 7},
         {'bikes': 20, 'pants': 30, 'watches': 35, 'glasses': 4, 'shoes': 10}]
store_items = pd.DataFrame(items, index=['store 1', 'store 2', 'store 3'])
store_items

Unnamed: 0,bikes,pants,watches,shirts,shoes,suits,glasses
store 1,20,30,35,15.0,8,45.0,
store 2,15,5,10,2.0,5,7.0,50.0
store 3,20,30,35,,10,,4.0


In [1999]:
x = store_items.isnull().sum().sum()
print(x)

3


In [2000]:
x = store_items.isnull()
print(x)

         bikes  pants  watches  shirts  shoes  suits  glasses
store 1  False  False    False   False  False  False     True
store 2  False  False    False   False  False  False    False
store 3  False  False    False    True  False   True    False


In [2001]:
x = store_items.isnull().sum()
print(x)

bikes      0
pants      0
watches    0
shirts     1
shoes      0
suits      1
glasses    1
dtype: int64


In [2002]:
store_items.count()

bikes      3
pants      3
watches    3
shirts     2
shoes      3
suits      2
glasses    2
dtype: int64

In [2003]:
store_items.dropna(axis=0)

Unnamed: 0,bikes,pants,watches,shirts,shoes,suits,glasses
store 2,15,5,10,2.0,5,7.0,50.0


In [2004]:
store_items.dropna(axis=1)

Unnamed: 0,bikes,pants,watches,shoes
store 1,20,30,35,8
store 2,15,5,10,5
store 3,20,30,35,10


In [2005]:
##store_items.dropna(axis=1, inplace=True)
##store_items

In [2006]:
# store_items = store_items.fillna(0)
# store_items

In [2007]:
store_items = store_items.fillna(method='ffill', axis=0)
store_items

Unnamed: 0,bikes,pants,watches,shirts,shoes,suits,glasses
store 1,20,30,35,15.0,8,45.0,
store 2,15,5,10,2.0,5,7.0,50.0
store 3,20,30,35,2.0,10,7.0,4.0


In [2008]:
store_items = store_items.fillna(method='ffill', axis=1)
store_items

Unnamed: 0,bikes,pants,watches,shirts,shoes,suits,glasses
store 1,20.0,30.0,35.0,15.0,8.0,45.0,45.0
store 2,15.0,5.0,10.0,2.0,5.0,7.0,50.0
store 3,20.0,30.0,35.0,2.0,10.0,7.0,4.0


#### LOADING DATA INTO A PANDAS DATAFRAME

In [2009]:
google_stock = pd.read_csv('./GOOG.csv')
google_stock

Unnamed: 0,Date,Open,High,Low,Close,Adj Close,Volume
0,2004-08-20,50.316402,54.336334,50.062355,53.952770,53.952770,22942874
1,2004-08-23,55.168217,56.528118,54.321388,54.495735,54.495735,18342897
2,2004-08-24,55.412300,55.591629,51.591621,52.239197,52.239197,15319808
3,2004-08-25,52.284027,53.798351,51.746044,52.802086,52.802086,9232276
4,2004-08-26,52.279045,53.773445,52.134586,53.753517,53.753517,7128620
...,...,...,...,...,...,...,...
4211,2021-05-13,2261.090088,2276.601074,2242.719971,2261.969971,2261.969971,1333500
4212,2021-05-14,2291.830078,2321.139893,2283.320068,2316.159912,2316.159912,1330100
4213,2021-05-17,2309.320068,2323.340088,2295.000000,2321.409912,2321.409912,992100
4214,2021-05-18,2336.906006,2343.149902,2303.159912,2303.429932,2303.429932,864400


In [2010]:
google_stock.head()

Unnamed: 0,Date,Open,High,Low,Close,Adj Close,Volume
0,2004-08-20,50.316402,54.336334,50.062355,53.95277,53.95277,22942874
1,2004-08-23,55.168217,56.528118,54.321388,54.495735,54.495735,18342897
2,2004-08-24,55.4123,55.591629,51.591621,52.239197,52.239197,15319808
3,2004-08-25,52.284027,53.798351,51.746044,52.802086,52.802086,9232276
4,2004-08-26,52.279045,53.773445,52.134586,53.753517,53.753517,7128620


In [2011]:
google_stock.tail()

Unnamed: 0,Date,Open,High,Low,Close,Adj Close,Volume
4211,2021-05-13,2261.090088,2276.601074,2242.719971,2261.969971,2261.969971,1333500
4212,2021-05-14,2291.830078,2321.139893,2283.320068,2316.159912,2316.159912,1330100
4213,2021-05-17,2309.320068,2323.340088,2295.0,2321.409912,2321.409912,992100
4214,2021-05-18,2336.906006,2343.149902,2303.159912,2303.429932,2303.429932,864400
4215,2021-05-19,2264.399902,2316.76001,2263.52002,2308.709961,2308.709961,943546


In [2012]:
google_stock.isnull().any()

Date         False
Open         False
High         False
Low          False
Close        False
Adj Close    False
Volume       False
dtype: bool

In [2013]:
google_stock['Adj Close'].describe()

count    4216.000000
mean      585.575104
std       464.904377
min        49.818268
25%       242.594445
50%       372.082840
75%       829.807526
max      2429.889893
Name: Adj Close, dtype: float64

In [2014]:
google_stock.describe()

Unnamed: 0,Open,High,Low,Close,Adj Close,Volume
count,4216.0,4216.0,4216.0,4216.0,4216.0,4216.0
mean,585.391795,591.148159,579.689011,585.575104,585.575104,6643241.0
std,464.288073,469.273706,460.125847,464.904377,464.904377,7856374.0
min,49.409801,50.680038,49.285267,49.818268,49.818268,7922.0
25%,242.767548,245.325462,240.0602,242.594445,242.594445,1647350.0
50%,371.283341,376.573502,368.810104,372.08284,372.08284,3968118.0
75%,828.707474,833.289993,825.232743,829.807526,829.807526,8480418.0
max,2410.330078,2452.37793,2402.280029,2429.889893,2429.889893,82541630.0


In [2015]:
google_stock.max()

Date          2021-05-19
Open         2410.330078
High          2452.37793
Low          2402.280029
Close        2429.889893
Adj Close    2429.889893
Volume          82541631
dtype: object

In [2016]:
google_stock.min()

Date         2004-08-20
Open          49.409801
High          50.680038
Low           49.285267
Close         49.818268
Adj Close     49.818268
Volume             7922
dtype: object

In [2017]:
google_stock['Close'].min()

49.818268

In [2018]:
google_stock.corr()

Unnamed: 0,Open,High,Low,Close,Adj Close,Volume
Open,1.0,0.9999,0.999878,0.99978,0.99978,-0.508078
High,0.9999,1.0,0.999845,0.999886,0.999886,-0.506377
Low,0.999878,0.999845,1.0,0.999902,0.999902,-0.509848
Close,0.99978,0.999886,0.999902,1.0,1.0,-0.508197
Adj Close,0.99978,0.999886,0.999902,1.0,1.0,-0.508197
Volume,-0.508078,-0.506377,-0.509848,-0.508197,-0.508197,1.0
