# Pandas for Data Science
* [Pandas](#0)
* [Pandas Data Structure](#1)
* [Input/Output](#2)
* [Pandas Help](#3)
* [Selection](#4)
* [Dropping](#5)
* [Sort and Rank](#6)
* [Retrieving Series/DataFrame Information](#7)
* [Applying Functions](#8)
* [Data Alignment](#9)

<a id="0"></a>
## Pandas
![Title](0.png)

In [1]:
import pandas as pd

<a id="1"></a>
## Pandas Data Structure
![Title](1.png)

In [3]:
series = pd.Series([10,20,30], index = ["c1","c2","c3"])
series

c1    10
c2    20
c3    30
dtype: int64

In [9]:
data = {"Football_team":["barcelona", "real_madrid"],
        "Footballer":["messi","ramos"],
        "money":[30.0,20.0]}
df = pd.DataFrame(data, columns=["Football_team", "Footballer","money" ])
df

Unnamed: 0,Football_team,Footballer,money
0,barcelona,messi,30.0
1,real_madrid,ramos,20.0


<a id="2"></a>
## Input/Output
![Title](2.png)

In [10]:
data = {"Football_team":["barcelona", "real_madrid"],
        "Footballer":["messi","ramos"],
        "money":[30.0,20.0]}
df = pd.DataFrame(data, columns=["Football_team", "Footballer","money" ])
df

Unnamed: 0,Football_team,Footballer,money
0,barcelona,messi,30.0
1,real_madrid,ramos,20.0


In [19]:
df.to_csv("first.csv")
df1 = pd.read_csv("first.csv",index_col = 0)
df1

Unnamed: 0,Football_team,Footballer,money
0,barcelona,messi,30.0
1,real_madrid,ramos,20.0


<a id="3"></a>
## Pandas Help
![Title](3.png)

In [20]:
help(pd.Series.loc)

Help on property:

    Access a group of rows and columns by label(s) or a boolean array.
    
    ``.loc[]`` is primarily label based, but may also be used with a
    boolean array.
    
    Allowed inputs are:
    
    - A single label, e.g. ``5`` or ``'a'``, (note that ``5`` is
      interpreted as a *label* of the index, and **never** as an
      integer position along the index).
    - A list or array of labels, e.g. ``['a', 'b', 'c']``.
    - A slice object with labels, e.g. ``'a':'f'``.
    
          start and the stop are included
    
    - A boolean array of the same length as the axis being sliced,
      e.g. ``[True, False, True]``.
    - A ``callable`` function with one argument (the calling Series, DataFrame
      or Panel) and that returns valid output for indexing (one of the above)
    
    See more at :ref:`Selection by Label <indexing.label>`
    
    See Also
    --------
    DataFrame.at : Access a single value for a row/column label pair
    DataFrame.iloc : Acce

<a id="4"></a>
## Selection
![Title](4.png)

In [25]:
df

Unnamed: 0,Football_team,Footballer,money
0,barcelona,messi,30.0
1,real_madrid,ramos,20.0


In [26]:
df["money"]

0    30.0
1    20.0
Name: money, dtype: float64

In [28]:
df[:1]

Unnamed: 0,Football_team,Footballer,money
0,barcelona,messi,30.0


In [29]:
df.iloc[1,1]

'ramos'

In [30]:
df.loc[1,"Footballer"]

'ramos'

In [35]:
#df.ix[1,1]

In [38]:
df

Unnamed: 0,Football_team,Footballer,money
0,barcelona,messi,30.0
1,real_madrid,ramos,20.0


In [39]:
filter_ = 25 > df["money"]
df[filter_]

Unnamed: 0,Football_team,Footballer,money
1,real_madrid,ramos,20.0


In [44]:
series

c1    10
c2    20
c3    30
dtype: int64

In [46]:
series["c2"] = 100
series

c1     10
c2    100
c3     30
dtype: int64

<a id="5"></a>
## Dropping
![Title](5.png)

In [57]:
data = {"Football_team":["barcelona", "real_madrid"],
        "Footballer":["messi","ramos"],
        "money":[30.0,20.0]}
df = pd.DataFrame(data, columns=["Football_team", "Footballer","money" ])
df

Unnamed: 0,Football_team,Footballer,money
0,barcelona,messi,30.0
1,real_madrid,ramos,20.0


In [59]:
df.drop([0],inplace = True)
df

Unnamed: 0,Football_team,Footballer,money
1,real_madrid,ramos,20.0


In [60]:
df.drop(["money"],axis = 1,inplace = True)
df

Unnamed: 0,Football_team,Footballer
1,real_madrid,ramos


<a id="6"></a>
## Sort and Rank
![Title](6.png)

In [61]:
data = {"Football_team":["barcelona", "real_madrid"],
        "Footballer":["messi","ramos"],
        "money":[30.0,20.0]}
df = pd.DataFrame(data, columns=["Football_team", "Footballer","money" ])
df

Unnamed: 0,Football_team,Footballer,money
0,barcelona,messi,30.0
1,real_madrid,ramos,20.0


In [68]:
df = df.sort_values(by = "money")
df

Unnamed: 0,Football_team,Footballer,money
1,real_madrid,ramos,20.0
0,barcelona,messi,30.0


In [69]:
df = df.sort_index()
df

Unnamed: 0,Football_team,Footballer,money
0,barcelona,messi,30.0
1,real_madrid,ramos,20.0


In [71]:
df.rank()

Unnamed: 0,Football_team,Footballer,money
0,1.0,1.0,2.0
1,2.0,2.0,1.0


<a id="7"></a>
## Retrieving Series/DataFrame Information
![Title](7.png)

In [72]:
df

Unnamed: 0,Football_team,Footballer,money
0,barcelona,messi,30.0
1,real_madrid,ramos,20.0


In [73]:
df.shape

(2, 3)

In [74]:
df.index

Int64Index([0, 1], dtype='int64')

In [75]:
df.columns

Index(['Football_team', 'Footballer', 'money'], dtype='object')

In [76]:
df.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 2 entries, 0 to 1
Data columns (total 3 columns):
Football_team    2 non-null object
Footballer       2 non-null object
money            2 non-null float64
dtypes: float64(1), object(2)
memory usage: 64.0+ bytes


In [78]:
df.count() # nan

Football_team    2
Footballer       2
money            2
dtype: int64

In [80]:
df

Unnamed: 0,Football_team,Footballer,money
0,barcelona,messi,30.0
1,real_madrid,ramos,20.0


In [79]:
df.sum()

Football_team    barcelonareal_madrid
Footballer                 messiramos
money                              50
dtype: object

In [82]:
df.max()

Football_team    real_madrid
Footballer             ramos
money                     30
dtype: object

In [89]:
df.index.argmax()

1

In [90]:
df.describe()

Unnamed: 0,money
count,2.0
mean,25.0
std,7.071068
min,20.0
25%,22.5
50%,25.0
75%,27.5
max,30.0


In [92]:
df.mean()

money    25.0
dtype: float64

In [93]:
df.median()

money    25.0
dtype: float64

<a id="8"></a>
## Applying Functions
![Title](8.png)

In [94]:
df

Unnamed: 0,Football_team,Footballer,money
0,barcelona,messi,30.0
1,real_madrid,ramos,20.0


In [95]:
f = lambda x: x*2
df.apply(f)

Unnamed: 0,Football_team,Footballer,money
0,barcelonabarcelona,messimessi,60.0
1,real_madridreal_madrid,ramosramos,40.0


In [97]:
df.applymap(f)

Unnamed: 0,Football_team,Footballer,money
0,barcelonabarcelona,messimessi,60.0
1,real_madridreal_madrid,ramosramos,40.0


<a id="9"></a>
## Data Alignment
![Title](9.png)

In [124]:
series = pd.Series([10,20,30], index = ["c1","c2","c3"])
series

c1    10
c2    20
c3    30
dtype: int64

In [125]:
s = pd.Series([1,2], index = ["c1","c2"])
s


c1    1
c2    2
dtype: int64

In [126]:
series + s

c1    11.0
c2    22.0
c3     NaN
dtype: float64

In [128]:
s.add(series,fill_value=0)

c1    11.0
c2    22.0
c3    30.0
dtype: float64