# Operations

Let us discuss some useful Operations using Pandas

In [30]:
import pandas as pd
dataframe = pd.DataFrame({'custID':[1,2,3,4],'SaleType':['big','small','medium','big'],'SalesCode':['121','131','141','151']})
dataframe.head()

Unnamed: 0,SaleType,SalesCode,custID
0,big,121,1
1,small,131,2
2,medium,141,3
3,big,151,4


### Info on Unique Values

In [8]:
dataframe['SaleType'].unique()

array(['big', 'small', 'medium'], dtype=object)

In [9]:
dataframe['SaleType'].nunique()

3

In [17]:
dataframe['SalesCode'].value_counts()

121    1
151    1
141    1
131    1
Name: SalesCode, dtype: int64

### Selecting Data

In [18]:
#Select from DataFrame using criteria from multiple columns
newdataframe = dataframe[(dataframe['custID']!=3) & (dataframe['SaleType']=='big')]

In [19]:
newdataframe

Unnamed: 0,SaleType,SalesCode,custID
0,big,121,1
3,big,151,4


### Applying Functions

In [20]:
def profit(a):
    return a*4

In [21]:
dataframe['custID'].apply(profit)

0     4
1     8
2    12
3    16
Name: custID, dtype: int64

In [23]:
dataframe['SaleType'].apply(len)

0    3
1    5
2    6
3    3
Name: SaleType, dtype: int64

In [24]:
dataframe['custID'].sum()

10

** Permanently Removing a Column**

In [25]:
del dataframe['custID']

In [26]:
dataframe

Unnamed: 0,SaleType,SalesCode
0,big,121
1,small,131
2,medium,141
3,big,151


** Get column and index names: **

In [27]:
dataframe.columns

Index(['SaleType', 'SalesCode'], dtype='object')

In [28]:
dataframe.index

RangeIndex(start=0, stop=4, step=1)

** Sorting and Ordering a DataFrame:**

In [31]:
dataframe

Unnamed: 0,SaleType,SalesCode,custID
0,big,121,1
1,small,131,2
2,medium,141,3
3,big,151,4


In [32]:
dataframe.sort_values(by='SaleType') #inplace=False by default

Unnamed: 0,SaleType,SalesCode,custID
0,big,121,1
3,big,151,4
2,medium,141,3
1,small,131,2


** Find Null Values or Check for Null Values**

In [35]:
dataframe.isnull()

Unnamed: 0,SaleType,SalesCode,custID
0,,,
1,,,
2,,,
3,,,


In [36]:
# Drop rows with NaN Values
dataframe.dropna()

Unnamed: 0,SaleType,SalesCode,custID
0,big,121,1
1,small,131,2
2,medium,141,3
3,big,151,4


** Filling in NaN values with something else: **

In [37]:
import numpy as np

In [38]:
dataframe = pd.DataFrame({'Sale1':[5,np.nan,10,np.nan],
                   'Sale2':[np.nan,121,np.nan,141],
                   'Sale3':['XUI','VYU','NMA','IUY']})
dataframe.head()

Unnamed: 0,Sale1,Sale2,Sale3
0,5.0,,XUI
1,,121.0,VYU
2,10.0,,NMA
3,,141.0,IUY


In [39]:
dataframe.fillna('Not nan')

Unnamed: 0,Sale1,Sale2,Sale3
0,5,Not nan,XUI
1,Not nan,121,VYU
2,10,Not nan,NMA
3,Not nan,141,IUY


### The END