# IoT & Smart Analytics
## A Program by IIIT-H and TalentSprint

## Learning Objectives
At the end of the experiment, you will be able to : 
* Understand & apply frequently used Pandas functions & operations

#### Importing Required Packages

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
print(np.__version__)         # Checking version of numpy
print(pd.__version__)         # checking version of pandas

1.21.5
1.3.5


### Two Data Structures in Pandas: A. Series & B. DataFrame
### A. Series
* It is NumPy 1D array like data structure with index attached for each member/element.
* It is like dictionary and can be used in many contexts where dictionary might be used.
### Making Series

In [2]:
s1=pd.Series([10,20,30,40,50])  # Series with Default index starting from zero.
print(s1)
print(s1.index)

0    10
1    20
2    30
3    40
4    50
dtype: int64
RangeIndex(start=0, stop=5, step=1)


In [3]:
# Making Series with user defind index values.
s2=pd.Series([4.5,7.2,-5.3,3.6],index=['d','b','a','c'])
s2

d    4.5
b    7.2
a   -5.3
c    3.6
dtype: float64

In [4]:
# Making Series from Dictionary
fruit={'Apple':5,'Mango':10,'Banana':3,'Grape':8,'Orange':3}
print(fruit)
s_fruit1=pd.Series(fruit)
print(s_fruit1)

{'Apple': 5, 'Mango': 10, 'Banana': 3, 'Grape': 8, 'Orange': 3}
Apple      5
Mango     10
Banana     3
Grape      8
Orange     3
dtype: int64


In [5]:
# Define new index by simply applyiing index on series and assigning new index.
s1=pd.Series([1,2,3,4,5])
print(s1)
s1.index=['Bob','John','Adam','Mathew','Smith'] # Define new index by simply applyiing index on series and assigning new index.
s1.index.name='Name of student'
s1.name='Number of absentism of students'
s1

0    1
1    2
2    3
3    4
4    5
dtype: int64


Name of student
Bob       1
John      2
Adam      3
Mathew    4
Smith     5
Name: Number of absentism of students, dtype: int64

### B. DataFrame 
* It is rectangular table of data and contains an ordered collection of columns, each of which can be a different value type:numeric, string, boolean, etc.It has both a row and column index.


### Making DataFrame 

In [6]:
# Dictiionary of equal length list can be used to construct a DataFrame
data={'Name':['Ramen','Ankur','Vinayak','Rahul','Divya','Sarthak','Adam'],
      'Subject Stream':['English','Maths','Biology','Physics','Computing','English','Math'],
      'Score':[435,234,986,562,12,600,900]}
data

{'Name': ['Ramen', 'Ankur', 'Vinayak', 'Rahul', 'Divya', 'Sarthak', 'Adam'],
 'Score': [435, 234, 986, 562, 12, 600, 900],
 'Subject Stream': ['English',
  'Maths',
  'Biology',
  'Physics',
  'Computing',
  'English',
  'Math']}

In [7]:
f1=pd.DataFrame(data) #Passing the above dict to make DataFrame,index is automatically added, can be edited as in Series
f1        # Keys of dictionary became the columns header and each value (list) become rows of tha

Unnamed: 0,Name,Subject Stream,Score
0,Ramen,English,435
1,Ankur,Maths,234
2,Vinayak,Biology,986
3,Rahul,Physics,562
4,Divya,Computing,12
5,Sarthak,English,600
6,Adam,Math,900


In [8]:
### Making DataFrame by passing data, columns and index seperately

data=np.arange(1,13).reshape(3,4) ## Using numpy function for making an array of shape(3,4) 
print(data,'\n')
f2=pd.DataFrame(data, columns=['A','B','C','D'],index=['X','Y','Z'])# Assigning column and index for making DataFrame
f2

[[ 1  2  3  4]
 [ 5  6  7  8]
 [ 9 10 11 12]] 



Unnamed: 0,A,B,C,D
X,1,2,3,4
Y,5,6,7,8
Z,9,10,11,12


  ### Getting row and column index and converting it into a list.

In [9]:
f2.index

Index(['X', 'Y', 'Z'], dtype='object')

In [10]:
f2.index.to_list()

['X', 'Y', 'Z']

In [11]:
f2.columns

Index(['A', 'B', 'C', 'D'], dtype='object')

In [12]:
f2.columns.to_list()

['A', 'B', 'C', 'D']

##### Similarly .values gives all the data except column and row index as 2D NumPy array.

In [13]:
f2.values

array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12]])

### Renaming Column and  Row index:  .rename()

In [14]:
f2=pd.DataFrame(np.arange(1,13).reshape(3,4), columns=['A','B','C','D'],index=['X','Y','Z'])# Assigning column and index for making DataFrame
f2

Unnamed: 0,A,B,C,D
X,1,2,3,4
Y,5,6,7,8
Z,9,10,11,12


In [15]:
# Changing column name
f2.rename({'A':'Apple','B':'Banana','C':'Cherry','D':'Dates'},axis=1,inplace=True)
f2   # Pass a dictionary with key as old index name and value as new index to be changed,axis=1 for columns.
     # inplace = True--> it permanently changes the initial DataFrame. Without this argument, the operation only create
     # a display and initial DataFrame remains unchanged.

Unnamed: 0,Apple,Banana,Cherry,Dates
X,1,2,3,4
Y,5,6,7,8
Z,9,10,11,12


In [16]:
# changing row name, similar as above, axis=0 for rows.
f2.rename({'X':'X-Man','Y':'Yo-Yo','Z':'Zen_Monk'},axis=0,inplace=True)
f2

Unnamed: 0,Apple,Banana,Cherry,Dates
X-Man,1,2,3,4
Yo-Yo,5,6,7,8
Zen_Monk,9,10,11,12


###  Adding additional column to existing DataFrame

In [17]:
# Using f1 DataFrame again:
# Using dictionary for making DataFrame f1
data={'Name':['Ramen','Ankur','Vinayak','Rahul','Divya','Sarthak','Adam'],
      'Subject Stream':['English','Maths','Biology','Physics',
                        'Computing','English','Math'],
      'Score':[435,234,986,562,12,600,900]}
f1=pd.DataFrame(data) #Passing the above dict to make DataFrame,index is automatically added, can be edited as in Series
f1 

Unnamed: 0,Name,Subject Stream,Score
0,Ramen,English,435
1,Ankur,Maths,234
2,Vinayak,Biology,986
3,Rahul,Physics,562
4,Divya,Computing,12
5,Sarthak,English,600
6,Adam,Math,900


In [18]:
# Adding single column
f1['Remarks']=['G','BA','Exceptional','Average','Fail','AA','Excellent']
f1

Unnamed: 0,Name,Subject Stream,Score,Remarks
0,Ramen,English,435,G
1,Ankur,Maths,234,BA
2,Vinayak,Biology,986,Exceptional
3,Rahul,Physics,562,Average
4,Divya,Computing,12,Fail
5,Sarthak,English,600,AA
6,Adam,Math,900,Excellent


In [19]:
# Adding Multiple column
f1[['Grade','CGPA']]=pd.DataFrame(np.array([['C+','C','A+','B','F','B+','A'],[6,5,10,7,4,8,9]]).T)
f1

Unnamed: 0,Name,Subject Stream,Score,Remarks,Grade,CGPA
0,Ramen,English,435,G,C+,6
1,Ankur,Maths,234,BA,C,5
2,Vinayak,Biology,986,Exceptional,A+,10
3,Rahul,Physics,562,Average,B,7
4,Divya,Computing,12,Fail,F,4
5,Sarthak,English,600,AA,B+,8
6,Adam,Math,900,Excellent,A,9


### Making a different  unique columns as row index: .set_index('column to be set as index ' ) & Resetting it again: .reset_index

In [20]:
# Using f1 DataFrame for this operation
df=f1.copy()  # Note in this DataFrame default index is applied.Now we want to change it with another unique column.
df

Unnamed: 0,Name,Subject Stream,Score,Remarks,Grade,CGPA
0,Ramen,English,435,G,C+,6
1,Ankur,Maths,234,BA,C,5
2,Vinayak,Biology,986,Exceptional,A+,10
3,Rahul,Physics,562,Average,B,7
4,Divya,Computing,12,Fail,F,4
5,Sarthak,English,600,AA,B+,8
6,Adam,Math,900,Excellent,A,9


In [21]:
df['Name'].is_unique # Checking whethe the column Name has unique entries or not

True

* DataFrame.set_index(keys, drop=True, append=False, inplace=False, verify_integrity=False)
* drop--> column to be used as the new index is not appear as another column. default-->True
* append--> Whether to append new index to existing index.default-->False :

In [22]:
df.set_index('Name',drop=True,append=False,inplace=True) # Now making Name column as index
df

Unnamed: 0_level_0,Subject Stream,Score,Remarks,Grade,CGPA
Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
Ramen,English,435,G,C+,6
Ankur,Maths,234,BA,C,5
Vinayak,Biology,986,Exceptional,A+,10
Rahul,Physics,562,Average,B,7
Divya,Computing,12,Fail,F,4
Sarthak,English,600,AA,B+,8
Adam,Math,900,Excellent,A,9


In [23]:
# Resetting again
df.reset_index(drop=False,inplace=True)
# It again reset the index, drop is False by default, if True,it drops the index columns
df

Unnamed: 0,Name,Subject Stream,Score,Remarks,Grade,CGPA
0,Ramen,English,435,G,C+,6
1,Ankur,Maths,234,BA,C,5
2,Vinayak,Biology,986,Exceptional,A+,10
3,Rahul,Physics,562,Average,B,7
4,Divya,Computing,12,Fail,F,4
5,Sarthak,English,600,AA,B+,8
6,Adam,Math,900,Excellent,A,9


### Getting information about the DataFrame

In [24]:
df.head()

Unnamed: 0,Name,Subject Stream,Score,Remarks,Grade,CGPA
0,Ramen,English,435,G,C+,6
1,Ankur,Maths,234,BA,C,5
2,Vinayak,Biology,986,Exceptional,A+,10
3,Rahul,Physics,562,Average,B,7
4,Divya,Computing,12,Fail,F,4


In [25]:
df.tail()

Unnamed: 0,Name,Subject Stream,Score,Remarks,Grade,CGPA
2,Vinayak,Biology,986,Exceptional,A+,10
3,Rahul,Physics,562,Average,B,7
4,Divya,Computing,12,Fail,F,4
5,Sarthak,English,600,AA,B+,8
6,Adam,Math,900,Excellent,A,9


In [26]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 7 entries, 0 to 6
Data columns (total 6 columns):
 #   Column          Non-Null Count  Dtype 
---  ------          --------------  ----- 
 0   Name            7 non-null      object
 1   Subject Stream  7 non-null      object
 2   Score           7 non-null      int64 
 3   Remarks         7 non-null      object
 4   Grade           7 non-null      object
 5   CGPA            7 non-null      object
dtypes: int64(1), object(5)
memory usage: 464.0+ bytes


In [27]:
df.dtypes

Name              object
Subject Stream    object
Score              int64
Remarks           object
Grade             object
CGPA              object
dtype: object

In [28]:
type(df)

pandas.core.frame.DataFrame

### Retrieving and Manipulating Rows , Columns : .loc/.iloc
 

In [29]:
df

Unnamed: 0,Name,Subject Stream,Score,Remarks,Grade,CGPA
0,Ramen,English,435,G,C+,6
1,Ankur,Maths,234,BA,C,5
2,Vinayak,Biology,986,Exceptional,A+,10
3,Rahul,Physics,562,Average,B,7
4,Divya,Computing,12,Fail,F,4
5,Sarthak,English,600,AA,B+,8
6,Adam,Math,900,Excellent,A,9


In [30]:
# Retrieval of a column: Passing column name as a string inside square bracket applied on DataFrame.
df['Name'] # or >> f1.Name # But for later, 'column name' should be a valid variable name.
# Assignment is also valid : f2=f1['Name']

0      Ramen
1      Ankur
2    Vinayak
3      Rahul
4      Divya
5    Sarthak
6       Adam
Name: Name, dtype: object

In [31]:
# Retrieval of multiple columns: pass list of columns to be retrive or slice inside square bracket
df[['Name','Score','CGPA']]

Unnamed: 0,Name,Score,CGPA
0,Ramen,435,6
1,Ankur,234,5
2,Vinayak,986,10
3,Rahul,562,7
4,Divya,12,4
5,Sarthak,600,8
6,Adam,900,9


In [32]:
# Applying .values will result in 2D arrays of values only (Except rows and column index) of that sliced DataFrame.
df[['Name','Score','CGPA']].values

array([['Ramen', 435, '6'],
       ['Ankur', 234, '5'],
       ['Vinayak', 986, '10'],
       ['Rahul', 562, '7'],
       ['Divya', 12, '4'],
       ['Sarthak', 600, '8'],
       ['Adam', 900, '9']], dtype=object)

For slicing and retrieval of any rows and columns or combination of both.
* .loc[ ]  : Used for label based index for getting rows and columns
* .iloc[ ]  : Used for default integer  based index for getting location of rows or columns

In [33]:
df.index=['one','two','three','four','five','six','seven'] ## We are changing index 
df

Unnamed: 0,Name,Subject Stream,Score,Remarks,Grade,CGPA
one,Ramen,English,435,G,C+,6
two,Ankur,Maths,234,BA,C,5
three,Vinayak,Biology,986,Exceptional,A+,10
four,Rahul,Physics,562,Average,B,7
five,Divya,Computing,12,Fail,F,4
six,Sarthak,English,600,AA,B+,8
seven,Adam,Math,900,Excellent,A,9


### Getting Rows

In [34]:
# Getting Single row: Pass row as a string inside square bracket
df.loc['one'] # Here 'one' is label of first row  , but its default integer index is 0 and increses by 1 for next rows.

Name                Ramen
Subject Stream    English
Score                 435
Remarks                 G
Grade                  C+
CGPA                    6
Name: one, dtype: object

In [35]:
# Getting Mutiple rows : Pass the list of rows you want.
df.loc[['one','three','six']] 

Unnamed: 0,Name,Subject Stream,Score,Remarks,Grade,CGPA
one,Ramen,English,435,G,C+,6
three,Vinayak,Biology,986,Exceptional,A+,10
six,Sarthak,English,600,AA,B+,8


#### Above two operation using .iloc [ ]

In [36]:
# Getting Single row: Pass default integer index for  first  row which is 0 
df.iloc[0] # Here 'one' is label of first row  , but its default integer index is 0 and increses by 1 for next rows.

Name                Ramen
Subject Stream    English
Score                 435
Remarks                 G
Grade                  C+
CGPA                    6
Name: one, dtype: object

In [37]:
# Getting Multiple row: Pass list of default integer index for  rows you want.
df.iloc[[0,2,5]] # Here 0,2,5 are the default integer indexes for 'one','three' and 'six' labelled rows, 
# thus passed as list inside square bracket of .iloc[ ]

Unnamed: 0,Name,Subject Stream,Score,Remarks,Grade,CGPA
one,Ramen,English,435,G,C+,6
three,Vinayak,Biology,986,Exceptional,A+,10
six,Sarthak,English,600,AA,B+,8


### Getting both Rows and Columns 

In [38]:
# Getting single row and single column: Pass row label  and column label inside square with coma in between
df.loc['three','Score'] # Note values before coma(',') is for rows and values after coma (',') is for columns (Always).

986

In [39]:
# Getting Multiple rows and Multiple columns: Pass list of rows label  and list of  columns label inside square with coma in between
df.loc[['three','seven'],['Name','Score']] 
# Note values before coma(',') is for rows and values after coma (',') is for columns (Always).

Unnamed: 0,Name,Score
three,Vinayak,986
seven,Adam,900


#### Above two operation using .iloc [ ]

In [40]:
# Getting single row and single column: Pass default row and column label index inside square with coma in between.
# Default row and column index starts from 0. For 'Score' column default index is 2 and for row-'three' default index is 2.
df.iloc[2,2] # Note values before coma(',') is for rows and values after coma (',') is for columns (Always).

986

In [41]:
# Getting Multiple rows and Multiple columns: 
# Pass list of default indexes of rows and columns  inside square with coma in between
df.iloc[[2,6],[0,2]] 

Unnamed: 0,Name,Score
three,Vinayak,986
seven,Adam,900


##### Note: Default indexes of columns from right to left  are : -1 (Right most column),-2 (Second right most column) ,-3....and so on.
##### Note: Default indexes or rows from bottom to top are : -1,-2,-3... and so on.

### Try following syntax and check on image above


In [42]:
df.iloc[:,4] ## All rows (:) and 4rth column . Default index --> .iloc

one      C+
two       C
three    A+
four      B
five      F
six      B+
seven     A
Name: Grade, dtype: object

In [43]:
df.iloc[4,:] ## 4rth row and all columns(:) . Default index --> .iloc

Name                  Divya
Subject Stream    Computing
Score                    12
Remarks                Fail
Grade                     F
CGPA                      4
Name: five, dtype: object

In [44]:
df.iloc[:5,:]  # from the very beginning to 4rth row only (value after colon (:) is not included) and all columns. 
# Default index --> .iloc

Unnamed: 0,Name,Subject Stream,Score,Remarks,Grade,CGPA
one,Ramen,English,435,G,C+,6
two,Ankur,Maths,234,BA,C,5
three,Vinayak,Biology,986,Exceptional,A+,10
four,Rahul,Physics,562,Average,B,7
five,Divya,Computing,12,Fail,F,4


In [45]:
df.iloc[4:,:] # from 4rth rows to last rows and all columns,value before colon is always included.
# Default index --> .iloc

Unnamed: 0,Name,Subject Stream,Score,Remarks,Grade,CGPA
five,Divya,Computing,12,Fail,F,4
six,Sarthak,English,600,AA,B+,8
seven,Adam,Math,900,Excellent,A,9


In [46]:
df.iloc[2:5,:] # from 2nd row to 4 rth row and all column
# Default index --> .iloc

Unnamed: 0,Name,Subject Stream,Score,Remarks,Grade,CGPA
three,Vinayak,Biology,986,Exceptional,A+,10
four,Rahul,Physics,562,Average,B,7
five,Divya,Computing,12,Fail,F,4


In [47]:
df.iloc[2:5,3:6] ## Explain yourself

Unnamed: 0,Remarks,Grade,CGPA
three,Exceptional,A+,10
four,Average,B,7
five,Fail,F,4


In [48]:
df.loc['five',:] # five labelled row all column

Name                  Divya
Subject Stream    Computing
Score                    12
Remarks                Fail
Grade                     F
CGPA                      4
Name: five, dtype: object

In [49]:
df.loc[:'five',:]  # from the very beginning to 5th row  (value after colon (:) is also  included in case of labelled row) and all columns. 

Unnamed: 0,Name,Subject Stream,Score,Remarks,Grade,CGPA
one,Ramen,English,435,G,C+,6
two,Ankur,Maths,234,BA,C,5
three,Vinayak,Biology,986,Exceptional,A+,10
four,Rahul,Physics,562,Average,B,7
five,Divya,Computing,12,Fail,F,4


In [50]:
df.iloc[-1] # Gives last row

Name                   Adam
Subject Stream         Math
Score                   900
Remarks           Excellent
Grade                     A
CGPA                      9
Name: seven, dtype: object

In [51]:
df.iloc[:,-1] # Gives last column

one       6
two       5
three    10
four      7
five      4
six       8
seven     9
Name: CGPA, dtype: object

### Exception in Slicing
*  What we have seen till now:--> Column is sliced by passing column name as string inside square bracket when applied on DataFrame. 
###### BUT
*  If we pass numerical values with colon, inside square bracket in any Dataframe it gives corresponding rows.
*  See Examples below:

In [52]:
# For slicing column
df['Remarks']

one                G
two               BA
three    Exceptional
four         Average
five            Fail
six               AA
seven      Excellent
Name: Remarks, dtype: object

#### BUT

In [53]:
df[2:5] ## from second to 4rth row
df[4:] # from 4rth to last row
df[:5] # from very beginning to the 4rth row

Unnamed: 0,Name,Subject Stream,Score,Remarks,Grade,CGPA
one,Ramen,English,435,G,C+,6
two,Ankur,Maths,234,BA,C,5
three,Vinayak,Biology,986,Exceptional,A+,10
four,Rahul,Physics,562,Average,B,7
five,Divya,Computing,12,Fail,F,4


### Deleting Rows and Columns: . drop( )

In [54]:
# Again using above DataFrame:--> df
df

Unnamed: 0,Name,Subject Stream,Score,Remarks,Grade,CGPA
one,Ramen,English,435,G,C+,6
two,Ankur,Maths,234,BA,C,5
three,Vinayak,Biology,986,Exceptional,A+,10
four,Rahul,Physics,562,Average,B,7
five,Divya,Computing,12,Fail,F,4
six,Sarthak,English,600,AA,B+,8
seven,Adam,Math,900,Excellent,A,9


In [55]:
df.drop(['one','five'],axis=0)# for multiple row put the list of rows inside parenthesis.
# for single row put the row index only given below.
# axis=0 for deleting rows
# This is a display only, and df will not be changed.Can be assigned to another variable.
# passing an argument:  inplace=True , permanently deletes selected rows from df

Unnamed: 0,Name,Subject Stream,Score,Remarks,Grade,CGPA
two,Ankur,Maths,234,BA,C,5
three,Vinayak,Biology,986,Exceptional,A+,10
four,Rahul,Physics,562,Average,B,7
six,Sarthak,English,600,AA,B+,8
seven,Adam,Math,900,Excellent,A,9


In [56]:
df.drop('one',axis=0)# Deleting single row
# This is a display only, and df will not be changed 
# passing an argument:  inplace=True ,permanently deletes selected rows from df

Unnamed: 0,Name,Subject Stream,Score,Remarks,Grade,CGPA
two,Ankur,Maths,234,BA,C,5
three,Vinayak,Biology,986,Exceptional,A+,10
four,Rahul,Physics,562,Average,B,7
five,Divya,Computing,12,Fail,F,4
six,Sarthak,English,600,AA,B+,8
seven,Adam,Math,900,Excellent,A,9


In [57]:
df.drop(['Name','Score','CGPA'],axis=1)# for multiple columns put the list of columns inside parenthesis.
# for single column put the colum index only.given below
# axis=0 for deleting rows
# This is a display only, and df will not be changed 
# passing an argument:  inplace=True ,permanently deletes selected columns

Unnamed: 0,Subject Stream,Remarks,Grade
one,English,G,C+
two,Maths,BA,C
three,Biology,Exceptional,A+
four,Physics,Average,B
five,Computing,Fail,F
six,English,AA,B+
seven,Math,Excellent,A


In [58]:
df.drop('Score',axis=1) # Deleting single column.

Unnamed: 0,Name,Subject Stream,Remarks,Grade,CGPA
one,Ramen,English,G,C+,6
two,Ankur,Maths,BA,C,5
three,Vinayak,Biology,Exceptional,A+,10
four,Rahul,Physics,Average,B,7
five,Divya,Computing,Fail,F,4
six,Sarthak,English,AA,B+,8
seven,Adam,Math,Excellent,A,9


#### 'del' is also used but it deletes only one column permanently at a time

In [59]:
del df['CGPA'] # 'CGPA' column permanently deleted from df column
df

Unnamed: 0,Name,Subject Stream,Score,Remarks,Grade
one,Ramen,English,435,G,C+
two,Ankur,Maths,234,BA,C
three,Vinayak,Biology,986,Exceptional,A+
four,Rahul,Physics,562,Average,B
five,Divya,Computing,12,Fail,F
six,Sarthak,English,600,AA,B+
seven,Adam,Math,900,Excellent,A


###  NumPy Element-wise array methods and Descriptive statistics with pandas objects

In [60]:
df1=pd.DataFrame((np.random.randint(1,20,12).reshape(4,3)),
                 columns=['Mon','Tue','Wed'],
                 index=['Mango','Jackfruit','W.Melon','Pineapple'])
df1 

Unnamed: 0,Mon,Tue,Wed
Mango,4,13,16
Jackfruit,16,5,3
W.Melon,18,7,13
Pineapple,13,4,8


In [61]:
np.mean(df1,axis=1)

Mango        11.000000
Jackfruit     8.000000
W.Melon      12.666667
Pineapple     8.333333
dtype: float64

In [62]:
# OR
df1.mean(axis=1)

Mango        11.000000
Jackfruit     8.000000
W.Melon      12.666667
Pineapple     8.333333
dtype: float64

In [63]:
df1.sum()  # by default axis=0

Mon    51
Tue    29
Wed    40
dtype: int64

In [64]:
df1.sum(axis=1)

Mango        33
Jackfruit    24
W.Melon      38
Pineapple    25
dtype: int64

In [65]:
np.exp(df1)

Unnamed: 0,Mon,Tue,Wed
Mango,54.59815,442413.392009,8886111.0
Jackfruit,8886111.0,148.413159,20.08554
W.Melon,65659970.0,1096.633158,442413.4
Pineapple,442413.4,54.59815,2980.958


In [66]:
df1.describe()#this doesn't accept axis argument, describe can be used after Transposing ( .T ).check next syntax below:

Unnamed: 0,Mon,Tue,Wed
count,4.0,4.0,4.0
mean,12.75,7.25,10.0
std,6.184658,4.031129,5.715476
min,4.0,4.0,3.0
25%,10.75,4.75,6.75
50%,14.5,6.0,10.5
75%,16.5,8.5,13.75
max,18.0,13.0,16.0


In [67]:
(df1.T).describe()

Unnamed: 0,Mango,Jackfruit,W.Melon,Pineapple
count,3.0,3.0,3.0,3.0
mean,11.0,8.0,12.666667,8.333333
std,6.244998,7.0,5.507571,4.50925
min,4.0,3.0,7.0,4.0
25%,8.5,4.0,10.0,6.0
50%,13.0,5.0,13.0,8.0
75%,14.5,10.5,15.5,10.5
max,16.0,16.0,18.0,13.0


###  .apply() and .applymap()
* Anonymous or Lambda Functions:

In [68]:
# General function for calculatng square of any number.
def sqr_f(x):    
    return x**2

In [69]:
# Calling function:
sqr_f(6)

36

In [70]:
# Same above operation can be achieved using lambda function as given below:
f1=lambda x:x**2
# lambda is key word for making lambda function(also known as anonymous function) and x before colon is argument to be passed
# expression after colon is operation performed on argument and returned

In [71]:
# Calling lambda function
f1(6)

36

In [72]:
f_sum=lambda a,b:a+b # lambda function can take multiple argument as well

In [73]:
f_sum(3,4)

7

In [74]:
f2=lambda x:x.max()-x.min() # This is a lambda function which operates on series or array as argument.
# It returns difference between  maximum and minimum value of a series passed as an argument. 

In [75]:
S1=pd.Series([1,2,3,4,4,9])
print(S1)
f2(S1)  # calling lambda function f2 and passing S1 as an argument

0    1
1    2
2    3
3    4
4    4
5    9
dtype: int64


8

##### Note: ' apply ' --> method is used to applying a function on one dimensional array to each column or row as given below.
* Each row and column can be think as of one dimensional array.
* Note: apply acts on either row/rows or on column/columns.

In [76]:
# using previous DataFrame df1
df1

Unnamed: 0,Mon,Tue,Wed
Mango,4,13,16
Jackfruit,16,5,3
W.Melon,18,7,13
Pineapple,13,4,8


In [77]:
df1.apply(f2) # apply method is applied in DataFrame and function is passed
# We haven't declared the axis thus it applies to axis zero(row-wise)automatically. 
# It gives maximum - minimum of each columns

Mon    14
Tue     9
Wed    13
dtype: int64

In [78]:
df1.apply(f2,axis=1) # apply method is applied in DataFrame and function is passed
# We have declared the axis=1 thus column-wise . 
# It gives maximum - minimum of each rows.

Mango        12
Jackfruit    13
W.Melon      11
Pineapple     9
dtype: int64

#### Note :  applymap --> method is another method for Element-wise python functions application.

In [79]:
f3=lambda x: '%.2f' % x # this is a function which truncate any floating point to 2 decimal point
## but returns as a string

In [80]:
a=3.5678910
f3(a) # Notice the result is string

'3.57'

In [81]:
data=df1*2.5678910
data 
# Now this is a table and we want only two points after decimal.applymap is used in such situation.See in next Syntax.

Unnamed: 0,Mon,Tue,Wed
Mango,10.271564,33.382583,41.086256
Jackfruit,41.086256,12.839455,7.703673
W.Melon,46.222038,17.975237,33.382583
Pineapple,33.382583,10.271564,20.543128


In [82]:
data.applymap(f3)# f3 function is passed in applymap which acts on each element

Unnamed: 0,Mon,Tue,Wed
Mango,10.27,33.38,41.09
Jackfruit,41.09,12.84,7.7
W.Melon,46.22,17.98,33.38
Pineapple,33.38,10.27,20.54


### Thank You ! Happy Learning !!