![](logo.png)

# Day Objectives
## Pandas
- Pandas is a built in library using for data analysis. You'll be using Pandas heavily for data manipulation, visualisation, building machine learning models, etc.
- Pandas implements a number of powerful data operations familiar to users of both database frameworks and spreadsheet programs.
- There are two main data structures in Pandas - Series and Dataframes. The default way to store data is dataframes, and thus manipulating dataframes quickly is probably the most important skill set for data analysis.
- Source: https://pandas.pydata.org/pandas-docs/stable/overview.html
## Pandas Series
- A series is similar to a 1-D numpy array, and contains values of the same type (numeric, character, datetime  etc.). A dataframe is simply a table where each column is a pandas series.

## creating series
- List
- Tuple
- Dictionary
- Numpy
- Date_Range
- Series Indexing

## Data Analysis with Pandas


|S.No |Name |Gender|
|--|--|--|
|1 | Mercy | Female|
|2 | Cherry | Male |
|3 | Raju | Male |


* Pandas DataFrame
* Combining & Merging
* File I/O
* Indexing
* sorting
* Filtering 


In [1]:
pip install pandas




In [3]:
import pandas  as pd

In [4]:
pd.__version__

'1.0.5'

## creating series

In [5]:
# convert list into pandas series
li = [213,345,456,6,776,4564,534345]
s1 = pd.Series(li)
s1
# each value in series having index 
# index starts from 0 to n

0       213
1       345
2       456
3         6
4       776
5      4564
6    534345
dtype: int64

In [6]:
# convertig tuple into series
t = (123,34,345.45,"APSSDC")
s2 = pd.Series(t)
s2

0       123
1        34
2    345.45
3    APSSDC
dtype: object

In [7]:
# converting Dict into Series
di = {"a":112,"b":345,"c":68}
s3 = pd.Series(di)
s3
# keys acts like index

a    112
b    345
c     68
dtype: int64

In [10]:
# changing index values
l = [3234,456,34.8]
s4 = pd.Series(l,index = ["x",456,45.4])
s4
# index values can be any type of data

x       3234.0
456      456.0
45.4      34.8
dtype: float64

In [11]:
# converting numpy into series
import numpy as np
n = np.array(l)
s5 = pd.Series(n)
s5

0    3234.0
1     456.0
2      34.8
dtype: float64

In [12]:
# Date_Range

s6 = pd.date_range(start = "2021-06-01",end = "2021-07-15")
s6

DatetimeIndex(['2021-06-01', '2021-06-02', '2021-06-03', '2021-06-04',
               '2021-06-05', '2021-06-06', '2021-06-07', '2021-06-08',
               '2021-06-09', '2021-06-10', '2021-06-11', '2021-06-12',
               '2021-06-13', '2021-06-14', '2021-06-15', '2021-06-16',
               '2021-06-17', '2021-06-18', '2021-06-19', '2021-06-20',
               '2021-06-21', '2021-06-22', '2021-06-23', '2021-06-24',
               '2021-06-25', '2021-06-26', '2021-06-27', '2021-06-28',
               '2021-06-29', '2021-06-30', '2021-07-01', '2021-07-02',
               '2021-07-03', '2021-07-04', '2021-07-05', '2021-07-06',
               '2021-07-07', '2021-07-08', '2021-07-09', '2021-07-10',
               '2021-07-11', '2021-07-12', '2021-07-13', '2021-07-14',
               '2021-07-15'],
              dtype='datetime64[ns]', freq='D')

In [14]:
help(pd.date_range)

Help on function date_range in module pandas.core.indexes.datetimes:

date_range(start=None, end=None, periods=None, freq=None, tz=None, normalize=False, name=None, closed=None, **kwargs) -> pandas.core.indexes.datetimes.DatetimeIndex
    Return a fixed frequency DatetimeIndex.
    
    Parameters
    ----------
    start : str or datetime-like, optional
        Left bound for generating dates.
    end : str or datetime-like, optional
        Right bound for generating dates.
    periods : int, optional
        Number of periods to generate.
    freq : str or DateOffset, default 'D'
        Frequency strings can have multiples, e.g. '5H'. See
        :ref:`here <timeseries.offset_aliases>` for a list of
        frequency aliases.
    tz : str or tzinfo, optional
        Time zone name for returning localized DatetimeIndex, for example
        'Asia/Hong_Kong'. By default, the resulting DatetimeIndex is
        timezone-naive.
    normalize : bool, default False
        Normalize start/

# Pandas Series Indexing

In [17]:
s1[0]

213

In [22]:
s1[::-1] # reverse of the series 

6    534345
5      4564
4       776
3         6
2       456
1       345
0       213
dtype: int64

In [24]:
s1[::2]

0       213
2       456
4       776
6    534345
dtype: int64

In [25]:
s2[1::2]

1        34
3    APSSDC
dtype: object

In [26]:
s1

0       213
1       345
2       456
3         6
4       776
5      4564
6    534345
dtype: int64

In [27]:
s1[3:]

3         6
4       776
5      4564
6    534345
dtype: int64

In [28]:
# access 3,6,2,4
# Fancy Indexing
s1[[3,6,2,4]]
# accessing Specified data 

3         6
6    534345
2       456
4       776
dtype: int64

In [29]:
s3

a    112
b    345
c     68
dtype: int64

In [30]:
s3["a"] # explicit slicing

112

In [31]:
s3["c"]

68

In [32]:
s3[0]  # implicit slicing

112

In [34]:
# converting Dict into Series
di = {"a":112,"b":345,"d":np.nan,"c":68}
s7 = pd.Series(di)
s7
# NaN - not a number - a special type of float value

a    112.0
b    345.0
d      NaN
c     68.0
dtype: float64

In [35]:
s8 = pd.Series(di,index = ["a","d","c"])
s8

a    112.0
d      NaN
c     68.0
dtype: float64

In [36]:
s9 = pd.Series("SRM",index = [290,392,435,234,324])
s9

290    SRM
392    SRM
435    SRM
234    SRM
324    SRM
dtype: object

# Task
- Generate n - Table  using pandas series

1 -- 5

2 -- 10

3 -- 15

In [37]:
n = int(input())
ls = [i for i in range(n, (n*10)+1, n)]
s = pd.Series(ls, index=[i for i in range(1, 11)])
print(s)

5
1      5
2     10
3     15
4     20
5     25
6     30
7     35
8     40
9     45
10    50
dtype: int64


In [38]:
di = {1:5,2:10,3:15}
s7 = pd.Series(di)
s7
# here boundaaries fixed and table number also fixed 

1     5
2    10
3    15
dtype: int64

In [39]:
list1=[x*5 for x in range(1,11)]
s1=pd.Series(list1,index=np.arange(1,11))
s1

1      5
2     10
3     15
4     20
5     25
6     30
7     35
8     40
9     45
10    50
dtype: int64

In [41]:
li = [5, 10, 15]
s = pd.Series(li, index=(1,2,3))
s

1     5
2    10
3    15
dtype: int64

In [45]:
pd.Series(np.arange(1,11)*5,index = np.arange(1,11))

1      5
2     10
3     15
4     20
5     25
6     30
7     35
8     40
9     45
10    50
dtype: int32

In [48]:
s5 ={1:5,2:10,3:15}
b7 = pd.Series(s5)
b7


1     5
2    10
3    15
dtype: int64

# Pandas DataFrame

In [46]:
# converting dict into Dataframe
di

{1: 5, 2: 10, 3: 15}

In [51]:
df1 = pd.DataFrame(di, index = ["a","b","c"])
df1
# keys acts like column names

Unnamed: 0,1,2,3
a,5,10,15
b,5,10,15
c,5,10,15


In [53]:
df1.columns = ["X","Y","Z"]
df1

Unnamed: 0,X,Y,Z
a,5,10,15
b,5,10,15
c,5,10,15


In [54]:
df1.shape # (rows,columns)

(3, 3)

In [56]:
# converting list into Df
df2 = pd.DataFrame([[1,2,3],[3,4,5],[6,7,8]])
df2
# columns and rows starts from 0

Unnamed: 0,0,1,2
0,1,2,3
1,3,4,5
2,6,7,8


In [57]:
df2.columns = ["er","56","df"]
df2

Unnamed: 0,er,56,df
0,1,2,3
1,3,4,5
2,6,7,8


In [58]:
df2.index = ["e","t","w"]
df2

Unnamed: 0,er,56,df
e,1,2,3
t,3,4,5
w,6,7,8


In [61]:
d2 = {
    "Name":["HemaSundar","Manish","Vamsi"],
    "Gender":["Male","Male",np.nan],
    "PIN" : [481,512,345]
}
df3 = pd.DataFrame(d2)
df3

Unnamed: 0,Name,Gender,PIN
0,HemaSundar,Male,481
1,Manish,Male,512
2,Vamsi,,345


In [63]:
type(df3["Name"])

pandas.core.series.Series

In [64]:
df3["Name"]

0    HemaSundar
1        Manish
2         Vamsi
Name: Name, dtype: object

In [65]:
df3["PIN"]

0    481
1    512
2    345
Name: PIN, dtype: int64

In [66]:
df3["Name","PIN"]

KeyError: ('Name', 'PIN')

In [67]:
df3[["Name","PIN"]] # accessing sub df

Unnamed: 0,Name,PIN
0,HemaSundar,481
1,Manish,512
2,Vamsi,345


In [69]:
df3[2:3]

Unnamed: 0,Name,Gender,PIN
2,Vamsi,,345


In [71]:
df3[::-1]

Unnamed: 0,Name,Gender,PIN
2,Vamsi,,345
1,Manish,Male,512
0,HemaSundar,Male,481


In [72]:
df3["Name"]

0    HemaSundar
1        Manish
2         Vamsi
Name: Name, dtype: object

In [74]:
df3[:1]

Unnamed: 0,Name,Gender,PIN
0,HemaSundar,Male,481


In [77]:
df3.set_index("Name")

Unnamed: 0_level_0,Gender,PIN
Name,Unnamed: 1_level_1,Unnamed: 2_level_1
HemaSundar,Male,481
Manish,Male,512
Vamsi,,345


In [78]:
df3

Unnamed: 0,Name,Gender,PIN
0,HemaSundar,Male,481
1,Manish,Male,512
2,Vamsi,,345


In [79]:
df3.set_index("Name", inplace = True) # changing original df

In [80]:
df3

Unnamed: 0_level_0,Gender,PIN
Name,Unnamed: 1_level_1,Unnamed: 2_level_1
HemaSundar,Male,481
Manish,Male,512
Vamsi,,345


## iloc -- for accessing rows using integer indicies
## loc -- for accessing rows other than integer indicies

In [83]:
df3.iloc[0]

Gender    Male
PIN        481
Name: HemaSundar, dtype: object

In [84]:
df3[0]

KeyError: 0

In [86]:
df3.iloc[2]

Gender    NaN
PIN       345
Name: Vamsi, dtype: object

In [89]:
df3.loc["HemaSundar"]

Gender    Male
PIN        481
Name: HemaSundar, dtype: object

In [92]:
df3.loc[["Manish","Vamsi"]]

Unnamed: 0_level_0,Gender,PIN
Name,Unnamed: 1_level_1,Unnamed: 2_level_1
Manish,Male,512
Vamsi,,345


In [94]:
df3.loc[["Manish","Vamsi"], "PIN"]

Name
Manish    512
Vamsi     345
Name: PIN, dtype: int64

In [97]:
df3.reset_index(inplace = True)

In [98]:
df3

Unnamed: 0,Name,Gender,PIN
0,HemaSundar,Male,481
1,Manish,Male,512
2,Vamsi,,345


# File Reading

In [99]:
df4 = pd.read_csv("https://raw.githubusercontent.com/AP-Skill-Development-Corporation/DataScienceUsingPython-Internship-SRM-University/main/Day12_Pandas/iris.csv")
df4

Unnamed: 0.1,Unnamed: 0,sepal length (cm),sepal width (cm),petal length (cm),petal width (cm),Target
0,0,5.1,3.5,1.4,0.2,0
1,1,4.9,3.0,1.4,0.2,0
2,2,4.7,3.2,1.3,0.2,0
3,3,4.6,3.1,1.5,0.2,0
4,4,5.0,3.6,1.4,0.2,0
...,...,...,...,...,...,...
145,145,6.7,3.0,5.2,2.3,2
146,146,6.3,2.5,5.0,1.9,2
147,147,6.5,3.0,5.2,2.0,2
148,148,6.2,3.4,5.4,2.3,2


In [100]:
df4 = pd.read_csv("iris.csv")
df4

Unnamed: 0.1,Unnamed: 0,sepal length (cm),sepal width (cm),petal length (cm),petal width (cm),Target
0,0,5.1,3.5,1.4,0.2,0
1,1,4.9,3.0,1.4,0.2,0
2,2,4.7,3.2,1.3,0.2,0
3,3,4.6,3.1,1.5,0.2,0
4,4,5.0,3.6,1.4,0.2,0
...,...,...,...,...,...,...
145,145,6.7,3.0,5.2,2.3,2
146,146,6.3,2.5,5.0,1.9,2
147,147,6.5,3.0,5.2,2.0,2
148,148,6.2,3.4,5.4,2.3,2


In [101]:
df5 = pd.read_excel("2020-07-25.xlsx")

In [102]:
df5

Unnamed: 0.1,Unnamed: 0,Roll Number,2020-07-25
0,0,17B81A04H1,P
1,1,198A5F0019,P
2,2,17KD1A0560,P
3,3,17KH1A0455,P
4,4,1210316262,P
5,5,18P31A0555,P
6,6,18B01A0211,P
7,7,Y18IT048,P
8,8,17B81A05B2,P
9,9,169X1A04E0,P


In [104]:
df4.head()

Unnamed: 0.1,Unnamed: 0,sepal length (cm),sepal width (cm),petal length (cm),petal width (cm),Target
0,0,5.1,3.5,1.4,0.2,0
1,1,4.9,3.0,1.4,0.2,0
2,2,4.7,3.2,1.3,0.2,0
3,3,4.6,3.1,1.5,0.2,0
4,4,5.0,3.6,1.4,0.2,0


In [105]:
df4.head(2)

Unnamed: 0.1,Unnamed: 0,sepal length (cm),sepal width (cm),petal length (cm),petal width (cm),Target
0,0,5.1,3.5,1.4,0.2,0
1,1,4.9,3.0,1.4,0.2,0


In [106]:
df4.tail()

Unnamed: 0.1,Unnamed: 0,sepal length (cm),sepal width (cm),petal length (cm),petal width (cm),Target
145,145,6.7,3.0,5.2,2.3,2
146,146,6.3,2.5,5.0,1.9,2
147,147,6.5,3.0,5.2,2.0,2
148,148,6.2,3.4,5.4,2.3,2
149,149,5.9,3.0,5.1,1.8,2


In [107]:
df4.tail(4)

Unnamed: 0.1,Unnamed: 0,sepal length (cm),sepal width (cm),petal length (cm),petal width (cm),Target
146,146,6.3,2.5,5.0,1.9,2
147,147,6.5,3.0,5.2,2.0,2
148,148,6.2,3.4,5.4,2.3,2
149,149,5.9,3.0,5.1,1.8,2


In [111]:
df4.sample()

Unnamed: 0.1,Unnamed: 0,sepal length (cm),sepal width (cm),petal length (cm),petal width (cm),Target
44,44,5.1,3.8,1.9,0.4,0


In [112]:
df4.iloc[3]

Unnamed: 0           3.0
sepal length (cm)    4.6
sepal width (cm)     3.1
petal length (cm)    1.5
petal width (cm)     0.2
Target               0.0
Name: 3, dtype: float64

In [113]:
df4.iloc[100]

Unnamed: 0           100.0
sepal length (cm)      6.3
sepal width (cm)       3.3
petal length (cm)      6.0
petal width (cm)       2.5
Target                 2.0
Name: 100, dtype: float64

In [115]:
len(df4.columns)

6

In [117]:
df4.shape[0]

150

In [118]:
df4.shape

(150, 6)

In [121]:
df4["sepal length (cm)"] # accessing column as pandas series

0      5.1
1      4.9
2      4.7
3      4.6
4      5.0
      ... 
145    6.7
146    6.3
147    6.5
148    6.2
149    5.9
Name: sepal length (cm), Length: 150, dtype: float64

In [123]:
df4[["sepal length (cm)","petal width (cm)"]] # 2 columns as sub df

Unnamed: 0,sepal length (cm),petal width (cm)
0,5.1,0.2
1,4.9,0.2
2,4.7,0.2
3,4.6,0.2
4,5.0,0.2
...,...,...
145,6.7,2.3
146,6.3,1.9
147,6.5,2.0
148,6.2,2.3


In [124]:
df4.describe()

Unnamed: 0.1,Unnamed: 0,sepal length (cm),sepal width (cm),petal length (cm),petal width (cm),Target
count,150.0,150.0,150.0,150.0,150.0,150.0
mean,74.5,5.843333,3.057333,3.758,1.199333,1.0
std,43.445368,0.828066,0.435866,1.765298,0.762238,0.819232
min,0.0,4.3,2.0,1.0,0.1,0.0
25%,37.25,5.1,2.8,1.6,0.3,0.0
50%,74.5,5.8,3.0,4.35,1.3,1.0
75%,111.75,6.4,3.3,5.1,1.8,2.0
max,149.0,7.9,4.4,6.9,2.5,2.0


In [125]:
df4.sum()

Unnamed: 0           11175.0
sepal length (cm)      876.5
sepal width (cm)       458.6
petal length (cm)      563.7
petal width (cm)       179.9
Target                 150.0
dtype: float64

In [126]:
df4.count()

Unnamed: 0           150
sepal length (cm)    150
sepal width (cm)     150
petal length (cm)    150
petal width (cm)     150
Target               150
dtype: int64

In [127]:
df4.max()

Unnamed: 0           149.0
sepal length (cm)      7.9
sepal width (cm)       4.4
petal length (cm)      6.9
petal width (cm)       2.5
Target                 2.0
dtype: float64

In [128]:
df4.std()

Unnamed: 0           43.445368
sepal length (cm)     0.828066
sepal width (cm)      0.435866
petal length (cm)     1.765298
petal width (cm)      0.762238
Target                0.819232
dtype: float64

# Sorting

In [129]:
data = pd.read_csv("https://raw.githubusercontent.com/AP-Skill-Development-Corporation/DataScienceUsingPython-Internship-SRM-University/main/Day12_Pandas/Students%20(8).csv")
data

Unnamed: 0,S.No.,Registration Id,Name,Email,Mobile No
0,1,AP19110010001,Tata Lakshmi Durga Likhitha,tatalikhitha01@gmail.com,7287982736
1,2,AP19110010003,Bhuvana Venigalla,venigallabhuvana@gmail.com,7989023905
2,3,AP19110010005,Kanagala Yoga Sai Abhigna,abhignakanagala21@gmail.com,9390089242
3,4,AP19110010006,Likhitha Parvathi Tadikonda,t.likhithaparvathi@gmail.com,8099140764
4,5,AP19110010007,Jaya Ganesh Kumar Gudipati,jayaganeshkumarg@gmail.com,7330654249
...,...,...,...,...,...
140,141,AP19110020052,Mahesh,mahesh2001.sis@gmail.com,7989651824
141,142,AP19110020077,Saptharishi.D,0123saptha@gmail.com,9000666305
142,143,AP19110020077,Saptharishi Reddy.D,0123saptha@gmail.com,9000666305
143,144,AP19110020106,Karnatapu Sri Sai Dhanush,dhanushkarnatapu1440@gmail.com,9398018626


In [136]:
data.sort_index(axis = 0, ascending=True) # axis = 1 columns sort
# axis = 0 index sorting

Unnamed: 0,S.No.,Registration Id,Name,Email,Mobile No
0,1,AP19110010001,Tata Lakshmi Durga Likhitha,tatalikhitha01@gmail.com,7287982736
1,2,AP19110010003,Bhuvana Venigalla,venigallabhuvana@gmail.com,7989023905
2,3,AP19110010005,Kanagala Yoga Sai Abhigna,abhignakanagala21@gmail.com,9390089242
3,4,AP19110010006,Likhitha Parvathi Tadikonda,t.likhithaparvathi@gmail.com,8099140764
4,5,AP19110010007,Jaya Ganesh Kumar Gudipati,jayaganeshkumarg@gmail.com,7330654249
...,...,...,...,...,...
140,141,AP19110020052,Mahesh,mahesh2001.sis@gmail.com,7989651824
141,142,AP19110020077,Saptharishi.D,0123saptha@gmail.com,9000666305
142,143,AP19110020077,Saptharishi Reddy.D,0123saptha@gmail.com,9000666305
143,144,AP19110020106,Karnatapu Sri Sai Dhanush,dhanushkarnatapu1440@gmail.com,9398018626


In [134]:
data.sort_index(axis = 0, ascending=False)

Unnamed: 0,S.No.,Registration Id,Name,Email,Mobile No
144,145,AP1911010183,Tiwari Dharmesh Sharma,dharmeshsharma17122002@gmail.com,9704043879
143,144,AP19110020106,Karnatapu Sri Sai Dhanush,dhanushkarnatapu1440@gmail.com,9398018626
142,143,AP19110020077,Saptharishi Reddy.D,0123saptha@gmail.com,9000666305
141,142,AP19110020077,Saptharishi.D,0123saptha@gmail.com,9000666305
140,141,AP19110020052,Mahesh,mahesh2001.sis@gmail.com,7989651824
...,...,...,...,...,...
4,5,AP19110010007,Jaya Ganesh Kumar Gudipati,jayaganeshkumarg@gmail.com,7330654249
3,4,AP19110010006,Likhitha Parvathi Tadikonda,t.likhithaparvathi@gmail.com,8099140764
2,3,AP19110010005,Kanagala Yoga Sai Abhigna,abhignakanagala21@gmail.com,9390089242
1,2,AP19110010003,Bhuvana Venigalla,venigallabhuvana@gmail.com,7989023905


In [137]:
data.sort_index(axis = 1, ascending=False)

Unnamed: 0,S.No.,Registration Id,Name,Mobile No,Email
0,1,AP19110010001,Tata Lakshmi Durga Likhitha,7287982736,tatalikhitha01@gmail.com
1,2,AP19110010003,Bhuvana Venigalla,7989023905,venigallabhuvana@gmail.com
2,3,AP19110010005,Kanagala Yoga Sai Abhigna,9390089242,abhignakanagala21@gmail.com
3,4,AP19110010006,Likhitha Parvathi Tadikonda,8099140764,t.likhithaparvathi@gmail.com
4,5,AP19110010007,Jaya Ganesh Kumar Gudipati,7330654249,jayaganeshkumarg@gmail.com
...,...,...,...,...,...
140,141,AP19110020052,Mahesh,7989651824,mahesh2001.sis@gmail.com
141,142,AP19110020077,Saptharishi.D,9000666305,0123saptha@gmail.com
142,143,AP19110020077,Saptharishi Reddy.D,9000666305,0123saptha@gmail.com
143,144,AP19110020106,Karnatapu Sri Sai Dhanush,9398018626,dhanushkarnatapu1440@gmail.com


In [138]:
data.sort_index(axis = 1, ascending=True)

Unnamed: 0,Email,Mobile No,Name,Registration Id,S.No.
0,tatalikhitha01@gmail.com,7287982736,Tata Lakshmi Durga Likhitha,AP19110010001,1
1,venigallabhuvana@gmail.com,7989023905,Bhuvana Venigalla,AP19110010003,2
2,abhignakanagala21@gmail.com,9390089242,Kanagala Yoga Sai Abhigna,AP19110010005,3
3,t.likhithaparvathi@gmail.com,8099140764,Likhitha Parvathi Tadikonda,AP19110010006,4
4,jayaganeshkumarg@gmail.com,7330654249,Jaya Ganesh Kumar Gudipati,AP19110010007,5
...,...,...,...,...,...
140,mahesh2001.sis@gmail.com,7989651824,Mahesh,AP19110020052,141
141,0123saptha@gmail.com,9000666305,Saptharishi.D,AP19110020077,142
142,0123saptha@gmail.com,9000666305,Saptharishi Reddy.D,AP19110020077,143
143,dhanushkarnatapu1440@gmail.com,9398018626,Karnatapu Sri Sai Dhanush,AP19110020106,144


In [141]:
data.sort_values("Name", ascending=True)

Unnamed: 0,S.No.,Registration Id,Name,Email,Mobile No
25,26,AP19110010090,Abhinav Challa,abhinavch3011@gmail.com,9182522042
7,8,AP19110010013,Addepalli Srilekha,serevalli12@gmail.com,8688225663
65,66,AP19110010230,Akkineni Sohith Sai,saiakkinenisohith312@gmail.com,7032338671
66,67,AP19110010230,Akkineni Sohith Sai,saiakkinenisohith312@gmail.com,7032338671
40,41,AP19110010131,Anishita Kakani,anishitak@gmail.com,9399939996
...,...,...,...,...,...
35,36,AP19110010118,Vankayalapati Akhil Babu,akhilbabu3838@gmail.com,9121536598
82,83,AP19110010296,Yaminibhargavi Alapati,munnychowdaryalapati@gmail.com,8639173919
83,84,AP19110010296,Yaminibhargavi Alapati,munnychowdaryalapati@gmail.com,8639173919
105,106,AP19110010395,Yelchuri Uma Sankar Phani Kumar Guptha,umasankaryelchuri@gmail.com,9666961872


In [140]:
data.sort_values("Name",ascending=False)

Unnamed: 0,S.No.,Registration Id,Name,Email,Mobile No
91,92,AP19110010320,Yogitha Goli,yogithagoli1702@gmail.com,9701802216
105,106,AP19110010395,Yelchuri Uma Sankar Phani Kumar Guptha,umasankaryelchuri@gmail.com,9666961872
82,83,AP19110010296,Yaminibhargavi Alapati,munnychowdaryalapati@gmail.com,8639173919
83,84,AP19110010296,Yaminibhargavi Alapati,munnychowdaryalapati@gmail.com,8639173919
35,36,AP19110010118,Vankayalapati Akhil Babu,akhilbabu3838@gmail.com,9121536598
...,...,...,...,...,...
40,41,AP19110010131,Anishita Kakani,anishitak@gmail.com,9399939996
66,67,AP19110010230,Akkineni Sohith Sai,saiakkinenisohith312@gmail.com,7032338671
65,66,AP19110010230,Akkineni Sohith Sai,saiakkinenisohith312@gmail.com,7032338671
7,8,AP19110010013,Addepalli Srilekha,serevalli12@gmail.com,8688225663


In [143]:
data.sort_values(["Name","Registration Id"], ascending=False)

Unnamed: 0,S.No.,Registration Id,Name,Email,Mobile No
91,92,AP19110010320,Yogitha Goli,yogithagoli1702@gmail.com,9701802216
105,106,AP19110010395,Yelchuri Uma Sankar Phani Kumar Guptha,umasankaryelchuri@gmail.com,9666961872
82,83,AP19110010296,Yaminibhargavi Alapati,munnychowdaryalapati@gmail.com,8639173919
83,84,AP19110010296,Yaminibhargavi Alapati,munnychowdaryalapati@gmail.com,8639173919
35,36,AP19110010118,Vankayalapati Akhil Babu,akhilbabu3838@gmail.com,9121536598
...,...,...,...,...,...
40,41,AP19110010131,Anishita Kakani,anishitak@gmail.com,9399939996
65,66,AP19110010230,Akkineni Sohith Sai,saiakkinenisohith312@gmail.com,7032338671
66,67,AP19110010230,Akkineni Sohith Sai,saiakkinenisohith312@gmail.com,7032338671
7,8,AP19110010013,Addepalli Srilekha,serevalli12@gmail.com,8688225663


In [144]:
data

Unnamed: 0,S.No.,Registration Id,Name,Email,Mobile No
0,1,AP19110010001,Tata Lakshmi Durga Likhitha,tatalikhitha01@gmail.com,7287982736
1,2,AP19110010003,Bhuvana Venigalla,venigallabhuvana@gmail.com,7989023905
2,3,AP19110010005,Kanagala Yoga Sai Abhigna,abhignakanagala21@gmail.com,9390089242
3,4,AP19110010006,Likhitha Parvathi Tadikonda,t.likhithaparvathi@gmail.com,8099140764
4,5,AP19110010007,Jaya Ganesh Kumar Gudipati,jayaganeshkumarg@gmail.com,7330654249
...,...,...,...,...,...
140,141,AP19110020052,Mahesh,mahesh2001.sis@gmail.com,7989651824
141,142,AP19110020077,Saptharishi.D,0123saptha@gmail.com,9000666305
142,143,AP19110020077,Saptharishi Reddy.D,0123saptha@gmail.com,9000666305
143,144,AP19110020106,Karnatapu Sri Sai Dhanush,dhanushkarnatapu1440@gmail.com,9398018626


In [145]:
data.sort_values(["Name","Registration Id"], ascending=False, inplace = True)
data

Unnamed: 0,S.No.,Registration Id,Name,Email,Mobile No
91,92,AP19110010320,Yogitha Goli,yogithagoli1702@gmail.com,9701802216
105,106,AP19110010395,Yelchuri Uma Sankar Phani Kumar Guptha,umasankaryelchuri@gmail.com,9666961872
82,83,AP19110010296,Yaminibhargavi Alapati,munnychowdaryalapati@gmail.com,8639173919
83,84,AP19110010296,Yaminibhargavi Alapati,munnychowdaryalapati@gmail.com,8639173919
35,36,AP19110010118,Vankayalapati Akhil Babu,akhilbabu3838@gmail.com,9121536598
...,...,...,...,...,...
40,41,AP19110010131,Anishita Kakani,anishitak@gmail.com,9399939996
65,66,AP19110010230,Akkineni Sohith Sai,saiakkinenisohith312@gmail.com,7032338671
66,67,AP19110010230,Akkineni Sohith Sai,saiakkinenisohith312@gmail.com,7032338671
7,8,AP19110010013,Addepalli Srilekha,serevalli12@gmail.com,8688225663


# Combining or merging

In [150]:
d2 = {
    'Name': ['HemaSundar', 'Manish', 'Vamsi'],
     'Gender': ['Male', 'Male', np.nan],
     'PIN': [481, 512, 345]
     }
df1 = pd.DataFrame(d2)
df1

Unnamed: 0,Name,Gender,PIN
0,HemaSundar,Male,481
1,Manish,Male,512
2,Vamsi,,345


In [159]:
d3 = {
    'Name': ['HemaSundar', 'Manish', 'Vamsi',"lavanya"],
     'PIN': [481, 512, 345,56],
    "Color" : ["White","Black","Yellow","Blue"]
}
df2 = pd.DataFrame(d3)
df2

Unnamed: 0,Name,PIN,Color
0,HemaSundar,481,White
1,Manish,512,Black
2,Vamsi,345,Yellow
3,lavanya,56,Blue


In [155]:
pd.concat([df1,df2], axis = 0) # concat at below 

Unnamed: 0,Name,Gender,PIN,Color
0,HemaSundar,Male,481,
1,Manish,Male,512,
2,Vamsi,,345,
0,HemaSundar,,481,White
1,Manish,,512,Black
2,Vamsi,,345,Yellow


In [154]:
pd.concat([df2,df1], axis = 1) # concat side by side 

Unnamed: 0,Name,PIN,Color,Name.1,Gender,PIN.1
0,HemaSundar,481,White,HemaSundar,Male,481
1,Manish,512,Black,Manish,Male,512
2,Vamsi,345,Yellow,Vamsi,,345


In [156]:
df1.append(df2) # appended at below

Unnamed: 0,Name,Gender,PIN,Color
0,HemaSundar,Male,481,
1,Manish,Male,512,
2,Vamsi,,345,
0,HemaSundar,,481,White
1,Manish,,512,Black
2,Vamsi,,345,Yellow


In [157]:
df2.append(df1)

Unnamed: 0,Name,PIN,Color,Gender
0,HemaSundar,481,White,
1,Manish,512,Black,
2,Vamsi,345,Yellow,
0,HemaSundar,481,,Male
1,Manish,512,,Male
2,Vamsi,345,,


In [160]:
pd.merge(df1,df2) # one single columns # it retuns only common data rom both dataframes

Unnamed: 0,Name,Gender,PIN,Color
0,HemaSundar,Male,481,White
1,Manish,Male,512,Black
2,Vamsi,,345,Yellow


In [161]:
df1

Unnamed: 0,Name,Gender,PIN
0,HemaSundar,Male,481
1,Manish,Male,512
2,Vamsi,,345


In [162]:
df2

Unnamed: 0,Name,PIN,Color
0,HemaSundar,481,White
1,Manish,512,Black
2,Vamsi,345,Yellow
3,lavanya,56,Blue


In [164]:
pd.merge(df2,df1, how= "right")

Unnamed: 0,Name,PIN,Color,Gender
0,HemaSundar,481,White,Male
1,Manish,512,Black,Male
2,Vamsi,345,Yellow,


In [165]:
pd.merge(df2,df1, how= "left")

Unnamed: 0,Name,PIN,Color,Gender
0,HemaSundar,481,White,Male
1,Manish,512,Black,Male
2,Vamsi,345,Yellow,
3,lavanya,56,Blue,


In [166]:
pd.merge(df2,df1,how = "outer") # union - all 

Unnamed: 0,Name,PIN,Color,Gender
0,HemaSundar,481,White,Male
1,Manish,512,Black,Male
2,Vamsi,345,Yellow,
3,lavanya,56,Blue,


In [168]:
pd.merge(df2,df1,how = "inner")  # intersection - common

Unnamed: 0,Name,PIN,Color,Gender
0,HemaSundar,481,White,Male
1,Manish,512,Black,Male
2,Vamsi,345,Yellow,


In [169]:
help(pd.merge)

Help on function merge in module pandas.core.reshape.merge:

merge(left, right, how: str = 'inner', on=None, left_on=None, right_on=None, left_index: bool = False, right_index: bool = False, sort: bool = False, suffixes=('_x', '_y'), copy: bool = True, indicator: bool = False, validate=None) -> 'DataFrame'
    Merge DataFrame or named Series objects with a database-style join.
    
    The join is done on columns or indexes. If joining columns on
    columns, the DataFrame indexes *will be ignored*. Otherwise if joining indexes
    on indexes or indexes on a column or columns, the index will be passed on.
    
    Parameters
    ----------
    left : DataFrame
    right : DataFrame or named Series
        Object to merge with.
    how : {'left', 'right', 'outer', 'inner'}, default 'inner'
        Type of merge to be performed.
    
        * left: use only keys from left frame, similar to a SQL left outer join;
          preserve key order.
        * right: use only keys from right fra

# Filtering

In [170]:
data

Unnamed: 0,S.No.,Registration Id,Name,Email,Mobile No
91,92,AP19110010320,Yogitha Goli,yogithagoli1702@gmail.com,9701802216
105,106,AP19110010395,Yelchuri Uma Sankar Phani Kumar Guptha,umasankaryelchuri@gmail.com,9666961872
82,83,AP19110010296,Yaminibhargavi Alapati,munnychowdaryalapati@gmail.com,8639173919
83,84,AP19110010296,Yaminibhargavi Alapati,munnychowdaryalapati@gmail.com,8639173919
35,36,AP19110010118,Vankayalapati Akhil Babu,akhilbabu3838@gmail.com,9121536598
...,...,...,...,...,...
40,41,AP19110010131,Anishita Kakani,anishitak@gmail.com,9399939996
65,66,AP19110010230,Akkineni Sohith Sai,saiakkinenisohith312@gmail.com,7032338671
66,67,AP19110010230,Akkineni Sohith Sai,saiakkinenisohith312@gmail.com,7032338671
7,8,AP19110010013,Addepalli Srilekha,serevalli12@gmail.com,8688225663


In [171]:
data.iloc[100]

S.No.                                    137
Registration Id                AP19110020035
Name                          Jahnavi Mangam
Email              jahnavimangam17@gmail.com
Mobile No                         9948274445
Name: 136, dtype: object

In [174]:
data[data["S.No."] > 100]           # data Masking

Unnamed: 0,S.No.,Registration Id,Name,Email,Mobile No
105,106,AP19110010395,Yelchuri Uma Sankar Phani Kumar Guptha,umasankaryelchuri@gmail.com,9666961872
127,128,AP19110010519,V Akhil,akhil0805v@gmail.com,9347657721
144,145,AP1911010183,Tiwari Dharmesh Sharma,dharmeshsharma17122002@gmail.com,9704043879
118,119,AP19110010463,Thatha Naga Jayasree,thathajayasree@gmail.com,9110727521
115,116,AP19110010449,Sumana Bandarupalli,sumanabandarupalli@gmail.com,8074434915
116,117,AP19110010449,Sumana Bandarupalli,sumanabandarupalli@gmail.com,8074434915
113,114,AP19110010431,Satya Mani Syam. D,syamdontula99@gmail.com,9133626880
141,142,AP19110020077,Saptharishi.D,0123saptha@gmail.com,9000666305
142,143,AP19110020077,Saptharishi Reddy.D,0123saptha@gmail.com,9000666305
133,134,AP19110010543,Sai Keerthi,kittu601399@gmail.com,8500074169


In [175]:
data[data["Mobile No"] > 9000000000]

Unnamed: 0,S.No.,Registration Id,Name,Email,Mobile No
91,92,AP19110010320,Yogitha Goli,yogithagoli1702@gmail.com,9701802216
105,106,AP19110010395,Yelchuri Uma Sankar Phani Kumar Guptha,umasankaryelchuri@gmail.com,9666961872
35,36,AP19110010118,Vankayalapati Akhil Babu,akhilbabu3838@gmail.com,9121536598
6,7,AP19110010011,V Naga Sai Aditya Gandavarapu,adityagandavarapu66@gmail.com,9110333929
127,128,AP19110010519,V Akhil,akhil0805v@gmail.com,9347657721
...,...,...,...,...,...
51,52,AP19110010174,Boyapati Sai Venkat,svboyapati24@gmail.com,9182708804
9,10,AP19110010015,Boorlagadda Sai Subhang,subhang51011@gmail.com,9014273040
34,35,AP19110010116,Bharath Kumar Reddy Sanikommu,bharathsanikommu@gmail.com,9110722944
40,41,AP19110010131,Anishita Kakani,anishitak@gmail.com,9399939996


In [178]:
# want to access the records having moblie no starts with 9 and 8 
data[data["Mobile No"] > 8000000000]

Unnamed: 0,S.No.,Registration Id,Name,Email,Mobile No
91,92,AP19110010320,Yogitha Goli,yogithagoli1702@gmail.com,9701802216
105,106,AP19110010395,Yelchuri Uma Sankar Phani Kumar Guptha,umasankaryelchuri@gmail.com,9666961872
82,83,AP19110010296,Yaminibhargavi Alapati,munnychowdaryalapati@gmail.com,8639173919
83,84,AP19110010296,Yaminibhargavi Alapati,munnychowdaryalapati@gmail.com,8639173919
35,36,AP19110010118,Vankayalapati Akhil Babu,akhilbabu3838@gmail.com,9121536598
...,...,...,...,...,...
26,27,AP19110010092,B.C Amulya,chinmayeamulyasai@gmail.com,8099776689
67,68,AP19110010234,Attuluri Prudhvi Raj,prudhviattuluri@gmail.com,8500016935
40,41,AP19110010131,Anishita Kakani,anishitak@gmail.com,9399939996
7,8,AP19110010013,Addepalli Srilekha,serevalli12@gmail.com,8688225663


In [181]:
# access the records having moblie no starts with 7 and 8 
data[7000000000 < data["Mobile No"] < 9000000000]

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

In [180]:
data[data["Mobile No"]>7000000000]
data[data["Mobile No"]<9000000000]

Unnamed: 0,S.No.,Registration Id,Name,Email,Mobile No
82,83,AP19110010296,Yaminibhargavi Alapati,munnychowdaryalapati@gmail.com,8639173919
83,84,AP19110010296,Yaminibhargavi Alapati,munnychowdaryalapati@gmail.com,8639173919
52,53,AP19110010175,Vangala Bhargava Anirudh,anirudhroy241@gmail.com,8886862576
98,99,AP19110010374,Vaishnavi Datla,vaishnavivarma87@gmail.com,8790988244
58,59,AP19110010212,Tiyyagura Durga Prasad Reddy,tdurgaprasad2002@gmail.com,7981821182
...,...,...,...,...,...
67,68,AP19110010234,Attuluri Prudhvi Raj,prudhviattuluri@gmail.com,8500016935
93,94,AP19110010340,Anjana Nallanagula,anjana.nallanagula@gmail.com,7981935617
65,66,AP19110010230,Akkineni Sohith Sai,saiakkinenisohith312@gmail.com,7032338671
66,67,AP19110010230,Akkineni Sohith Sai,saiakkinenisohith312@gmail.com,7032338671


In [183]:
data[(data["Mobile No"] > 7000000000) & (data["Mobile No"] < 9000000000)]

Unnamed: 0,S.No.,Registration Id,Name,Email,Mobile No
82,83,AP19110010296,Yaminibhargavi Alapati,munnychowdaryalapati@gmail.com,8639173919
83,84,AP19110010296,Yaminibhargavi Alapati,munnychowdaryalapati@gmail.com,8639173919
52,53,AP19110010175,Vangala Bhargava Anirudh,anirudhroy241@gmail.com,8886862576
98,99,AP19110010374,Vaishnavi Datla,vaishnavivarma87@gmail.com,8790988244
58,59,AP19110010212,Tiyyagura Durga Prasad Reddy,tdurgaprasad2002@gmail.com,7981821182
...,...,...,...,...,...
67,68,AP19110010234,Attuluri Prudhvi Raj,prudhviattuluri@gmail.com,8500016935
93,94,AP19110010340,Anjana Nallanagula,anjana.nallanagula@gmail.com,7981935617
65,66,AP19110010230,Akkineni Sohith Sai,saiakkinenisohith312@gmail.com,7032338671
66,67,AP19110010230,Akkineni Sohith Sai,saiakkinenisohith312@gmail.com,7032338671


In [184]:
data2 = data.copy()
data2['Mobile No'] = data2['Mobile No'].astype(str)
m_num = input("Enter the number, to find the numbers starts with that number: ")
series = data2['Mobile No'].str.startswith(m_num)
data2[series]

Enter the number, to find the numbers starts with that number: 7


Unnamed: 0,S.No.,Registration Id,Name,Email,Mobile No
58,59,AP19110010212,Tiyyagura Durga Prasad Reddy,tdurgaprasad2002@gmail.com,7981821182
0,1,AP19110010001,Tata Lakshmi Durga Likhitha,tatalikhitha01@gmail.com,7287982736
64,65,AP19110010229,R. Hasitha Sree,rhasithasree123@gmail.com,7337050642
37,38,AP19110010126,R . Uday Kiran,udaykiranramaraju@gmail.com,7780273060
94,95,AP19110010346,Pushpa Latha,pushpalathaavvaru@gmail.com,7995694959
44,45,AP19110010160,Mullapudi Venkat,Venkatmullapudi14300@gmail.com,7032567895
72,73,AP19110010249,Mohit Kumar,mohitkumar.projects@gmail.com,7281809313
140,141,AP19110020052,Mahesh,mahesh2001.sis@gmail.com,7989651824
135,136,AP19110010553,Mahalakshmi Damarla,mahalakshmidamarla6@gmail.com,7901097444
23,24,AP19110010080,M.Tarun,tarunchowdary001@gmail.com,7337433099
