# pandas

### pandas is a Python package providing fast, flexible, and expressive data structures designed to make working with “relational” or “labeled” data both easy and intuitive. 

#### pandas is well suited for many different kinds of data:

1.Tabular data with heterogeneously-typed columns, as in an SQL table or Excel spreadsheet
<br>
2.Ordered and unordered (not necessarily fixed-frequency) time series data.
<br>
3.Arbitrary matrix data (homogeneously typed or heterogeneous) with row and column labels
<br>
4.Any other form of observational / statistical data sets. The data need not be labeled at all to be placed into a pandas data structure.


**The two primary data structures of pandas, Series (1-dimensional) and DataFrame (2-dimensional), handle the vast majority of typical use cases in finance, statistics, social science, and many areas of engineering.**


A DataFrame is a 2-dimensional data structure that can store data of different types (including characters, integers, floating point values, categorical data and more) in columns. It is similar to a spreadsheet, a SQL table or the data.frame in R.

[pandas documentation here](https://pandas.pydata.org/docs/getting_started/intro_tutorials/01_table_oriented.html)

In [1]:
# importing pandas lib as pd
import pandas as pd

In [2]:
# importing csv file 
df=pd.read_csv('services.csv')

In [3]:
# df is a data frame
type(df)

pandas.core.frame.DataFrame

In [4]:
# to read first 5 data points, '5' (by default)
df.head()

Unnamed: 0,id,location_id,program_id,accepted_payments,alternate_name,application_process,audience,description,eligibility,email,...,interpretation_services,keywords,languages,name,required_documents,service_areas,status,wait_time,website,taxonomy_ids
0,1,1,,,,Walk in or apply by phone.,"Older adults age 55 or over, ethnic minorities...",A walk-in center for older adults that provide...,"Age 55 or over for most programs, age 60 or ov...",,...,,"ADULT PROTECTION AND CARE SERVICES, Meal Sites...",,Fair Oaks Adult Activity Center,,Colma,active,No wait.,,
1,2,2,,,,Apply by phone for an appointment.,Residents of San Mateo County age 55 or over,Provides training and job placement to eligibl...,"Age 55 or over, county resident and willing an...",,...,,"EMPLOYMENT/TRAINING SERVICES, Job Development,...",,Second Career Employment Program,,San Mateo County,active,Varies.,,
2,3,3,,,,Phone for information (403-4300 Ext. 4322).,Older adults age 55 or over who can benefit fr...,Offers supportive counseling services to San M...,Resident of San Mateo County age 55 or over,,...,,"Geriatric Counseling, Older Adults, Gay, Lesbi...",,Senior Peer Counseling,,San Mateo County,active,Varies.,,
3,4,4,,,,Apply by phone.,"Parents, children, families with problems of c...",Provides supervised visitation services and a ...,,,...,,"INDIVIDUAL AND FAMILY DEVELOPMENT SERVICES, Gr...",,Family Visitation Center,,San Mateo County,active,No wait.,,
4,5,5,,,,Phone for information.,Low-income working families with children tran...,Provides fixed 8% short term loans to eligible...,Eligibility: Low-income family with legal cust...,,...,,"COMMUNITY SERVICES, Speakers, Automobile Loans",,Economic Self-Sufficiency Program,,San Mateo County,active,,,


In [5]:
# to read last 5 data points
df.tail()

Unnamed: 0,id,location_id,program_id,accepted_payments,alternate_name,application_process,audience,description,eligibility,email,...,interpretation_services,keywords,languages,name,required_documents,service_areas,status,wait_time,website,taxonomy_ids
18,19,19,,,,Call for screening appointment (650-347-3648).,,Provides free medical and dental care to those...,Low-income person without access to health care,,...,,"HEALTH SERVICES, Outpatient Care, Community Cl...",,San Mateo Free Medical Clinic,,"Belmont, Burlingame",active,Varies.,,
19,20,20,,,,Walk in.,,no unrequired fields for this service,,,...,,,,Service with blank fields,,,defunct,,,
20,21,21,,,,By phone during business hours.,,just a test service,,,...,,,,Service for Admin Test Location,,San Mateo County,inactive,,,
21,22,22,,"Cash, Check, Credit Card",Fotos para pasaportes,Walk in or apply by phone or mail,"Profit and nonprofit businesses, the public, m...",[NOTE THIS IS NOT A REAL SERVICE--THIS IS FOR ...,,passports@example.org,...,We offer 3-way interpretation services over th...,"Salud, Medicina",Spanish,Passport Photos,Government-issued picture identification,"Alameda County, San Mateo County",active,No wait to 2 weeks.,http://www.example.com,"105, 108, 108-05, 108-05-01, 111, 111-05"
22,23,22,,,,Walk in or apply by phone or mail,"Second service and nonprofit businesses, the p...",[NOTE THIS IS NOT A REAL ORGANIZATION--THIS IS...,,,...,,"Ruby on Rails/Postgres/Redis, testing, wic",,Example Service Name,,"San Mateo County, Alameda County",active,No wait to 2 weeks,http://www.example.com,


In [6]:
# Each column in a DataFrame is a Series
df.columns

Index(['id', 'location_id', 'program_id', 'accepted_payments',
       'alternate_name', 'application_process', 'audience', 'description',
       'eligibility', 'email', 'fees', 'funding_sources',
       'interpretation_services', 'keywords', 'languages', 'name',
       'required_documents', 'service_areas', 'status', 'wait_time', 'website',
       'taxonomy_ids'],
      dtype='object')

In [7]:
# series in dataframe df
type(df.columns)

pandas.core.indexes.base.Index

In [8]:
# we can also make them list and store if we want to 
list(df.columns)

['id',
 'location_id',
 'program_id',
 'accepted_payments',
 'alternate_name',
 'application_process',
 'audience',
 'description',
 'eligibility',
 'email',
 'fees',
 'funding_sources',
 'interpretation_services',
 'keywords',
 'languages',
 'name',
 'required_documents',
 'service_areas',
 'status',
 'wait_time',
 'website',
 'taxonomy_ids']

In [9]:
# to find the data-types of each of the series elements
df.dtypes

id                           int64
location_id                  int64
program_id                 float64
accepted_payments           object
alternate_name              object
application_process         object
audience                    object
description                 object
eligibility                 object
email                       object
fees                        object
funding_sources             object
interpretation_services     object
keywords                    object
languages                   object
name                        object
required_documents          object
service_areas               object
status                      object
wait_time                   object
website                     object
taxonomy_ids                object
dtype: object

In [10]:
# to access columns
# observe series provides index-numbers unlike in list
df['location_id']

0      1
1      2
2      3
3      4
4      5
5      6
6      7
7      8
8      9
9     10
10    11
11    12
12    13
13    14
14    15
15    16
16    17
17    18
18    19
19    20
20    21
21    22
22    22
Name: location_id, dtype: int64

In [11]:
# observe indexing
df['audience']

0     Older adults age 55 or over, ethnic minorities...
1          Residents of San Mateo County age 55 or over
2     Older adults age 55 or over who can benefit fr...
3     Parents, children, families with problems of c...
4     Low-income working families with children tran...
5                                               Any age
6     Older adults who have memory or sensory loss, ...
7     Senior citizens age 60 or over, disabled indiv...
8        Ethnic minorities, especially Spanish speaking
9                                                   NaN
10                                                  NaN
11    Adults, parents, children in 1st-12th grades i...
12                                                  NaN
13    Individuals or families with low or no income ...
14    Adult alcoholic/drug addictive men and women w...
15                                                  NaN
16                                                  NaN
17                                              

In [12]:
print(type(df['location_id']))

<class 'pandas.core.series.Series'>


In [13]:
print(type(df['audience']))

<class 'pandas.core.series.Series'>


In [14]:
# how to retrive multiple columns
# df['location_id','audience']
# above statement results in error as now we are retriving dataframe so,
df[['location_id','audience']]

Unnamed: 0,location_id,audience
0,1,"Older adults age 55 or over, ethnic minorities..."
1,2,Residents of San Mateo County age 55 or over
2,3,Older adults age 55 or over who can benefit fr...
3,4,"Parents, children, families with problems of c..."
4,5,Low-income working families with children tran...
5,6,Any age
6,7,"Older adults who have memory or sensory loss, ..."
7,8,"Senior citizens age 60 or over, disabled indiv..."
8,9,"Ethnic minorities, especially Spanish speaking"
9,10,


In [15]:
# observe the data type of below
type(df[['location_id','audience']])

pandas.core.frame.DataFrame

In [16]:
df.columns

Index(['id', 'location_id', 'program_id', 'accepted_payments',
       'alternate_name', 'application_process', 'audience', 'description',
       'eligibility', 'email', 'fees', 'funding_sources',
       'interpretation_services', 'keywords', 'languages', 'name',
       'required_documents', 'service_areas', 'status', 'wait_time', 'website',
       'taxonomy_ids'],
      dtype='object')

In [17]:
df[['location_id','accepted_payments','fees','status','funding_sources']]

Unnamed: 0,location_id,accepted_payments,fees,status,funding_sources
0,1,,$2.50 suggested donation for lunch for age 60 ...,active,"City, County, Donations, Fees, Fundraising"
1,2,,None. Donations requested of clients who can a...,active,"County, Federal, State"
2,3,,None.,active,"County, Donations, Grants"
3,4,,"Vary according to income ($5-$90). Cash, check...",active,"County, Donations, Grants"
4,5,,$60 application fee. Cash or checks accepted.,active,"County, Grants, State"
5,6,,$55 per year membership dues. Classes have fee...,active,"Fees, Fundraising, Grants, Dues"
6,7,,$85 per day. Vary according to income for thos...,active,"Donations, Fees, Grants"
7,8,,Suggested donation of $4.25 per meal for senio...,active,"County, Donations"
8,9,,None.,active,"City, County"
9,10,,None.,active,City


In [18]:
# loading another dataset
df1=pd.read_excel('LUSID Excel - Setting up your market data.xlsx')

In [19]:
df1.head()

Unnamed: 0.1,Unnamed: 0,Unnamed: 1,Unnamed: 2,Unnamed: 3,Unnamed: 4,Unnamed: 5,Unnamed: 6,Unnamed: 7,Unnamed: 8,Unnamed: 9
0,,,,,,,,,,
1,,,,Datetimes in LUSID,,,,,,
2,,,,,,,,,,
3,,,,This sheet allows you to format datetimes for...,,,,,,
4,,,,If you have any questions please visit our:,,,,,,


In [20]:
df1.columns

Index(['Unnamed: 0', 'Unnamed: 1', 'Unnamed: 2', 'Unnamed: 3', 'Unnamed: 4',
       'Unnamed: 5', 'Unnamed: 6', 'Unnamed: 7', 'Unnamed: 8', 'Unnamed: 9'],
      dtype='object')

In [21]:
# it can fetch datatypes
df1.dtypes

Unnamed: 0    float64
Unnamed: 1    float64
Unnamed: 2    float64
Unnamed: 3     object
Unnamed: 4     object
Unnamed: 5     object
Unnamed: 6    float64
Unnamed: 7     object
Unnamed: 8     object
Unnamed: 9     object
dtype: object

In [22]:
# loading csv data from web
df2 =pd.read_csv("https://raw.githubusercontent.com/datasciencedojo/datasets/master/titanic.csv")

In [23]:
# it is titanic dataset from seaborn
df.head()

Unnamed: 0,id,location_id,program_id,accepted_payments,alternate_name,application_process,audience,description,eligibility,email,...,interpretation_services,keywords,languages,name,required_documents,service_areas,status,wait_time,website,taxonomy_ids
0,1,1,,,,Walk in or apply by phone.,"Older adults age 55 or over, ethnic minorities...",A walk-in center for older adults that provide...,"Age 55 or over for most programs, age 60 or ov...",,...,,"ADULT PROTECTION AND CARE SERVICES, Meal Sites...",,Fair Oaks Adult Activity Center,,Colma,active,No wait.,,
1,2,2,,,,Apply by phone for an appointment.,Residents of San Mateo County age 55 or over,Provides training and job placement to eligibl...,"Age 55 or over, county resident and willing an...",,...,,"EMPLOYMENT/TRAINING SERVICES, Job Development,...",,Second Career Employment Program,,San Mateo County,active,Varies.,,
2,3,3,,,,Phone for information (403-4300 Ext. 4322).,Older adults age 55 or over who can benefit fr...,Offers supportive counseling services to San M...,Resident of San Mateo County age 55 or over,,...,,"Geriatric Counseling, Older Adults, Gay, Lesbi...",,Senior Peer Counseling,,San Mateo County,active,Varies.,,
3,4,4,,,,Apply by phone.,"Parents, children, families with problems of c...",Provides supervised visitation services and a ...,,,...,,"INDIVIDUAL AND FAMILY DEVELOPMENT SERVICES, Gr...",,Family Visitation Center,,San Mateo County,active,No wait.,,
4,5,5,,,,Phone for information.,Low-income working families with children tran...,Provides fixed 8% short term loans to eligible...,Eligibility: Low-income family with legal cust...,,...,,"COMMUNITY SERVICES, Speakers, Automobile Loans",,Economic Self-Sufficiency Program,,San Mateo County,active,,,


In [24]:
df2.dtypes

PassengerId      int64
Survived         int64
Pclass           int64
Name            object
Sex             object
Age            float64
SibSp            int64
Parch            int64
Ticket          object
Fare           float64
Cabin           object
Embarked        object
dtype: object

In [25]:
df2[['PassengerId','Name','Sex','Fare','Survived']]

Unnamed: 0,PassengerId,Name,Sex,Fare,Survived
0,1,"Braund, Mr. Owen Harris",male,7.2500,0
1,2,"Cumings, Mrs. John Bradley (Florence Briggs Th...",female,71.2833,1
2,3,"Heikkinen, Miss. Laina",female,7.9250,1
3,4,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,53.1000,1
4,5,"Allen, Mr. William Henry",male,8.0500,0
...,...,...,...,...,...
886,887,"Montvila, Rev. Juozas",male,13.0000,0
887,888,"Graham, Miss. Margaret Edith",female,30.0000,1
888,889,"Johnston, Miss. Catherine Helen ""Carrie""",female,23.4500,0
889,890,"Behr, Mr. Karl Howell",male,30.0000,1


In [26]:
df2_dummy=df2[['PassengerId','Name','Sex','Fare','Survived']]

In [27]:
type(df2_dummy)

pandas.core.frame.DataFrame

In [28]:
df2_dummy.head()

Unnamed: 0,PassengerId,Name,Sex,Fare,Survived
0,1,"Braund, Mr. Owen Harris",male,7.25,0
1,2,"Cumings, Mrs. John Bradley (Florence Briggs Th...",female,71.2833,1
2,3,"Heikkinen, Miss. Laina",female,7.925,1
3,4,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,53.1,1
4,5,"Allen, Mr. William Henry",male,8.05,0


In [29]:
# to save a dataframe
# it will try to save indexes too (by default), to avoid it use 'index=False'
df2_dummy.to_csv('titanic_subframe.csv',index=False)

In [30]:
# by default first row is treated as columns_headers 
# to avoid this use 
df2_dummy2=pd.read_csv('titanic_subframe.csv', header=None)
# it will by default give numbers to columns

In [31]:
df2_dummy2.head()

Unnamed: 0,0,1,2,3,4
0,PassengerId,Name,Sex,Fare,Survived
1,1,"Braund, Mr. Owen Harris",male,7.25,0
2,2,"Cumings, Mrs. John Bradley (Florence Briggs Th...",female,71.2833,1
3,3,"Heikkinen, Miss. Laina",female,7.925,1
4,4,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,53.1,1


In [32]:
df2_dummy2.columns

Int64Index([0, 1, 2, 3, 4], dtype='int64')

In [33]:
# to read table data (only) from web or html files

In [34]:
pip install lxml

Note: you may need to restart the kernel to use updated packages.


In [35]:
import lxml
import pandas as pd
url_data=pd.read_html("https://www.moneycontrol.com/markets/indian-indices/")

In [36]:
type(url_data)

list

In [37]:
len(url_data)

6

In [38]:
 # it is able to fetch data only at indexes 2 and 3 
df3_1=url_data[2]
df3_2=url_data[3]
df3 = df3_1 + df3_2

In [39]:
df3

Unnamed: 0,Stock Name,Sector,LTP,Change,%Chg
0,Tata MotorsONGC,Auto - LCVs & HCVsOil Drilling And Exploration,1266.55,-8.45,0.78
1,Bajaj FinanceCoal India,Finance - NBFCMining & Minerals,7654.7,-96.85,0.07
2,Kotak MahindraWipro,Banks - Private SectorComputers - Software,2317.15,-18.3,-0.29
3,HDFC LifePower Grid Corp,Life & Health InsurancePower - Generation & Di...,926.3,-4.05,-0.25
4,Shriram FinanceGrasim,Finance - Leasing & Hire PurchaseDiversified,5647.1,-3.55,-0.06
5,Axis BankInfosys,Banks - Private SectorComputers - Software,2852.35,-0.35,-0.2
6,Bajaj AutoHDFC Bank,Auto - 2 & 3 WheelersBanks - Private Sector,11164.0,-73.6,-0.38
7,SBI Life InsuraUltraTechCement,Life & Health InsuranceCement - Major,13449.3,42.8,-0.38
8,Dr Reddys LabsTCS,PharmaceuticalsComputers - Software,10296.3,-35.6,-0.39
9,ICICI BankLarsen,Banks - Private SectorInfrastructure - General,4744.9,6.5,-0.33


In [40]:
# object datatype is similar to string
df3.dtypes

Stock Name     object
Sector         object
LTP           float64
Change        float64
%Chg          float64
dtype: object

In [41]:
df3.to_csv('Stock_M_Data.csv',index=False)

In [42]:
# adding data manually
# To manually store data in a table, create a DataFrame. When using a Python dictionary of lists,
# the dictionary keys will be used as column headers and the values in each list as columns of the DataFrame.

import pandas as pd
df4 = pd.DataFrame(
        {
            "Name": [
                "Braund, Mr. Owen Harris",
                "Allen, Mr. William Henry",
                "Bonnell, Miss. Elizabeth",] ,
            "Age": [22, 35, 58],
            "Sex": ["male", "male", "female"],
        }
)

In [43]:
df4

Unnamed: 0,Name,Age,Sex
0,"Braund, Mr. Owen Harris",22,male
1,"Allen, Mr. William Henry",35,male
2,"Bonnell, Miss. Elizabeth",58,female


In [44]:
# import titanic from github web page
df5 = pd.read_csv("https://raw.githubusercontent.com/datasciencedojo/datasets/master/titanic.csv")

In [45]:
df5.head()

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
0,1,0,3,"Braund, Mr. Owen Harris",male,22.0,1,0,A/5 21171,7.25,,S
1,2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Th...",female,38.0,1,0,PC 17599,71.2833,C85,C
2,3,1,3,"Heikkinen, Miss. Laina",female,26.0,0,0,STON/O2. 3101282,7.925,,S
3,4,1,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35.0,1,0,113803,53.1,C123,S
4,5,0,3,"Allen, Mr. William Henry",male,35.0,0,0,373450,8.05,,S


In [46]:
# The describe() method returns description of the data in the DataFrame.
# observe it only returns description of numerical data
df5.describe()

Unnamed: 0,PassengerId,Survived,Pclass,Age,SibSp,Parch,Fare
count,891.0,891.0,891.0,714.0,891.0,891.0,891.0
mean,446.0,0.383838,2.308642,29.699118,0.523008,0.381594,32.204208
std,257.353842,0.486592,0.836071,14.526497,1.102743,0.806057,49.693429
min,1.0,0.0,1.0,0.42,0.0,0.0,0.0
25%,223.5,0.0,2.0,20.125,0.0,0.0,7.9104
50%,446.0,0.0,3.0,28.0,0.0,0.0,14.4542
75%,668.5,1.0,3.0,38.0,1.0,0.0,31.0
max,891.0,1.0,3.0,80.0,8.0,6.0,512.3292


In [47]:
df5.dtypes

PassengerId      int64
Survived         int64
Pclass           int64
Name            object
Sex             object
Age            float64
SibSp            int64
Parch            int64
Ticket          object
Fare           float64
Cabin           object
Embarked        object
dtype: object

In [48]:
df5[['Name','Sex','Ticket','Cabin','Embarked']]

Unnamed: 0,Name,Sex,Ticket,Cabin,Embarked
0,"Braund, Mr. Owen Harris",male,A/5 21171,,S
1,"Cumings, Mrs. John Bradley (Florence Briggs Th...",female,PC 17599,C85,C
2,"Heikkinen, Miss. Laina",female,STON/O2. 3101282,,S
3,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,113803,C123,S
4,"Allen, Mr. William Henry",male,373450,,S
...,...,...,...,...,...
886,"Montvila, Rev. Juozas",male,211536,,S
887,"Graham, Miss. Margaret Edith",female,112053,B42,S
888,"Johnston, Miss. Catherine Helen ""Carrie""",female,W./C. 6607,,S
889,"Behr, Mr. Karl Howell",male,111369,C148,C


In [49]:
df5.dtypes=="object"

PassengerId    False
Survived       False
Pclass         False
Name            True
Sex             True
Age            False
SibSp          False
Parch          False
Ticket          True
Fare           False
Cabin           True
Embarked        True
dtype: bool

In [50]:
df5.dtypes[df5.dtypes=="object"]

Name        object
Sex         object
Ticket      object
Cabin       object
Embarked    object
dtype: object

In [51]:
# observe it  is returning a list
df5.dtypes[df5.dtypes=='object'].index

Index(['Name', 'Sex', 'Ticket', 'Cabin', 'Embarked'], dtype='object')

In [52]:
# to obtain the dataframe with only 'object' dtype columns
df5[df5.dtypes[df5.dtypes=='object'].index]

Unnamed: 0,Name,Sex,Ticket,Cabin,Embarked
0,"Braund, Mr. Owen Harris",male,A/5 21171,,S
1,"Cumings, Mrs. John Bradley (Florence Briggs Th...",female,PC 17599,C85,C
2,"Heikkinen, Miss. Laina",female,STON/O2. 3101282,,S
3,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,113803,C123,S
4,"Allen, Mr. William Henry",male,373450,,S
...,...,...,...,...,...
886,"Montvila, Rev. Juozas",male,211536,,S
887,"Graham, Miss. Margaret Edith",female,112053,B42,S
888,"Johnston, Miss. Catherine Helen ""Carrie""",female,W./C. 6607,,S
889,"Behr, Mr. Karl Howell",male,111369,C148,C


In [53]:
# to get description of above dataframe
df5[df5.dtypes[df5.dtypes=='object'].index].describe()

# from this description we can conclude basic info such as 
# column 'sex' has only two unique values, etc

Unnamed: 0,Name,Sex,Ticket,Cabin,Embarked
count,891,891,891,204,889
unique,891,2,681,147,3
top,"Braund, Mr. Owen Harris",male,347082,B96 B98,S
freq,1,577,7,4,644


In [54]:
# above derived formula can also be applicable to 'int64' dtype
df5[df5.dtypes[df5.dtypes=='int64'].index]

Unnamed: 0,PassengerId,Survived,Pclass,SibSp,Parch
0,1,0,3,1,0
1,2,1,1,1,0
2,3,1,3,0,0
3,4,1,1,1,0
4,5,0,3,0,0
...,...,...,...,...,...
886,887,0,2,0,0
887,888,1,1,0,0
888,889,0,3,1,2
889,890,1,1,0,0


In [55]:
df5[df5.dtypes[df5.dtypes=='float'].index]

Unnamed: 0,Age,Fare
0,22.0,7.2500
1,38.0,71.2833
2,26.0,7.9250
3,35.0,53.1000
4,35.0,8.0500
...,...,...
886,27.0,13.0000
887,19.0,30.0000
888,,23.4500
889,26.0,30.0000


In [56]:
# obtaining records (like slicing)

In [57]:
df5.columns

Index(['PassengerId', 'Survived', 'Pclass', 'Name', 'Sex', 'Age', 'SibSp',
       'Parch', 'Ticket', 'Fare', 'Cabin', 'Embarked'],
      dtype='object')

In [58]:
# to obtain records from 4 to 10 of columns 'Name' and 'Fare'
df5[['Name','Fare']][4:11]

Unnamed: 0,Name,Fare
4,"Allen, Mr. William Henry",8.05
5,"Moran, Mr. James",8.4583
6,"McCarthy, Mr. Timothy J",51.8625
7,"Palsson, Master. Gosta Leonard",21.075
8,"Johnson, Mrs. Oscar W (Elisabeth Vilhelmina Berg)",11.1333
9,"Nasser, Mrs. Nicholas (Adele Achem)",30.0708
10,"Sandstrom, Miss. Marguerite Rut",16.7


In [59]:
# to obtain even number records till 20 of columns 'Name' and 'Fare'
df5[['Name','Fare']][0:21:2]

Unnamed: 0,Name,Fare
0,"Braund, Mr. Owen Harris",7.25
2,"Heikkinen, Miss. Laina",7.925
4,"Allen, Mr. William Henry",8.05
6,"McCarthy, Mr. Timothy J",51.8625
8,"Johnson, Mrs. Oscar W (Elisabeth Vilhelmina Berg)",11.1333
10,"Sandstrom, Miss. Marguerite Rut",16.7
12,"Saundercock, Mr. William Henry",8.05
14,"Vestrom, Miss. Hulda Amanda Adolfina",7.8542
16,"Rice, Master. Eugene",29.125
18,"Vander Planke, Mrs. Julius (Emelia Maria Vande...",18.0


In [60]:
df5.head()

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
0,1,0,3,"Braund, Mr. Owen Harris",male,22.0,1,0,A/5 21171,7.25,,S
1,2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Th...",female,38.0,1,0,PC 17599,71.2833,C85,C
2,3,1,3,"Heikkinen, Miss. Laina",female,26.0,0,0,STON/O2. 3101282,7.925,,S
3,4,1,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35.0,1,0,113803,53.1,C123,S
4,5,0,3,"Allen, Mr. William Henry",male,35.0,0,0,373450,8.05,,S


In [61]:
# to add a new column and assign '0' to every value
df5['New_Col']=0
df5.head()

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked,New_Col
0,1,0,3,"Braund, Mr. Owen Harris",male,22.0,1,0,A/5 21171,7.25,,S,0
1,2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Th...",female,38.0,1,0,PC 17599,71.2833,C85,C,0
2,3,1,3,"Heikkinen, Miss. Laina",female,26.0,0,0,STON/O2. 3101282,7.925,,S,0
3,4,1,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35.0,1,0,113803,53.1,C123,S,0
4,5,0,3,"Allen, Mr. William Henry",male,35.0,0,0,373450,8.05,,S,0


In [62]:
# to add a new column as a combination of two columns
df5['New_Col_2']= df5['PassengerId']+df5['Pclass']

In [63]:
df5.head()

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked,New_Col,New_Col_2
0,1,0,3,"Braund, Mr. Owen Harris",male,22.0,1,0,A/5 21171,7.25,,S,0,4
1,2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Th...",female,38.0,1,0,PC 17599,71.2833,C85,C,0,3
2,3,1,3,"Heikkinen, Miss. Laina",female,26.0,0,0,STON/O2. 3101282,7.925,,S,0,6
3,4,1,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35.0,1,0,113803,53.1,C123,S,0,5
4,5,0,3,"Allen, Mr. William Henry",male,35.0,0,0,373450,8.05,,S,0,8


In [64]:
# deleting a column or row
# The drop() method takes an argument axis, which can be 0 or 1.
# If axis is 0, then the row will be deleted. If axis is 1, then the column will be deleted.

# df.drop(0, axis=0)
# This would return a new dataframe with the first row deleted.

df5.drop('New_Col',axis=1)

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked,New_Col_2
0,1,0,3,"Braund, Mr. Owen Harris",male,22.0,1,0,A/5 21171,7.2500,,S,4
1,2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Th...",female,38.0,1,0,PC 17599,71.2833,C85,C,3
2,3,1,3,"Heikkinen, Miss. Laina",female,26.0,0,0,STON/O2. 3101282,7.9250,,S,6
3,4,1,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35.0,1,0,113803,53.1000,C123,S,5
4,5,0,3,"Allen, Mr. William Henry",male,35.0,0,0,373450,8.0500,,S,8
...,...,...,...,...,...,...,...,...,...,...,...,...,...
886,887,0,2,"Montvila, Rev. Juozas",male,27.0,0,0,211536,13.0000,,S,889
887,888,1,1,"Graham, Miss. Margaret Edith",female,19.0,0,0,112053,30.0000,B42,S,889
888,889,0,3,"Johnston, Miss. Catherine Helen ""Carrie""",female,,1,2,W./C. 6607,23.4500,,S,892
889,890,1,1,"Behr, Mr. Karl Howell",male,26.0,0,0,111369,30.0000,C148,C,891


In [65]:
df5=df5.drop('New_Col',axis=1)

In [66]:
df5

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked,New_Col_2
0,1,0,3,"Braund, Mr. Owen Harris",male,22.0,1,0,A/5 21171,7.2500,,S,4
1,2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Th...",female,38.0,1,0,PC 17599,71.2833,C85,C,3
2,3,1,3,"Heikkinen, Miss. Laina",female,26.0,0,0,STON/O2. 3101282,7.9250,,S,6
3,4,1,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35.0,1,0,113803,53.1000,C123,S,5
4,5,0,3,"Allen, Mr. William Henry",male,35.0,0,0,373450,8.0500,,S,8
...,...,...,...,...,...,...,...,...,...,...,...,...,...
886,887,0,2,"Montvila, Rev. Juozas",male,27.0,0,0,211536,13.0000,,S,889
887,888,1,1,"Graham, Miss. Margaret Edith",female,19.0,0,0,112053,30.0000,B42,S,889
888,889,0,3,"Johnston, Miss. Catherine Helen ""Carrie""",female,,1,2,W./C. 6607,23.4500,,S,892
889,890,1,1,"Behr, Mr. Karl Howell",male,26.0,0,0,111369,30.0000,C148,C,891


In [67]:
# Categorical()
# This function converts the given input into a categorical data type. 
# Categorical data types are a pandas-specific type that is useful for variables that have a fixed number of possible values

# Why Use Categorical Data Types?
# >Memory Efficiency: Categorical data types can be more memory efficient than other data types, especially when there are many repeated values.
# >Performance: Operations on categorical data can be faster than on plain object or integer data.
# >Analytical Clarity: Categorical types make it clear that the data should be treated as a set of discrete values (categories),
# not as numerical or free-text data.

In [68]:
df.head()

Unnamed: 0,id,location_id,program_id,accepted_payments,alternate_name,application_process,audience,description,eligibility,email,...,interpretation_services,keywords,languages,name,required_documents,service_areas,status,wait_time,website,taxonomy_ids
0,1,1,,,,Walk in or apply by phone.,"Older adults age 55 or over, ethnic minorities...",A walk-in center for older adults that provide...,"Age 55 or over for most programs, age 60 or ov...",,...,,"ADULT PROTECTION AND CARE SERVICES, Meal Sites...",,Fair Oaks Adult Activity Center,,Colma,active,No wait.,,
1,2,2,,,,Apply by phone for an appointment.,Residents of San Mateo County age 55 or over,Provides training and job placement to eligibl...,"Age 55 or over, county resident and willing an...",,...,,"EMPLOYMENT/TRAINING SERVICES, Job Development,...",,Second Career Employment Program,,San Mateo County,active,Varies.,,
2,3,3,,,,Phone for information (403-4300 Ext. 4322).,Older adults age 55 or over who can benefit fr...,Offers supportive counseling services to San M...,Resident of San Mateo County age 55 or over,,...,,"Geriatric Counseling, Older Adults, Gay, Lesbi...",,Senior Peer Counseling,,San Mateo County,active,Varies.,,
3,4,4,,,,Apply by phone.,"Parents, children, families with problems of c...",Provides supervised visitation services and a ...,,,...,,"INDIVIDUAL AND FAMILY DEVELOPMENT SERVICES, Gr...",,Family Visitation Center,,San Mateo County,active,No wait.,,
4,5,5,,,,Phone for information.,Low-income working families with children tran...,Provides fixed 8% short term loans to eligible...,Eligibility: Low-income family with legal cust...,,...,,"COMMUNITY SERVICES, Speakers, Automobile Loans",,Economic Self-Sufficiency Program,,San Mateo County,active,,,


In [69]:
pd.Categorical(df5['Survived'])

[0, 1, 1, 1, 0, ..., 0, 1, 0, 1, 0]
Length: 891
Categories (2, int64): [0, 1]

In [70]:
df5_cat_sur=pd.Categorical(df5['Survived'])

In [71]:
df5_cat_sur

[0, 1, 1, 1, 0, ..., 0, 1, 0, 1, 0]
Length: 891
Categories (2, int64): [0, 1]

In [72]:
pd.Categorical(df5['Cabin'])

[NaN, 'C85', NaN, 'C123', NaN, ..., NaN, 'B42', NaN, 'C148', NaN]
Length: 891
Categories (147, object): ['A10', 'A14', 'A16', 'A19', ..., 'F38', 'F4', 'G6', 'T']

In [73]:
df5['Cabin'].unique()

array([nan, 'C85', 'C123', 'E46', 'G6', 'C103', 'D56', 'A6',
       'C23 C25 C27', 'B78', 'D33', 'B30', 'C52', 'B28', 'C83', 'F33',
       'F G73', 'E31', 'A5', 'D10 D12', 'D26', 'C110', 'B58 B60', 'E101',
       'F E69', 'D47', 'B86', 'F2', 'C2', 'E33', 'B19', 'A7', 'C49', 'F4',
       'A32', 'B4', 'B80', 'A31', 'D36', 'D15', 'C93', 'C78', 'D35',
       'C87', 'B77', 'E67', 'B94', 'C125', 'C99', 'C118', 'D7', 'A19',
       'B49', 'D', 'C22 C26', 'C106', 'C65', 'E36', 'C54',
       'B57 B59 B63 B66', 'C7', 'E34', 'C32', 'B18', 'C124', 'C91', 'E40',
       'T', 'C128', 'D37', 'B35', 'E50', 'C82', 'B96 B98', 'E10', 'E44',
       'A34', 'C104', 'C111', 'C92', 'E38', 'D21', 'E12', 'E63', 'A14',
       'B37', 'C30', 'D20', 'B79', 'E25', 'D46', 'B73', 'C95', 'B38',
       'B39', 'B22', 'C86', 'C70', 'A16', 'C101', 'C68', 'A10', 'E68',
       'B41', 'A20', 'D19', 'D50', 'D9', 'A23', 'B50', 'A26', 'D48',
       'E58', 'C126', 'B71', 'B51 B53 B55', 'D49', 'B5', 'B20', 'F G63',
       'C62 C64',

In [74]:
# to find records of all adults
# to find the records which have age >18
df5['Age']>18

0       True
1       True
2       True
3       True
4       True
       ...  
886     True
887     True
888    False
889     True
890     True
Name: Age, Length: 891, dtype: bool

In [75]:
df5[df5['Age']>18]

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked,New_Col_2
0,1,0,3,"Braund, Mr. Owen Harris",male,22.0,1,0,A/5 21171,7.2500,,S,4
1,2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Th...",female,38.0,1,0,PC 17599,71.2833,C85,C,3
2,3,1,3,"Heikkinen, Miss. Laina",female,26.0,0,0,STON/O2. 3101282,7.9250,,S,6
3,4,1,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35.0,1,0,113803,53.1000,C123,S,5
4,5,0,3,"Allen, Mr. William Henry",male,35.0,0,0,373450,8.0500,,S,8
...,...,...,...,...,...,...,...,...,...,...,...,...,...
885,886,0,3,"Rice, Mrs. William (Margaret Norton)",female,39.0,0,5,382652,29.1250,,Q,889
886,887,0,2,"Montvila, Rev. Juozas",male,27.0,0,0,211536,13.0000,,S,889
887,888,1,1,"Graham, Miss. Margaret Edith",female,19.0,0,0,112053,30.0000,B42,S,889
889,890,1,1,"Behr, Mr. Karl Howell",male,26.0,0,0,111369,30.0000,C148,C,891


In [76]:
# Number of adults
len(df5[df5['Age']>18])

575

In [78]:
# Number of passengers which are less than age of 18
len(df5) -  len(df5[df5['Age']>18])

316

In [79]:
df5.describe()

Unnamed: 0,PassengerId,Survived,Pclass,Age,SibSp,Parch,Fare,New_Col_2
count,891.0,891.0,891.0,714.0,891.0,891.0,891.0,891.0
mean,446.0,0.383838,2.308642,29.699118,0.523008,0.381594,32.204208,448.308642
std,257.353842,0.486592,0.836071,14.526497,1.102743,0.806057,49.693429,257.325816
min,1.0,0.0,1.0,0.42,0.0,0.0,0.0,3.0
25%,223.5,0.0,2.0,20.125,0.0,0.0,7.9104,226.0
50%,446.0,0.0,3.0,28.0,0.0,0.0,14.4542,448.0
75%,668.5,1.0,3.0,38.0,1.0,0.0,31.0,671.0
max,891.0,1.0,3.0,80.0,8.0,6.0,512.3292,894.0


In [80]:
# to find the records or names of passengers who paid fare more than the avg fare (=32.204208)
df5[df5['Fare']>32.204208]

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked,New_Col_2
1,2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Th...",female,38.0,1,0,PC 17599,71.2833,C85,C,3
3,4,1,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35.0,1,0,113803,53.1000,C123,S,5
6,7,0,1,"McCarthy, Mr. Timothy J",male,54.0,0,0,17463,51.8625,E46,S,8
23,24,1,1,"Sloper, Mr. William Thompson",male,28.0,0,0,113788,35.5000,A6,S,25
27,28,0,1,"Fortune, Mr. Charles Alexander",male,19.0,3,2,19950,263.0000,C23 C25 C27,S,29
...,...,...,...,...,...,...,...,...,...,...,...,...,...
856,857,1,1,"Wick, Mrs. George Dennick (Mary Hitchcock)",female,45.0,1,1,36928,164.8667,,S,858
863,864,0,3,"Sage, Miss. Dorothy Edith ""Dolly""",female,,8,2,CA. 2343,69.5500,,S,867
867,868,0,1,"Roebling, Mr. Washington Augustus II",male,31.0,0,0,PC 17590,50.4958,A24,S,869
871,872,1,1,"Beckwith, Mrs. Richard Leonard (Sallie Monypeny)",female,47.0,1,1,11751,52.5542,D35,S,873


In [81]:
df5[df5['Fare']>32.204208][['Name','Fare']]

Unnamed: 0,Name,Fare
1,"Cumings, Mrs. John Bradley (Florence Briggs Th...",71.2833
3,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",53.1000
6,"McCarthy, Mr. Timothy J",51.8625
23,"Sloper, Mr. William Thompson",35.5000
27,"Fortune, Mr. Charles Alexander",263.0000
...,...,...
856,"Wick, Mrs. George Dennick (Mary Hitchcock)",164.8667
863,"Sage, Miss. Dorothy Edith ""Dolly""",69.5500
867,"Roebling, Mr. Washington Augustus II",50.4958
871,"Beckwith, Mrs. Richard Leonard (Sallie Monypeny)",52.5542


In [82]:
# to find the records of people who didn't pay any fare (=0)
df5[df5['Fare']==0]

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked,New_Col_2
179,180,0,3,"Leonard, Mr. Lionel",male,36.0,0,0,LINE,0.0,,S,183
263,264,0,1,"Harrison, Mr. William",male,40.0,0,0,112059,0.0,B94,S,265
271,272,1,3,"Tornquist, Mr. William Henry",male,25.0,0,0,LINE,0.0,,S,275
277,278,0,2,"Parkes, Mr. Francis ""Frank""",male,,0,0,239853,0.0,,S,280
302,303,0,3,"Johnson, Mr. William Cahoone Jr",male,19.0,0,0,LINE,0.0,,S,306
413,414,0,2,"Cunningham, Mr. Alfred Fleming",male,,0,0,239853,0.0,,S,416
466,467,0,2,"Campbell, Mr. William",male,,0,0,239853,0.0,,S,469
481,482,0,2,"Frost, Mr. Anthony Wood ""Archie""",male,,0,0,239854,0.0,,S,484
597,598,0,3,"Johnson, Mr. Alfred",male,49.0,0,0,LINE,0.0,,S,601
633,634,0,1,"Parr, Mr. William Henry Marsh",male,,0,0,112052,0.0,,S,635


In [83]:
# to find the number of people who didn't pay any fare (=0)
len(df5[df5['Fare']==0])

15

In [84]:
# to find the passengers in pclass=1
df5[df5['Pclass']==1][['Name','Pclass','Fare']]

Unnamed: 0,Name,Pclass,Fare
1,"Cumings, Mrs. John Bradley (Florence Briggs Th...",1,71.2833
3,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",1,53.1000
6,"McCarthy, Mr. Timothy J",1,51.8625
11,"Bonnell, Miss. Elizabeth",1,26.5500
23,"Sloper, Mr. William Thompson",1,35.5000
...,...,...,...
871,"Beckwith, Mrs. Richard Leonard (Sallie Monypeny)",1,52.5542
872,"Carlsson, Mr. Frans Olof",1,5.0000
879,"Potter, Mrs. Thomas Jr (Lily Alexenia Wilson)",1,83.1583
887,"Graham, Miss. Margaret Edith",1,30.0000


In [85]:
# To find the records of people who survived
df5[df5['Survived']==1]

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked,New_Col_2
1,2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Th...",female,38.0,1,0,PC 17599,71.2833,C85,C,3
2,3,1,3,"Heikkinen, Miss. Laina",female,26.0,0,0,STON/O2. 3101282,7.9250,,S,6
3,4,1,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35.0,1,0,113803,53.1000,C123,S,5
8,9,1,3,"Johnson, Mrs. Oscar W (Elisabeth Vilhelmina Berg)",female,27.0,0,2,347742,11.1333,,S,12
9,10,1,2,"Nasser, Mrs. Nicholas (Adele Achem)",female,14.0,1,0,237736,30.0708,,C,12
...,...,...,...,...,...,...,...,...,...,...,...,...,...
875,876,1,3,"Najib, Miss. Adele Kiamie ""Jane""",female,15.0,0,0,2667,7.2250,,C,879
879,880,1,1,"Potter, Mrs. Thomas Jr (Lily Alexenia Wilson)",female,56.0,0,1,11767,83.1583,C50,C,881
880,881,1,2,"Shelley, Mrs. William (Imanita Parrish Hall)",female,25.0,0,1,230433,26.0000,,S,883
887,888,1,1,"Graham, Miss. Margaret Edith",female,19.0,0,0,112053,30.0000,B42,S,889


In [86]:
t_passengers= len(df5)

t_male= len(df5[df5['Sex']=='male'])
s_male= len(df5[(df5['Survived']==1) & (df5['Sex']=='male')])
d_male=t_male-s_male

t_female= len(df5[df5['Sex']=='female'])
s_female= len(df5[(df5['Survived']==1) & (df5['Sex']=='female')])
d_female=t_female-s_female

t_minors= len(df5[df5['Age']<18])
s_minors= len(df5[(df5['Survived']==1) & (df5['Age']<18)])
d_minors=t_minors-s_minors

print('Total number of passengers : ',t_passengers)
print(f'Total number of Males : {t_male} \t Survived : {s_male} \t Died : {d_male}')
print(f'Total number of Females : {t_female} \t Survived : {s_female} \t Died : {d_female}')
print(f'Total number of Minors : {t_minors} \t Survived : {s_minors} \t Died : {d_minors}')

Total number of passengers :  891
Total number of Males : 577 	 Survived : 109 	 Died : 468
Total number of Females : 314 	 Survived : 233 	 Died : 81
Total number of Minors : 113 	 Survived : 61 	 Died : 52


In [87]:
# to print the names of people who paid maximum fare
df5[df5['Fare']==max(df5['Fare'])][['Name','Fare']]

Unnamed: 0,Name,Fare
258,"Ward, Miss. Anna",512.3292
679,"Cardeza, Mr. Thomas Drake Martinez",512.3292
737,"Lesurer, Mr. Gustave J",512.3292


In [88]:
# we can also slice through dataframe itself
# to print every 3rd record from 1 to 25 from dataframe 
df5[0:25:3]

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked,New_Col_2
0,1,0,3,"Braund, Mr. Owen Harris",male,22.0,1,0,A/5 21171,7.25,,S,4
3,4,1,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35.0,1,0,113803,53.1,C123,S,5
6,7,0,1,"McCarthy, Mr. Timothy J",male,54.0,0,0,17463,51.8625,E46,S,8
9,10,1,2,"Nasser, Mrs. Nicholas (Adele Achem)",female,14.0,1,0,237736,30.0708,,C,12
12,13,0,3,"Saundercock, Mr. William Henry",male,20.0,0,0,A/5. 2151,8.05,,S,16
15,16,1,2,"Hewlett, Mrs. (Mary D Kingcome)",female,55.0,0,0,248706,16.0,,S,18
18,19,0,3,"Vander Planke, Mrs. Julius (Emelia Maria Vande...",female,31.0,1,0,345763,18.0,,S,22
21,22,1,2,"Beesley, Mr. Lawrence",male,34.0,0,0,248698,13.0,D56,S,24
24,25,0,3,"Palsson, Miss. Torborg Danira",female,8.0,3,1,349909,21.075,,S,28


###  loc[] and iloc[]

Both `loc[]` and `iloc[]` are indexers used in pandas for selecting data from a DataFrame. They are used for different types of indexing and have distinct functionalities.

### `loc[]`
- **Label-based Indexing**: `loc[]` is used to access a group of rows and columns by labels or a boolean array.
- **Inclusive**: When specifying a range of labels, both the start and the end labels are included.
- **Syntax**: `df.loc[row_label, column_label]`
- **Type of Index**: Works with labels (index names) and can accept lists of labels, single labels, or slices.

### `iloc[]`
- **Integer-based Indexing**: `iloc[]` is used to access a group of rows and columns by integer positions (like numpy array indexing).
- **Exclusive**: When specifying a range of integers, the start is included but the end is excluded.
- **Syntax**: `df.iloc[row_index, column_index]`
- **Type of Index**: Works with integer positions and can accept lists of integers, single integers, or slices.

### Difference

| Feature            | `loc[]`                        | `iloc[]`                       |
|--------------------|--------------------------------|--------------------------------|
| Type of Indexing   | Label-based                   | Integer-based                 |
| Index Type         | Labels (names)                | Integer positions             |
| Inclusive/Exclusive| Inclusive of end label        | Exclusive of end position     |
| Syntax             | `df.loc[row_label, col_label]`| `df.iloc[row_index, col_index]`|
| Usage Example      | `df.loc[1:3, 'A':'C']`        | `df.iloc[1:3, 0:2]`           |


In [89]:
df5.loc[1:5, ['PassengerId','Name','Survived']]

Unnamed: 0,PassengerId,Name,Survived
1,2,"Cumings, Mrs. John Bradley (Florence Briggs Th...",1
2,3,"Heikkinen, Miss. Laina",1
3,4,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",1
4,5,"Allen, Mr. William Henry",0
5,6,"Moran, Mr. James",0


In [90]:
df5.iloc[1:5, [0,3,1]]

Unnamed: 0,PassengerId,Name,Survived
1,2,"Cumings, Mrs. John Bradley (Florence Briggs Th...",1
2,3,"Heikkinen, Miss. Laina",1
3,4,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",1
4,5,"Allen, Mr. William Henry",0
