# Pandas

Pandas is an open-source Python library that is commonly used for data manipulation, analysis, and visualization. It provides data structures for effectively storing and manipulating large, heterogeneous datasets, as well as tools for data cleaning, reshaping, grouping, merging, and more.

The core data structures of Pandas are the Series (a one-dimensional array-like object) and the DataFrame (a two-dimensional table-like data structure). These data structures are highly flexible and can handle data of various types, including numerical, categorical, and textual data.

Pandas also provides various functions and methods for data cleaning, such as dropping missing or duplicate values, filling in missing values, and transforming data. In addition, it offers powerful capabilities for data aggregation and grouping, which allow users to easily perform complex data analyses.

Pandas is widely used in many fields, including finance, economics, social sciences, and more. It is also frequently used in conjunction with other Python libraries, such as NumPy, Matplotlib, and Scikit-learn.

In [60]:
#import pandas as pd is a Python statement that imports the Pandas library and assigns an alias pd to it. This is a common convention in the Python data science community, as it makes it easier and more convenient to reference the Pandas library in your code.Once imported, you can use all the functionalities of the Pandas library by prefixing the appropriate methods and attributes with pd
import pandas as pd

In [61]:
# Read a CSV file into a Pandas DataFrame
pd.read_csv('services.csv')

Unnamed: 0,id,location_id,program_id,accepted_payments,alternate_name,application_process,audience,description,eligibility,email,...,interpretation_services,keywords,languages,name,required_documents,service_areas,status,wait_time,website,taxonomy_ids
0,1,1,,,,Walk in or apply by phone.,"Older adults age 55 or over, ethnic minorities...",A walk-in center for older adults that provide...,"Age 55 or over for most programs, age 60 or ov...",,...,,"ADULT PROTECTION AND CARE SERVICES, Meal Sites...",,Fair Oaks Adult Activity Center,,Colma,active,No wait.,,
1,2,2,,,,Apply by phone for an appointment.,Residents of San Mateo County age 55 or over,Provides training and job placement to eligibl...,"Age 55 or over, county resident and willing an...",,...,,"EMPLOYMENT/TRAINING SERVICES, Job Development,...",,Second Career Employment Program,,San Mateo County,active,Varies.,,
2,3,3,,,,Phone for information (403-4300 Ext. 4322).,Older adults age 55 or over who can benefit fr...,Offers supportive counseling services to San M...,Resident of San Mateo County age 55 or over,,...,,"Geriatric Counseling, Older Adults, Gay, Lesbi...",,Senior Peer Counseling,,San Mateo County,active,Varies.,,
3,4,4,,,,Apply by phone.,"Parents, children, families with problems of c...",Provides supervised visitation services and a ...,,,...,,"INDIVIDUAL AND FAMILY DEVELOPMENT SERVICES, Gr...",,Family Visitation Center,,San Mateo County,active,No wait.,,
4,5,5,,,,Phone for information.,Low-income working families with children tran...,Provides fixed 8% short term loans to eligible...,Eligibility: Low-income family with legal cust...,,...,,"COMMUNITY SERVICES, Speakers, Automobile Loans",,Economic Self-Sufficiency Program,,San Mateo County,active,,,
5,6,6,,,,Walk in or apply by phone for membership appli...,Any age,A multipurpose center offering a wide variety ...,,,...,,"ADULT PROTECTION AND CARE SERVICES, In-Home Su...",,Little House Recreational Activities,,San Mateo County,active,No wait.,,
6,7,7,,,,"Apply by phone or be referred by a doctor, soc...","Older adults who have memory or sensory loss, ...",Rosener House is a day center for older adults...,Age 18 or over,,...,,"ADULT PROTECTION AND CARE SERVICES, Adult Day ...",,Rosener House Adult Day Services,,"Belmont, Burlingame, East Palo Alto",active,No wait.,,
7,8,8,,,,Apply by phone.,"Senior citizens age 60 or over, disabled indiv...",Delivers a hot meal to the home of persons age...,Homebound person unable to cook or shop,,...,,"ADULT PROTECTION AND CARE SERVICES, Meal Sites...",,Meals on Wheels - South County,,"Belmont, East Palo Alto",active,No wait.,,
8,9,9,,,,Walk in. Proof of residency in California requ...,"Ethnic minorities, especially Spanish speaking","Provides general reading material, including b...",Resident of California to obtain a library card,,...,,"EDUCATION SERVICES, Library, Libraries, Public...",,Fair Oaks Branch,,San Mateo County,active,No wait.,,
9,10,10,,,,Walk in. Proof of California residency to rece...,,"Provides general reading and media materials, ...",Resident of California to obtain a card,,...,,"EDUCATION SERVICES, Library, Libraries, Public...",,Main Library,,San Mateo County,active,No wait.,,


In [62]:
pd.read_csv('services.csv',header=None)
# here header=None, specifies that the first row of the CSV file should not be treated as the column names.

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,12,13,14,15,16,17,18,19,20,21
0,id,location_id,program_id,accepted_payments,alternate_name,application_process,audience,description,eligibility,email,...,interpretation_services,keywords,languages,name,required_documents,service_areas,status,wait_time,website,taxonomy_ids
1,1,1,,,,Walk in or apply by phone.,"Older adults age 55 or over, ethnic minorities...",A walk-in center for older adults that provide...,"Age 55 or over for most programs, age 60 or ov...",,...,,"ADULT PROTECTION AND CARE SERVICES, Meal Sites...",,Fair Oaks Adult Activity Center,,Colma,active,No wait.,,
2,2,2,,,,Apply by phone for an appointment.,Residents of San Mateo County age 55 or over,Provides training and job placement to eligibl...,"Age 55 or over, county resident and willing an...",,...,,"EMPLOYMENT/TRAINING SERVICES, Job Development,...",,Second Career Employment Program,,San Mateo County,active,Varies.,,
3,3,3,,,,Phone for information (403-4300 Ext. 4322).,Older adults age 55 or over who can benefit fr...,Offers supportive counseling services to San M...,Resident of San Mateo County age 55 or over,,...,,"Geriatric Counseling, Older Adults, Gay, Lesbi...",,Senior Peer Counseling,,San Mateo County,active,Varies.,,
4,4,4,,,,Apply by phone.,"Parents, children, families with problems of c...",Provides supervised visitation services and a ...,,,...,,"INDIVIDUAL AND FAMILY DEVELOPMENT SERVICES, Gr...",,Family Visitation Center,,San Mateo County,active,No wait.,,
5,5,5,,,,Phone for information.,Low-income working families with children tran...,Provides fixed 8% short term loans to eligible...,Eligibility: Low-income family with legal cust...,,...,,"COMMUNITY SERVICES, Speakers, Automobile Loans",,Economic Self-Sufficiency Program,,San Mateo County,active,,,
6,6,6,,,,Walk in or apply by phone for membership appli...,Any age,A multipurpose center offering a wide variety ...,,,...,,"ADULT PROTECTION AND CARE SERVICES, In-Home Su...",,Little House Recreational Activities,,San Mateo County,active,No wait.,,
7,7,7,,,,"Apply by phone or be referred by a doctor, soc...","Older adults who have memory or sensory loss, ...",Rosener House is a day center for older adults...,Age 18 or over,,...,,"ADULT PROTECTION AND CARE SERVICES, Adult Day ...",,Rosener House Adult Day Services,,"Belmont, Burlingame, East Palo Alto",active,No wait.,,
8,8,8,,,,Apply by phone.,"Senior citizens age 60 or over, disabled indiv...",Delivers a hot meal to the home of persons age...,Homebound person unable to cook or shop,,...,,"ADULT PROTECTION AND CARE SERVICES, Meal Sites...",,Meals on Wheels - South County,,"Belmont, East Palo Alto",active,No wait.,,
9,9,9,,,,Walk in. Proof of residency in California requ...,"Ethnic minorities, especially Spanish speaking","Provides general reading material, including b...",Resident of California to obtain a library card,,...,,"EDUCATION SERVICES, Library, Libraries, Public...",,Fair Oaks Branch,,San Mateo County,active,No wait.,,


In [63]:
df=pd.read_csv('services.csv')

In [64]:
df

Unnamed: 0,id,location_id,program_id,accepted_payments,alternate_name,application_process,audience,description,eligibility,email,...,interpretation_services,keywords,languages,name,required_documents,service_areas,status,wait_time,website,taxonomy_ids
0,1,1,,,,Walk in or apply by phone.,"Older adults age 55 or over, ethnic minorities...",A walk-in center for older adults that provide...,"Age 55 or over for most programs, age 60 or ov...",,...,,"ADULT PROTECTION AND CARE SERVICES, Meal Sites...",,Fair Oaks Adult Activity Center,,Colma,active,No wait.,,
1,2,2,,,,Apply by phone for an appointment.,Residents of San Mateo County age 55 or over,Provides training and job placement to eligibl...,"Age 55 or over, county resident and willing an...",,...,,"EMPLOYMENT/TRAINING SERVICES, Job Development,...",,Second Career Employment Program,,San Mateo County,active,Varies.,,
2,3,3,,,,Phone for information (403-4300 Ext. 4322).,Older adults age 55 or over who can benefit fr...,Offers supportive counseling services to San M...,Resident of San Mateo County age 55 or over,,...,,"Geriatric Counseling, Older Adults, Gay, Lesbi...",,Senior Peer Counseling,,San Mateo County,active,Varies.,,
3,4,4,,,,Apply by phone.,"Parents, children, families with problems of c...",Provides supervised visitation services and a ...,,,...,,"INDIVIDUAL AND FAMILY DEVELOPMENT SERVICES, Gr...",,Family Visitation Center,,San Mateo County,active,No wait.,,
4,5,5,,,,Phone for information.,Low-income working families with children tran...,Provides fixed 8% short term loans to eligible...,Eligibility: Low-income family with legal cust...,,...,,"COMMUNITY SERVICES, Speakers, Automobile Loans",,Economic Self-Sufficiency Program,,San Mateo County,active,,,
5,6,6,,,,Walk in or apply by phone for membership appli...,Any age,A multipurpose center offering a wide variety ...,,,...,,"ADULT PROTECTION AND CARE SERVICES, In-Home Su...",,Little House Recreational Activities,,San Mateo County,active,No wait.,,
6,7,7,,,,"Apply by phone or be referred by a doctor, soc...","Older adults who have memory or sensory loss, ...",Rosener House is a day center for older adults...,Age 18 or over,,...,,"ADULT PROTECTION AND CARE SERVICES, Adult Day ...",,Rosener House Adult Day Services,,"Belmont, Burlingame, East Palo Alto",active,No wait.,,
7,8,8,,,,Apply by phone.,"Senior citizens age 60 or over, disabled indiv...",Delivers a hot meal to the home of persons age...,Homebound person unable to cook or shop,,...,,"ADULT PROTECTION AND CARE SERVICES, Meal Sites...",,Meals on Wheels - South County,,"Belmont, East Palo Alto",active,No wait.,,
8,9,9,,,,Walk in. Proof of residency in California requ...,"Ethnic minorities, especially Spanish speaking","Provides general reading material, including b...",Resident of California to obtain a library card,,...,,"EDUCATION SERVICES, Library, Libraries, Public...",,Fair Oaks Branch,,San Mateo County,active,No wait.,,
9,10,10,,,,Walk in. Proof of California residency to rece...,,"Provides general reading and media materials, ...",Resident of California to obtain a card,,...,,"EDUCATION SERVICES, Library, Libraries, Public...",,Main Library,,San Mateo County,active,No wait.,,


In [65]:
print(type(df))
# type(df) returns the type of the object df, which is likely a Pandas DataFrame object..

<class 'pandas.core.frame.DataFrame'>


In [66]:
df.columns
# df.columns is a property of a Pandas DataFrame object that returns a list of column names for the DataFrame.

Index(['id', 'location_id', 'program_id', 'accepted_payments',
       'alternate_name', 'application_process', 'audience', 'description',
       'eligibility', 'email', 'fees', 'funding_sources',
       'interpretation_services', 'keywords', 'languages', 'name',
       'required_documents', 'service_areas', 'status', 'wait_time', 'website',
       'taxonomy_ids'],
      dtype='object')

In [67]:
list(df.columns)
#list(df.columns) is a Python expression that converts the column names of a Pandas DataFrame object df into a list.

['id',
 'location_id',
 'program_id',
 'accepted_payments',
 'alternate_name',
 'application_process',
 'audience',
 'description',
 'eligibility',
 'email',
 'fees',
 'funding_sources',
 'interpretation_services',
 'keywords',
 'languages',
 'name',
 'required_documents',
 'service_areas',
 'status',
 'wait_time',
 'website',
 'taxonomy_ids']

In [68]:
df.head()

Unnamed: 0,id,location_id,program_id,accepted_payments,alternate_name,application_process,audience,description,eligibility,email,...,interpretation_services,keywords,languages,name,required_documents,service_areas,status,wait_time,website,taxonomy_ids
0,1,1,,,,Walk in or apply by phone.,"Older adults age 55 or over, ethnic minorities...",A walk-in center for older adults that provide...,"Age 55 or over for most programs, age 60 or ov...",,...,,"ADULT PROTECTION AND CARE SERVICES, Meal Sites...",,Fair Oaks Adult Activity Center,,Colma,active,No wait.,,
1,2,2,,,,Apply by phone for an appointment.,Residents of San Mateo County age 55 or over,Provides training and job placement to eligibl...,"Age 55 or over, county resident and willing an...",,...,,"EMPLOYMENT/TRAINING SERVICES, Job Development,...",,Second Career Employment Program,,San Mateo County,active,Varies.,,
2,3,3,,,,Phone for information (403-4300 Ext. 4322).,Older adults age 55 or over who can benefit fr...,Offers supportive counseling services to San M...,Resident of San Mateo County age 55 or over,,...,,"Geriatric Counseling, Older Adults, Gay, Lesbi...",,Senior Peer Counseling,,San Mateo County,active,Varies.,,
3,4,4,,,,Apply by phone.,"Parents, children, families with problems of c...",Provides supervised visitation services and a ...,,,...,,"INDIVIDUAL AND FAMILY DEVELOPMENT SERVICES, Gr...",,Family Visitation Center,,San Mateo County,active,No wait.,,
4,5,5,,,,Phone for information.,Low-income working families with children tran...,Provides fixed 8% short term loans to eligible...,Eligibility: Low-income family with legal cust...,,...,,"COMMUNITY SERVICES, Speakers, Automobile Loans",,Economic Self-Sufficiency Program,,San Mateo County,active,,,


In [69]:
df.tail()

Unnamed: 0,id,location_id,program_id,accepted_payments,alternate_name,application_process,audience,description,eligibility,email,...,interpretation_services,keywords,languages,name,required_documents,service_areas,status,wait_time,website,taxonomy_ids
18,19,19,,,,Call for screening appointment (650-347-3648).,,Provides free medical and dental care to those...,Low-income person without access to health care,,...,,"HEALTH SERVICES, Outpatient Care, Community Cl...",,San Mateo Free Medical Clinic,,"Belmont, Burlingame",active,Varies.,,
19,20,20,,,,Walk in.,,no unrequired fields for this service,,,...,,,,Service with blank fields,,,defunct,,,
20,21,21,,,,By phone during business hours.,,just a test service,,,...,,,,Service for Admin Test Location,,San Mateo County,inactive,,,
21,22,22,,"Cash, Check, Credit Card",Fotos para pasaportes,Walk in or apply by phone or mail,"Profit and nonprofit businesses, the public, m...",[NOTE THIS IS NOT A REAL SERVICE--THIS IS FOR ...,,passports@example.org,...,We offer 3-way interpretation services over th...,"Salud, Medicina",Spanish,Passport Photos,Government-issued picture identification,"Alameda County, San Mateo County",active,No wait to 2 weeks.,http://www.example.com,"105, 108, 108-05, 108-05-01, 111, 111-05"
22,23,22,,,,Walk in or apply by phone or mail,"Second service and nonprofit businesses, the p...",[NOTE THIS IS NOT A REAL ORGANIZATION--THIS IS...,,,...,,"Ruby on Rails/Postgres/Redis, testing, wic",,Example Service Name,,"San Mateo County, Alameda County",active,No wait to 2 weeks,http://www.example.com,


In [70]:
df.head(2)

Unnamed: 0,id,location_id,program_id,accepted_payments,alternate_name,application_process,audience,description,eligibility,email,...,interpretation_services,keywords,languages,name,required_documents,service_areas,status,wait_time,website,taxonomy_ids
0,1,1,,,,Walk in or apply by phone.,"Older adults age 55 or over, ethnic minorities...",A walk-in center for older adults that provide...,"Age 55 or over for most programs, age 60 or ov...",,...,,"ADULT PROTECTION AND CARE SERVICES, Meal Sites...",,Fair Oaks Adult Activity Center,,Colma,active,No wait.,,
1,2,2,,,,Apply by phone for an appointment.,Residents of San Mateo County age 55 or over,Provides training and job placement to eligibl...,"Age 55 or over, county resident and willing an...",,...,,"EMPLOYMENT/TRAINING SERVICES, Job Development,...",,Second Career Employment Program,,San Mateo County,active,Varies.,,


In [71]:
df.tail(2)

Unnamed: 0,id,location_id,program_id,accepted_payments,alternate_name,application_process,audience,description,eligibility,email,...,interpretation_services,keywords,languages,name,required_documents,service_areas,status,wait_time,website,taxonomy_ids
21,22,22,,"Cash, Check, Credit Card",Fotos para pasaportes,Walk in or apply by phone or mail,"Profit and nonprofit businesses, the public, m...",[NOTE THIS IS NOT A REAL SERVICE--THIS IS FOR ...,,passports@example.org,...,We offer 3-way interpretation services over th...,"Salud, Medicina",Spanish,Passport Photos,Government-issued picture identification,"Alameda County, San Mateo County",active,No wait to 2 weeks.,http://www.example.com,"105, 108, 108-05, 108-05-01, 111, 111-05"
22,23,22,,,,Walk in or apply by phone or mail,"Second service and nonprofit businesses, the p...",[NOTE THIS IS NOT A REAL ORGANIZATION--THIS IS...,,,...,,"Ruby on Rails/Postgres/Redis, testing, wic",,Example Service Name,,"San Mateo County, Alameda County",active,No wait to 2 weeks,http://www.example.com,


In [72]:
df.dtypes
#dtypes is a property of a Pandas DataFrame object that returns the data type of each column in the DataFrame.


id                           int64
location_id                  int64
program_id                 float64
accepted_payments           object
alternate_name              object
application_process         object
audience                    object
description                 object
eligibility                 object
email                       object
fees                        object
funding_sources             object
interpretation_services     object
keywords                    object
languages                   object
name                        object
required_documents          object
service_areas               object
status                      object
wait_time                   object
website                     object
taxonomy_ids                object
dtype: object

In [73]:
df['id']
#df['id'] is a Pandas Series object that represents the 'id' column of a Pandas DataFrame df.
#A Series is a one-dimensional labeled array that can hold any data type. In this case, the 'id' column likely contains unique identifiers for each row in the DataFrame.

0      1
1      2
2      3
3      4
4      5
5      6
6      7
7      8
8      9
9     10
10    11
11    12
12    13
13    14
14    15
15    16
16    17
17    18
18    19
19    20
20    21
21    22
22    23
Name: id, dtype: int64

In [74]:
df['location_id']

0      1
1      2
2      3
3      4
4      5
5      6
6      7
7      8
8      9
9     10
10    11
11    12
12    13
13    14
14    15
15    16
16    17
17    18
18    19
19    20
20    21
21    22
22    22
Name: location_id, dtype: int64

In [75]:
type(df['location_id'])
# returns the type of the object df['location_id'], which is likely a Pandas Series object.

pandas.core.series.Series

In [76]:
list(df['location_id'])
# Python expression that converts the 'location_id' column of a Pandas DataFrame df into a Python list.

[1,
 2,
 3,
 4,
 5,
 6,
 7,
 8,
 9,
 10,
 11,
 12,
 13,
 14,
 15,
 16,
 17,
 18,
 19,
 20,
 21,
 22,
 22]

In [77]:
df[['location_id']]
#is a Pandas DataFrame that contains only the 'location_id' column of the original DataFrame df.

Unnamed: 0,location_id
0,1
1,2
2,3
3,4
4,5
5,6
6,7
7,8
8,9
9,10


In [78]:
type(df[['location_id']])

pandas.core.frame.DataFrame

In [79]:
df['id','location_id']

KeyError: ('id', 'location_id')

In [80]:
df[['id','location_id']]
#df[['id', 'location_id']] is a Pandas DataFrame that contains only the 'id' and 'location_id' columns of the original DataFrame df.
#The double brackets [['id', 'location_id']] are used to select a subset of columns in a DataFrame. When you pass a list of column names as the argument to df[], Pandas will return a new DataFrame that contains only the specified columns.

Unnamed: 0,id,location_id
0,1,1
1,2,2
2,3,3
3,4,4
4,5,5
5,6,6
6,7,7
7,8,8
8,9,9
9,10,10


In [81]:
type(df[['id','location_id']])

pandas.core.frame.DataFrame

In [82]:
df[['keywords','status']]

Unnamed: 0,keywords,status
0,"ADULT PROTECTION AND CARE SERVICES, Meal Sites...",active
1,"EMPLOYMENT/TRAINING SERVICES, Job Development,...",active
2,"Geriatric Counseling, Older Adults, Gay, Lesbi...",active
3,"INDIVIDUAL AND FAMILY DEVELOPMENT SERVICES, Gr...",active
4,"COMMUNITY SERVICES, Speakers, Automobile Loans",active
5,"ADULT PROTECTION AND CARE SERVICES, In-Home Su...",active
6,"ADULT PROTECTION AND CARE SERVICES, Adult Day ...",active
7,"ADULT PROTECTION AND CARE SERVICES, Meal Sites...",active
8,"EDUCATION SERVICES, Library, Libraries, Public...",active
9,"EDUCATION SERVICES, Library, Libraries, Public...",active


In [83]:
type(df[['keywords','status']])

pandas.core.frame.DataFrame

In [84]:
type(df['keywords'])

pandas.core.series.Series

# Excel_Data_read

In [85]:
df1=pd.read_excel('LUSID Excel - Setting up your market data.xlsx')

In [86]:
df1

Unnamed: 0.1,Unnamed: 0,Unnamed: 1,Unnamed: 2,Unnamed: 3,Unnamed: 4,Unnamed: 5,Unnamed: 6,Unnamed: 7,Unnamed: 8,Unnamed: 9
0,,,,,,,,,,
1,,,,Datetimes in LUSID,,,,,,
2,,,,,,,,,,
3,,,,This sheet allows you to format datetimes for...,,,,,,
4,,,,If you have any questions please visit our:,,,,,,
5,,,,,Getting Started tutorials,,,,,
6,,,,,Knowledge Base articles,,,,,
7,,,,,or Contact us,,,,,
8,,,,,,,,,,
9,,,,,,,,,,


In [87]:
df1.head()

Unnamed: 0.1,Unnamed: 0,Unnamed: 1,Unnamed: 2,Unnamed: 3,Unnamed: 4,Unnamed: 5,Unnamed: 6,Unnamed: 7,Unnamed: 8,Unnamed: 9
0,,,,,,,,,,
1,,,,Datetimes in LUSID,,,,,,
2,,,,,,,,,,
3,,,,This sheet allows you to format datetimes for...,,,,,,
4,,,,If you have any questions please visit our:,,,,,,


In [88]:
df1.tail()

Unnamed: 0.1,Unnamed: 0,Unnamed: 1,Unnamed: 2,Unnamed: 3,Unnamed: 4,Unnamed: 5,Unnamed: 6,Unnamed: 7,Unnamed: 8,Unnamed: 9
23,,,,,Although the date can still appears without time,2019-04-10 13:30:45.550000,,,,
24,,,,,,,,,,
25,,,,,Add an hour to your datetime,2019-04-10 14:30:45.550000,,,,
26,,,,,,,,,,
27,,,,,Subtract a minute from your datetime,2019-04-10 14:29:45.550000,,,,


In [89]:
df1.dtypes

Unnamed: 0    float64
Unnamed: 1    float64
Unnamed: 2    float64
Unnamed: 3     object
Unnamed: 4     object
Unnamed: 5     object
Unnamed: 6    float64
Unnamed: 7     object
Unnamed: 8     object
Unnamed: 9     object
dtype: object

In [90]:
df1.columns

Index(['Unnamed: 0', 'Unnamed: 1', 'Unnamed: 2', 'Unnamed: 3', 'Unnamed: 4',
       'Unnamed: 5', 'Unnamed: 6', 'Unnamed: 7', 'Unnamed: 8', 'Unnamed: 9'],
      dtype='object')

In [91]:
# to_csv is a Pandas method used to write a DataFrame to a CSV (Comma-Separated Values) file. The method takes the filename to write to as an argument and can accept a variety of optional parameters to customize the output.
df1.to_csv('abc.csv')

In [92]:
# pd.Series is a Pandas class used to represent a one-dimensional labeled array. You can think of a pd.Series as a single column of data in a DataFrame.
import pandas as pd
l=[1,2,3,4,5,6]
df=pd.Series([l])
print(df)

0    [1, 2, 3, 4, 5, 6]
dtype: object


In [93]:
type(df)

pandas.core.series.Series

In [94]:
#A DataFrame is a 2-dimensional labeled data structure in pandas. It can be thought of as a table with rows and columns, similar to an Excel spreadsheet or a SQL table.You can create a DataFrame from many different data sources, including lists, dictionaries, NumPy arrays, and CSV files.
data={'Name':['abcd','efgh','ijkl'],
      'Add':['bza','gvk','vzkp'],
      'Num':[10,20,30]
     }

In [95]:
df1=pd.DataFrame(data)

In [96]:
df1

Unnamed: 0,Name,Add,Num
0,abcd,bza,10
1,efgh,gvk,20
2,ijkl,vzkp,30


In [97]:
type(df1)

pandas.core.frame.DataFrame

In [98]:
import pandas as pd
df2=pd.read_csv('https://raw.githubusercontent.com/datasciencedojo/datasets/master/titanic.csv')

In [99]:
df2

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
0,1,0,3,"Braund, Mr. Owen Harris",male,22.0,1,0,A/5 21171,7.2500,,S
1,2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Th...",female,38.0,1,0,PC 17599,71.2833,C85,C
2,3,1,3,"Heikkinen, Miss. Laina",female,26.0,0,0,STON/O2. 3101282,7.9250,,S
3,4,1,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35.0,1,0,113803,53.1000,C123,S
4,5,0,3,"Allen, Mr. William Henry",male,35.0,0,0,373450,8.0500,,S
...,...,...,...,...,...,...,...,...,...,...,...,...
886,887,0,2,"Montvila, Rev. Juozas",male,27.0,0,0,211536,13.0000,,S
887,888,1,1,"Graham, Miss. Margaret Edith",female,19.0,0,0,112053,30.0000,B42,S
888,889,0,3,"Johnston, Miss. Catherine Helen ""Carrie""",female,,1,2,W./C. 6607,23.4500,,S
889,890,1,1,"Behr, Mr. Karl Howell",male,26.0,0,0,111369,30.0000,C148,C


In [100]:
df2.to_excel('excel_titanic.xlsx')

In [101]:
df2.to_html('html_titanic.html')

In [102]:
df2.to_json('json_titanic.json')

In [103]:
df2.to_csv('titanic.csv')

In [104]:
df2.columns

Index(['PassengerId', 'Survived', 'Pclass', 'Name', 'Sex', 'Age', 'SibSp',
       'Parch', 'Ticket', 'Fare', 'Cabin', 'Embarked'],
      dtype='object')

In [105]:
df2.head()

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
0,1,0,3,"Braund, Mr. Owen Harris",male,22.0,1,0,A/5 21171,7.25,,S
1,2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Th...",female,38.0,1,0,PC 17599,71.2833,C85,C
2,3,1,3,"Heikkinen, Miss. Laina",female,26.0,0,0,STON/O2. 3101282,7.925,,S
3,4,1,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35.0,1,0,113803,53.1,C123,S
4,5,0,3,"Allen, Mr. William Henry",male,35.0,0,0,373450,8.05,,S


In [106]:
df2.tail()

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
886,887,0,2,"Montvila, Rev. Juozas",male,27.0,0,0,211536,13.0,,S
887,888,1,1,"Graham, Miss. Margaret Edith",female,19.0,0,0,112053,30.0,B42,S
888,889,0,3,"Johnston, Miss. Catherine Helen ""Carrie""",female,,1,2,W./C. 6607,23.45,,S
889,890,1,1,"Behr, Mr. Karl Howell",male,26.0,0,0,111369,30.0,C148,C
890,891,0,3,"Dooley, Mr. Patrick",male,32.0,0,0,370376,7.75,,Q


In [107]:
df2.dtypes

PassengerId      int64
Survived         int64
Pclass           int64
Name            object
Sex             object
Age            float64
SibSp            int64
Parch            int64
Ticket          object
Fare           float64
Cabin           object
Embarked        object
dtype: object

In [108]:
df2.describe()

Unnamed: 0,PassengerId,Survived,Pclass,Age,SibSp,Parch,Fare
count,891.0,891.0,891.0,714.0,891.0,891.0,891.0
mean,446.0,0.383838,2.308642,29.699118,0.523008,0.381594,32.204208
std,257.353842,0.486592,0.836071,14.526497,1.102743,0.806057,49.693429
min,1.0,0.0,1.0,0.42,0.0,0.0,0.0
25%,223.5,0.0,2.0,20.125,0.0,0.0,7.9104
50%,446.0,0.0,3.0,28.0,0.0,0.0,14.4542
75%,668.5,1.0,3.0,38.0,1.0,0.0,31.0
max,891.0,1.0,3.0,80.0,8.0,6.0,512.3292


In [109]:
df2.dtypes

PassengerId      int64
Survived         int64
Pclass           int64
Name            object
Sex             object
Age            float64
SibSp            int64
Parch            int64
Ticket          object
Fare           float64
Cabin           object
Embarked        object
dtype: object

In [110]:
df2[['Name','Sex','Ticket','Cabin','Embarked']]

Unnamed: 0,Name,Sex,Ticket,Cabin,Embarked
0,"Braund, Mr. Owen Harris",male,A/5 21171,,S
1,"Cumings, Mrs. John Bradley (Florence Briggs Th...",female,PC 17599,C85,C
2,"Heikkinen, Miss. Laina",female,STON/O2. 3101282,,S
3,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,113803,C123,S
4,"Allen, Mr. William Henry",male,373450,,S
...,...,...,...,...,...
886,"Montvila, Rev. Juozas",male,211536,,S
887,"Graham, Miss. Margaret Edith",female,112053,B42,S
888,"Johnston, Miss. Catherine Helen ""Carrie""",female,W./C. 6607,,S
889,"Behr, Mr. Karl Howell",male,111369,C148,C


In [111]:
type(df2.dtypes)

pandas.core.series.Series

In [112]:
df2.dtypes == 'object'
# df2.dtypes == 'object' is a Boolean expression that checks if the data type of each column in df2 is an object (i.e., a string or other non-numeric type).

PassengerId    False
Survived       False
Pclass         False
Name            True
Sex             True
Age            False
SibSp          False
Parch          False
Ticket          True
Fare           False
Cabin           True
Embarked        True
dtype: bool

In [113]:
# is a Pandas Series object that returns the data types of only those columns in df2 that are of type object (i.e., strings).
df2.dtypes[df.dtypes=='object']

KeyError: True

In [114]:
# df2.dtypes[df.dtypes=='object'].index is a Pandas Index object that returns the names of only those columns in df2 that are of type object (i.e., strings).
df2.dtypes[df.dtypes=='object'].index

KeyError: True

In [185]:
df2[df2.dtypes[df.dtypes=='object'].index]
#It seems there is an error in your code. You should replace df.dtypes with df2.dtypes since you want to get the data types for df2

KeyError: True

In [116]:
df2[df2.dtypes[df.dtypes=='object'].index].describe()
#It seems there is an error in your code. You should replace df.dtypes with df2.dtypes since you want to get the data types for df2

KeyError: True

In [184]:
df2[df2.dtypes[df.dtypes=='int64'].index]
#Similar to the previous error, I think you meant to write df2[df2.dtypes[df2.dtypes=='int64'].index] instead of df2[df2.dtypes[df.dtypes=='int64'].index]. This will return all columns in df2 that have a data type of int64.

KeyError: False

In [183]:
df2[df2.dtypes[df.dtypes=='float64'].index]
#There seems to be an error in the code you provided. I think you meant to write df2[df2.dtypes[df2.dtypes=='float64'].index] instead of df2[df2.dtypes[df.dtypes=='float64'].index]. This will return all columns in df2 that have a data type of float64.

KeyError: False

In [119]:
df2[['Survived','Pclass']]

Unnamed: 0,Survived,Pclass
0,0,3
1,1,1
2,1,3
3,1,1
4,0,3
...,...,...
886,0,2
887,1,1
888,0,3
889,1,1


In [123]:
df2[['Survived','Pclass']][4:11]
#df2[['Survived','Pclass']][4:11] is a Pandas DataFrame that contains a subset of rows and columns from df2.
#The output of this code will be a DataFrame containing the Survived and Pclass columns for rows 4 through 10 of the original DataFrame.

Unnamed: 0,Survived,Pclass
4,0,3
5,0,3
6,0,1
7,0,3
8,1,3
9,1,2
10,1,3


In [124]:
df2[['Survived','Pclass']][10:21]

Unnamed: 0,Survived,Pclass
10,1,3
11,1,1
12,0,3
13,0,3
14,0,3
15,1,2
16,0,3
17,1,2
18,0,3
19,1,3


In [125]:
df2[['Survived','Pclass']][10:31:2]

Unnamed: 0,Survived,Pclass
10,1,3
12,0,3
14,0,3
16,0,3
18,0,3
20,0,2
22,1,3
24,0,3
26,0,3
28,1,3


In [127]:
df2['new_colum']=0
#df2['new_colum']=0 is a Pandas DataFrame operation that creates a new column called 'new_colum' and initializes it with the value 0 for every row of the DataFrame df2.

In [128]:
df2.head(2)

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked,new_colum
0,1,0,3,"Braund, Mr. Owen Harris",male,22.0,1,0,A/5 21171,7.25,,S,0
1,2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Th...",female,38.0,1,0,PC 17599,71.2833,C85,C,0


In [129]:
df2['new_col1'] = df2['PassengerId']+df2['Pclass']
# is a Pandas DataFrame operation that creates a new column called 'new_col1' and initializes it with the sum of the 'PassengerId' and 'Pclass' columns for every row of the DataFrame df

In [130]:
df2

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked,new_colum,new_col1
0,1,0,3,"Braund, Mr. Owen Harris",male,22.0,1,0,A/5 21171,7.2500,,S,0,4
1,2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Th...",female,38.0,1,0,PC 17599,71.2833,C85,C,0,3
2,3,1,3,"Heikkinen, Miss. Laina",female,26.0,0,0,STON/O2. 3101282,7.9250,,S,0,6
3,4,1,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35.0,1,0,113803,53.1000,C123,S,0,5
4,5,0,3,"Allen, Mr. William Henry",male,35.0,0,0,373450,8.0500,,S,0,8
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
886,887,0,2,"Montvila, Rev. Juozas",male,27.0,0,0,211536,13.0000,,S,0,889
887,888,1,1,"Graham, Miss. Margaret Edith",female,19.0,0,0,112053,30.0000,B42,S,0,889
888,889,0,3,"Johnston, Miss. Catherine Helen ""Carrie""",female,,1,2,W./C. 6607,23.4500,,S,0,892
889,890,1,1,"Behr, Mr. Karl Howell",male,26.0,0,0,111369,30.0000,C148,C,0,891


In [131]:
df2['Pclass']

0      3
1      1
2      3
3      1
4      3
      ..
886    2
887    1
888    3
889    1
890    3
Name: Pclass, Length: 891, dtype: int64

In [132]:
pd.Categorical(df2['Pclass'])
#pd.Categorical(df2['Pclass']) is a Pandas function that creates a categorical variable from the values in the 'Pclass' column of the DataFrame df2. A categorical variable is a variable that can take on a limited and usually fixed number of values, and can be used to represent categorical data.

[3, 1, 3, 1, 3, ..., 2, 1, 3, 1, 3]
Length: 891
Categories (3, int64): [1, 2, 3]

In [133]:
pd.Categorical(df2['Survived'])

[0, 1, 1, 1, 0, ..., 0, 1, 0, 1, 0]
Length: 891
Categories (2, int64): [0, 1]

In [134]:
pd.Categorical(df2['Cabin'])

[NaN, 'C85', NaN, 'C123', NaN, ..., NaN, 'B42', NaN, 'C148', NaN]
Length: 891
Categories (147, object): ['A10', 'A14', 'A16', 'A19', ..., 'F38', 'F4', 'G6', 'T']

In [182]:
df2["Cabin"].unique()
#df["Cabin"].unique() is a Pandas DataFrame operation that returns an array of unique values from the Cabin column of the DataFrame df

array([nan, 'C85', 'C123', 'E46', 'G6', 'C103', 'D56', 'A6',
       'C23 C25 C27', 'B78', 'D33', 'B30', 'C52', 'B28', 'C83', 'F33',
       'F G73', 'E31', 'A5', 'D10 D12', 'D26', 'C110', 'B58 B60', 'E101',
       'F E69', 'D47', 'B86', 'F2', 'C2', 'E33', 'B19', 'A7', 'C49', 'F4',
       'A32', 'B4', 'B80', 'A31', 'D36', 'D15', 'C93', 'C78', 'D35',
       'C87', 'B77', 'E67', 'B94', 'C125', 'C99', 'C118', 'D7', 'A19',
       'B49', 'D', 'C22 C26', 'C106', 'C65', 'E36', 'C54',
       'B57 B59 B63 B66', 'C7', 'E34', 'C32', 'B18', 'C124', 'C91', 'E40',
       'T', 'C128', 'D37', 'B35', 'E50', 'C82', 'B96 B98', 'E10', 'E44',
       'A34', 'C104', 'C111', 'C92', 'E38', 'D21', 'E12', 'E63', 'A14',
       'B37', 'C30', 'D20', 'B79', 'E25', 'D46', 'B73', 'C95', 'B38',
       'B39', 'B22', 'C86', 'C70', 'A16', 'C101', 'C68', 'A10', 'E68',
       'B41', 'A20', 'D19', 'D50', 'D9', 'A23', 'B50', 'A26', 'D48',
       'E58', 'C126', 'B71', 'B51 B53 B55', 'D49', 'B5', 'B20', 'F G63',
       'C62 C64',

In [139]:
df2.head(3)

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked,new_colum,new_col1
0,1,0,3,"Braund, Mr. Owen Harris",male,22.0,1,0,A/5 21171,7.25,,S,0,4
1,2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Th...",female,38.0,1,0,PC 17599,71.2833,C85,C,0,3
2,3,1,3,"Heikkinen, Miss. Laina",female,26.0,0,0,STON/O2. 3101282,7.925,,S,0,6


In [140]:
df2['Age']>18

0       True
1       True
2       True
3       True
4       True
       ...  
886     True
887     True
888    False
889     True
890     True
Name: Age, Length: 891, dtype: bool

In [141]:
df2[df2['Age']>18]

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked,new_colum,new_col1
0,1,0,3,"Braund, Mr. Owen Harris",male,22.0,1,0,A/5 21171,7.2500,,S,0,4
1,2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Th...",female,38.0,1,0,PC 17599,71.2833,C85,C,0,3
2,3,1,3,"Heikkinen, Miss. Laina",female,26.0,0,0,STON/O2. 3101282,7.9250,,S,0,6
3,4,1,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35.0,1,0,113803,53.1000,C123,S,0,5
4,5,0,3,"Allen, Mr. William Henry",male,35.0,0,0,373450,8.0500,,S,0,8
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
885,886,0,3,"Rice, Mrs. William (Margaret Norton)",female,39.0,0,5,382652,29.1250,,Q,0,889
886,887,0,2,"Montvila, Rev. Juozas",male,27.0,0,0,211536,13.0000,,S,0,889
887,888,1,1,"Graham, Miss. Margaret Edith",female,19.0,0,0,112053,30.0000,B42,S,0,889
889,890,1,1,"Behr, Mr. Karl Howell",male,26.0,0,0,111369,30.0000,C148,C,0,891


In [142]:
len(df2)-len(df2[df2['Age']>18])

316

In [143]:
df2['Fare']< 32.204

0       True
1      False
2       True
3      False
4       True
       ...  
886     True
887     True
888     True
889     True
890     True
Name: Fare, Length: 891, dtype: bool

In [144]:
df2[df2['Fare']< 32.204]

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked,new_colum,new_col1
0,1,0,3,"Braund, Mr. Owen Harris",male,22.0,1,0,A/5 21171,7.2500,,S,0,4
2,3,1,3,"Heikkinen, Miss. Laina",female,26.0,0,0,STON/O2. 3101282,7.9250,,S,0,6
4,5,0,3,"Allen, Mr. William Henry",male,35.0,0,0,373450,8.0500,,S,0,8
5,6,0,3,"Moran, Mr. James",male,,0,0,330877,8.4583,,Q,0,9
7,8,0,3,"Palsson, Master. Gosta Leonard",male,2.0,3,1,349909,21.0750,,S,0,11
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
886,887,0,2,"Montvila, Rev. Juozas",male,27.0,0,0,211536,13.0000,,S,0,889
887,888,1,1,"Graham, Miss. Margaret Edith",female,19.0,0,0,112053,30.0000,B42,S,0,889
888,889,0,3,"Johnston, Miss. Catherine Helen ""Carrie""",female,,1,2,W./C. 6607,23.4500,,S,0,892
889,890,1,1,"Behr, Mr. Karl Howell",male,26.0,0,0,111369,30.0000,C148,C,0,891


In [145]:
df2[df2['Fare']> 32.204]

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked,new_colum,new_col1
1,2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Th...",female,38.0,1,0,PC 17599,71.2833,C85,C,0,3
3,4,1,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35.0,1,0,113803,53.1000,C123,S,0,5
6,7,0,1,"McCarthy, Mr. Timothy J",male,54.0,0,0,17463,51.8625,E46,S,0,8
23,24,1,1,"Sloper, Mr. William Thompson",male,28.0,0,0,113788,35.5000,A6,S,0,25
27,28,0,1,"Fortune, Mr. Charles Alexander",male,19.0,3,2,19950,263.0000,C23 C25 C27,S,0,29
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
856,857,1,1,"Wick, Mrs. George Dennick (Mary Hitchcock)",female,45.0,1,1,36928,164.8667,,S,0,858
863,864,0,3,"Sage, Miss. Dorothy Edith ""Dolly""",female,,8,2,CA. 2343,69.5500,,S,0,867
867,868,0,1,"Roebling, Mr. Washington Augustus II",male,31.0,0,0,PC 17590,50.4958,A24,S,0,869
871,872,1,1,"Beckwith, Mrs. Richard Leonard (Sallie Monypeny)",female,47.0,1,1,11751,52.5542,D35,S,0,873


In [146]:
df2['Fare'] ==0

0      False
1      False
2      False
3      False
4      False
       ...  
886    False
887    False
888    False
889    False
890    False
Name: Fare, Length: 891, dtype: bool

In [147]:
df2[df2['Fare'] ==0]

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked,new_colum,new_col1
179,180,0,3,"Leonard, Mr. Lionel",male,36.0,0,0,LINE,0.0,,S,0,183
263,264,0,1,"Harrison, Mr. William",male,40.0,0,0,112059,0.0,B94,S,0,265
271,272,1,3,"Tornquist, Mr. William Henry",male,25.0,0,0,LINE,0.0,,S,0,275
277,278,0,2,"Parkes, Mr. Francis ""Frank""",male,,0,0,239853,0.0,,S,0,280
302,303,0,3,"Johnson, Mr. William Cahoone Jr",male,19.0,0,0,LINE,0.0,,S,0,306
413,414,0,2,"Cunningham, Mr. Alfred Fleming",male,,0,0,239853,0.0,,S,0,416
466,467,0,2,"Campbell, Mr. William",male,,0,0,239853,0.0,,S,0,469
481,482,0,2,"Frost, Mr. Anthony Wood ""Archie""",male,,0,0,239854,0.0,,S,0,484
597,598,0,3,"Johnson, Mr. Alfred",male,49.0,0,0,LINE,0.0,,S,0,601
633,634,0,1,"Parr, Mr. William Henry Marsh",male,,0,0,112052,0.0,,S,0,635


In [148]:
df2[df2['Fare'] ==0]['Name']

179                 Leonard, Mr. Lionel
263               Harrison, Mr. William
271        Tornquist, Mr. William Henry
277         Parkes, Mr. Francis "Frank"
302     Johnson, Mr. William Cahoone Jr
413      Cunningham, Mr. Alfred Fleming
466               Campbell, Mr. William
481    Frost, Mr. Anthony Wood "Archie"
597                 Johnson, Mr. Alfred
633       Parr, Mr. William Henry Marsh
674          Watson, Mr. Ennis Hastings
732                Knight, Mr. Robert J
806              Andrews, Mr. Thomas Jr
815                    Fry, Mr. Richard
822     Reuchlin, Jonkheer. John George
Name: Name, dtype: object

In [149]:
len(df2[df2['Fare'] ==0])

15

In [150]:
df2['Sex']=='Male'

0      False
1      False
2      False
3      False
4      False
       ...  
886    False
887    False
888    False
889    False
890    False
Name: Sex, Length: 891, dtype: bool

In [151]:
df2[df2['Sex']=='male']

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked,new_colum,new_col1
0,1,0,3,"Braund, Mr. Owen Harris",male,22.0,1,0,A/5 21171,7.2500,,S,0,4
4,5,0,3,"Allen, Mr. William Henry",male,35.0,0,0,373450,8.0500,,S,0,8
5,6,0,3,"Moran, Mr. James",male,,0,0,330877,8.4583,,Q,0,9
6,7,0,1,"McCarthy, Mr. Timothy J",male,54.0,0,0,17463,51.8625,E46,S,0,8
7,8,0,3,"Palsson, Master. Gosta Leonard",male,2.0,3,1,349909,21.0750,,S,0,11
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
883,884,0,2,"Banfield, Mr. Frederick James",male,28.0,0,0,C.A./SOTON 34068,10.5000,,S,0,886
884,885,0,3,"Sutehall, Mr. Henry Jr",male,25.0,0,0,SOTON/OQ 392076,7.0500,,S,0,888
886,887,0,2,"Montvila, Rev. Juozas",male,27.0,0,0,211536,13.0000,,S,0,889
889,890,1,1,"Behr, Mr. Karl Howell",male,26.0,0,0,111369,30.0000,C148,C,0,891


In [152]:
len(df2[df2['Sex']=='male'])

577

In [153]:
df2['Sex']=='Female'

0      False
1      False
2      False
3      False
4      False
       ...  
886    False
887    False
888    False
889    False
890    False
Name: Sex, Length: 891, dtype: bool

In [154]:
df2[df2['Sex']=='female']

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked,new_colum,new_col1
1,2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Th...",female,38.0,1,0,PC 17599,71.2833,C85,C,0,3
2,3,1,3,"Heikkinen, Miss. Laina",female,26.0,0,0,STON/O2. 3101282,7.9250,,S,0,6
3,4,1,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35.0,1,0,113803,53.1000,C123,S,0,5
8,9,1,3,"Johnson, Mrs. Oscar W (Elisabeth Vilhelmina Berg)",female,27.0,0,2,347742,11.1333,,S,0,12
9,10,1,2,"Nasser, Mrs. Nicholas (Adele Achem)",female,14.0,1,0,237736,30.0708,,C,0,12
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
880,881,1,2,"Shelley, Mrs. William (Imanita Parrish Hall)",female,25.0,0,1,230433,26.0000,,S,0,883
882,883,0,3,"Dahlberg, Miss. Gerda Ulrika",female,22.0,0,0,7552,10.5167,,S,0,886
885,886,0,3,"Rice, Mrs. William (Margaret Norton)",female,39.0,0,5,382652,29.1250,,Q,0,889
887,888,1,1,"Graham, Miss. Margaret Edith",female,19.0,0,0,112053,30.0000,B42,S,0,889


In [155]:
len(df2[df2['Sex']=='female'])

314

In [156]:
df2['Pclass']==1

0      False
1       True
2      False
3       True
4      False
       ...  
886    False
887     True
888    False
889     True
890    False
Name: Pclass, Length: 891, dtype: bool

In [157]:
df2[df2['Pclass']==1]

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked,new_colum,new_col1
1,2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Th...",female,38.0,1,0,PC 17599,71.2833,C85,C,0,3
3,4,1,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35.0,1,0,113803,53.1000,C123,S,0,5
6,7,0,1,"McCarthy, Mr. Timothy J",male,54.0,0,0,17463,51.8625,E46,S,0,8
11,12,1,1,"Bonnell, Miss. Elizabeth",female,58.0,0,0,113783,26.5500,C103,S,0,13
23,24,1,1,"Sloper, Mr. William Thompson",male,28.0,0,0,113788,35.5000,A6,S,0,25
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
871,872,1,1,"Beckwith, Mrs. Richard Leonard (Sallie Monypeny)",female,47.0,1,1,11751,52.5542,D35,S,0,873
872,873,0,1,"Carlsson, Mr. Frans Olof",male,33.0,0,0,695,5.0000,B51 B53 B55,S,0,874
879,880,1,1,"Potter, Mrs. Thomas Jr (Lily Alexenia Wilson)",female,56.0,0,1,11767,83.1583,C50,C,0,881
887,888,1,1,"Graham, Miss. Margaret Edith",female,19.0,0,0,112053,30.0000,B42,S,0,889


In [158]:
len(df2[df2['Pclass']==1])

216

In [159]:
df2['Survived']==1

0      False
1       True
2       True
3       True
4      False
       ...  
886    False
887     True
888    False
889     True
890    False
Name: Survived, Length: 891, dtype: bool

In [160]:
df2[df2['Survived']==1]

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked,new_colum,new_col1
1,2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Th...",female,38.0,1,0,PC 17599,71.2833,C85,C,0,3
2,3,1,3,"Heikkinen, Miss. Laina",female,26.0,0,0,STON/O2. 3101282,7.9250,,S,0,6
3,4,1,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35.0,1,0,113803,53.1000,C123,S,0,5
8,9,1,3,"Johnson, Mrs. Oscar W (Elisabeth Vilhelmina Berg)",female,27.0,0,2,347742,11.1333,,S,0,12
9,10,1,2,"Nasser, Mrs. Nicholas (Adele Achem)",female,14.0,1,0,237736,30.0708,,C,0,12
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
875,876,1,3,"Najib, Miss. Adele Kiamie ""Jane""",female,15.0,0,0,2667,7.2250,,C,0,879
879,880,1,1,"Potter, Mrs. Thomas Jr (Lily Alexenia Wilson)",female,56.0,0,1,11767,83.1583,C50,C,0,881
880,881,1,2,"Shelley, Mrs. William (Imanita Parrish Hall)",female,25.0,0,1,230433,26.0000,,S,0,883
887,888,1,1,"Graham, Miss. Margaret Edith",female,19.0,0,0,112053,30.0000,B42,S,0,889


In [161]:
len(df2[df2['Survived']==1])

342

In [162]:
len(df2[df2['Survived']==0])

549

In [163]:
df2['Sex']== 'female'

0      False
1       True
2       True
3       True
4      False
       ...  
886    False
887     True
888     True
889    False
890    False
Name: Sex, Length: 891, dtype: bool

In [164]:
df2['Fare']>32

0      False
1       True
2      False
3       True
4      False
       ...  
886    False
887    False
888    False
889    False
890    False
Name: Fare, Length: 891, dtype: bool

In [166]:
df[(df2['Sex']== 'female') & (df2['Fare']>32)]
#it is a Pandas DataFrame operation that filters rows from the DataFrame df based on the conditions that the 'Sex' column equals 'female' and the 'Fare' column is greater than 32

Series([], dtype: object)

In [167]:
len(df2[(df2['Sex']== 'female') & (df2['Fare']>32)])

104

In [168]:
len(df2[(df2['Sex']== 'male') | (df2['Fare']>32)])

681

In [169]:
df2.head(4)

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked,new_colum,new_col1
0,1,0,3,"Braund, Mr. Owen Harris",male,22.0,1,0,A/5 21171,7.25,,S,0,4
1,2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Th...",female,38.0,1,0,PC 17599,71.2833,C85,C,0,3
2,3,1,3,"Heikkinen, Miss. Laina",female,26.0,0,0,STON/O2. 3101282,7.925,,S,0,6
3,4,1,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35.0,1,0,113803,53.1,C123,S,0,5


In [170]:
max(df2['Fare'])

512.3292

In [171]:
df2['Fare']==max(df2['Fare'])

0      False
1      False
2      False
3      False
4      False
       ...  
886    False
887    False
888    False
889    False
890    False
Name: Fare, Length: 891, dtype: bool

In [172]:
df[df2['Fare']==max(df2['Fare'])]

Series([], dtype: object)

In [173]:
df2[df2['Fare']==max(df2['Fare'])]['Name']

258                      Ward, Miss. Anna
679    Cardeza, Mr. Thomas Drake Martinez
737                Lesurer, Mr. Gustave J
Name: Name, dtype: object

In [174]:
df2[0:100:2]

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked,new_colum,new_col1
0,1,0,3,"Braund, Mr. Owen Harris",male,22.0,1,0,A/5 21171,7.25,,S,0,4
2,3,1,3,"Heikkinen, Miss. Laina",female,26.0,0,0,STON/O2. 3101282,7.925,,S,0,6
4,5,0,3,"Allen, Mr. William Henry",male,35.0,0,0,373450,8.05,,S,0,8
6,7,0,1,"McCarthy, Mr. Timothy J",male,54.0,0,0,17463,51.8625,E46,S,0,8
8,9,1,3,"Johnson, Mrs. Oscar W (Elisabeth Vilhelmina Berg)",female,27.0,0,2,347742,11.1333,,S,0,12
10,11,1,3,"Sandstrom, Miss. Marguerite Rut",female,4.0,1,1,PP 9549,16.7,G6,S,0,14
12,13,0,3,"Saundercock, Mr. William Henry",male,20.0,0,0,A/5. 2151,8.05,,S,0,16
14,15,0,3,"Vestrom, Miss. Hulda Amanda Adolfina",female,14.0,0,0,350406,7.8542,,S,0,18
16,17,0,3,"Rice, Master. Eugene",male,2.0,4,1,382652,29.125,,Q,0,20
18,19,0,3,"Vander Planke, Mrs. Julius (Emelia Maria Vande...",female,31.0,1,0,345763,18.0,,S,0,22


In [176]:
df2.iloc[0:2] 
##  it  is a Pandas DataFrame operation that returns the first two rows of the DataFrame df2 using integer-based indexing.

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked,new_colum,new_col1
0,1,0,3,"Braund, Mr. Owen Harris",male,22.0,1,0,A/5 21171,7.25,,S,0,4
1,2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Th...",female,38.0,1,0,PC 17599,71.2833,C85,C,0,3


In [177]:
df2.loc[0:2]
#df2.loc[0:2] is a Pandas DataFrame operation that returns the first three rows of the DataFrame df2 using label-based indexing.

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked,new_colum,new_col1
0,1,0,3,"Braund, Mr. Owen Harris",male,22.0,1,0,A/5 21171,7.25,,S,0,4
1,2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Th...",female,38.0,1,0,PC 17599,71.2833,C85,C,0,3
2,3,1,3,"Heikkinen, Miss. Laina",female,26.0,0,0,STON/O2. 3101282,7.925,,S,0,6


In [178]:
df2.loc[0:2,['PassengerId','Survived','Pclass']]
#it is a Pandas DataFrame operation that returns the first three rows of the DataFrame df2 but only includes the PassengerId, Survived, and Pclass columns.

Unnamed: 0,PassengerId,Survived,Pclass
0,1,0,3
1,2,1,1
2,3,1,3


In [179]:
df2.iloc[0:2,['PassengerId','Survived','Pclass']]
# raises a TypeError, because the second argument of the .iloc method should be an integer, a list of integers, or a slice object.
#The .iloc method uses integer-based indexing, where you can select rows and columns by their integer location, starting from 0. The first argument of .iloc specifies the rows to select, and the second argument specifies the columns to select.

IndexError: .iloc requires numeric indexers, got ['PassengerId' 'Survived' 'Pclass']

In [180]:
df2.iloc[0:2,[0,1,2]]
#df2.iloc[0:2,[0,1,2]] is a Pandas DataFrame operation that returns the first two rows of the DataFrame df2, but only including the PassengerId, Survived, and Pclass columns.

Unnamed: 0,PassengerId,Survived,Pclass
0,1,0,3
1,2,1,1
