### PANDAS 

Pandas (Python for Data Analysis) is a Python library that focuses on data analysis tasks such a data manipulation, preprocessing, and cleaning. Pandas provides data structures and high-level functions. There are two objects in Pandas: Series and DataFrame.

In [1]:
import pandas as pd
import numpy as np

OBJECT SERIES

Series is a one-dimensional array object with labels. Series is more suitable to represent and manipulate single columns of data.

In [2]:
odd_number = [1, 3, 5, 7, 9]

In [3]:
print(odd_number)

[1, 3, 5, 7, 9]


In [4]:
odd_number = [1, 3, 5, 7, 9]

odd_number = pd.Series(odd_number)

odd_number

0    1
1    3
2    5
3    7
4    9
dtype: int64

In [5]:
#convert series to array

odd_number.values

array([1, 3, 5, 7, 9], dtype=int64)

INDEX

Index is the process of accessing a specific element in a sequence, such as a string or list, using its position or index number. In Python, indexing begins at 0, which means that the first element in a series has an index of zero, the second has an index of one, and so on.

In [6]:
#show the index

odd_number.index

RangeIndex(start=0, stop=5, step=1)

Index of a string element

In [7]:
odd_number = [1, 3, 'five', 7, 9, 'eleven']

In [8]:
number = 'five'

In [9]:
odd_number.index(number)

2

Index in List of Comprehension

Due to the fact that index() only provides the first match to an object, you can use list comprehension to obtain the positions of other matches in the list.

In [10]:
numbers = [5, 11, 1, 3, 3, 13, 5, 9, 3, 7, 11, 9, 17]

[a for a, n in enumerate(numbers) if n == 3]

[3, 4, 8]

In [11]:
print("the list of indexes that have a '3' in numbers as a list of comprehension are: {}".format([a for a, n in enumerate(numbers) if n == 3]))

the list of indexes that have a '3' in numbers as a list of comprehension are: [3, 4, 8]


In [12]:
#calling the data (by default)

odd_number[3]

7

In [13]:
odd_number = [1, 3, 5, 7, 9]

odd_number = pd.Series(odd_number)

odd_number

0    1
1    3
2    5
3    7
4    9
dtype: int64

In [14]:
#custom index

odd_number = pd.Series([1, 3, 5, 7, 9], index=['e', 'f', 'g', 'h', 'i'])

In [15]:
odd_number

e    1
f    3
g    5
h    7
i    9
dtype: int64

In [16]:
odd_number.values

array([1, 3, 5, 7, 9], dtype=int64)

In [17]:
odd_number.index

Index(['e', 'f', 'g', 'h', 'i'], dtype='object')

There are two types of index:

1. Explicit index

The explicit index applies the values (numeric or non-numeric) set as the index

In [18]:
#explicit index

odd_number['g']

5

2. Implicit index

The implicit index applies the indices' location (numeric), similar to the Python indexing method.

In [19]:
#implicit index

odd_number[2]

  odd_number[2]


5

#Notes:

The "warning" sign in the figure above contains information about integer elements that in future versions will be treated as labels. It is not the error warning.

In [20]:
#custom index; both index have same value

odd_number1 = pd.Series([1, 3, 5, 7, 9], index=[3, 6, 1, 8, 7])

In [21]:
odd_number1

3    1
6    3
1    5
8    7
7    9
dtype: int64

In [22]:
#explicit index

odd_number1[1]

5

In [23]:
#implicit index

odd_number1[2]

KeyError: 2

#Notes:

When the implicit index and explicit index have the same index number; and the data is called, Python will only rely on the explicit index. that's why it is error while running the data

INDEX WITH DATA SLICING

In [24]:
#create a Series

odd_number = [1, 3, 5, 7, 9]
odd_number = pd.Series(odd_number)

print(odd_number)

0    1
1    3
2    5
3    7
4    9
dtype: int64


In [25]:
#create explicit index

odd_number = pd.Series([1, 3, 5, 7, 9], index=['e', 'f', 'g', 'h', 'i'])

In [26]:
#slicing explicit

odd_number['e':'g']

e    1
f    3
g    5
dtype: int64

In [27]:
#slicing implicit

odd_number[0:2]

e    1
f    3
dtype: int64

LOC & ILOC

**LOC** is basically a label-based indexing system that allows you to refer to rows and columns by their names. For example, consider the'month' column. Integers may be used, but they are treated as labels; while **ILOC** is basically an integer-based indexing system that starts from 0. That is, you specify rows and columns and provide a number. Thus, row zero is the first row, column one is the second column, and so on.

In [28]:
#create a Series

odd_number = [1, 3, 5, 7, 9]
odd_number = pd.Series(odd_number)

print(odd_number)

0    1
1    3
2    5
3    7
4    9
dtype: int64


In [29]:
#create explicit index

odd_number2 = pd.Series([1, 3, 5, 7, 9], index=[3, 6, 1, 8, 7])

In [30]:
#use a loc

odd_number2.loc[3:1]

3    1
6    3
1    5
dtype: int64

In [31]:
#use an iloc

odd_number2.iloc[0:2]

3    1
6    3
dtype: int64

INDEX WITH DATA DICTIONARY

In [32]:
dict_snack = {'Zira':'Bengbeng',
              'Cila':'Bisvit',
              'Aya':'Biskuat',
              'Sasa':'Bonbon',
              'Lala':'Better'}

In [33]:
dict_snack

{'Zira': 'Bengbeng',
 'Cila': 'Bisvit',
 'Aya': 'Biskuat',
 'Sasa': 'Bonbon',
 'Lala': 'Better'}

In [34]:
#transform the dictionary to the Series

snack = pd.Series(dict_snack)

In [35]:
snack

Zira    Bengbeng
Cila      Bisvit
Aya      Biskuat
Sasa      Bonbon
Lala      Better
dtype: object

In [36]:
dict_age = {'Zira':7,
              'Cila':10,
              'Aya':8,
              'Sasa':6,
              'Lala':10}

In [37]:
age = pd.Series(dict_age)

In [38]:
age

Zira     7
Cila    10
Aya      8
Sasa     6
Lala    10
dtype: int64

In [39]:
snack.loc['Lala']

'Better'

In [40]:
snack.iloc[0]

'Bengbeng'

-----------------------------------------------------------

OBJECT DATAFRAME

DataFrame is a data structure that is **tabular and column-oriented**, containing labels for each row and column.

In [41]:
#import pandas & NumPy

import pandas as pd
import numpy as np

In [42]:
dict_snack = {'Zira':'Bengbeng',
              'Cila':'Bisvit',
              'Aya':'Biskuat',
              'Sasa':'Bonbon',
              'Lala':'Better'}

In [43]:
snack = pd.Series(dict_snack)

In [44]:
snack

Zira    Bengbeng
Cila      Bisvit
Aya      Biskuat
Sasa      Bonbon
Lala      Better
dtype: object

In [45]:
dict_age = {'Zira':7,
              'Cila':10,
              'Aya':8,
              'Sasa':6,
              'Lala':10}

In [46]:
age = pd.Series(dict_age)

In [47]:
age

Zira     7
Cila    10
Aya      8
Sasa     6
Lala    10
dtype: int64

Transform data Series to DataFrame

In [48]:
kids = pd.DataFrame({"favorite snack":snack, "child's age":age})

In [49]:
kids

Unnamed: 0,favorite snack,child's age
Zira,Bengbeng,7
Cila,Bisvit,10
Aya,Biskuat,8
Sasa,Bonbon,6
Lala,Better,10


In [50]:
kids["favorite snack"]

Zira    Bengbeng
Cila      Bisvit
Aya      Biskuat
Sasa      Bonbon
Lala      Better
Name: favorite snack, dtype: object

In [51]:
kids["child's age"]

Zira     7
Cila    10
Aya      8
Sasa     6
Lala    10
Name: child's age, dtype: int64

In [52]:
kids["child's age"]["Zira"]

7

In [53]:
#explicit index

kids["favorite snack"]["Cila":"Sasa"]

Cila     Bisvit
Aya     Biskuat
Sasa     Bonbon
Name: favorite snack, dtype: object

In [54]:
#implicit index

kids["child's age"].iloc[1:3]

Cila    10
Aya      8
Name: child's age, dtype: int64

USING DATA CSV

In [55]:
#load data titanic

df = pd.read_csv('Titanic.csv')

In [56]:
#view the data with ascending (by default: 5 rows only)

df.head()

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
0,1,0,3,"Braund, Mr. Owen Harris",male,22.0,1,0,A/5 21171,7.25,,S
1,2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Th...",female,38.0,1,0,PC 17599,71.2833,C85,C
2,3,1,3,"Heikkinen, Miss. Laina",female,26.0,0,0,STON/O2. 3101282,7.925,,S
3,4,1,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35.0,1,0,113803,53.1,C123,S
4,5,0,3,"Allen, Mr. William Henry",male,35.0,0,0,373450,8.05,,S


#notes:
- SibSp = sibling/spouse
- Parch = parent/children
- Embark = tujuannya

In [57]:
#show the data info

df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 891 entries, 0 to 890
Data columns (total 12 columns):
 #   Column       Non-Null Count  Dtype  
---  ------       --------------  -----  
 0   PassengerId  891 non-null    int64  
 1   Survived     891 non-null    int64  
 2   Pclass       891 non-null    int64  
 3   Name         891 non-null    object 
 4   Sex          891 non-null    object 
 5   Age          714 non-null    float64
 6   SibSp        891 non-null    int64  
 7   Parch        891 non-null    int64  
 8   Ticket       891 non-null    object 
 9   Fare         891 non-null    float64
 10  Cabin        204 non-null    object 
 11  Embarked     889 non-null    object 
dtypes: float64(2), int64(5), object(5)
memory usage: 83.7+ KB


In [58]:
#view the number of non-nulls in the data

df.notnull().sum()

PassengerId    891
Survived       891
Pclass         891
Name           891
Sex            891
Age            714
SibSp          891
Parch          891
Ticket         891
Fare           891
Cabin          204
Embarked       889
dtype: int64

In [59]:
#view the NaN count of the data

df.isnull().sum()

PassengerId      0
Survived         0
Pclass           0
Name             0
Sex              0
Age            177
SibSp            0
Parch            0
Ticket           0
Fare             0
Cabin          687
Embarked         2
dtype: int64

In [60]:
#view the data with desscending (by custom)

df.tail(7)

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
884,885,0,3,"Sutehall, Mr. Henry Jr",male,25.0,0,0,SOTON/OQ 392076,7.05,,S
885,886,0,3,"Rice, Mrs. William (Margaret Norton)",female,39.0,0,5,382652,29.125,,Q
886,887,0,2,"Montvila, Rev. Juozas",male,27.0,0,0,211536,13.0,,S
887,888,1,1,"Graham, Miss. Margaret Edith",female,19.0,0,0,112053,30.0,B42,S
888,889,0,3,"Johnston, Miss. Catherine Helen ""Carrie""",female,,1,2,W./C. 6607,23.45,,S
889,890,1,1,"Behr, Mr. Karl Howell",male,26.0,0,0,111369,30.0,C148,C
890,891,0,3,"Dooley, Mr. Patrick",male,32.0,0,0,370376,7.75,,Q


In [61]:
#view the number of rows and columns

df.shape

(891, 12)

In [62]:
#view the columns

df.columns

Index(['PassengerId', 'Survived', 'Pclass', 'Name', 'Sex', 'Age', 'SibSp',
       'Parch', 'Ticket', 'Fare', 'Cabin', 'Embarked'],
      dtype='object')

In [63]:
#show the index

df.index

RangeIndex(start=0, stop=891, step=1)

In [64]:
#displays information from columns that are in the form of numbers

df.describe()

Unnamed: 0,PassengerId,Survived,Pclass,Age,SibSp,Parch,Fare
count,891.0,891.0,891.0,714.0,891.0,891.0,891.0
mean,446.0,0.383838,2.308642,29.699118,0.523008,0.381594,32.204208
std,257.353842,0.486592,0.836071,14.526497,1.102743,0.806057,49.693429
min,1.0,0.0,1.0,0.42,0.0,0.0,0.0
25%,223.5,0.0,2.0,20.125,0.0,0.0,7.9104
50%,446.0,0.0,3.0,28.0,0.0,0.0,14.4542
75%,668.5,1.0,3.0,38.0,1.0,0.0,31.0
max,891.0,1.0,3.0,80.0,8.0,6.0,512.3292


In [65]:
#show the mean from column Age

df['Age'].mean()

29.69911764705882

In [66]:
#show mean from column Age (another option)

df.Age.mean()

29.69911764705882

In [67]:
#show the median from column Age

df['Age'].median()

28.0

In [68]:
#show the mode from column Age

df['Age'].mode()

0    24.0
Name: Age, dtype: float64

In [69]:
#must use "0" because the mode includes a series form so that the results can be displayed directly

df['Age'].mode() [0]

24.0

In [70]:
#show the minimum of data from column Age

df['Age'].min()

0.42

In [71]:
#show the maximum of data from column Age

df['Age'].max()

80.0

In [73]:
#show isnull

df[df['Age'].isnull()]

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
5,6,0,3,"Moran, Mr. James",male,,0,0,330877,8.4583,,Q
17,18,1,2,"Williams, Mr. Charles Eugene",male,,0,0,244373,13.0000,,S
19,20,1,3,"Masselmani, Mrs. Fatima",female,,0,0,2649,7.2250,,C
26,27,0,3,"Emir, Mr. Farred Chehab",male,,0,0,2631,7.2250,,C
28,29,1,3,"O'Dwyer, Miss. Ellen ""Nellie""",female,,0,0,330959,7.8792,,Q
...,...,...,...,...,...,...,...,...,...,...,...,...
859,860,0,3,"Razi, Mr. Raihed",male,,0,0,2629,7.2292,,C
863,864,0,3,"Sage, Miss. Dorothy Edith ""Dolly""",female,,8,2,CA. 2343,69.5500,,S
868,869,0,3,"van Melkebeke, Mr. Philemon",male,,0,0,345777,9.5000,,S
878,879,0,3,"Laleff, Mr. Kristo",male,,0,0,349217,7.8958,,S


In [74]:
#to dataframe

df[df.Age.isnull()]

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
5,6,0,3,"Moran, Mr. James",male,,0,0,330877,8.4583,,Q
17,18,1,2,"Williams, Mr. Charles Eugene",male,,0,0,244373,13.0000,,S
19,20,1,3,"Masselmani, Mrs. Fatima",female,,0,0,2649,7.2250,,C
26,27,0,3,"Emir, Mr. Farred Chehab",male,,0,0,2631,7.2250,,C
28,29,1,3,"O'Dwyer, Miss. Ellen ""Nellie""",female,,0,0,330959,7.8792,,Q
...,...,...,...,...,...,...,...,...,...,...,...,...
859,860,0,3,"Razi, Mr. Raihed",male,,0,0,2629,7.2292,,C
863,864,0,3,"Sage, Miss. Dorothy Edith ""Dolly""",female,,8,2,CA. 2343,69.5500,,S
868,869,0,3,"van Melkebeke, Mr. Philemon",male,,0,0,345777,9.5000,,S
878,879,0,3,"Laleff, Mr. Kristo",male,,0,0,349217,7.8958,,S


In [75]:
#show the unique data from column Sex

df.Sex.unique()

array(['male', 'female'], dtype=object)

In [76]:
#show the number of unique data from column Sex

df.Sex.nunique()

2

In [77]:
##show the proportion from column Sex

df.Sex.value_counts()

Sex
male      577
female    314
Name: count, dtype: int64

In [78]:
#calling specific columns

df[['Survived','Name','Age']]

Unnamed: 0,Survived,Name,Age
0,0,"Braund, Mr. Owen Harris",22.0
1,1,"Cumings, Mrs. John Bradley (Florence Briggs Th...",38.0
2,1,"Heikkinen, Miss. Laina",26.0
3,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",35.0
4,0,"Allen, Mr. William Henry",35.0
...,...,...,...
886,0,"Montvila, Rev. Juozas",27.0
887,1,"Graham, Miss. Margaret Edith",19.0
888,0,"Johnston, Miss. Catherine Helen ""Carrie""",
889,1,"Behr, Mr. Karl Howell",26.0


In [79]:
#another option (defined by list)

sel_columns = ['Survived','Name','Age']
df[sel_columns]

Unnamed: 0,Survived,Name,Age
0,0,"Braund, Mr. Owen Harris",22.0
1,1,"Cumings, Mrs. John Bradley (Florence Briggs Th...",38.0
2,1,"Heikkinen, Miss. Laina",26.0
3,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",35.0
4,0,"Allen, Mr. William Henry",35.0
...,...,...,...
886,0,"Montvila, Rev. Juozas",27.0
887,1,"Graham, Miss. Margaret Edith",19.0
888,0,"Johnston, Miss. Catherine Helen ""Carrie""",
889,1,"Behr, Mr. Karl Howell",26.0


In [80]:
#calling column Survived that has a "1" value

df[df['Survived']==1]

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
1,2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Th...",female,38.0,1,0,PC 17599,71.2833,C85,C
2,3,1,3,"Heikkinen, Miss. Laina",female,26.0,0,0,STON/O2. 3101282,7.9250,,S
3,4,1,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35.0,1,0,113803,53.1000,C123,S
8,9,1,3,"Johnson, Mrs. Oscar W (Elisabeth Vilhelmina Berg)",female,27.0,0,2,347742,11.1333,,S
9,10,1,2,"Nasser, Mrs. Nicholas (Adele Achem)",female,14.0,1,0,237736,30.0708,,C
...,...,...,...,...,...,...,...,...,...,...,...,...
875,876,1,3,"Najib, Miss. Adele Kiamie ""Jane""",female,15.0,0,0,2667,7.2250,,C
879,880,1,1,"Potter, Mrs. Thomas Jr (Lily Alexenia Wilson)",female,56.0,0,1,11767,83.1583,C50,C
880,881,1,2,"Shelley, Mrs. William (Imanita Parrish Hall)",female,25.0,0,1,230433,26.0000,,S
887,888,1,1,"Graham, Miss. Margaret Edith",female,19.0,0,0,112053,30.0000,B42,S


In [81]:
#calling column Age that has a value more than 50

df[df['Age']>50]

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
6,7,0,1,"McCarthy, Mr. Timothy J",male,54.0,0,0,17463,51.8625,E46,S
11,12,1,1,"Bonnell, Miss. Elizabeth",female,58.0,0,0,113783,26.5500,C103,S
15,16,1,2,"Hewlett, Mrs. (Mary D Kingcome)",female,55.0,0,0,248706,16.0000,,S
33,34,0,2,"Wheadon, Mr. Edward H",male,66.0,0,0,C.A. 24579,10.5000,,S
54,55,0,1,"Ostby, Mr. Engelhart Cornelius",male,65.0,0,1,113509,61.9792,B30,C
...,...,...,...,...,...,...,...,...,...,...,...,...
820,821,1,1,"Hays, Mrs. Charles Melville (Clara Jennings Gr...",female,52.0,1,1,12749,93.5000,B69,S
829,830,1,1,"Stone, Mrs. George Nelson (Martha Evelyn)",female,62.0,0,0,113572,80.0000,B28,
851,852,0,3,"Svensson, Mr. Johan",male,74.0,0,0,347060,7.7750,,S
857,858,1,1,"Daly, Mr. Peter Denis",male,51.0,0,0,113055,26.5500,E17,S


In [82]:
#calling column Age that has a value less than equal to 50

df[~(df['Age']>50)]

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
0,1,0,3,"Braund, Mr. Owen Harris",male,22.0,1,0,A/5 21171,7.2500,,S
1,2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Th...",female,38.0,1,0,PC 17599,71.2833,C85,C
2,3,1,3,"Heikkinen, Miss. Laina",female,26.0,0,0,STON/O2. 3101282,7.9250,,S
3,4,1,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35.0,1,0,113803,53.1000,C123,S
4,5,0,3,"Allen, Mr. William Henry",male,35.0,0,0,373450,8.0500,,S
...,...,...,...,...,...,...,...,...,...,...,...,...
886,887,0,2,"Montvila, Rev. Juozas",male,27.0,0,0,211536,13.0000,,S
887,888,1,1,"Graham, Miss. Margaret Edith",female,19.0,0,0,112053,30.0000,B42,S
888,889,0,3,"Johnston, Miss. Catherine Helen ""Carrie""",female,,1,2,W./C. 6607,23.4500,,S
889,890,1,1,"Behr, Mr. Karl Howell",male,26.0,0,0,111369,30.0000,C148,C


In [83]:
#calling column Age that has a value more than 50 and column Sex that has a "male" value

df[(df['Age']>50) & (df['Sex']=='male')]

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
6,7,0,1,"McCarthy, Mr. Timothy J",male,54.0,0,0,17463,51.8625,E46,S
33,34,0,2,"Wheadon, Mr. Edward H",male,66.0,0,0,C.A. 24579,10.5,,S
54,55,0,1,"Ostby, Mr. Engelhart Cornelius",male,65.0,0,1,113509,61.9792,B30,C
94,95,0,3,"Coxon, Mr. Daniel",male,59.0,0,0,364500,7.25,,S
96,97,0,1,"Goldschmidt, Mr. George B",male,71.0,0,0,PC 17754,34.6542,A5,C
116,117,0,3,"Connors, Mr. Patrick",male,70.5,0,0,370369,7.75,,Q
124,125,0,1,"White, Mr. Percival Wayland",male,54.0,0,1,35281,77.2875,D26,S
150,151,0,2,"Bateman, Rev. Robert James",male,51.0,0,0,S.O.P. 1166,12.525,,S
152,153,0,3,"Meo, Mr. Alfonzo",male,55.5,0,0,A.5. 11206,8.05,,S
155,156,0,1,"Williams, Mr. Charles Duane",male,51.0,0,1,PC 17597,61.3792,,C


In [84]:
#calling column Age that has a value more than 50 or column Sex that has a "male" value

df[(df['Age']>50) | (df['Sex']=='male')]

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
0,1,0,3,"Braund, Mr. Owen Harris",male,22.0,1,0,A/5 21171,7.2500,,S
4,5,0,3,"Allen, Mr. William Henry",male,35.0,0,0,373450,8.0500,,S
5,6,0,3,"Moran, Mr. James",male,,0,0,330877,8.4583,,Q
6,7,0,1,"McCarthy, Mr. Timothy J",male,54.0,0,0,17463,51.8625,E46,S
7,8,0,3,"Palsson, Master. Gosta Leonard",male,2.0,3,1,349909,21.0750,,S
...,...,...,...,...,...,...,...,...,...,...,...,...
883,884,0,2,"Banfield, Mr. Frederick James",male,28.0,0,0,C.A./SOTON 34068,10.5000,,S
884,885,0,3,"Sutehall, Mr. Henry Jr",male,25.0,0,0,SOTON/OQ 392076,7.0500,,S
886,887,0,2,"Montvila, Rev. Juozas",male,27.0,0,0,211536,13.0000,,S
889,890,1,1,"Behr, Mr. Karl Howell",male,26.0,0,0,111369,30.0000,C148,C


In [85]:
#calling column Age that has a value between 20 and 40

df[(df['Age']>20) & (df['Age']>40)]

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
6,7,0,1,"McCarthy, Mr. Timothy J",male,54.0,0,0,17463,51.8625,E46,S
11,12,1,1,"Bonnell, Miss. Elizabeth",female,58.0,0,0,113783,26.5500,C103,S
15,16,1,2,"Hewlett, Mrs. (Mary D Kingcome)",female,55.0,0,0,248706,16.0000,,S
33,34,0,2,"Wheadon, Mr. Edward H",male,66.0,0,0,C.A. 24579,10.5000,,S
35,36,0,1,"Holverson, Mr. Alexander Oskar",male,42.0,1,0,113789,52.0000,,S
...,...,...,...,...,...,...,...,...,...,...,...,...
862,863,1,1,"Swift, Mrs. Frederick Joel (Margaret Welles Ba...",female,48.0,0,0,17466,25.9292,D17,S
865,866,1,2,"Bystrom, Mrs. (Karolina)",female,42.0,0,0,236852,13.0000,,S
871,872,1,1,"Beckwith, Mrs. Richard Leonard (Sallie Monypeny)",female,47.0,1,1,11751,52.5542,D35,S
873,874,0,3,"Vander Cruyssen, Mr. Victor",male,47.0,0,0,345765,9.0000,,S


In [86]:
#calling column Survived and column Pclass, where the age is greater than 50 or the sex is male.
df[['Survived','Pclass','Age','Sex']][(df['Sex']=='male')]

Unnamed: 0,Survived,Pclass,Age,Sex
0,0,3,22.0,male
4,0,3,35.0,male
5,0,3,,male
6,0,1,54.0,male
7,0,3,2.0,male
...,...,...,...,...
883,0,2,28.0,male
884,0,3,25.0,male
886,0,2,27.0,male
889,1,1,26.0,male
