# Pandas
We will learn about the basics of Pandas, which is a powerful, open-source data analysis and manipulation library for Python built on top of the NumPy package. **Pandas** is derived from the term **panel data**.

In this session, we will study Pandas data structures and understand how to study their attributes and access their member elements and values.

To get started using `Pandas`, import it into your Python program as follows:

In [1]:
import pandas as pd

The two most important data structures in `Pandas` are the `Series` and the `DataFrame` classes. A series is essentially a single variable with multiple values recorded for different observations. Putting multiple series together across the same observations gives you a dataframe.

# Series
The Pandas `Series` object is a 1D labeled array capable of holding data of any type.

### Example
Creating series from lists

In [144]:
empty_series = pd.Series()

  empty_series = pd.Series()


In [145]:
empty_series

Series([], dtype: float64)

In [4]:
empty_series = pd.Series(data = None, dtype = 'int')

In [5]:
empty_series

Series([], dtype: int32)

In [6]:
type(empty_series)

pandas.core.series.Series

In [7]:
print(empty_series)

Series([], dtype: int32)


In [8]:
prices = [649000, 391000, 5476000, 1786000, 1091000]
carnames = ['swift', 'santro', 'audi', 'elantra', 'bolero']

In [146]:
car_series = pd.Series(data = prices, index = carnames)
car_series

swift       649000
santro      391000
audi       5476000
elantra    1786000
bolero     1091000
dtype: int64

In [10]:
type(car_series)

pandas.core.series.Series

In [147]:
car_series = pd.Series(prices)
car_series

0     649000
1     391000
2    5476000
3    1786000
4    1091000
dtype: int64

In [149]:
car_series = pd.Series(prices, carnames, name = 'price')
car_series

swift       649000
santro      391000
audi       5476000
elantra    1786000
bolero     1091000
Name: price, dtype: int64

The `name` attribute for the Pandas series is optional but helpful and makes the data organization neater and easier to handle.

### Example
Creating series from dictionaries

In [13]:
entries = {'swift': 649000,
           'santro': 391000,
           'audi': 5476000,
           'elantra': 1786000,
           'bolero': 1091000}
car_series = pd.Series(data = entries, name = 'price')
car_series

swift       649000
santro      391000
audi       5476000
elantra    1786000
bolero     1091000
Name: price, dtype: int64

Notice how the index and the values are meaningfully interpreted from the keys and the values of the dictionary.

In [150]:
entries = {'swift': 649000,
           'santro': '391000',
           'audi': 5476000,
           'elantra': 1786000,
           'bolero': 1091000}
car_series = pd.Series(data = entries, name = 'price')
car_series

swift       649000
santro      391000
audi       5476000
elantra    1786000
bolero     1091000
Name: price, dtype: object

In [151]:
type(car_series['swift'])

int

In [16]:
type(car_series['santro'])

str

Notice how the series shows an `object` data type but the elements within the series have their own data types, namely, `int` and `str`, wherever applicable.

We will study more about indexing and accessing later in the session.

### Example
Looking at some series attributes

Series have attributes similar to arrays, and a few more, that make the life of a data analyst easier.

In [153]:
entries = {'swift': 649000,
           'santro': 391000,
           'audi': 5476000,
           'elantra': 1786000,
           'bolero': 1091000}
car_series = pd.Series(data = entries, name = 'price')
car_series

swift       649000
santro      391000
audi       5476000
elantra    1786000
bolero     1091000
Name: price, dtype: int64

In [154]:
car_series.name

'price'

In [155]:
car_series.index

Index(['swift', 'santro', 'audi', 'elantra', 'bolero'], dtype='object')

In [156]:
car_series.values

array([ 649000,  391000, 5476000, 1786000, 1091000], dtype=int64)

In [21]:
car_series.dtype

dtype('int64')

In [22]:
car_series.shape

(5,)

In [23]:
car_series.ndim

1

In [24]:
car_series.size

5

In [25]:
len(car_series)

5

In [26]:
car_series.empty

False

# Accessing data from series

In this section, we will study various methods to access data from series.

### Example
Accessing data from series using logical conditions

In [27]:
entries = {'swift': 649000,
           'santro': 391000,
           'audi': 5476000,
           'elantra': 1786000,
           'bolero': 1091000}
car_series = pd.Series(data = entries, name = 'price')
car_series

swift       649000
santro      391000
audi       5476000
elantra    1786000
bolero     1091000
Name: price, dtype: int64

In [160]:
car_series > 1000000

swift      False
santro     False
audi        True
elantra     True
bolero      True
Name: price, dtype: bool

In [161]:
car_series[car_series>1000000]

audi       5476000
elantra    1786000
bolero     1091000
Name: price, dtype: int64

In [29]:
car_series[car_series > 1000000]

audi       5476000
elantra    1786000
bolero     1091000
Name: price, dtype: int64

In [162]:
car_series[car_series > 1000000].index

Index(['audi', 'elantra', 'bolero'], dtype='object')

In [163]:
car_series[car_series > 1000000].values

array([5476000, 1786000, 1091000], dtype=int64)

In [164]:
car_series[car_series > 1000000].index[0]

'audi'

In [165]:
car_series[car_series > 1000000].index[1]

'elantra'

In [34]:
car_series[car_series > 1000000].index[2]

'bolero'

In [35]:
car_series[car_series > 1000000].values[0]

5476000

In [36]:
car_series[car_series > 1000000].values[1]

1786000

In [37]:
car_series[car_series > 1000000].values[2]

1091000

In [40]:
(car_series > 1000000) & (car_series < 2000000)

swift      False
santro     False
audi       False
elantra     True
bolero      True
Name: price, dtype: bool

In [41]:
car_series[(car_series > 1000000) & (car_series < 2000000)]

elantra    1786000
bolero     1091000
Name: price, dtype: int64

Pandas objects such as series and dataframes strictly follow indexing. This is critical in data science to keep track of data entries and associated values.

In [42]:
car_series[(car_series > 1000000) & (car_series < 2000000)].index

Index(['elantra', 'bolero'], dtype='object')

In [43]:
car_series[(car_series > 1000000) & (car_series < 2000000)].values

array([1786000, 1091000], dtype=int64)

In [44]:
car_series[(car_series > 1000000) & (car_series < 2000000)].index[0]

'elantra'

In [45]:
car_series[(car_series > 1000000) & (car_series < 2000000)].index[1]

'bolero'

In [46]:
car_series[(car_series > 1000000) & (car_series < 2000000)].values[0]

1786000

In [47]:
car_series[(car_series > 1000000) & (car_series < 2000000)].values[1]

1091000

### Quiz
Consider the series shown below:
```
cust_names = ['Chad', 'Farheen', 'Himadri', 'Monisha']
cust_bill = [256.78, 434.53, 109.25, 529.42]
cust_info = pd.Series(cust_bill, cust_names)
```
Write code to print the names of the customers who have spent more than 300 rupees.

In [166]:
cust_names = ['Chad', 'Farheen', 'Himadri', 'Monisha']
cust_bill = [256.78, 434.53, 109.25, 529.42]
cust_info = pd.Series(cust_bill, cust_names)
print(list(cust_info[cust_info > 300].index))

['Farheen', 'Monisha']


### Example
Accessing data from series using the names of the entries

In [167]:
entries = {'swift': 649000,
           'santro': 391000,
           'audi': 5476000,
           'elantra': 1786000,
           'bolero': 1091000}
car_series = pd.Series(data = entries, name = 'price')
car_series

swift       649000
santro      391000
audi       5476000
elantra    1786000
bolero     1091000
Name: price, dtype: int64

In [168]:
car_series['swift']

649000

In [169]:
car_series[['swift']]

swift    649000
Name: price, dtype: int64

In [170]:
car_series[['swift', 'audi']]

swift     649000
audi     5476000
Name: price, dtype: int64

In [53]:
car_series[['swift', 'audi', 'bolero']]

swift      649000
audi      5476000
bolero    1091000
Name: price, dtype: int64

In [54]:
car_series[['bolero', 'swift', 'audi']]

bolero    1091000
swift      649000
audi      5476000
Name: price, dtype: int64

### Example
Accessing data from series using the `.loc[]` method

In [55]:
entries = {'swift': 649000,
           'santro': 391000,
           'audi': 5476000,
           'elantra': 1786000,
           'bolero': 1091000}
car_series = pd.Series(data = entries, name = 'price')
car_series

swift       649000
santro      391000
audi       5476000
elantra    1786000
bolero     1091000
Name: price, dtype: int64

In [171]:
car_series.loc['swift']

649000

In [172]:
car_series.loc[['swift']]

swift    649000
Name: price, dtype: int64

In [58]:
car_series.loc[['swift', 'audi']]

swift     649000
audi     5476000
Name: price, dtype: int64

In [59]:
car_series.loc['swift':'elantra']

swift       649000
santro      391000
audi       5476000
elantra    1786000
Name: price, dtype: int64

Note that this is similar to NumPy array slicing, but the `.loc[]` method is inclusive of the stop value as well.

In [174]:
car_series.loc['santro':'audi']

santro     391000
audi      5476000
Name: price, dtype: int64

In [61]:
car_series.loc[:'elantra']

swift       649000
santro      391000
audi       5476000
elantra    1786000
Name: price, dtype: int64

### Example
Accessing data from series using the `.iloc[]` method

In [62]:
entries = {'swift': 649000,
           'santro': 391000,
           'audi': 5476000,
           'elantra': 1786000,
           'bolero': 1091000}
car_series = pd.Series(data = entries, name = 'price')
car_series

swift       649000
santro      391000
audi       5476000
elantra    1786000
bolero     1091000
Name: price, dtype: int64

In [63]:
car_series.iloc[0]

649000

In [64]:
car_series.iloc[3]

1786000

In [65]:
car_series.iloc[[3]]

elantra    1786000
Name: price, dtype: int64

In [66]:
car_series.iloc[[0, 2, 4]]

swift      649000
audi      5476000
bolero    1091000
Name: price, dtype: int64

In [67]:
car_series.iloc[0:2]

swift     649000
santro    391000
Name: price, dtype: int64

Note that the `.iloc[]` method is not inclusive of the stop element like the `.loc[]` method. The `.iloc[]` method is very similar to NumPy array indexing and slicing.

In [68]:
car_series.iloc[-1]

1091000

In [69]:
car_series.iloc[[-1]]

bolero    1091000
Name: price, dtype: int64

In [70]:
car_series.iloc[1:5:2]

santro      391000
elantra    1786000
Name: price, dtype: int64

In [71]:
car_series.iloc[::-1]

bolero     1091000
elantra    1786000
audi       5476000
santro      391000
swift       649000
Name: price, dtype: int64

In this section, we familiarized ourselves with the series object from the Pandas library. Note that the most common data structure that is used in data science from the Pandas library is the dataframe object, which is essentially a table of series with a common index. In the next section, we will study dataframes in detail. Learners are encouraged to explore more about series on their own.

### Quiz
Consider the series shown below:
```
cust_names = ['Chad', 'Farheen', 'Himadri', 'Monisha']
cust_bill = [256.78, 434.53, 109.25, 529.42]
cust_info = pd.Series(cust_bill, cust_names)
```
Use the different methods you have studied to extract the bill amounts for Chad and Monisha.

In [72]:
cust_names = ['Chad', 'Farheen', 'Himadri', 'Monisha']
cust_bill = [256.78, 434.53, 109.25, 529.42]
cust_info = pd.Series(cust_bill, cust_names)

In [73]:
cust_info[['Chad', 'Monisha']]

Chad       256.78
Monisha    529.42
dtype: float64

In [74]:
cust_info.loc[['Chad', 'Monisha']]

Chad       256.78
Monisha    529.42
dtype: float64

In [75]:
cust_info.iloc[[0, 3]]

Chad       256.78
Monisha    529.42
dtype: float64

# Dataframes
A Dataframe is a 2D table made up of multiple series. We will study an introduction to dataframes in this session.

### Example
Creating dataframes

In [76]:
df = pd.DataFrame()

In [77]:
df

In [78]:
type(df)

pandas.core.frame.DataFrame

In [79]:
df.empty

True

While dataframes can be created using indexed data structures such as lists, we will study the dictionary method in this section. You may explore other ways of creating dataframes on your own.

In [80]:
cust_data = {'CustomerID' : ['001', '002', '003', '004'],
             'Name': ['Alice', 'Bharadwaj', 'Chiranjeev', 'Dawood'],
             'Email': ['a@example.com', 'b@example.com', 'c@example.com', 'd@example.com'],
             'Phone':  [1234567891, 8090809080, 4145145412, 1989898122],
             'TotalPurchases':  [20, 15, 25, 10]}
df = pd.DataFrame(cust_data)

Note that each feature of the dataframe is a list of values. The dataframe itself is a structured list of these features.

In [81]:
df

Unnamed: 0,CustomerID,Name,Email,Phone,TotalPurchases
0,1,Alice,a@example.com,1234567891,20
1,2,Bharadwaj,b@example.com,8090809080,15
2,3,Chiranjeev,c@example.com,4145145412,25
3,4,Dawood,d@example.com,1989898122,10


Note how the column names for the dataframe are inferred from the keys of the dictionary.

In [82]:
cust_data = {'CustomerID' : ['001', '002', '003', '004'],
            #  'Name': ['Alice', 'Bharadwaj', 'Chiranjeev', 'Dawood'],
             'Email': ['a@example.com', 'b@example.com', 'c@example.com', 'd@example.com'],
             'Phone':  [1234567891, 8090809080, 4145145412, 1989898122],
             'TotalPurchases':  [20, 15, 25, 10]}
cust_names = ['Alice', 'Bharadwaj', 'Chiranjeev', 'Dawood']
df = pd.DataFrame(data = cust_data, index = cust_names)

In [83]:
df

Unnamed: 0,CustomerID,Email,Phone,TotalPurchases
Alice,1,a@example.com,1234567891,20
Bharadwaj,2,b@example.com,8090809080,15
Chiranjeev,3,c@example.com,4145145412,25
Dawood,4,d@example.com,1989898122,10


In [84]:
cust_data = {'CustomerID' : ['001', '002', '003', '004'],
             'Email': ['a@example.com', 'b@example.com', 'c@example.com', 'd@example.com'],
             'Phone':  [1234567891, 8090809080, 4145145412, 1989898122],
             'TotalPurchases':  [20, 15, 25, 10]}
cust_names = ['Alice', 'Bharadwaj', 'Chiranjeev', 'Dawood']
df = pd.DataFrame(cust_data, cust_names)

In [85]:
df

Unnamed: 0,CustomerID,Email,Phone,TotalPurchases
Alice,1,a@example.com,1234567891,20
Bharadwaj,2,b@example.com,8090809080,15
Chiranjeev,3,c@example.com,4145145412,25
Dawood,4,d@example.com,1989898122,10


Keep in mind the order of the parameters when calling methods without specifying parameter names.

While you may only work on reading and analyzing data initially, learning to create and manipulate dataframes on the go becomes an integral part of data science as you move ahead in your learning journey.

### Example
Reading data into dataframes

While tabular data can be loaded into the Python environment from various source files, we will study how to read CSV files in this session.

In [2]:
import pandas as pd

In [3]:
df_cars = pd.read_csv('cars_dataset.csv') # without file path


In [8]:
df_cars = pd.read_csv('C:\\Users\\askpr\\Week 7 Pandas\\SME Session 1 - Introduction to Pandas\\cars_dataset.csv') 
# with file path , replace your system path with mine


In [9]:
df_cars.tail()

Unnamed: 0,car_ID,symboling,carname,doornumber,carbody,enginelocation,wheelbase,carlength,carwidth,carheight,...,cylindernumber,enginesize,boreratio,stroke,compressionratio,horsepower,peakrpm,citympg,highwaympg,price
142,197,-2,volvo 244dl,four,sedan,front,104.3,188.8,67.2,56.2,...,four,141,3.78,3.15,9.5,114,5400,24,28,15985.0
143,198,-1,volvo 245,four,wagon,front,104.3,188.8,67.2,57.5,...,four,141,3.78,3.15,9.5,114,5400,24,28,16515.0
144,199,-2,volvo 264gl,four,sedan,front,104.3,188.8,67.2,56.2,...,four,130,3.62,3.15,7.5,162,5100,17,22,18420.0
145,200,-1,volvo diesel,four,wagon,front,104.3,188.8,67.2,57.5,...,four,130,3.62,3.15,7.5,162,5100,17,22,18950.0
146,204,-1,volvo 246,four,sedan,front,109.1,188.8,68.9,55.5,...,six,145,3.01,3.4,23.0,106,4800,26,27,22470.0


You can see the number of rows and columns in the dataset here. You can also look at some of the feature/column names and the entries.

In [175]:
df_cars.head() # to display top 5 records

Unnamed: 0_level_0,car_ID,symboling,doornumber,carbody,enginelocation,wheelbase,carlength,carwidth,carheight,curbweight,cylindernumber,enginesize,boreratio,stroke,compressionratio,horsepower,peakrpm,citympg,highwaympg,price
carname,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1
alfa-romero giulia,1,3,two,convertible,front,88.6,168.8,64.1,48.8,2548,four,130,3.47,2.68,9.0,111,5000,21,27,13495.0
alfa-romero stelvio,2,3,two,convertible,front,88.6,168.8,64.1,48.8,2548,four,130,3.47,2.68,9.0,111,5000,21,27,16500.0
alfa-romero Quadrifoglio,3,1,two,hatchback,front,94.5,171.2,65.5,52.4,2823,six,152,2.68,3.47,9.0,154,5000,19,26,16500.0
audi 100 ls,4,2,four,sedan,front,99.8,176.6,66.2,54.3,2337,four,109,3.19,3.4,10.0,102,5500,24,30,13950.0
audi 100ls,5,2,four,sedan,front,99.4,176.6,66.4,54.3,2824,five,136,3.19,3.4,8.0,115,5500,18,22,17450.0


In [12]:
df_cars.head(2)

Unnamed: 0,car_ID,symboling,carname,doornumber,carbody,enginelocation,wheelbase,carlength,carwidth,carheight,...,cylindernumber,enginesize,boreratio,stroke,compressionratio,horsepower,peakrpm,citympg,highwaympg,price
0,1,3,alfa-romero giulia,two,convertible,front,88.6,168.8,64.1,48.8,...,four,130,3.47,2.68,9.0,111,5000,21,27,13495.0
1,2,3,alfa-romero stelvio,two,convertible,front,88.6,168.8,64.1,48.8,...,four,130,3.47,2.68,9.0,111,5000,21,27,16500.0


The `.head()` method is useful to quickly understand a dataframe's structure.

Whether to use the default integer index or to use unique identifiers such as car names or car models as the index depends on the data analysis methodology.

In [13]:
df_cars = pd.read_csv('cars_dataset.csv', index_col = 'carname') # make one column (carname) as index
df_cars

Unnamed: 0_level_0,car_ID,symboling,doornumber,carbody,enginelocation,wheelbase,carlength,carwidth,carheight,curbweight,cylindernumber,enginesize,boreratio,stroke,compressionratio,horsepower,peakrpm,citympg,highwaympg,price
carname,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1
alfa-romero giulia,1,3,two,convertible,front,88.6,168.8,64.1,48.8,2548,four,130,3.47,2.68,9.0,111,5000,21,27,13495.0
alfa-romero stelvio,2,3,two,convertible,front,88.6,168.8,64.1,48.8,2548,four,130,3.47,2.68,9.0,111,5000,21,27,16500.0
alfa-romero Quadrifoglio,3,1,two,hatchback,front,94.5,171.2,65.5,52.4,2823,six,152,2.68,3.47,9.0,154,5000,19,26,16500.0
audi 100 ls,4,2,four,sedan,front,99.8,176.6,66.2,54.3,2337,four,109,3.19,3.40,10.0,102,5500,24,30,13950.0
audi 100ls,5,2,four,sedan,front,99.4,176.6,66.4,54.3,2824,five,136,3.19,3.40,8.0,115,5500,18,22,17450.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
volvo 244dl,197,-2,four,sedan,front,104.3,188.8,67.2,56.2,2935,four,141,3.78,3.15,9.5,114,5400,24,28,15985.0
volvo 245,198,-1,four,wagon,front,104.3,188.8,67.2,57.5,3042,four,141,3.78,3.15,9.5,114,5400,24,28,16515.0
volvo 264gl,199,-2,four,sedan,front,104.3,188.8,67.2,56.2,3045,four,130,3.62,3.15,7.5,162,5100,17,22,18420.0
volvo diesel,200,-1,four,wagon,front,104.3,188.8,67.2,57.5,3157,four,130,3.62,3.15,7.5,162,5100,17,22,18950.0


### Example
Studying dataframes

In [14]:
df_cars

Unnamed: 0_level_0,car_ID,symboling,doornumber,carbody,enginelocation,wheelbase,carlength,carwidth,carheight,curbweight,cylindernumber,enginesize,boreratio,stroke,compressionratio,horsepower,peakrpm,citympg,highwaympg,price
carname,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1
alfa-romero giulia,1,3,two,convertible,front,88.6,168.8,64.1,48.8,2548,four,130,3.47,2.68,9.0,111,5000,21,27,13495.0
alfa-romero stelvio,2,3,two,convertible,front,88.6,168.8,64.1,48.8,2548,four,130,3.47,2.68,9.0,111,5000,21,27,16500.0
alfa-romero Quadrifoglio,3,1,two,hatchback,front,94.5,171.2,65.5,52.4,2823,six,152,2.68,3.47,9.0,154,5000,19,26,16500.0
audi 100 ls,4,2,four,sedan,front,99.8,176.6,66.2,54.3,2337,four,109,3.19,3.40,10.0,102,5500,24,30,13950.0
audi 100ls,5,2,four,sedan,front,99.4,176.6,66.4,54.3,2824,five,136,3.19,3.40,8.0,115,5500,18,22,17450.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
volvo 244dl,197,-2,four,sedan,front,104.3,188.8,67.2,56.2,2935,four,141,3.78,3.15,9.5,114,5400,24,28,15985.0
volvo 245,198,-1,four,wagon,front,104.3,188.8,67.2,57.5,3042,four,141,3.78,3.15,9.5,114,5400,24,28,16515.0
volvo 264gl,199,-2,four,sedan,front,104.3,188.8,67.2,56.2,3045,four,130,3.62,3.15,7.5,162,5100,17,22,18420.0
volvo diesel,200,-1,four,wagon,front,104.3,188.8,67.2,57.5,3157,four,130,3.62,3.15,7.5,162,5100,17,22,18950.0


In [15]:
df_cars.head()

Unnamed: 0_level_0,car_ID,symboling,doornumber,carbody,enginelocation,wheelbase,carlength,carwidth,carheight,curbweight,cylindernumber,enginesize,boreratio,stroke,compressionratio,horsepower,peakrpm,citympg,highwaympg,price
carname,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1
alfa-romero giulia,1,3,two,convertible,front,88.6,168.8,64.1,48.8,2548,four,130,3.47,2.68,9.0,111,5000,21,27,13495.0
alfa-romero stelvio,2,3,two,convertible,front,88.6,168.8,64.1,48.8,2548,four,130,3.47,2.68,9.0,111,5000,21,27,16500.0
alfa-romero Quadrifoglio,3,1,two,hatchback,front,94.5,171.2,65.5,52.4,2823,six,152,2.68,3.47,9.0,154,5000,19,26,16500.0
audi 100 ls,4,2,four,sedan,front,99.8,176.6,66.2,54.3,2337,four,109,3.19,3.4,10.0,102,5500,24,30,13950.0
audi 100ls,5,2,four,sedan,front,99.4,176.6,66.4,54.3,2824,five,136,3.19,3.4,8.0,115,5500,18,22,17450.0


In [92]:
df_cars.head(n = 3)

Unnamed: 0_level_0,car_ID,symboling,doornumber,carbody,enginelocation,wheelbase,carlength,carwidth,carheight,curbweight,cylindernumber,enginesize,boreratio,stroke,compressionratio,horsepower,peakrpm,citympg,highwaympg,price
carname,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1
alfa-romero giulia,1,3,two,convertible,front,88.6,168.8,64.1,48.8,2548,four,130,3.47,2.68,9.0,111,5000,21,27,13495.0
alfa-romero stelvio,2,3,two,convertible,front,88.6,168.8,64.1,48.8,2548,four,130,3.47,2.68,9.0,111,5000,21,27,16500.0
alfa-romero Quadrifoglio,3,1,two,hatchback,front,94.5,171.2,65.5,52.4,2823,six,152,2.68,3.47,9.0,154,5000,19,26,16500.0


In [93]:
df_cars.head(2)

Unnamed: 0_level_0,car_ID,symboling,doornumber,carbody,enginelocation,wheelbase,carlength,carwidth,carheight,curbweight,cylindernumber,enginesize,boreratio,stroke,compressionratio,horsepower,peakrpm,citympg,highwaympg,price
carname,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1
alfa-romero giulia,1,3,two,convertible,front,88.6,168.8,64.1,48.8,2548,four,130,3.47,2.68,9.0,111,5000,21,27,13495.0
alfa-romero stelvio,2,3,two,convertible,front,88.6,168.8,64.1,48.8,2548,four,130,3.47,2.68,9.0,111,5000,21,27,16500.0


In [177]:
df_cars.tail()

Unnamed: 0_level_0,car_ID,symboling,doornumber,carbody,enginelocation,wheelbase,carlength,carwidth,carheight,curbweight,cylindernumber,enginesize,boreratio,stroke,compressionratio,horsepower,peakrpm,citympg,highwaympg,price
carname,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1
volvo 244dl,197,-2,four,sedan,front,104.3,188.8,67.2,56.2,2935,four,141,3.78,3.15,9.5,114,5400,24,28,15985.0
volvo 245,198,-1,four,wagon,front,104.3,188.8,67.2,57.5,3042,four,141,3.78,3.15,9.5,114,5400,24,28,16515.0
volvo 264gl,199,-2,four,sedan,front,104.3,188.8,67.2,56.2,3045,four,130,3.62,3.15,7.5,162,5100,17,22,18420.0
volvo diesel,200,-1,four,wagon,front,104.3,188.8,67.2,57.5,3157,four,130,3.62,3.15,7.5,162,5100,17,22,18950.0
volvo 246,204,-1,four,sedan,front,109.1,188.8,68.9,55.5,3217,six,145,3.01,3.4,23.0,106,4800,26,27,22470.0


In [95]:
df_cars.tail(2)

Unnamed: 0_level_0,car_ID,symboling,doornumber,carbody,enginelocation,wheelbase,carlength,carwidth,carheight,curbweight,cylindernumber,enginesize,boreratio,stroke,compressionratio,horsepower,peakrpm,citympg,highwaympg,price
carname,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1
volvo diesel,200,-1,four,wagon,front,104.3,188.8,67.2,57.5,3157,four,130,3.62,3.15,7.5,162,5100,17,22,18950.0
volvo 246,204,-1,four,sedan,front,109.1,188.8,68.9,55.5,3217,six,145,3.01,3.4,23.0,106,4800,26,27,22470.0


In [16]:
df_cars.axes

[Index(['alfa-romero giulia', 'alfa-romero stelvio', 'alfa-romero Quadrifoglio',
        'audi 100 ls', 'audi 100ls', 'audi fox', 'audi 5000', 'audi 4000',
        'audi 5000s (diesel)', 'bmw 320i',
        ...
        'vw rabbit', 'volkswagen rabbit', 'volkswagen rabbit custom',
        'volvo 145e (sw)', 'volvo 144ea', 'volvo 244dl', 'volvo 245',
        'volvo 264gl', 'volvo diesel', 'volvo 246'],
       dtype='object', name='carname', length=147),
 Index(['car_ID', 'symboling', 'doornumber', 'carbody', 'enginelocation',
        'wheelbase', 'carlength', 'carwidth', 'carheight', 'curbweight',
        'cylindernumber', 'enginesize', 'boreratio', 'stroke',
        'compressionratio', 'horsepower', 'peakrpm', 'citympg', 'highwaympg',
        'price'],
       dtype='object')]

In [17]:
df_cars.index

Index(['alfa-romero giulia', 'alfa-romero stelvio', 'alfa-romero Quadrifoglio',
       'audi 100 ls', 'audi 100ls', 'audi fox', 'audi 5000', 'audi 4000',
       'audi 5000s (diesel)', 'bmw 320i',
       ...
       'vw rabbit', 'volkswagen rabbit', 'volkswagen rabbit custom',
       'volvo 145e (sw)', 'volvo 144ea', 'volvo 244dl', 'volvo 245',
       'volvo 264gl', 'volvo diesel', 'volvo 246'],
      dtype='object', name='carname', length=147)

In [19]:
df_cars.columns # display list of columns 

Index(['car_ID', 'symboling', 'doornumber', 'carbody', 'enginelocation',
       'wheelbase', 'carlength', 'carwidth', 'carheight', 'curbweight',
       'cylindernumber', 'enginesize', 'boreratio', 'stroke',
       'compressionratio', 'horsepower', 'peakrpm', 'citympg', 'highwaympg',
       'price'],
      dtype='object')

In [99]:
df_cars.shape # size of data frame as 147 row and 20 columns

(147, 20)

In [100]:
df_cars.size

2940

In [180]:
df_cars.head()

Unnamed: 0_level_0,car_ID,symboling,doornumber,carbody,enginelocation,wheelbase,carlength,carwidth,carheight,curbweight,cylindernumber,enginesize,boreratio,stroke,compressionratio,horsepower,peakrpm,citympg,highwaympg,price
carname,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1
alfa-romero giulia,1,3,two,convertible,front,88.6,168.8,64.1,48.8,2548,four,130,3.47,2.68,9.0,111,5000,21,27,13495.0
alfa-romero stelvio,2,3,two,convertible,front,88.6,168.8,64.1,48.8,2548,four,130,3.47,2.68,9.0,111,5000,21,27,16500.0
alfa-romero Quadrifoglio,3,1,two,hatchback,front,94.5,171.2,65.5,52.4,2823,six,152,2.68,3.47,9.0,154,5000,19,26,16500.0
audi 100 ls,4,2,four,sedan,front,99.8,176.6,66.2,54.3,2337,four,109,3.19,3.4,10.0,102,5500,24,30,13950.0
audi 100ls,5,2,four,sedan,front,99.4,176.6,66.4,54.3,2824,five,136,3.19,3.4,8.0,115,5500,18,22,17450.0


In [179]:
df_cars.dtypes

car_ID                int64
symboling             int64
doornumber           object
carbody              object
enginelocation       object
wheelbase           float64
carlength           float64
carwidth            float64
carheight           float64
curbweight            int64
cylindernumber       object
enginesize            int64
boreratio           float64
stroke              float64
compressionratio    float64
horsepower            int64
peakrpm               int64
citympg               int64
highwaympg            int64
price               float64
dtype: object

In [102]:
df_cars.info() # memory usage

<class 'pandas.core.frame.DataFrame'>
Index: 147 entries, alfa-romero giulia to volvo 246
Data columns (total 20 columns):
 #   Column            Non-Null Count  Dtype  
---  ------            --------------  -----  
 0   car_ID            147 non-null    int64  
 1   symboling         147 non-null    int64  
 2   doornumber        147 non-null    object 
 3   carbody           147 non-null    object 
 4   enginelocation    147 non-null    object 
 5   wheelbase         147 non-null    float64
 6   carlength         147 non-null    float64
 7   carwidth          147 non-null    float64
 8   carheight         147 non-null    float64
 9   curbweight        147 non-null    int64  
 10  cylindernumber    147 non-null    object 
 11  enginesize        147 non-null    int64  
 12  boreratio         147 non-null    float64
 13  stroke            147 non-null    float64
 14  compressionratio  147 non-null    float64
 15  horsepower        147 non-null    int64  
 16  peakrpm           147 non-

Generally, the `shape`, `columns`, `index` and `info` commands are sufficient to study some important properties of dataframes.

In [103]:
df_cars.keys()

Index(['car_ID', 'symboling', 'doornumber', 'carbody', 'enginelocation',
       'wheelbase', 'carlength', 'carwidth', 'carheight', 'curbweight',
       'cylindernumber', 'enginesize', 'boreratio', 'stroke',
       'compressionratio', 'horsepower', 'peakrpm', 'citympg', 'highwaympg',
       'price'],
      dtype='object')

In [104]:
df_cars.values

array([[1, 3, 'two', ..., 21, 27, 13495.0],
       [2, 3, 'two', ..., 21, 27, 16500.0],
       [3, 1, 'two', ..., 19, 26, 16500.0],
       ...,
       [199, -2, 'four', ..., 17, 22, 18420.0],
       [200, -1, 'four', ..., 17, 22, 18950.0],
       [204, -1, 'four', ..., 26, 27, 22470.0]], dtype=object)

In [105]:
df_cars.transpose()

carname,alfa-romero giulia,alfa-romero stelvio,alfa-romero Quadrifoglio,audi 100 ls,audi 100ls,audi fox,audi 5000,audi 4000,audi 5000s (diesel),bmw 320i,...,vw rabbit,volkswagen rabbit,volkswagen rabbit custom,volvo 145e (sw),volvo 144ea,volvo 244dl,volvo 245,volvo 264gl,volvo diesel,volvo 246
car_ID,1,2,3,4,5,6,8,9,10,11,...,191,192,193,195,196,197,198,199,200,204
symboling,3,3,1,2,2,2,1,1,0,2,...,3,0,0,-2,-1,-2,-1,-2,-1,-1
doornumber,two,two,two,four,four,two,four,four,two,two,...,two,four,four,four,four,four,four,four,four,four
carbody,convertible,convertible,hatchback,sedan,sedan,sedan,wagon,sedan,hatchback,sedan,...,hatchback,sedan,sedan,sedan,wagon,sedan,wagon,sedan,wagon,sedan
enginelocation,front,front,front,front,front,front,front,front,front,front,...,front,front,front,front,front,front,front,front,front,front
wheelbase,88.6,88.6,94.5,99.8,99.4,99.8,105.8,105.8,99.5,101.2,...,94.5,100.4,100.4,104.3,104.3,104.3,104.3,104.3,104.3,109.1
carlength,168.8,168.8,171.2,176.6,176.6,177.3,192.7,192.7,178.2,176.8,...,165.7,180.2,180.2,188.8,188.8,188.8,188.8,188.8,188.8,188.8
carwidth,64.1,64.1,65.5,66.2,66.4,66.3,71.4,71.4,67.9,64.8,...,64.0,66.9,66.9,67.2,67.2,67.2,67.2,67.2,67.2,68.9
carheight,48.8,48.8,52.4,54.3,54.3,53.1,55.7,55.9,52.0,54.3,...,51.4,55.1,55.1,56.2,57.5,56.2,57.5,56.2,57.5,55.5
curbweight,2548,2548,2823,2337,2824,2507,2954,3086,3053,2395,...,2221,2661,2579,2912,3034,2935,3042,3045,3157,3217


We have used the `.transpose()` method to swap the axes of the dataframe here. It is easy to see that the two axes of dataframes are basically collections of series. A pair of series intersect at a single entry in the dataframe.

We will study further operations on dataframes in detail in a later session.

We have seen some of the important attributes and methods that help in quickly familiarizing ourselves with datasets. Learners are expected to explore this further on their own. You can study more about Pandas dataframes [here](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html).

### Quiz
Print the top four observations in the `df_cars` dataframe.

In [106]:
df_cars.head(4)

Unnamed: 0_level_0,car_ID,symboling,doornumber,carbody,enginelocation,wheelbase,carlength,carwidth,carheight,curbweight,cylindernumber,enginesize,boreratio,stroke,compressionratio,horsepower,peakrpm,citympg,highwaympg,price
carname,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1
alfa-romero giulia,1,3,two,convertible,front,88.6,168.8,64.1,48.8,2548,four,130,3.47,2.68,9.0,111,5000,21,27,13495.0
alfa-romero stelvio,2,3,two,convertible,front,88.6,168.8,64.1,48.8,2548,four,130,3.47,2.68,9.0,111,5000,21,27,16500.0
alfa-romero Quadrifoglio,3,1,two,hatchback,front,94.5,171.2,65.5,52.4,2823,six,152,2.68,3.47,9.0,154,5000,19,26,16500.0
audi 100 ls,4,2,four,sedan,front,99.8,176.6,66.2,54.3,2337,four,109,3.19,3.4,10.0,102,5500,24,30,13950.0


### Quiz
Print the bottom six observations in the `df_cars` dataframe.

In [107]:
df_cars.tail(6)

Unnamed: 0_level_0,car_ID,symboling,doornumber,carbody,enginelocation,wheelbase,carlength,carwidth,carheight,curbweight,cylindernumber,enginesize,boreratio,stroke,compressionratio,horsepower,peakrpm,citympg,highwaympg,price
carname,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1
volvo 144ea,196,-1,four,wagon,front,104.3,188.8,67.2,57.5,3034,four,141,3.78,3.15,9.5,114,5400,23,28,13415.0
volvo 244dl,197,-2,four,sedan,front,104.3,188.8,67.2,56.2,2935,four,141,3.78,3.15,9.5,114,5400,24,28,15985.0
volvo 245,198,-1,four,wagon,front,104.3,188.8,67.2,57.5,3042,four,141,3.78,3.15,9.5,114,5400,24,28,16515.0
volvo 264gl,199,-2,four,sedan,front,104.3,188.8,67.2,56.2,3045,four,130,3.62,3.15,7.5,162,5100,17,22,18420.0
volvo diesel,200,-1,four,wagon,front,104.3,188.8,67.2,57.5,3157,four,130,3.62,3.15,7.5,162,5100,17,22,18950.0
volvo 246,204,-1,four,sedan,front,109.1,188.8,68.9,55.5,3217,six,145,3.01,3.4,23.0,106,4800,26,27,22470.0


# Accessing data from dataframes
In this section, we will study various access methods to access data from dataframes. We will focus more on the fundamentals in this session and briefly look at some other important methods. Note that you will work with these operations and methods throughout your data science learning journey and beyond.

### Example
Direct access

In [108]:
df_cars

Unnamed: 0_level_0,car_ID,symboling,doornumber,carbody,enginelocation,wheelbase,carlength,carwidth,carheight,curbweight,cylindernumber,enginesize,boreratio,stroke,compressionratio,horsepower,peakrpm,citympg,highwaympg,price
carname,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1
alfa-romero giulia,1,3,two,convertible,front,88.6,168.8,64.1,48.8,2548,four,130,3.47,2.68,9.0,111,5000,21,27,13495.0
alfa-romero stelvio,2,3,two,convertible,front,88.6,168.8,64.1,48.8,2548,four,130,3.47,2.68,9.0,111,5000,21,27,16500.0
alfa-romero Quadrifoglio,3,1,two,hatchback,front,94.5,171.2,65.5,52.4,2823,six,152,2.68,3.47,9.0,154,5000,19,26,16500.0
audi 100 ls,4,2,four,sedan,front,99.8,176.6,66.2,54.3,2337,four,109,3.19,3.40,10.0,102,5500,24,30,13950.0
audi 100ls,5,2,four,sedan,front,99.4,176.6,66.4,54.3,2824,five,136,3.19,3.40,8.0,115,5500,18,22,17450.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
volvo 244dl,197,-2,four,sedan,front,104.3,188.8,67.2,56.2,2935,four,141,3.78,3.15,9.5,114,5400,24,28,15985.0
volvo 245,198,-1,four,wagon,front,104.3,188.8,67.2,57.5,3042,four,141,3.78,3.15,9.5,114,5400,24,28,16515.0
volvo 264gl,199,-2,four,sedan,front,104.3,188.8,67.2,56.2,3045,four,130,3.62,3.15,7.5,162,5100,17,22,18420.0
volvo diesel,200,-1,four,wagon,front,104.3,188.8,67.2,57.5,3157,four,130,3.62,3.15,7.5,162,5100,17,22,18950.0


In [20]:
df_cars['price']

carname
alfa-romero giulia          13495.0
alfa-romero stelvio         16500.0
alfa-romero Quadrifoglio    16500.0
audi 100 ls                 13950.0
audi 100ls                  17450.0
                             ...   
volvo 244dl                 15985.0
volvo 245                   16515.0
volvo 264gl                 18420.0
volvo diesel                18950.0
volvo 246                   22470.0
Name: price, Length: 147, dtype: float64

Note that the resulting object is also a Pandas indexable object, like a series or a dataframe. So, you can continue to use relevant accessing methods.

In [23]:
df_cars[['doornumber','carbody']] # to display multiple columns

Unnamed: 0_level_0,doornumber,carbody
carname,Unnamed: 1_level_1,Unnamed: 2_level_1
alfa-romero giulia,two,convertible
alfa-romero stelvio,two,convertible
alfa-romero Quadrifoglio,two,hatchback
audi 100 ls,four,sedan
audi 100ls,four,sedan
...,...,...
volvo 244dl,four,sedan
volvo 245,four,wagon
volvo 264gl,four,sedan
volvo diesel,four,wagon


In [25]:
df_cars['price']['audi 100 ls']

13950.0

You can also access the pricing column as shown below, but this method is applicable only when the name of the column is a single string.

In [111]:
df_cars.price

carname
alfa-romero giulia          13495.0
alfa-romero stelvio         16500.0
alfa-romero Quadrifoglio    16500.0
audi 100 ls                 13950.0
audi 100ls                  17450.0
                             ...   
volvo 244dl                 15985.0
volvo 245                   16515.0
volvo 264gl                 18420.0
volvo diesel                18950.0
volvo 246                   22470.0
Name: price, Length: 147, dtype: float64

In [189]:
df_cars[['carbody', 'price']]

Unnamed: 0_level_0,carbody,price
carname,Unnamed: 1_level_1,Unnamed: 2_level_1
alfa-romero giulia,convertible,13495.0
alfa-romero stelvio,convertible,16500.0
alfa-romero Quadrifoglio,hatchback,16500.0
audi 100 ls,sedan,13950.0
audi 100ls,sedan,17450.0
...,...,...
volvo 244dl,sedan,15985.0
volvo 245,wagon,16515.0
volvo 264gl,sedan,18420.0
volvo diesel,wagon,18950.0


Direct access methods are not as robust as we would want them to be, but this is where the `.loc[]` and the `.iloc[]` methods help.

### Example
Accessing using the `.loc[]` method

In [26]:
df_cars.loc[['audi 100 ls', 'volvo diesel'], ['carbody', 'price']]

Unnamed: 0_level_0,carbody,price
carname,Unnamed: 1_level_1,Unnamed: 2_level_1
audi 100 ls,sedan,13950.0
volvo diesel,wagon,18950.0


Notice the format of the indexer. It is rows first, and then columns.

In [191]:
df_cars.loc[['audi 100 ls', 'volvo diesel'], ['boreratio', 'price']]

Unnamed: 0_level_0,boreratio,price
carname,Unnamed: 1_level_1,Unnamed: 2_level_1
audi 100 ls,3.19,13950.0
volvo diesel,3.62,18950.0


In [192]:
df_cars.loc['alfa-romero stelvio':'audi 100ls', ['carbody', 'price']]

Unnamed: 0_level_0,carbody,price
carname,Unnamed: 1_level_1,Unnamed: 2_level_1
alfa-romero stelvio,convertible,16500.0
alfa-romero Quadrifoglio,hatchback,16500.0
audi 100 ls,sedan,13950.0
audi 100ls,sedan,17450.0


In [118]:
df_cars.loc['alfa-romero stelvio':'audi 100ls', 'wheelbase':'curbweight']

Unnamed: 0_level_0,wheelbase,carlength,carwidth,carheight,curbweight
carname,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
alfa-romero stelvio,88.6,168.8,64.1,48.8,2548
alfa-romero Quadrifoglio,94.5,171.2,65.5,52.4,2823
audi 100 ls,99.8,176.6,66.2,54.3,2337
audi 100ls,99.4,176.6,66.4,54.3,2824


In [119]:
df_cars.loc[:, 'wheelbase':'curbweight']

Unnamed: 0_level_0,wheelbase,carlength,carwidth,carheight,curbweight
carname,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
alfa-romero giulia,88.6,168.8,64.1,48.8,2548
alfa-romero stelvio,88.6,168.8,64.1,48.8,2548
alfa-romero Quadrifoglio,94.5,171.2,65.5,52.4,2823
audi 100 ls,99.8,176.6,66.2,54.3,2337
audi 100ls,99.4,176.6,66.4,54.3,2824
...,...,...,...,...,...
volvo 244dl,104.3,188.8,67.2,56.2,2935
volvo 245,104.3,188.8,67.2,57.5,3042
volvo 264gl,104.3,188.8,67.2,56.2,3045
volvo diesel,104.3,188.8,67.2,57.5,3157


In [120]:
df_cars.loc['alfa-romero stelvio':'audi 100ls', :]

Unnamed: 0_level_0,car_ID,symboling,doornumber,carbody,enginelocation,wheelbase,carlength,carwidth,carheight,curbweight,cylindernumber,enginesize,boreratio,stroke,compressionratio,horsepower,peakrpm,citympg,highwaympg,price
carname,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1
alfa-romero stelvio,2,3,two,convertible,front,88.6,168.8,64.1,48.8,2548,four,130,3.47,2.68,9.0,111,5000,21,27,16500.0
alfa-romero Quadrifoglio,3,1,two,hatchback,front,94.5,171.2,65.5,52.4,2823,six,152,2.68,3.47,9.0,154,5000,19,26,16500.0
audi 100 ls,4,2,four,sedan,front,99.8,176.6,66.2,54.3,2337,four,109,3.19,3.4,10.0,102,5500,24,30,13950.0
audi 100ls,5,2,four,sedan,front,99.4,176.6,66.4,54.3,2824,five,136,3.19,3.4,8.0,115,5500,18,22,17450.0


### Example
Accessing using the `.iloc[]` method

The `.iloc[]` method follows the same format as the `.loc[]` method with two major differences. Firstly, we already know that the `.iloc[]` method excludes the stop value. Secondly, the `.loc[]` method depends on the names of the rows and columns, so shuffling the dataset will not produce different results as long as the right names are used, but the `.iloc[]` method strictly works on integer index positioning, so the results will be different if the datasets are different.

In [121]:
df_cars.iloc[0, 2]

'two'

In [27]:
df_cars.iloc[[0, 3, 6], [2, 5, 10]]

Unnamed: 0_level_0,doornumber,wheelbase,cylindernumber
carname,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
alfa-romero giulia,two,88.6,four
audi 100 ls,four,99.8,four
audi 5000,four,105.8,five


In [123]:
df_cars.iloc[3:110, [2, 5, 10]]

Unnamed: 0_level_0,doornumber,wheelbase,cylindernumber
carname,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
audi 100 ls,four,99.8,four
audi 100ls,four,99.4,five
audi fox,two,99.8,five
audi 5000,four,105.8,five
audi 4000,four,105.8,five
...,...,...,...
subaru dl,two,93.7,four
subaru brz,four,97.2,four
subaru baja,four,97.2,four
subaru r1,four,97.0,four


In [124]:
df_cars.iloc[[0, 3, 6], :]

Unnamed: 0_level_0,car_ID,symboling,doornumber,carbody,enginelocation,wheelbase,carlength,carwidth,carheight,curbweight,cylindernumber,enginesize,boreratio,stroke,compressionratio,horsepower,peakrpm,citympg,highwaympg,price
carname,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1
alfa-romero giulia,1,3,two,convertible,front,88.6,168.8,64.1,48.8,2548,four,130,3.47,2.68,9.0,111,5000,21,27,13495.0
audi 100 ls,4,2,four,sedan,front,99.8,176.6,66.2,54.3,2337,four,109,3.19,3.4,10.0,102,5500,24,30,13950.0
audi 5000,8,1,four,wagon,front,105.8,192.7,71.4,55.7,2954,five,136,3.19,3.4,8.5,110,5500,19,25,18920.0


### Example
Boolean indexing

In [28]:
df_cars['enginesize'] > 120

carname
alfa-romero giulia           True
alfa-romero stelvio          True
alfa-romero Quadrifoglio     True
audi 100 ls                 False
audi 100ls                   True
                            ...  
volvo 244dl                  True
volvo 245                    True
volvo 264gl                  True
volvo diesel                 True
volvo 246                    True
Name: enginesize, Length: 147, dtype: bool

In [34]:
df_cars[df_cars['enginesize'] > 120]      

Unnamed: 0_level_0,car_ID,symboling,doornumber,carbody,enginelocation,wheelbase,carlength,carwidth,carheight,curbweight,cylindernumber,enginesize,boreratio,stroke,compressionratio,horsepower,peakrpm,citympg,highwaympg,price
carname,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1
alfa-romero giulia,1,3,two,convertible,front,88.6,168.8,64.1,48.8,2548,four,130,3.47,2.68,9.0,111,5000,21,27,13495.0
alfa-romero stelvio,2,3,two,convertible,front,88.6,168.8,64.1,48.8,2548,four,130,3.47,2.68,9.0,111,5000,21,27,16500.0
alfa-romero Quadrifoglio,3,1,two,hatchback,front,94.5,171.2,65.5,52.4,2823,six,152,2.68,3.47,9.0,154,5000,19,26,16500.0
audi 100ls,5,2,four,sedan,front,99.4,176.6,66.4,54.3,2824,five,136,3.19,3.40,8.0,115,5500,18,22,17450.0
audi fox,6,2,two,sedan,front,99.8,177.3,66.3,53.1,2507,five,136,3.19,3.40,8.5,110,5500,19,25,15250.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
volvo 244dl,197,-2,four,sedan,front,104.3,188.8,67.2,56.2,2935,four,141,3.78,3.15,9.5,114,5400,24,28,15985.0
volvo 245,198,-1,four,wagon,front,104.3,188.8,67.2,57.5,3042,four,141,3.78,3.15,9.5,114,5400,24,28,16515.0
volvo 264gl,199,-2,four,sedan,front,104.3,188.8,67.2,56.2,3045,four,130,3.62,3.15,7.5,162,5100,17,22,18420.0
volvo diesel,200,-1,four,wagon,front,104.3,188.8,67.2,57.5,3157,four,130,3.62,3.15,7.5,162,5100,17,22,18950.0


In [35]:
df_cars[df_cars['doornumber']=='two']

Unnamed: 0_level_0,car_ID,symboling,doornumber,carbody,enginelocation,wheelbase,carlength,carwidth,carheight,curbweight,cylindernumber,enginesize,boreratio,stroke,compressionratio,horsepower,peakrpm,citympg,highwaympg,price
carname,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1
alfa-romero giulia,1,3,two,convertible,front,88.6,168.8,64.1,48.8,2548,four,130,3.47,2.68,9.0,111,5000,21,27,13495.000
alfa-romero stelvio,2,3,two,convertible,front,88.6,168.8,64.1,48.8,2548,four,130,3.47,2.68,9.0,111,5000,21,27,16500.000
alfa-romero Quadrifoglio,3,1,two,hatchback,front,94.5,171.2,65.5,52.4,2823,six,152,2.68,3.47,9.0,154,5000,19,26,16500.000
audi fox,6,2,two,sedan,front,99.8,177.3,66.3,53.1,2507,five,136,3.19,3.40,8.5,110,5500,19,25,15250.000
audi 5000s (diesel),10,0,two,hatchback,front,99.5,178.2,67.9,52.0,3053,five,131,3.13,3.40,7.0,160,5500,16,22,17859.167
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
toyota cressida,173,2,two,convertible,front,98.4,176.2,65.6,53.0,2975,four,146,3.62,3.50,9.3,116,4800,24,30,17669.000
vokswagen rabbit,183,2,two,sedan,front,97.3,171.7,65.5,55.7,2261,four,97,3.01,3.40,23.0,52,4800,37,46,7775.000
volkswagen 1131 deluxe sedan,184,2,two,sedan,front,97.3,171.7,65.5,55.7,2209,four,109,3.19,3.40,9.0,85,5250,27,34,7975.000
vw dasher,190,3,two,convertible,front,94.5,159.3,64.2,55.6,2254,four,109,3.19,3.40,8.5,90,5500,24,29,11595.000


In [38]:
(df_cars['enginesize'] > 120) & (df_cars['carbody'] == 'sedan')

carname
alfa-romero giulia          False
alfa-romero stelvio         False
alfa-romero Quadrifoglio    False
audi 100 ls                 False
audi 100ls                   True
                            ...  
volvo 244dl                  True
volvo 245                   False
volvo 264gl                  True
volvo diesel                False
volvo 246                    True
Length: 147, dtype: bool

In [39]:
df_cars[(df_cars['enginesize'] > 120) & (df_cars['carbody'] == 'sedan')]

Unnamed: 0_level_0,car_ID,symboling,doornumber,carbody,enginelocation,wheelbase,carlength,carwidth,carheight,curbweight,cylindernumber,enginesize,boreratio,stroke,compressionratio,horsepower,peakrpm,citympg,highwaympg,price
carname,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1
audi 100ls,5,2,four,sedan,front,99.4,176.6,66.4,54.3,2824,five,136,3.19,3.4,8.0,115,5500,18,22,17450.0
audi fox,6,2,two,sedan,front,99.8,177.3,66.3,53.1,2507,five,136,3.19,3.4,8.5,110,5500,19,25,15250.0
audi 4000,9,1,four,sedan,front,105.8,192.7,71.4,55.9,3086,five,131,3.13,3.4,8.3,140,5500,17,20,23875.0
bmw x1,13,0,two,sedan,front,101.2,176.8,64.8,54.3,2710,six,164,3.31,3.19,9.0,121,4250,21,28,20970.0
bmw x3,14,0,four,sedan,front,101.2,176.8,64.8,54.3,2765,six,164,3.31,3.19,9.0,121,4250,21,28,21105.0
bmw z4,15,1,four,sedan,front,103.5,189.0,66.9,55.7,3055,six,164,3.31,3.19,9.0,121,4250,20,25,24565.0
bmw x4,16,0,four,sedan,front,103.5,189.0,66.9,55.7,3230,six,209,3.62,3.39,8.0,182,5400,16,22,30760.0
bmw x5,17,0,two,sedan,front,103.5,193.8,67.9,53.7,3380,six,209,3.62,3.39,8.0,182,5400,16,22,41315.0
jaguar xj,48,0,four,sedan,front,113.0,199.6,69.6,52.8,4066,six,258,3.63,4.17,8.1,176,4750,15,19,32250.0
jaguar xf,49,0,four,sedan,front,113.0,199.6,69.6,52.8,4066,six,258,3.63,4.17,8.1,176,4750,15,19,35550.0


In [129]:
enginesize_condition = df_cars['enginesize'] > 120
carbody_condition = df_cars['carbody'] == 'sedan'
price_condition = df_cars['price'] < 12500

df_cars[enginesize_condition & carbody_condition & price_condition]

Unnamed: 0_level_0,car_ID,symboling,doornumber,carbody,enginelocation,wheelbase,carlength,carwidth,carheight,curbweight,cylindernumber,enginesize,boreratio,stroke,compressionratio,horsepower,peakrpm,citympg,highwaympg,price
carname,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1
mazda glc custom l,61,0,four,sedan,front,98.8,177.8,66.5,55.5,2410,four,122,3.39,3.39,8.6,84,4800,26,32,8495.0
mitsubishi montero,86,1,four,sedan,front,96.3,172.4,65.4,51.6,2365,four,122,3.35,3.46,8.5,88,5000,25,32,6989.0
mitsubishi pajero,87,1,four,sedan,front,96.3,172.4,65.4,51.6,2405,four,122,3.35,3.46,8.5,88,5000,25,32,8189.0
saab 99le,134,2,four,sedan,front,99.1,186.6,66.5,56.1,2695,four,121,3.54,3.07,9.3,110,5250,21,28,12170.0


In [130]:
enginesize_condition = df_cars['enginesize'] > 120
carbody_condition = df_cars['carbody'] == 'sedan'
price_condition = df_cars['price'] < 12500
carheight_condition = df_cars['carheight'] > 50
df_cars[(enginesize_condition & carbody_condition & price_condition) | carheight_condition]

Unnamed: 0_level_0,car_ID,symboling,doornumber,carbody,enginelocation,wheelbase,carlength,carwidth,carheight,curbweight,cylindernumber,enginesize,boreratio,stroke,compressionratio,horsepower,peakrpm,citympg,highwaympg,price
carname,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1
alfa-romero Quadrifoglio,3,1,two,hatchback,front,94.5,171.2,65.5,52.4,2823,six,152,2.68,3.47,9.0,154,5000,19,26,16500.0
audi 100 ls,4,2,four,sedan,front,99.8,176.6,66.2,54.3,2337,four,109,3.19,3.40,10.0,102,5500,24,30,13950.0
audi 100ls,5,2,four,sedan,front,99.4,176.6,66.4,54.3,2824,five,136,3.19,3.40,8.0,115,5500,18,22,17450.0
audi fox,6,2,two,sedan,front,99.8,177.3,66.3,53.1,2507,five,136,3.19,3.40,8.5,110,5500,19,25,15250.0
audi 5000,8,1,four,wagon,front,105.8,192.7,71.4,55.7,2954,five,136,3.19,3.40,8.5,110,5500,19,25,18920.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
volvo 244dl,197,-2,four,sedan,front,104.3,188.8,67.2,56.2,2935,four,141,3.78,3.15,9.5,114,5400,24,28,15985.0
volvo 245,198,-1,four,wagon,front,104.3,188.8,67.2,57.5,3042,four,141,3.78,3.15,9.5,114,5400,24,28,16515.0
volvo 264gl,199,-2,four,sedan,front,104.3,188.8,67.2,56.2,3045,four,130,3.62,3.15,7.5,162,5100,17,22,18420.0
volvo diesel,200,-1,four,wagon,front,104.3,188.8,67.2,57.5,3157,four,130,3.62,3.15,7.5,162,5100,17,22,18950.0


In [131]:
subset_cars = df_cars[(enginesize_condition & carbody_condition & price_condition) | carheight_condition]
subset_cars

Unnamed: 0_level_0,car_ID,symboling,doornumber,carbody,enginelocation,wheelbase,carlength,carwidth,carheight,curbweight,cylindernumber,enginesize,boreratio,stroke,compressionratio,horsepower,peakrpm,citympg,highwaympg,price
carname,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1
alfa-romero Quadrifoglio,3,1,two,hatchback,front,94.5,171.2,65.5,52.4,2823,six,152,2.68,3.47,9.0,154,5000,19,26,16500.0
audi 100 ls,4,2,four,sedan,front,99.8,176.6,66.2,54.3,2337,four,109,3.19,3.40,10.0,102,5500,24,30,13950.0
audi 100ls,5,2,four,sedan,front,99.4,176.6,66.4,54.3,2824,five,136,3.19,3.40,8.0,115,5500,18,22,17450.0
audi fox,6,2,two,sedan,front,99.8,177.3,66.3,53.1,2507,five,136,3.19,3.40,8.5,110,5500,19,25,15250.0
audi 5000,8,1,four,wagon,front,105.8,192.7,71.4,55.7,2954,five,136,3.19,3.40,8.5,110,5500,19,25,18920.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
volvo 244dl,197,-2,four,sedan,front,104.3,188.8,67.2,56.2,2935,four,141,3.78,3.15,9.5,114,5400,24,28,15985.0
volvo 245,198,-1,four,wagon,front,104.3,188.8,67.2,57.5,3042,four,141,3.78,3.15,9.5,114,5400,24,28,16515.0
volvo 264gl,199,-2,four,sedan,front,104.3,188.8,67.2,56.2,3045,four,130,3.62,3.15,7.5,162,5100,17,22,18420.0
volvo diesel,200,-1,four,wagon,front,104.3,188.8,67.2,57.5,3157,four,130,3.62,3.15,7.5,162,5100,17,22,18950.0


In [132]:
subset_cars.index

Index(['alfa-romero Quadrifoglio', 'audi 100 ls', 'audi 100ls', 'audi fox',
       'audi 5000', 'audi 4000', 'audi 5000s (diesel)', 'bmw 320i', 'bmw x1',
       'bmw x3',
       ...
       'vw rabbit', 'volkswagen rabbit', 'volkswagen rabbit custom',
       'volvo 145e (sw)', 'volvo 144ea', 'volvo 244dl', 'volvo 245',
       'volvo 264gl', 'volvo diesel', 'volvo 246'],
      dtype='object', name='carname', length=137)

In [133]:
subset_cars.loc[:, 'wheelbase':'curbweight']

Unnamed: 0_level_0,wheelbase,carlength,carwidth,carheight,curbweight
carname,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
alfa-romero Quadrifoglio,94.5,171.2,65.5,52.4,2823
audi 100 ls,99.8,176.6,66.2,54.3,2337
audi 100ls,99.4,176.6,66.4,54.3,2824
audi fox,99.8,177.3,66.3,53.1,2507
audi 5000,105.8,192.7,71.4,55.7,2954
...,...,...,...,...,...
volvo 244dl,104.3,188.8,67.2,56.2,2935
volvo 245,104.3,188.8,67.2,57.5,3042
volvo 264gl,104.3,188.8,67.2,56.2,3045
volvo diesel,104.3,188.8,67.2,57.5,3157


In [134]:
subset_cars.loc['volvo 244dl':'volvo diesel', 'wheelbase':'curbweight']

Unnamed: 0_level_0,wheelbase,carlength,carwidth,carheight,curbweight
carname,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
volvo 244dl,104.3,188.8,67.2,56.2,2935
volvo 245,104.3,188.8,67.2,57.5,3042
volvo 264gl,104.3,188.8,67.2,56.2,3045
volvo diesel,104.3,188.8,67.2,57.5,3157


The `.isin()` method is quite helpful in executing conditional access from dataframes.

In [135]:
df_cars['carbody'].isin(['wagon', 'sedan'])

carname
alfa-romero giulia          False
alfa-romero stelvio         False
alfa-romero Quadrifoglio    False
audi 100 ls                  True
audi 100ls                   True
                            ...  
volvo 244dl                  True
volvo 245                    True
volvo 264gl                  True
volvo diesel                 True
volvo 246                    True
Name: carbody, Length: 147, dtype: bool

In [136]:
df_cars[df_cars['carbody'].isin(['wagon', 'sedan'])]

Unnamed: 0_level_0,car_ID,symboling,doornumber,carbody,enginelocation,wheelbase,carlength,carwidth,carheight,curbweight,cylindernumber,enginesize,boreratio,stroke,compressionratio,horsepower,peakrpm,citympg,highwaympg,price
carname,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1
audi 100 ls,4,2,four,sedan,front,99.8,176.6,66.2,54.3,2337,four,109,3.19,3.40,10.0,102,5500,24,30,13950.0
audi 100ls,5,2,four,sedan,front,99.4,176.6,66.4,54.3,2824,five,136,3.19,3.40,8.0,115,5500,18,22,17450.0
audi fox,6,2,two,sedan,front,99.8,177.3,66.3,53.1,2507,five,136,3.19,3.40,8.5,110,5500,19,25,15250.0
audi 5000,8,1,four,wagon,front,105.8,192.7,71.4,55.7,2954,five,136,3.19,3.40,8.5,110,5500,19,25,18920.0
audi 4000,9,1,four,sedan,front,105.8,192.7,71.4,55.9,3086,five,131,3.13,3.40,8.3,140,5500,17,20,23875.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
volvo 244dl,197,-2,four,sedan,front,104.3,188.8,67.2,56.2,2935,four,141,3.78,3.15,9.5,114,5400,24,28,15985.0
volvo 245,198,-1,four,wagon,front,104.3,188.8,67.2,57.5,3042,four,141,3.78,3.15,9.5,114,5400,24,28,16515.0
volvo 264gl,199,-2,four,sedan,front,104.3,188.8,67.2,56.2,3045,four,130,3.62,3.15,7.5,162,5100,17,22,18420.0
volvo diesel,200,-1,four,wagon,front,104.3,188.8,67.2,57.5,3157,four,130,3.62,3.15,7.5,162,5100,17,22,18950.0


In [137]:
df_cars[~df_cars['carbody'].isin(['wagon', 'sedan'])]

Unnamed: 0_level_0,car_ID,symboling,doornumber,carbody,enginelocation,wheelbase,carlength,carwidth,carheight,curbweight,cylindernumber,enginesize,boreratio,stroke,compressionratio,horsepower,peakrpm,citympg,highwaympg,price
carname,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1
alfa-romero giulia,1,3,two,convertible,front,88.6,168.8,64.1,48.8,2548,four,130,3.47,2.68,9.0,111,5000,21,27,13495.0
alfa-romero stelvio,2,3,two,convertible,front,88.6,168.8,64.1,48.8,2548,four,130,3.47,2.68,9.0,111,5000,21,27,16500.0
alfa-romero Quadrifoglio,3,1,two,hatchback,front,94.5,171.2,65.5,52.4,2823,six,152,2.68,3.47,9.0,154,5000,19,26,16500.0
audi 5000s (diesel),10,0,two,hatchback,front,99.5,178.2,67.9,52.0,3053,five,131,3.13,3.4,7.0,160,5500,16,22,17859.167
chevrolet impala,19,2,two,hatchback,front,88.4,141.1,60.3,53.2,1488,three,61,2.91,3.03,9.5,48,5100,47,53,5151.0
chevrolet monte carlo,20,1,two,hatchback,front,94.5,155.9,63.6,52.0,1874,four,90,3.03,3.11,9.6,70,5400,38,43,6295.0
dodge rampage,22,1,two,hatchback,front,93.7,157.3,63.8,50.8,1876,four,90,2.97,3.23,9.41,68,5500,37,41,5572.0
dodge challenger se,23,1,two,hatchback,front,93.7,157.3,63.8,50.8,1876,four,90,2.97,3.23,9.4,68,5500,31,38,6377.0
dodge d200,24,1,two,hatchback,front,93.7,157.3,63.8,50.8,2128,four,98,3.03,3.39,7.6,102,5500,24,30,7957.0
dodge monaco (sw),25,1,four,hatchback,front,93.7,157.3,63.8,50.6,1967,four,90,2.97,3.23,9.4,68,5500,31,38,6229.0


### Example

Accessing columns with specific data types

In [138]:
int_columns = df_cars.select_dtypes(include = 'int')
int_columns

Unnamed: 0_level_0,car_ID,symboling,curbweight,enginesize,horsepower,peakrpm,citympg,highwaympg
carname,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
alfa-romero giulia,1,3,2548,130,111,5000,21,27
alfa-romero stelvio,2,3,2548,130,111,5000,21,27
alfa-romero Quadrifoglio,3,1,2823,152,154,5000,19,26
audi 100 ls,4,2,2337,109,102,5500,24,30
audi 100ls,5,2,2824,136,115,5500,18,22
...,...,...,...,...,...,...,...,...
volvo 244dl,197,-2,2935,141,114,5400,24,28
volvo 245,198,-1,3042,141,114,5400,24,28
volvo 264gl,199,-2,3045,130,162,5100,17,22
volvo diesel,200,-1,3157,130,162,5100,17,22


In [139]:
float_columns = df_cars.select_dtypes(include = 'float')
float_columns

Unnamed: 0_level_0,wheelbase,carlength,carwidth,carheight,boreratio,stroke,compressionratio,price
carname,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
alfa-romero giulia,88.6,168.8,64.1,48.8,3.47,2.68,9.0,13495.0
alfa-romero stelvio,88.6,168.8,64.1,48.8,3.47,2.68,9.0,16500.0
alfa-romero Quadrifoglio,94.5,171.2,65.5,52.4,2.68,3.47,9.0,16500.0
audi 100 ls,99.8,176.6,66.2,54.3,3.19,3.40,10.0,13950.0
audi 100ls,99.4,176.6,66.4,54.3,3.19,3.40,8.0,17450.0
...,...,...,...,...,...,...,...,...
volvo 244dl,104.3,188.8,67.2,56.2,3.78,3.15,9.5,15985.0
volvo 245,104.3,188.8,67.2,57.5,3.78,3.15,9.5,16515.0
volvo 264gl,104.3,188.8,67.2,56.2,3.62,3.15,7.5,18420.0
volvo diesel,104.3,188.8,67.2,57.5,3.62,3.15,7.5,18950.0


In [140]:
df_cars

Unnamed: 0_level_0,car_ID,symboling,doornumber,carbody,enginelocation,wheelbase,carlength,carwidth,carheight,curbweight,cylindernumber,enginesize,boreratio,stroke,compressionratio,horsepower,peakrpm,citympg,highwaympg,price
carname,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1
alfa-romero giulia,1,3,two,convertible,front,88.6,168.8,64.1,48.8,2548,four,130,3.47,2.68,9.0,111,5000,21,27,13495.0
alfa-romero stelvio,2,3,two,convertible,front,88.6,168.8,64.1,48.8,2548,four,130,3.47,2.68,9.0,111,5000,21,27,16500.0
alfa-romero Quadrifoglio,3,1,two,hatchback,front,94.5,171.2,65.5,52.4,2823,six,152,2.68,3.47,9.0,154,5000,19,26,16500.0
audi 100 ls,4,2,four,sedan,front,99.8,176.6,66.2,54.3,2337,four,109,3.19,3.40,10.0,102,5500,24,30,13950.0
audi 100ls,5,2,four,sedan,front,99.4,176.6,66.4,54.3,2824,five,136,3.19,3.40,8.0,115,5500,18,22,17450.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
volvo 244dl,197,-2,four,sedan,front,104.3,188.8,67.2,56.2,2935,four,141,3.78,3.15,9.5,114,5400,24,28,15985.0
volvo 245,198,-1,four,wagon,front,104.3,188.8,67.2,57.5,3042,four,141,3.78,3.15,9.5,114,5400,24,28,16515.0
volvo 264gl,199,-2,four,sedan,front,104.3,188.8,67.2,56.2,3045,four,130,3.62,3.15,7.5,162,5100,17,22,18420.0
volvo diesel,200,-1,four,wagon,front,104.3,188.8,67.2,57.5,3157,four,130,3.62,3.15,7.5,162,5100,17,22,18950.0


### Quiz
Print the `'price'`, `'enginelocation'`, `'citympg'`, and `'cylindernumber'` details of the cars `'nissan teana'`, `'toyota corolla tercel'`, and `'volkswagen type 3'` from the `df_cars` dataframe.

In [141]:
df_cars.loc[['nissan teana', 'toyota corolla tercel', 'volkswagen type 3'],
            ['price', 'enginelocation', 'citympg', 'cylindernumber']]

Unnamed: 0_level_0,price,enginelocation,citympg,cylindernumber
carname,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
nissan teana,17199.0,front,19,six
toyota corolla tercel,9538.0,front,26,four
volkswagen type 3,8195.0,front,27,four


### Quiz
Print the names of the cars that have a `'carbody'` type of `'sedan'`, a `'highwaympg'` value greater than 45, and whose prices are not more than 22000 dollars.

In [142]:
carbody_condition = df_cars['carbody'] == 'sedan'
hmpg_condition = df_cars['highwaympg'] > 45
price_condition = df_cars['price'] <= 22000
print(list(df_cars[(carbody_condition) & (hmpg_condition) & (price_condition)].index))

['nissan gt-r', 'vokswagen rabbit', 'volkswagen model 111']


In this session, we studied Pandas series and dataframes, two of the most popular and important data structures for handling real-world data, especially tabular data. We looked into their basic attributes and also learned how to query them by using various accessing methods.