# Pandas Data Indexing and Selection Coding Practice Questions

1. Create a Pandas Series named `s` with indices as letters from 'a' to 'e' and values as integers from 1 to 5. Display the Series.

In [3]:
import pandas as pd
import numpy as np

In [4]:
ser = pd.Series(np.arange(1, 6), index=['a', 'b', 'c', 'd', 'e'])
ser

a    1
b    2
c    3
d    4
e    5
dtype: int64

2. Using the Series `s`, select the value at index 'c'.

In [5]:
ser['c']

3

3. Using the Series `s`, select values at indices 'a', 'c', and 'e'.

In [6]:
ind = ['a', 'c', 'e']
ser[ind]

a    1
c    3
e    5
dtype: int64

4. Using the Series `s`, select all values greater than 2.

In [7]:
ser[ser > 2]

c    3
d    4
e    5
dtype: int64

5. Create a DataFrame named `df` with columns 'A', 'B', 'C' and values from 1 to 9 in a square grid incrementing value along each row. Display the DataFrame.

In [49]:
df = pd.DataFrame(np.arange(1, 10).reshape(3, 3), columns=['A', 'B', 'C'])
df

Unnamed: 0,A,B,C
0,1,2,3
1,4,5,6
2,7,8,9


6. Using the DataFrame `df`, select the value at row 1 and column 'B'.

In [15]:
df.loc[1, 'B']

5

In [16]:
df.at[1, 'B']

5

In summary, use df.at when you need to access a single element efficiently by specifying row and column labels. Use df.loc when you need more flexibility and want to access multiple elements or when working with both label-based and integer-based indexing.

7. Using the DataFrame `df`, select the entire row with index 2.

In [17]:
df.loc[2]

A    7
B    8
C    9
Name: 2, dtype: int64

In [27]:
# DataFrame can be accessed through columns directly using [],
# but it is no same case for index.
df[2]

KeyError: ignored

8. Using the DataFrame `df`, select the entire column named 'C'.

In [26]:
df['C']

0    3
1    6
2    9
Name: C, dtype: int64

9. Using the DataFrame `df`, select values from column 'A' where values in column 'B' are greater than 2.

In [29]:
df['A'][df['B'].values > 2]

1    4
2    7
Name: A, dtype: int64

10. Using the DataFrame `df`, select values from columns 'A' and 'C' for rows with index 0 and 2.

In [38]:
ind = ['A', 'C']
df.loc[[0, 2]][ind]

Unnamed: 0,A,C
0,1,3
2,7,9


11. Using the DataFrame `df`, change the value at row 1 and column 'B' to 10.

In [44]:
df_changed = df.copy()
df_changed.loc[1]['B'] = 10
df_changed

Unnamed: 0,A,B,C
0,1,2,3
1,4,10,6
2,7,8,9


12. Using the DataFrame `df`, add a new column 'D' with values as the sum of columns 'A' and 'B'.

In [50]:
df['D'] = df['A'] + df['B']
df

Unnamed: 0,A,B,C,D
0,1,2,3,3
1,4,5,6,9
2,7,8,9,15


13. Using the DataFrame `df`, drop the column named 'C'.

In [51]:
# df.drop(columns=['C'], inplace=True)
df.drop('C', axis=1, inplace=True)
df

Unnamed: 0,A,B,D
0,1,2,3
1,4,5,9
2,7,8,15


14. Using the DataFrame `df`, filter rows where the value in column 'A' is even.

In [80]:
df[df['A'] % 2 == 0]

Unnamed: 0,A,B,D
1,4,5,9
3,10,11,12


15. Using the DataFrame `df`, set the index of the DataFrame to be the values in column 'A'.

In [54]:
df.set_index('A', inplace=True)
df

Unnamed: 0_level_0,B,D
A,Unnamed: 1_level_1,Unnamed: 2_level_1
1,2,3
4,5,9
7,8,15


16. Using the DataFrame `df`, reset the index of the DataFrame.

In [56]:
df.reset_index(inplace=True)
df

Unnamed: 0,A,B,D
0,1,2,3
1,4,5,9
2,7,8,15


17. Using the DataFrame `df`, select all rows where the index is greater than 2.

In [57]:
df.loc[3:]

Unnamed: 0,A,B,D


In [58]:
# more robust
df[df.index > 2]

Unnamed: 0,A,B,D


18. Using the DataFrame `df`, select all rows where the values in column 'B' are in the list [2, 4, 6].

In [66]:
df[df['B'].isin([2, 4, 6])]

Unnamed: 0,A,B,D
0,1,2,3


19. Using the DataFrame `df`, use the `.loc` method to select values from column 'A' for rows with index 1 and 3.

In [74]:
df.loc[3] = [10, 11, 12]
print(df)
df.loc[[1, 3], 'A']

    A   B   D
0   1   2   3
1   4   5   9
2   7   8  15
3  10  11  12


1     4
3    10
Name: A, dtype: int64

20. Using the DataFrame `df`, use the `.iloc` method to select values from the first column for the first and third rows.

In [79]:
df.iloc[[0, 2], 0]

0    1
2    7
Name: A, dtype: int64