Selecting columns and rows in a Pandas DataFrame is a fundamental operation for data manipulation and analysis. 


* # Column Selection
  
1. Selecting a Single Column
 
You can select a single column using the column name.

In [1]:
import pandas as pd

# Create a DataFrame
data = {'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, 9]}
df = pd.DataFrame(data)

# Select column 'A'
col_a = df['A']

print("Column A:\n", col_a)


Column A:
 0    1
1    2
2    3
Name: A, dtype: int64


2. Selecting Multiple Columns
   
You can select multiple columns by passing a list of column names.

In [2]:
# Select columns 'A' and 'C'
cols_ac = df[['A', 'C']]

print("Columns A and C:\n", cols_ac)

Columns A and C:
    A  C
0  1  7
1  2  8
2  3  9


* # Row Selection

# 1. Selecting Rows by Index (iloc)  
The iloc method is used for integer-location based indexing for selection by position.

In [3]:
# Select the first row
row_0 = df.iloc[0]

print("First Row:\n", row_0)

# Select multiple rows (first and third rows)
rows_0_2 = df.iloc[[0, 2]]

print("First and Third Rows:\n", rows_0_2)

First Row:
 A    1
B    4
C    7
Name: 0, dtype: int64
First and Third Rows:
    A  B  C
0  1  4  7
2  3  6  9


# 2. Selecting Rows by Label (loc)  
The loc method is used for label-based indexing for selection by label.

In [4]:
# Set a new index to demonstrate label-based selection
df.index = ['row1', 'row2', 'row3']

# Select the row with label 'row1'
row_1 = df.loc['row1']

print("Row with label 'row1':\n", row_1)

# Select multiple rows with labels 'row1' and 'row3'
rows_1_3 = df.loc[['row1', 'row3']]

print("Rows with labels 'row1' and 'row3':\n", rows_1_3)

Row with label 'row1':
 A    1
B    4
C    7
Name: row1, dtype: int64
Rows with labels 'row1' and 'row3':
       A  B  C
row1  1  4  7
row3  3  6  9


* # Conditional Selection  
You can also select rows based on a condition.

In [5]:
# Select rows where column 'A' is greater than 1
cond_rows = df[df['A'] > 1]

print("Rows where column 'A' > 1:\n", cond_rows)


Rows where column 'A' > 1:
       A  B  C
row2  2  5  8
row3  3  6  9


* # Selecting Rows and Columns Together  
You can combine both row and column selection using iloc or loc.

In [7]:
# Select a specific element (row 1, column 'B') using iloc
elem = df.iloc[1, df.columns.get_loc('B')]

print("Element at row 1, column 'B':", elem)

# Select a subset of the DataFrame (first two rows, columns 'A' and 'B')
subset = df.loc[['row1', 'row2'], ['A', 'B']]

print("Subset of the DataFrame:\n", subset)


Element at row 1, column 'B': 5
Subset of the DataFrame:
       A  B
row1  1  4
row2  2  5


* # Using at and iat for Scalar Values  
For accessing a single scalar value, at and iat are more efficient.

In [8]:
# Using at
scalar_at = df.at['row1', 'A']
print("Scalar value at ('row1', 'A'):", scalar_at)

# Using iat
scalar_iat = df.iat[0, 0]
print("Scalar value at (0, 0):", scalar_iat)

Scalar value at ('row1', 'A'): 1
Scalar value at (0, 0): 1


* # Adding Columns
  
1. Adding a Column with a Constant Value  
You can add a new column with a constant value to a DataFrame.

In [9]:
import pandas as pd

# Create a DataFrame
data = {'A': [1, 2, 3], 'B': [4, 5, 6]}
df = pd.DataFrame(data)

# Add a new column 'C' with a constant value
df['C'] = 7

print("DataFrame after adding column 'C':\n", df)

DataFrame after adding column 'C':
    A  B  C
0  1  4  7
1  2  5  7
2  3  6  7


2. Adding a Column with Values from a List or Series  
You can add a new column with values from a list or another Series.

In [10]:
# Add a new column 'D' with values from a list
df['D'] = [10, 20, 30]

print("DataFrame after adding column 'D':\n", df)

DataFrame after adding column 'D':
    A  B  C   D
0  1  4  7  10
1  2  5  7  20
2  3  6  7  30


3. Adding a Column Based on a Calculation
   
You can add a new column based on calculations from existing columns.

In [11]:
# Add a new column 'E' which is the sum of columns 'A' and 'B'
df['E'] = df['A'] + df['B']

print("DataFrame after adding column 'E':\n", df)

DataFrame after adding column 'E':
    A  B  C   D  E
0  1  4  7  10  5
1  2  5  7  20  7
2  3  6  7  30  9


* # Deleting Columns
  
1. Deleting a Single Column  
You can delete a single column using the drop() method.

In [12]:
# Delete column 'C'
df = df.drop(columns=['C'])

print("DataFrame after deleting column 'C':\n", df)

DataFrame after deleting column 'C':
    A  B   D  E
0  1  4  10  5
1  2  5  20  7
2  3  6  30  9


2. Deleting Multiple Columns  
You can delete multiple columns at once using the drop() method.

In [13]:
# Delete columns 'D' and 'E'
df = df.drop(columns=['D', 'E'])

print("DataFrame after deleting columns 'D' and 'E':\n", df)

DataFrame after deleting columns 'D' and 'E':
    A  B
0  1  4
1  2  5
2  3  6


* # Updating Columns
  
1. Updating Values in an Entire Column  
You can update all values in a column.

In [14]:
# Update all values in column 'A' to be doubled
df['A'] = df['A'] * 2

print("DataFrame after updating column 'A':\n", df)

DataFrame after updating column 'A':
    A  B
0  2  4
1  4  5
2  6  6


2. Updating Values Based on a Condition  
You can update values in a column based on a condition.

In [15]:
# Update values in column 'B' to be 0 where the value in 'A' is greater than 2
df.loc[df['A'] > 2, 'B'] = 0

print("DataFrame after conditionally updating column 'B':\n", df)

DataFrame after conditionally updating column 'B':
    A  B
0  2  4
1  4  0
2  6  0


# Adding, Deleting, and Updating Columns Example  
Here's an example combining adding, deleting, and updating columns:

In [16]:
# Create a DataFrame
data = {'A': [1, 2, 3], 'B': [4, 5, 6]}
df = pd.DataFrame(data)

# Add new column 'C' with values [7, 8, 9]
df['C'] = [7, 8, 9]

# Add new column 'D' based on calculation (A + B)
df['D'] = df['A'] + df['B']

# Update column 'A' to be its negative
df['A'] = -df['A']

# Conditionally update column 'B' to be 0 where column 'C' > 7
df.loc[df['C'] > 7, 'B'] = 0

# Delete column 'D'
df = df.drop(columns=['D'])

print("Final DataFrame:\n", df)

Final DataFrame:
    A  B  C
0 -1  4  7
1 -2  0  8
2 -3  0  9


* # Setting an Index
  
1. Setting an Index with set_index()  
You can set a column as the index of the DataFrame.

In [17]:
import pandas as pd

# Create a DataFrame
data = {'A': [1, 2, 3], 'B': [4, 5, 6], 'C': ['a', 'b', 'c']}
df = pd.DataFrame(data)

# Set column 'C' as the index
df = df.set_index('C')

print("DataFrame after setting index:\n", df)

DataFrame after setting index:
    A  B
C      
a  1  4
b  2  5
c  3  6


* # Resetting the Index
  
2. Resetting the Index with reset_index()  
You can reset the index to default integer index and move the current index back to columns.

In [18]:
# Reset the index
df_reset = df.reset_index()

print("DataFrame after resetting index:\n", df_reset)

DataFrame after resetting index:
    C  A  B
0  a  1  4
1  b  2  5
2  c  3  6


* # Removing the Index
  
3. Removing the Index by Resetting and Dropping It  
To remove the index and reset it to the default integer index, you can reset the index and drop it.

In [19]:
# Reset the index and drop the current index
df_reset_drop = df.reset_index(drop=True)

print("DataFrame after resetting and dropping index:\n", df_reset_drop)

DataFrame after resetting and dropping index:
    A  B
0  1  4
1  2  5
2  3  6


* # Operations between two Columns
  
# 1. Arithmetic Operations
   * Addition

In [20]:
import pandas as pd

# Create a DataFrame
data = {'A': [1, 2, 3], 'B': [4, 5, 6]}
df = pd.DataFrame(data)

# Add columns 'A' and 'B'
df['A_plus_B'] = df['A'] + df['B']

print("DataFrame after addition:\n", df)

DataFrame after addition:
    A  B  A_plus_B
0  1  4         5
1  2  5         7
2  3  6         9


  * Subtraction

In [22]:
# Subtract column 'B' from column 'A'
df['A_minus_B'] = df['A'] - df['B']

print("\nDataFrame after subtraction:\n", df)


DataFrame after subtraction:
    A  B  A_plus_B  A_minus_B
0  1  4         5         -3
1  2  5         7         -3
2  3  6         9         -3


  *  Multiplication

In [23]:
# Multiply columns 'A' and 'B'
df['A_times_B'] = df['A'] * df['B']

print("\nDataFrame after multiplication:\n", df)


DataFrame after multiplication:
    A  B  A_plus_B  A_minus_B  A_times_B
0  1  4         5         -3          4
1  2  5         7         -3         10
2  3  6         9         -3         18


  * Division

In [24]:
# Divide column 'A' by column 'B'
df['A_div_B'] = df['A'] / df['B']

print("\nDataFrame after division:\n", df)


DataFrame after division:
    A  B  A_plus_B  A_minus_B  A_times_B  A_div_B
0  1  4         5         -3          4     0.25
1  2  5         7         -3         10     0.40
2  3  6         9         -3         18     0.50


# 2. Logical Operations

   * Greater Than

In [25]:
# Check if column 'A' is greater than column 'B'
df['A_gt_B'] = df['A'] > df['B']

print("\nDataFrame after logical 'greater than' operation:\n", df)


DataFrame after logical 'greater than' operation:
    A  B  A_plus_B  A_minus_B  A_times_B  A_div_B  A_gt_B
0  1  4         5         -3          4     0.25   False
1  2  5         7         -3         10     0.40   False
2  3  6         9         -3         18     0.50   False


  * Equal To

In [26]:
# Check if column 'A' is equal to column 'B'
df['A_eq_B'] = df['A'] == df['B']

print("\nDataFrame after logical 'equal to' operation:\n", df)


DataFrame after logical 'equal to' operation:
    A  B  A_plus_B  A_minus_B  A_times_B  A_div_B  A_gt_B  A_eq_B
0  1  4         5         -3          4     0.25   False   False
1  2  5         7         -3         10     0.40   False   False
2  3  6         9         -3         18     0.50   False   False


* # 3. Element-wise String Operations
  
If you have string columns, you can perform operations such as concatenation.

In [27]:
# Create a DataFrame with string columns
data = {'A': ['foo', 'bar', 'baz'], 'B': ['one', 'two', 'three']}
df_str = pd.DataFrame(data)

# Concatenate columns 'A' and 'B'
df_str['A_B_concat'] = df_str['A'] + df_str['B']

print("\nDataFrame after concatenation:\n", df_str)


DataFrame after concatenation:
      A      B A_B_concat
0  foo    one     fooone
1  bar    two     bartwo
2  baz  three   bazthree


* # 4. Applying Custom Functions
  
You can apply custom functions to columns using the apply method.

In [28]:
# Define a custom function for example
def custom_operation(x, y):
    return x ** 2 + y ** 2

# Apply the custom function to columns 'A' and 'B'
df['custom'] = df.apply(lambda row: custom_operation(row['A'], row['B']), axis=1)

print("\nDataFrame after applying custom function:\n", df)


DataFrame after applying custom function:
    A  B  A_plus_B  A_minus_B  A_times_B  A_div_B  A_gt_B  A_eq_B  custom
0  1  4         5         -3          4     0.25   False   False      17
1  2  5         7         -3         10     0.40   False   False      29
2  3  6         9         -3         18     0.50   False   False      45
