# Pandas Tutorial - Part 61: DataFrame Methods (info, mask)

This notebook covers important DataFrame methods including:
- `info()` - Print a concise summary of a DataFrame
- `mask()` - Replace values where the condition is True

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

# Set display options
pd.set_option('display.max_columns', None)
pd.set_option('display.expand_frame_repr', False)

## 1. DataFrame.info()

The `info()` method prints a concise summary of a DataFrame, including the index dtype, column dtypes, non-null values, and memory usage.

In [None]:
# Create a DataFrame with different data types
int_values = [1, 2, 3, 4, 5]
text_values = ['alpha', 'beta', 'gamma', 'delta', 'epsilon']
float_values = [0.0, 0.25, 0.5, 0.75, 1.0]
df = pd.DataFrame({
    "int_col": int_values, 
    "text_col": text_values,
    "float_col": float_values
})

print("DataFrame:")
df

In [None]:
# Print information about the DataFrame
print("DataFrame info:")
df.info(verbose=True)

In [None]:
# Create a DataFrame with missing values
df_with_na = pd.DataFrame({
    "A": [1, 2, np.nan, 4, 5],
    "B": [np.nan, 2, 3, 4, 5],
    "C": [1, 2, 3, np.nan, np.nan]
})

print("DataFrame with missing values:")
df_with_na

In [None]:
# Print information about the DataFrame with missing values
print("DataFrame with missing values info:")
df_with_na.info()

In [None]:
# Create a larger DataFrame
large_df = pd.DataFrame({
    f"col_{i}": np.random.rand(1000) for i in range(20)
})

print("Large DataFrame info:")
large_df.info()

In [None]:
# Show memory usage with deep introspection
print("Memory usage with deep introspection:")
df.info(memory_usage='deep')

In [None]:
# Customize max_cols parameter
print("Info with max_cols=2:")
large_df.info(max_cols=2)

## 2. DataFrame.mask()

The `mask()` method replaces values where the condition is True. It's the opposite of `where()` method.

In [None]:
# Create a Series
s = pd.Series(range(5))
print("Original Series:")
print(s)

In [None]:
# Using where() - keep values where condition is True
print("\nwhere(s > 0) - keep values where s > 0:")
print(s.where(s > 0))

In [None]:
# Using mask() - replace values where condition is True
print("\nmask(s > 0) - replace values where s > 0:")
print(s.mask(s > 0))

In [None]:
# Using where() with a replacement value
print("\nwhere(s > 1, 10) - replace values where s <= 1 with 10:")
print(s.where(s > 1, 10))

In [None]:
# Using mask() with a replacement value
print("\nmask(s > 1, 10) - replace values where s > 1 with 10:")
print(s.mask(s > 1, 10))

In [None]:
# Create a DataFrame
df = pd.DataFrame(np.arange(10).reshape(-1, 2), columns=['A', 'B'])
print("Original DataFrame:")
print(df)

In [None]:
# Create a condition
m = df % 3 == 0
print("Condition (m = df % 3 == 0):")
print(m)

In [None]:
# Using where() with the condition
print("\ndf.where(m, -df) - keep values where m is True, replace others with -df:")
print(df.where(m, -df))

In [None]:
# Using mask() with the condition
print("\ndf.mask(m, -df) - replace values where m is True with -df:")
print(df.mask(m, -df))

In [None]:
# Verify that where(m) is equivalent to mask(~m)
print("\nVerify that df.where(m, -df) == df.mask(~m, -df):")
print(df.where(m, -df) == df.mask(~m, -df))

In [None]:
# Using mask() with a callable for the condition
print("\nUsing a callable for the condition:")
print(df.mask(lambda x: x > 5, 0))

In [None]:
# Using mask() with a callable for the replacement
print("\nUsing a callable for the replacement:")
print(df.mask(m, lambda x: x * 10))

In [None]:
# Using mask() with inplace=True
df_copy = df.copy()
print("\nBefore mask() with inplace=True:")
print(df_copy)

df_copy.mask(m, -df, inplace=True)
print("\nAfter mask() with inplace=True:")
print(df_copy)

## Summary

In this notebook, we've explored two important DataFrame methods:

1. **info()**: Prints a concise summary of a DataFrame, including the index dtype, column dtypes, non-null values, and memory usage. This is useful for quickly understanding the structure and content of a DataFrame.

2. **mask()**: Replaces values where the condition is True. It's the opposite of the `where()` method, which keeps values where the condition is True. The `mask()` method is useful for data cleaning and transformation tasks.

These methods are essential for data exploration, understanding DataFrame structure, and data manipulation in pandas.