# Chapter 1: DataFrames

## Sorting and Subsetting

Pandas allows sorting of data and creating subsets efficiently, which is crucial for data analysis and visualization.

In [None]:
import pandas as pd

# Example DataFrame
data = {
    'name': ['Bella', 'Charlie', 'Lucy'],
    'breed': ['Labrador', 'Poodle', 'Chow Chow'],
    'weight_kg': [24, 23, 22]
}
dogs = pd.DataFrame(data)

# Sorting by weight_kg
sorted_dogs = dogs.sort_values('weight_kg')

# Creating a subset where weight is greater than 22 kg
subset_dogs = dogs[dogs['weight_kg'] > 22]

print(sorted_dogs)
print(subset_dogs)

## Creating New Columns

Adding new columns is useful for extending DataFrames with calculated or derived data.

In [None]:
# Adding a new column for weight in pounds
dogs['weight_lb'] = dogs['weight_kg'] * 2.20462

print(dogs)

## Exploring a DataFrame

- `.head()`, `.info()`, `.describe()`: Methods to quickly view the DataFrame, column details, and descriptive statistics, respectively.

In [None]:
print(dogs.head())
print(dogs.info())
print(dogs.describe())

## Components of a DataFrame

Understanding DataFrame components such as `.values`, `.columns`, and `.index` is crucial for data manipulation.

In [None]:
print(dogs.values)
print(dogs.columns)
print(dogs.index)

## Subsetting and Filtering

Subsetting based on specific conditions allows for analysis of data segments.

In [None]:
# Subset by Labrador breed
labradors = dogs[dogs['breed'] == 'Labrador']

# Subset using .isin() for multiple values
colors = ['Black', 'Brown']
subset_colors = dogs[dogs['color'].isin(colors)]

print(labradors)
print(subset_colors)