# Data Selection in Pandas DataFrames

## 1. DataFrame as a Dictionary of Series:
   * Columns can be accessed using dictionary-style indexing: `data['column_name']`
   * Attribute-style access is also possible for string column names: `data.column_name`
   * New columns can be added: `data['new_column'] = values`

## 2. DataFrame as a Two-dimensional Array:
   * The underlying data can be accessed using the `values` attribute
   * The DataFrame can be transposed using `data.T`

## 3. Indexing Methods:
   * `loc`: Label-based indexing 
     Example: `data.loc['row_label', 'column_label']`
   * `iloc`: Integer-based indexing 
     Example: `data.iloc[0, 1]`
   * `ix`: Hybrid of `loc` and `iloc` (deprecated in newer versions)

## 4. Slicing:
   * Can be done on rows: `data['row1':'row3']`
   * Can use integer-based slicing: `data[1:3]`

## 5. Masking:
   * Boolean indexing is applied row-wise: `data[data.column > value]`

## 6. Modifying Data:
   * Can be done using any of the indexing methods
   * Example: `data.iloc[0, 2] = 90`

## 7. Additional Conventions:
   * Square brackets `[]` refer to columns when single-indexing
   * Slicing with `[]` refers to rows

In [17]:
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'A': [1, 2, 3, 4, 5],
    'B': [10, 20, 30, 40, 50],
    'C': [100, 200, 300, 400, 500]
}, index=['row1', 'row2', 'row3', 'row4', 'row5'])

df


Unnamed: 0,A,B,C
row1,1,10,100
row2,2,20,200
row3,3,30,300
row4,4,40,400
row5,5,50,500


In [19]:
# Dictionary-style column access
print(df['A'])



row1    1
row2    2
row3    3
row4    4
row5    5
Name: A, dtype: int64


In [21]:
# Label-based indexing with loc
print(df.loc['row2':'row4', 'B':'C'])



       B    C
row2  20  200
row3  30  300
row4  40  400


In [23]:
# Integer-based indexing with iloc
print(df.iloc[1:4, 1:3])



       B    C
row2  20  200
row3  30  300
row4  40  400


In [25]:
# Masking
print(df[df['A'] > 3])



      A   B    C
row4  4  40  400
row5  5  50  500


In [27]:
# Modifying data
df.loc['row3', 'B'] = 35
print(df)

      A   B    C
row1  1  10  100
row2  2  20  200
row3  3  35  300
row4  4  40  400
row5  5  50  500
