# Assignment

## One Column

`values` can be a:
- scalar (i.e. `int`, `str`, etc.)
- list (must match DataFrame length)
- `np.array([value1, value2, value3, ...])` (must match DataFrame length)
- `pd.Series([value1, value2, ...], index=[idx1, idx2, ...])` (must match DataFrame length; pd.Series will align by index)

`colname` can be:
- A new column (will be added to the existing `df`)
- An existing column (values will be overwritten in `df`)

```python
# Using direct assignment with a new column name
df['colname'] = values

# or

# Using the assign() method
df = df.assign(colname=values)
```

### `assign`

- Immutability: `assign()` doesn't modify the original DataFrame; it returns a new one. 
- Existing columns that are re-assigned will be overwritten.
- Where the value is a callable, evaluated on `df`

```python
# Create new columns with values derived from existing columns
df = df.assign(new_col = lambda _df: _df['col1'] + _df['col2'])

# Create multiple new columns
df = df.assign(new_col1=lambda _df: _df['col1'] + _df['col2'],
               new_col2=lambda _df: _df['col1'] - _df['col2'])
```

## One New Row

```python
# Add a row using loc with a new index label
df.loc[new_index] = new_row_values

# or

# Using from_dict with orient='index'
new_row = {new_index: new_row_values}
new_row_df = pd.DataFrame.from_dict(new_row, orient='index', columns=df.columns)
df = pd.concat([df, new_row_df])
```


## Row Assignment

Assign values to a row using `loc` (label-based):

```python
# Assign a list/array to all columns in row with index 'idx'
df.loc['idx'] = [value1, value2, value3, ...]

# Assign a dictionary (column_name: value pairs)
df.loc['idx'] = {'col1': value1, 'col2': value2, ...}
```

Assign values to a row using `iloc` (position-based):

```python
# Assign a list/array to all columns in row at position 0
df.iloc[0] = [value1, value2, value3, ...]
```

## Cell Assignment

Assign a value to a specific cell using `loc` (label-based):

```python
df.loc['row_idx', 'col_name'] = value
```

Assign a value to a specific cell using `iloc` (position-based):

```python
df.iloc[row_pos, col_pos] = value
```

Assign using `at`/`iat` (faster for single cell access):

```python
df.at['row_idx', 'col_name'] = value  # Label-based
df.iat[row_pos, col_pos] = value      # Position-based
```

### Cell Assignment of a `pd.DataFrame`

```python
# Note: Only works if df is a one-row pd.DataFrame
df['new_col_name'] = [df2]
```


## Subset Assignment

### Assign to a subset of rows and columns

```python
# Assign to multiple rows, one column
df.loc[['idx1', 'idx2'], 'col_name'] = [value1, value2]

# Assign to multiple rows, multiple columns
df.loc[['idx1', 'idx2'], ['col1', 'col2']] = [[val1, val2], [val3, val4]]

# Assign a single value to a subset (broadcasting)
df.loc[['idx1', 'idx2'], ['col1', 'col2']] = value
```

## Conditional Assignment

### Assign values based on a condition

```python
# Basic conditional assignment
df.loc[df['col1'] > threshold, 'col2'] = new_value

# Assignment with multiple conditions
df.loc[(df['col1'] > val1) & (df['col2'] < val2), 'col3'] = new_value
```

### Using numpy.where for conditional assignment

```python
df['new_col'] = np.where(df['col'] > threshold, value_if_true, value_if_false)
```

# Examples

In [33]:
import pandas as pd
import numpy as np

In [34]:
# Create a sample DataFrame
data = {
    'Name': ['Alice', 'Bob', 'Charlie', 'David'],
    'Age': [25, 30, 35, 40],
    'Salary': [50000, 60000, 70000, 80000]
}
df = pd.DataFrame(data, index=['p1', 'p2', 'p3', 'p4'])
df

Unnamed: 0,Name,Age,Salary
p1,Alice,25,50000
p2,Bob,30,60000
p3,Charlie,35,70000
p4,David,40,80000


## Example: Column Assignment

In [35]:
# Create a new column with a single value
df_copy = df.copy()
df_copy['Department'] = 'Engineering'
df_copy

Unnamed: 0,Name,Age,Salary,Department
p1,Alice,25,50000,Engineering
p2,Bob,30,60000,Engineering
p3,Charlie,35,70000,Engineering
p4,David,40,80000,Engineering


In [36]:
# Assign a list to a column
df_copy = df.copy()
df_copy['Department'] = ['HR', 'Engineering', 'Marketing', 'Finance']
df_copy

Unnamed: 0,Name,Age,Salary,Department
p1,Alice,25,50000,HR
p2,Bob,30,60000,Engineering
p3,Charlie,35,70000,Marketing
p4,David,40,80000,Finance


## Example: Row Assignment

In [37]:
# Assign values to a row using loc
df_copy = df.copy()
df_copy.loc['p2'] = ['Bob Smith', 32, 65000]
df_copy

Unnamed: 0,Name,Age,Salary
p1,Alice,25,50000
p2,Bob Smith,32,65000
p3,Charlie,35,70000
p4,David,40,80000


In [38]:
# Assign values to a row using iloc
df_copy = df.copy()
df_copy.iloc[2] = ['Charlie Brown', 36, 75000]
df_copy

Unnamed: 0,Name,Age,Salary
p1,Alice,25,50000
p2,Bob,30,60000
p3,Charlie Brown,36,75000
p4,David,40,80000


## Example: Cell Assignment

In [39]:
# Modify a single cell using loc
df_copy = df.copy()
df_copy.loc['p1', 'Salary'] = 55000
df_copy

Unnamed: 0,Name,Age,Salary
p1,Alice,25,55000
p2,Bob,30,60000
p3,Charlie,35,70000
p4,David,40,80000


In [40]:
# Modify a single cell using at (faster for single cell access)
df_copy = df.copy()
df_copy.at['p3', 'Age'] = 37
df_copy

Unnamed: 0,Name,Age,Salary
p1,Alice,25,50000
p2,Bob,30,60000
p3,Charlie,37,70000
p4,David,40,80000


## Example: Subset Assignment

In [41]:
# Assign to multiple rows in one column
df_copy = df.copy()
df_copy.loc[['p1', 'p2'], 'Age'] = [26, 31]
df_copy

Unnamed: 0,Name,Age,Salary
p1,Alice,26,50000
p2,Bob,31,60000
p3,Charlie,35,70000
p4,David,40,80000


In [42]:
# Assign to a subset of rows and columns
df_copy = df.copy()
df_copy.loc[['p3', 'p4'], ['Age', 'Salary']] = [[38, 75000], [42, 85000]]
df_copy

Unnamed: 0,Name,Age,Salary
p1,Alice,25,50000
p2,Bob,30,60000
p3,Charlie,38,75000
p4,David,42,85000


## Example: Conditional Assignment

In [43]:
# Assign values based on a condition
df_copy = df.copy()
df_copy.loc[df_copy['Age'] > 30, 'Salary'] = 100000
df_copy

Unnamed: 0,Name,Age,Salary
p1,Alice,25,50000
p2,Bob,30,60000
p3,Charlie,35,100000
p4,David,40,100000


In [44]:
# Using numpy.where for conditional assignment
df_copy = df.copy()
df_copy['SeniorityLevel'] = np.where(df_copy['Age'] >= 35, 'Senior', 'Junior')
df_copy

Unnamed: 0,Name,Age,Salary,SeniorityLevel
p1,Alice,25,50000,Junior
p2,Bob,30,60000,Junior
p3,Charlie,35,70000,Senior
p4,David,40,80000,Senior


## Example: Function-based Assignment

In [45]:
# Apply a function to a column
df_copy = df.copy()
df_copy['Name'] = df_copy['Name'].apply(lambda x: x.upper())
df_copy

Unnamed: 0,Name,Age,Salary
p1,ALICE,25,50000
p2,BOB,30,60000
p3,CHARLIE,35,70000
p4,DAVID,40,80000


In [46]:
# Create a new column using assign
df_copy = df.copy()
df_copy = df_copy.assign(Bonus = lambda x: x['Salary'] * 0.1)
df_copy

Unnamed: 0,Name,Age,Salary,Bonus
p1,Alice,25,50000,5000.0
p2,Bob,30,60000,6000.0
p3,Charlie,35,70000,7000.0
p4,David,40,80000,8000.0


## Example: Different ways of Column Assignment

In [47]:
# 1. Using direct assignment with a scalar value
df_copy = df.copy()
df_copy['Experience'] = 5  # All rows get the same value
df_copy

Unnamed: 0,Name,Age,Salary,Experience
p1,Alice,25,50000,5
p2,Bob,30,60000,5
p3,Charlie,35,70000,5
p4,David,40,80000,5


In [48]:
# 2. Assignment with a list (must match DataFrame length)
df_copy = df.copy()
df_copy['Experience'] = [3, 7, 10, 15]  # Each row gets a different value
df_copy

Unnamed: 0,Name,Age,Salary,Experience
p1,Alice,25,50000,3
p2,Bob,30,60000,7
p3,Charlie,35,70000,10
p4,David,40,80000,15


In [49]:
# 3. Assignment with a numpy array
df_copy = df.copy()
df_copy['Experience'] = np.array([3, 7, 10, 15])
df_copy

Unnamed: 0,Name,Age,Salary,Experience
p1,Alice,25,50000,3
p2,Bob,30,60000,7
p3,Charlie,35,70000,10
p4,David,40,80000,15


In [50]:
# 4. Assignment with a pandas Series
df_copy = df.copy()
experience_series = pd.Series([5, 8, 12, 20], index=['p1', 'p2', 'p3', 'p4'])
df_copy['Experience'] = experience_series

# With different index order - Series will align by index
mixed_series = pd.Series([25, 15, 5, 10], index=['p4', 'p3', 'p1', 'p2'])
df_copy['MixedOrder'] = mixed_series
df_copy

Unnamed: 0,Name,Age,Salary,Experience,MixedOrder
p1,Alice,25,50000,5,5
p2,Bob,30,60000,8,10
p3,Charlie,35,70000,12,15
p4,David,40,80000,20,25


In [51]:
# 5. Using assign() method
df_copy = df.copy()
df_copy = df_copy.assign(Experience=5)
df_copy

Unnamed: 0,Name,Age,Salary,Experience
p1,Alice,25,50000,5
p2,Bob,30,60000,5
p3,Charlie,35,70000,5
p4,David,40,80000,5


In [52]:
# 6. Using assign() with lambda function
df_copy = df.copy()
df_copy = df_copy.assign(YearlyRaise=lambda x: x['Salary'] * 0.05,
                         AdjustedSalary=lambda x: x['Salary'] + (x['Salary'] * 0.05))
df_copy

Unnamed: 0,Name,Age,Salary,YearlyRaise,AdjustedSalary
p1,Alice,25,50000,2500.0,52500.0
p2,Bob,30,60000,3000.0,63000.0
p3,Charlie,35,70000,3500.0,73500.0
p4,David,40,80000,4000.0,84000.0


## Example: Adding One New Row

In [53]:
# 1. Add a row using loc with a new index label
df_copy = df.copy()
df_copy.loc['p5'] = ['Eve', 28, 65000]
df_copy

Unnamed: 0,Name,Age,Salary
p1,Alice,25,50000
p2,Bob,30,60000
p3,Charlie,35,70000
p4,David,40,80000
p5,Eve,28,65000


In [54]:
# 2. Using from_dict with orient='index'
df_copy = df.copy()
new_row = {'p5': ['Eve', 28, 65000]}
new_row_df = pd.DataFrame.from_dict(new_row, orient='index', columns=df_copy.columns)
df_copy = pd.concat([df_copy, new_row_df])
df_copy

Unnamed: 0,Name,Age,Salary
p1,Alice,25,50000
p2,Bob,30,60000
p3,Charlie,35,70000
p4,David,40,80000
p5,Eve,28,65000


## Example: Row Assignment with Dictionary

In [55]:
# Assign a dictionary to a row using loc
df_copy = df.copy()
df_copy.loc['p2'] = {'Name': 'Bob Smith', 'Age': 32, 'Salary': 65000}
df_copy

Unnamed: 0,Name,Age,Salary
p1,Alice,25,50000
p2,Bob Smith,32,65000
p3,Charlie,35,70000
p4,David,40,80000


## Example: Cell Assignment with iloc and iat

In [56]:
# Modify a single cell using iloc (position-based)
df_copy = df.copy()
df_copy.iloc[0, 1] = 26  # Change Age of first row
df_copy

Unnamed: 0,Name,Age,Salary
p1,Alice,26,50000
p2,Bob,30,60000
p3,Charlie,35,70000
p4,David,40,80000


In [57]:
# Modify a single cell using iat (faster for single cell access)
df_copy = df.copy()
df_copy.iat[2, 1] = 36  # Change Age of third row
df_copy

Unnamed: 0,Name,Age,Salary
p1,Alice,25,50000
p2,Bob,30,60000
p3,Charlie,36,70000
p4,David,40,80000


## Example: Cell Assignment of a pandas DataFrame

In [58]:
# Create a single-row DataFrame
df_details = pd.DataFrame({
    'Department': ['Engineering'],
    'Position': ['Manager'],
    'StartDate': ['2020-01-15']
})

# Demonstrate assigning a DataFrame as a cell value
df_copy = df.copy()
df_copy['Details'] = [df_details.iloc[0]] * len(df_copy)  # Repeat for each row
df_copy

Unnamed: 0,Name,Age,Salary,Details
p1,Alice,25,50000,Department Engineering Position Ma...
p2,Bob,30,60000,Department Engineering Position Ma...
p3,Charlie,35,70000,Department Engineering Position Ma...
p4,David,40,80000,Department Engineering Position Ma...


## Example: Subset Assignment with Broadcasting

In [59]:
# Assign a single value to a subset (broadcasting)
df_copy = df.copy()
df_copy.loc[['p1', 'p2'], ['Age', 'Salary']] = 0
df_copy

Unnamed: 0,Name,Age,Salary
p1,Alice,0,0
p2,Bob,0,0
p3,Charlie,35,70000
p4,David,40,80000


## Example: Conditional Assignment with Multiple Conditions

In [60]:
# Assignment with multiple conditions
df_copy = df.copy()
df_copy['ExperienceLevel'] = 'Mid-level'  # Default value

# Apply multiple conditions
df_copy.loc[(df_copy['Age'] < 30) & (df_copy['Salary'] < 60000), 'ExperienceLevel'] = 'Junior'
df_copy.loc[(df_copy['Age'] >= 35) & (df_copy['Salary'] >= 70000), 'ExperienceLevel'] = 'Senior'

df_copy

Unnamed: 0,Name,Age,Salary,ExperienceLevel
p1,Alice,25,50000,Junior
p2,Bob,30,60000,Mid-level
p3,Charlie,35,70000,Senior
p4,David,40,80000,Senior
