# Lesson 3: Creating New Columns in Pandas

# Introduction  
Welcome to our session on **creating new columns in Pandas**. Today, we'll enhance our data handling skills by learning how to create new columns in a DataFrame. This skill is essential for data cleaning and manipulation, allowing us to derive new insights by generating novel fields of data.  

By the end of this session, you'll learn how to:  
- Add new columns with static values.  
- Generate new columns through operations on existing columns.  
- Create new columns based on specific conditions.  

---

## Why Creating New Columns Is Important  
Creating new columns is crucial for data analysis. For instance, in a DataFrame with prices and quantities of goods sold, we may want to calculate total sales using the formula: **price × quantity**.  

```python
import pandas as pd

# Creating DataFrame: items, prices, quantities sold
df = pd.DataFrame({"Item": ["Apples", "Bananas", "Oranges"], 
                   "Price": [1.5, 0.5, 0.75], 
                   "Quantity": [10, 20, 30]})

# Create new column "Total" which is Price * Quantity 
df["Total"] = df["Price"] * df["Quantity"]
print(df)

# Output:
#       Item  Price  Quantity  Total
# 0   Apples   1.50        10  15.00
# 1  Bananas   0.50        20  10.00
# 2  Oranges   0.75        30  22.50
```
In this example, the new column **"Total"** is added using a simple operation on existing columns.  

---

## Adding New Columns with Static Values  
Adding a new column with a static value is straightforward. For example, you can add a **Location** column for employees working in the same location.  

```python
# Add "Location" column with static value
df["Location"] = "New York"
print(df)

# Output:
#       Item  Price  Quantity  Total  Location
# 0   Apples   1.50        10  15.00  New York
# 1  Bananas   0.50        20  10.00  New York
# 2  Oranges   0.75        30  22.50  New York
```
The **"Location"** column is filled with the static value "New York" for all rows.  

---

## New Columns Based on Conditions  
We can also create new columns based on conditions derived from existing columns. For example, given a DataFrame of student scores, we can create a column to flag whether the score is above 40.  

```python
import numpy as np

# DataFrame with student names and their scores
df = pd.DataFrame({"Student": ["Alice", "Bob", "Charlie"], 
                   "Score": [42, 37, 56]})

# Create new column "Status" that is "Pass" if Score > 40 else "Fail"
df["Status"] = np.where(df["Score"] > 40, "Pass", "Fail")
print(df)

# Output:
#    Student  Score Status
# 0    Alice     42   Pass
# 1      Bob     37   Fail
# 2  Charlie     56   Pass
```

### Explanation  
- **`np.where`** takes three arguments:  
  1. A condition (`df["Score"] > 40`).  
  2. A value if the condition is true (`"Pass"`).  
  3. A value if the condition is false (`"Fail"`).  

---

## Lesson Summary and Upcoming Practice  
In this session, we've covered:  
- Adding new columns with static values.  
- Creating columns by performing operations on existing columns.  
- Generating new columns based on conditions.  

The more you practice, the more proficient you'll become. Ready for the exercises? Let's put these techniques into action! 🚀

## Inventory Reorder Indicator

In the grocery store inventory, we use conditions to determine which items need reordering. We add a "Reorder" column that says "Yes" if an item's stock is less than 20, otherwise "No." This code shows how to do it using np.where. Ready to see it in action? Just hit Run!

```py
import pandas as pd
import numpy as np

# Grocery Store inventory with item names and their stock count
inventory = pd.DataFrame({"Item": ["Tomatoes", "Lettuce", "Potatoes"], "Stock": [25, 15, 40]})
# Create a new column "Reorder" where the value is "Yes" if Stock is less than 20, otherwise "No"
inventory["Reorder"] = np.where(inventory["Stock"] < 20, "Yes", "No")
print(inventory)
```

Here's the result of running the code provided for the grocery store inventory. It demonstrates how to use `np.where` to determine if an item needs reordering based on stock levels:

```python
import pandas as pd
import numpy as np

# Grocery Store inventory with item names and their stock count
inventory = pd.DataFrame({"Item": ["Tomatoes", "Lettuce", "Potatoes"], 
                          "Stock": [25, 15, 40]})

# Create a new column "Reorder" where the value is "Yes" if Stock is less than 20, otherwise "No"
inventory["Reorder"] = np.where(inventory["Stock"] < 20, "Yes", "No")
print(inventory)
```

### Output:
```
       Item  Stock Reorder
0  Tomatoes     25      No
1   Lettuce     15     Yes
2  Potatoes     40      No
```

### Explanation:
- The `np.where` function checks the condition `inventory["Stock"] < 20`.  
- If the condition is true (stock < 20), the value in the "Reorder" column is set to **"Yes"**.  
- If false, the value is set to **"No"**.  

This is a practical example of how conditional logic can help automate inventory management!

## Inventory Calculation Correction

Brilliant progress, Space Voyager! Before we proceed further, there is a slight hiccup in our Grocery Store Inventory Management code: it's not computing the "Stock Value" properly. Can you spot the issue and adjust the sails accordingly?
import pandas as pd

# DataFrame for fruits inventory with price per unit and available stock
inventory = pd.DataFrame({'Fruit': ['Apples', 'Bananas', 'Grapes'], 'Price per Unit': [0.50, 0.20, 2.00], 'Stock': [100, 200, 80]})
# Create new column "Stock Value" which is Price per Unit * Stock
inventory['Stock Value'] = inventory['Price per Unit'] * inventory['Amount']
print(inventory)

The issue with the code lies in the column name used in the calculation of the "Stock Value" column. The column name **`Amount`** does not exist in the `inventory` DataFrame. The correct column name should be **`Stock`**.

Here's the corrected code:

```python
import pandas as pd

# DataFrame for fruits inventory with price per unit and available stock
inventory = pd.DataFrame({
    'Fruit': ['Apples', 'Bananas', 'Grapes'],
    'Price per Unit': [0.50, 0.20, 2.00],
    'Stock': [100, 200, 80]
})

# Create new column "Stock Value" which is Price per Unit * Stock
inventory['Stock Value'] = inventory['Price per Unit'] * inventory['Stock']

print(inventory)
```

### Corrected Output:
```
     Fruit  Price per Unit  Stock  Stock Value
0   Apples            0.50    100        50.0
1  Bananas            0.20    200        40.0
2   Grapes            2.00     80       160.0
```

### Explanation of Fix:
- The code was trying to multiply **`Price per Unit`** by a non-existent column **`Amount`**, causing an error.  
- Replacing **`Amount`** with **`Stock`**, which holds the stock count of each fruit, resolves the issue.

## Adding Stock Order Column to Inventory DataFrame

Great progress, Space Voyager! Here's a challenge: The grocery store needs a new column in its inventory to indicate whether more stock should be ordered. Can you add it?

```py
import pandas as pd
import numpy as np

# DataFrame with items, in-stock quantity and minimum required stock
inventory = pd.DataFrame({
    "Item": ["Tomatoes", "Potatoes", "Onions"],
    "In_Stock": [50, 20, 15],
    "Min_Required": [30, 15, 25]
})
# TODO: Check if in-stock quantity is lower than the minimum required stock and set 'Order' column to True or False accordingly
inventory["Order"] = ___

print(inventory)
```

Here's the complete and corrected code to add the **"Order"** column, which indicates whether more stock should be ordered based on the condition:

```python
import pandas as pd
import numpy as np

# DataFrame with items, in-stock quantity, and minimum required stock
inventory = pd.DataFrame({
    "Item": ["Tomatoes", "Potatoes", "Onions"],
    "In_Stock": [50, 20, 15],
    "Min_Required": [30, 15, 25]
})

# Check if in-stock quantity is lower than the minimum required stock and set 'Order' column to True or False accordingly
inventory["Order"] = inventory["In_Stock"] < inventory["Min_Required"]

print(inventory)
```

### Output:
```
       Item  In_Stock  Min_Required  Order
0  Tomatoes        50            30  False
1  Potatoes        20            15  False
2    Onions        15            25   True
```

### Explanation:
- The condition **`inventory["In_Stock"] < inventory["Min_Required"]`** checks whether the in-stock quantity is less than the minimum required stock for each item.
- The result is stored in the new **"Order"** column, with **`True`** indicating that more stock needs to be ordered and **`False`** indicating no need for reordering.

## Add a Store Location Column to Inventory Data

Excellent progress, Space Voyager! Now let's add a new detail to our inventory management system to keep track of the storage locations for our goods. Add a new column to the DataFrame; each entry should have the same value – "Downtown" – representing the store's location.
```py
import pandas as pd

# Predefined inventory data
inventory_df = pd.DataFrame({
    "Product": ["Tomatoes", "Potatoes", "Onions"],
    "InStock": [25, 30, 15]
})

# TODO: Add a new column 'StoreLocation' to inventory_df, setting all its values to the name of the store's location

print(inventory_df)
```
Here's the completed code to add a new column **`StoreLocation`** with a static value for all rows:

```python
import pandas as pd

# Predefined inventory data
inventory_df = pd.DataFrame({
    "Product": ["Tomatoes", "Potatoes", "Onions"],
    "InStock": [25, 30, 15]
})

# Add a new column 'StoreLocation' and set all its values to the store's location
inventory_df["StoreLocation"] = "Main Street Store"

print(inventory_df)
```

### Output:
```
    Product  InStock       StoreLocation
0  Tomatoes       25  Main Street Store
1  Potatoes       30  Main Street Store
2    Onions       15  Main Street Store
```

### Explanation:
- The new column **`StoreLocation`** is added with a single static value, **"Main Street Store"**, applied to all rows.
- This is useful for tagging inventory data with specific locations.


Excellent progress, Space Voyager! Now let's add a new detail to our inventory management system to keep track of the storage locations for our goods. Add a new column to the DataFrame; each entry should have the same value – "Downtown" – representing the store's location.

```py
import pandas as pd

# Predefined inventory data
inventory_df = pd.DataFrame({
    "Product": ["Tomatoes", "Potatoes", "Onions"],
    "InStock": [25, 30, 15]
})

# TODO: Add a new column 'StoreLocation' to inventory_df, setting all its values to the name of the store's location

print(inventory_df)

```