# **Lesson: Introduction to Dictionaries in Python**

## **What is a Dictionary?**
A **dictionary** in Python is a collection of key-value pairs. It is an **unordered, mutable** data structure that allows **fast lookups** using keys.

### **Creating a Dictionary**



In [2]:
# Example: A simple dictionary storing account balances
bank_accounts = {
    "Alice": 1500,
    "Bob": 2300,
    "Charlie": 1800
}
print(bank_accounts)

{'Alice': 1500, 'Bob': 2300, 'Charlie': 1800}


Accessing Values

In [3]:
print(bank_accounts["Alice"])  # Output: 1500


1500


Adding & Updating Entries

In [4]:
bank_accounts["David"] = 2000  # Adding a new key-value pair
bank_accounts["Alice"] += 500  # Updating Alice's balance
print(bank_accounts)


{'Alice': 2000, 'Bob': 2300, 'Charlie': 1800, 'David': 2000}


Removing Entries

In [5]:
del bank_accounts["Charlie"]  # Removing Charlie's account
print(bank_accounts)


{'Alice': 2000, 'Bob': 2300, 'David': 2000}


Checking if a Key Exists

In [6]:
if "Bob" in bank_accounts:
    print("Bob has an account.")


Bob has an account.


Looping Through a Dictionary

In [7]:
for name, balance in bank_accounts.items():
    print(f"{name} has ${balance} in their account.")


Alice has $2000 in their account.
Bob has $2300 in their account.
David has $2000 in their account.


### Practice Questions

Question 1: Find High Balance Accounts

You have a dictionary of customer balances. Find and return all customers with balances above $2,000.

In [8]:
# input
accounts = {
    "Alice": 1500,
    "Bob": 2300,
    "Charlie": 1800,
    "David": 2500
}


Expected Output:

["Bob", "David"]

Question 2: Update Credit Card Limits

A bank wants to increase the credit limit for all accounts by 10%. Update the dictionary in-place.

In [9]:
# input
credit_limits = {
    "Alice": 5000,
    "Bob": 7000,
    "Charlie": 6000
}


Expected Output:

{
    "Alice": 5500,
    "Bob": 7700,
    "Charlie": 6600
}

Answers

In [10]:
# Answer 1: Find High Balance Accounts
high_balance_accounts = [name for name, balance in accounts.items() if balance > 2000]
print(high_balance_accounts)

# Answer 2: Increase Credit Limits
for name in credit_limits:
    credit_limits[name] *= 1.1  # Increase by 10%
print(credit_limits)


['Bob', 'David']
{'Alice': 5500.0, 'Bob': 7700.000000000001, 'Charlie': 6600.000000000001}


## **Working with Nested Dictionaries**
Dictionaries can contain other dictionaries, allowing us to structure data more effectively.




In [12]:
# Example: Storing customer transaction history
transactions = {
    "Alice": {"balance": 1500, "last_transaction": -200, "is_premium": True},
    "Bob": {"balance": 2300, "last_transaction": 500, "is_premium": False},
    "Charlie": {"balance": 1800, "last_transaction": -100, "is_premium": True}
}

print(transactions["Alice"]["balance"])  # Accessing Alice's balance

1500


Updating Nested Dictionary Values

In [None]:
transactions["Alice"]["balance"] += 300  # Alice deposited $300
print(transactions["Alice"]["balance"])


Combining Dictionaries

In [13]:
# Example: Merging customer credit scores
credit_scores = {"Alice": 750, "Bob": 680}
transactions.update(credit_scores)
print(transactions)  # Merging data


{'Alice': 750, 'Bob': 680, 'Charlie': {'balance': 1800, 'last_transaction': -100, 'is_premium': True}}


Sorting a Dictionary by Values

In [15]:
transactions = {
    "Alice": {"balance": 1500, "last_transaction": -200, "is_premium": True},
    "Bob": {"balance": 2300, "last_transaction": 500, "is_premium": False},
    "Charlie": {"balance": 1800, "last_transaction": -100, "is_premium": True}
}

In [18]:
# Sorting accounts by balance
sorted_accounts = sorted(transactions.items(), key=lambda x: x[1]["balance"], reverse=True)
sorted_accounts

[('Bob', {'balance': 2300, 'last_transaction': 500, 'is_premium': False}),
 ('Charlie', {'balance': 1800, 'last_transaction': -100, 'is_premium': True}),
 ('Alice', {'balance': 1500, 'last_transaction': -200, 'is_premium': True})]

### Practice Questions

Question 3: Find VIP Customers

A VIP customer is one whose balance is above $2,000 and has a premium account. Identify all VIP customers.

In [19]:
# Input:
accounts = {
    "Alice": {"balance": 1500, "is_premium": True},
    "Bob": {"balance": 2300, "is_premium": False},
    "Charlie": {"balance": 1800, "is_premium": True},
    "David": {"balance": 2600, "is_premium": True}
}


Expected Output:

["David"]

Question 4: Identify Unprofitable Customers

A customer is unprofitable if they have fewer than 5 transactions in a year. Identify them.

In [20]:
# Input:
transactions = {
    "Alice": {"transactions": 10, "balance": 1500},
    "Bob": {"transactions": 4, "balance": 2300},
    "Charlie": {"transactions": 3, "balance": 1800},
    "David": {"transactions": 8, "balance": 2600}
}


Expected Output:

["Bob", "Charlie"]

Answers

In [21]:
# Answer 3: Find VIP Customers
vip_customers = [name for name, data in accounts.items() if data["balance"] > 2000 and data["is_premium"]]
print(vip_customers)

# Answer 4: Identify Unprofitable Customers
unprofitable_customers = [name for name, data in transactions.items() if data["transactions"] < 5]
print(unprofitable_customers)


['David']
['Bob', 'Charlie']


## Advanced Dictionary Manipulation in Python


Scenario 1: Fraud Detection Based on Transaction Patterns

We will analyze transactions to identify potential fraud cases.

In [23]:
# Example: Identifying suspicious transactions
transactions = {
    "Alice": [100, 200, -5000, 50],  # Large negative transaction (-5000)
    "Bob": [50, 60, 70, 90],  # No fraud
    "Charlie": [-3000, 100, -100, 500]  # Large withdrawals (-3000)
}

# Flag accounts with transactions greater than $4000 or withdrawals over $2500
suspicious_accounts = {name: trans for name, trans in transactions.items() if any(abs(t) > 4000 for t in trans)}

suspicious_accounts


{'Alice': [100, 200, -5000, 50]}

💡 Takeaway: We use dictionaries + lists + loops to detect fraud patterns.

Scenario 2: Credit Limit Adjustment Based on Spending Behavior

Banks adjust credit limits based on spending consistency.

In [25]:
customers = {
    "Alice": {"monthly_spending": [1000, 1200, 1100, 1050], "credit_limit": 5000},
    "Bob": {"monthly_spending": [300, 400, 450, 500], "credit_limit": 2000},
    "Charlie": {"monthly_spending": [2000, 2500, 2400, 2600], "credit_limit": 7000}
}

# Increase credit limit for customers who consistently spend above 80% of their current limit
for name, data in customers.items():
    avg_spending = sum(data["monthly_spending"]) / len(data["monthly_spending"])
    if avg_spending > 0.8 * data["credit_limit"]:
        data["credit_limit"] *= 1.2  # Increase by 20%

customers


{'Alice': {'monthly_spending': [1000, 1200, 1100, 1050], 'credit_limit': 5000},
 'Bob': {'monthly_spending': [300, 400, 450, 500], 'credit_limit': 2000},
 'Charlie': {'monthly_spending': [2000, 2500, 2400, 2600],
  'credit_limit': 7000}}

Scenario 3: Analyzing Top Spenders Across Categories

Businesses track spending across categories like food, travel, shopping.

In [28]:
spending_data = {
    "Alice": {"food": 1200, "travel": 1500, "shopping": 1800},
    "Bob": {"food": 800, "travel": 600, "shopping": 1200},
    "Charlie": {"food": 2000, "travel": 2500, "shopping": 2800}
}

# Find the category where each person spends the most
top_categories = {name: max(data, key=data.get) for name, data in spending_data.items()}

top_categories


{'Alice': 'shopping', 'Bob': 'shopping', 'Charlie': 'shopping'}

### Practice Questions

Question 5: Find High-Risk Customers
A high-risk customer:

*   Has more than 3 withdrawals above $2000

*   OR has a balance below $1000 Identify all high-risk customers.




In [29]:
# Input:
accounts = {
    "Alice": {"transactions": [-5000, -2500, -1000, -200], "balance": 900},
    "Bob": {"transactions": [-100, -50, -30, -70], "balance": 5000},
    "Charlie": {"transactions": [-3000, -2500, -2700, -100], "balance": 1500}
}


Expected Output:

["Alice", "Charlie"]

Question 6: Categorizing Customers Based on Spending
Customers are classified as:



*   "Luxury Spender" if their highest spending category is travel.

*   "Budget Conscious" if their highest spending category is food.

*   "Shopper" if their highest spending category is shopping.

Classify each customer.

In [31]:
# Input:
spending_data = {
    "Alice": {"food": 1200, "travel": 1500, "shopping": 1800},
    "Bob": {"food": 1800, "travel": 1600, "shopping": 1500},
    "Charlie": {"food": 2500, "travel": 2200, "shopping": 2600}
}


Expected Output:

{"Alice": "Shopper", "Bob": "Budget Conscious", "Charlie": "Shopper"}

Answers:

In [32]:
# Answer 5: Finding High-Risk Customers
high_risk_customers = [
    name for name, data in accounts.items()
    if len([t for t in data["transactions"] if t < -2000]) > 3 or data["balance"] < 1000
]
print(high_risk_customers)

# Answer 6: Categorizing Customers Based on Spending
categories = {
    name: "Luxury Spender" if max(data, key=data.get) == "travel"
    else "Budget Conscious" if max(data, key=data.get) == "food"
    else "Shopper"
    for name, data in spending_data.items()
}
print(categories)


['Alice']
{'Alice': 'Shopper', 'Bob': 'Budget Conscious', 'Charlie': 'Shopper'}


In [39]:
import pandas as pd
import numpy as np

# Creating the first dataset: Fraudulent Transactions Detection
np.random.seed(42)
num_rows = 100

fraud_data = pd.DataFrame({
    "transaction_id": range(1, num_rows + 1),
    "customer_id": np.random.randint(1000, 2000, num_rows),
    "transaction_amount": np.random.randint(50, 5000, num_rows),
    "merchant_category": np.random.choice(["Electronics", "Grocery", "Luxury", "Travel", "Fast Food"], num_rows),
    "location": np.random.choice(["New York", "Los Angeles", "Chicago", "Houston", "Miami"], num_rows),
    "time_of_transaction": np.random.choice(["Morning", "Afternoon", "Evening", "Night"], num_rows),
    "previous_fraud_reports": np.random.randint(0, 5, num_rows)
})

# Creating the second dataset: Credit Card Default Risk
credit_data = pd.DataFrame({
    "customer_id": range(2001, 2001 + num_rows),
    "credit_limit": np.random.randint(5000, 25000, num_rows),
    "avg_utilization": np.random.uniform(30, 95, num_rows).round(2),
    "late_payments_6m": np.random.randint(0, 6, num_rows),
    "recent_large_transaction": np.random.randint(1000, 15000, num_rows)
})

# Saving to Excel files
fraud_data.to_csv("fraud_transactions.csv", index=False)
credit_data.to_csv("credit_risk.csv", index=False)




### **Risk Management in Credit Card Companies**

For the final part of this lesson, we’ll focus on **real-world risk management scenarios** in a **credit card company**.  
These two challenges require **DataFrames, lists, dictionaries, tuples, and boolean logic** while leveraging **Pandas and NumPy**.

---

## **Question 1: Identifying Customers with Unusual Spending Behavior**
A credit card company wants to flag customers with **suspicious spending patterns** based on the following conditions:  

🔴 **High Risk**:  
- Spends more than **90%** of their credit limit in a single transaction.  
- Has at least **two** transactions on the same day at different locations (potential fraud).  

🟡 **Medium Risk**:  
- Spends more than **70%** of their credit limit in a single transaction but less than 90%.  
- Has at least **one** transaction at a foreign location.  

🟢 **Low Risk**:  
- Otherwise, they are considered **low risk**.  

### **Dataset Sample (`transactions.csv`):**  
| customer_id | date       | amount | location  | credit_limit |
|------------|------------|--------|------------|--------------|
| 101        | 2024-02-01 | 5000   | "New York" | 6000         |
| 101        | 2024-02-01 | 100    | "Los Angeles" | 6000         |
| 102        | 2024-02-02 | 2000   | "Paris"    | 5000         |
| 103        | 2024-02-03 | 1500   | "Chicago"  | 4000         |

### **Expected Output:**
```python
{"101": "High Risk", "102": "Medium Risk", "103": "Low Risk"}


Hint:


*   Use Pandas to merge the files
*   Use Pandas to group transactions by customer_id and date.
*   Use a dictionary to store each customer’s risk level.
*   Use tuples to track unique locations per day.
*   Use boolean logic to classify risk levels.










### **Question 2: Predicting Customers Likely to Default on Payments**

A credit card company wants to **predict customers at risk of missing payments** based on the following criteria:  

🔹 **Factors Considered:**  
- **Payment History**: If a customer has more than **2 late payments** in the last **6 months**, they are at risk.  
- **Credit Utilization**: If they use **more than 80%** of their credit limit on average.  
- **Recent Large Transactions**: If they made a transaction over **50%** of their limit in the last month.  

🛑 **High Default Risk** = Meets **all** conditions.  
⚠️ **Medium Default Risk** = Meets **two** conditions.  
✅ **Low Default Risk** = Meets **one or none**.  

## **Dataset Sample (`credit_data.csv`):**  
| customer_id | credit_limit | avg_utilization | late_payments_6m | recent_large_transaction |
|------------|--------------|----------------|------------------|--------------------------|
| 201        | 10000        | 85%            | 3                | 7000                     |
| 202        | 8000         | 75%            | 2                | 1000                     |
| 203        | 5000         | 50%            | 1                | 2000                     |

### **Expected Output:**
```python
{"201": "High Default Risk", "202": "Medium Default Risk", "203": "Low Default Risk"}


Hint:
*   Use a dictionary to store customers and their risk levels.
*   Convert percentages to floats for comparisons.
*   Use logical conditions to determine risk classification.
*   Use NumPy or Pandas to work efficiently with data.







Answers:

Answer 1: Identifying Customers with Unusual Spending Behavior

In [33]:
import pandas as pd

# Load data
credit_risk = pd.read_csv('credit_risk.csv')
fraud_transactions = pd.read_csv('fraud_transactions.csv')

df = fraud_transactions.merge(credit_risk, on='customer_id', how='left')

# Group transactions by customer
risk_levels = {}

for customer in df["customer_id"].unique():
    cust_df = df[df["customer_id"] == customer]
    credit_limit = cust_df["credit_limit"].iloc[0]

    high_risk = any(cust_df["amount"] > 0.9 * credit_limit)
    multiple_locations = any(cust_df.groupby("date")["location"].nunique() > 1)
    medium_risk = any((cust_df["amount"] > 0.7 * credit_limit) & (cust_df["amount"] <= 0.9 * credit_limit))
    foreign_location = any(cust_df["location"].str.contains("Paris|London|Tokyo", case=False))

    if high_risk or multiple_locations:
        risk_levels[customer] = "High Risk"
    elif medium_risk or foreign_location:
        risk_levels[customer] = "Medium Risk"
    else:
        risk_levels[customer] = "Low Risk"

print(risk_levels)


{101: 'High Risk', 102: 'Medium Risk', 103: 'Low Risk'}


Answer 2: Predicting Customers Likely to Default on Payments

In [34]:
import pandas as pd

# Load data
credit_risk = pd.read_csv('credit_risk.csv')

# Define risk levels
risk_levels = {}

for _, row in df.iterrows():
    conditions_met = 0
    if row["late_payments_6m"] > 2:
        conditions_met += 1
    if row["avg_utilization"] > 0.80:
        conditions_met += 1
    if row["recent_large_transaction"] > 0.50 * row["credit_limit"]:
        conditions_met += 1

    if conditions_met == 3:
        risk_levels[row["customer_id"]] = "High Default Risk"
    elif conditions_met == 2:
        risk_levels[row["customer_id"]] = "Medium Default Risk"
    else:
        risk_levels[row["customer_id"]] = "Low Default Risk"

print(risk_levels)


{201.0: 'High Default Risk', 202.0: 'Low Default Risk', 203.0: 'Low Default Risk'}
