# Topic 02 - Problem 7: Handle Missing Values in a Dictionary-Based Dataset

---

## 1. About the Problem

This problem asks me to clean missing values in a dataset represented as a list of dictionaries.  
In real data science projects, datasets often come in structured formats where each record has named fields.  
To solve this problem, I will check each key-value pair and replace missing values with a default value based on the column.  
This approach helps maintain dataset structure while handling missing information properly.

---


## 2. Solution Code

In [24]:
def clean_dataset(data,defaults):
    cleaned_records=[]

    for record in data:
        cleaned_dataset={}
        for key,value in record.items():
            if value is None:
                cleaned_dataset[key]=defaults.get(key)
            else:
                cleaned_dataset[key]=value
        cleaned_records.append(cleaned_dataset)
    
    return cleaned_records
data = [
    {"age": 25, "salary": 50000, "city": "Dhaka"},
    {"age": None, "salary": 48000, "city": None},
    {"age": 30, "salary": None, "city": "Chittagong"}
]

defaults = {"age": 0, "salary": 0, "city": "Unknown"}

print("Cleaned dataset:", clean_dataset(data, defaults))


Cleaned dataset: [{'age': 25, 'salary': 50000, 'city': 'Dhaka'}, {'age': 0, 'salary': 48000, 'city': 'Unknown'}, {'age': 30, 'salary': 0, 'city': 'Chittagong'}]


---

## 3. Summary / Takeaways

By solving this problem, I learned how to clean structured datasets without losing column meaning.  
I understood how different columns may require different default values.  
This approach is very close to how data is handled in pandas DataFrames.  
Proper column-wise cleaning improves data consistency for analysis and machine learning.  
Next, I want to detect columns with too many missing values.
