**Cleaning Up a Messy Grocery Dataset**

You are given a dataset representing sales of varius grocery items, including details like names, quamtities, unit prices, purchase dates, and categories. However, the dataset contain some errors and missing data that you are tasked with finding and correcting. 

**Task:** Without editing inputs and using Python code anole, write a clean_data() function that will return a dataset ready for analysis. Your function should return a cleaned DataFrame with updated categories, filled-in unit prices and positive quantites. 

In [12]:
import numpy as np
import pandas as pd
from datetime import datetime

In [13]:
# Grocery sales dataset below. Your answer should return the cleaned dataset.
# This is how your code will be called.
data = {
    'Item Name': ['Apples', 'Milk', 'Bread', 'Eggs', 'Bananas', 'Cheese', 
                  'Tomatoes', 'Potatoes', 'Onions', 'Chicken',
                  'Pasta', 'Rice', 'Bread','Yogurt', 'Ice Cream', 'Cereal'],
    'Quantity': [5, 1, 1, 3, 5, 2, 3, 4, 1, 2, -3, 2, 2, 2, 1, 3],
    'Unit Price': [1.5, 2.0, np.nan, 0.2, 1.0, 3.0, 1.2, 0.5, 0.8, 5.0, 2.5, 1.0, 4.0, 4.0, 2.0, 1.0],  
    'Purchase Date': [
        datetime(2024, 4, 1),
        datetime(2024, 4, 2),
        datetime(2024, 4, 2),
        datetime(2024, 4, 3),
        datetime(2024, 4, 3),
        datetime(2024, 4, 3),
        datetime(2024, 4, 4),
        datetime(2024, 4, 4),
        datetime(2024, 4, 7),
        datetime(2024, 4, 7),
        datetime(2024, 4, 7),
        datetime(2024, 4, 1),
        datetime(2024, 4, 4),
        datetime(2024, 4, 2),
        datetime(2024, 4, 3),
        datetime(2024, 4, 1)
    ],
    'Category': ['Fruits', 'Dairy', 'Bakery', np.nan, 'Fruits', 'Dairy', 
                 'Vegetables', np.nan, 'Vegetables', 'Meat',
                 'Pasta', 'Grains', 'Bakery','Dairy', 'Desserts', 'Cereal']
}

df = pd.DataFrame(data)


In [14]:
df.head()

Unnamed: 0,Item Name,Quantity,Unit Price,Purchase Date,Category
0,Apples,5,1.5,2024-04-01,Fruits
1,Milk,1,2.0,2024-04-02,Dairy
2,Bread,1,,2024-04-02,Bakery
3,Eggs,3,0.2,2024-04-03,
4,Bananas,5,1.0,2024-04-03,Fruits


In [21]:
#show the items that there is no category
df[df['Category'].isnull()]


Unnamed: 0,Item Name,Quantity,Unit Price,Purchase Date,Category
3,Eggs,3,0.2,2024-04-03,
7,Potatoes,4,0.5,2024-04-04,


In [33]:
def clean_data(df):
   
    #Clean categories
    df.loc[df['Item Name']=='Eggs', 'Category']='Dairy'
    df.loc[df['Item Name']=='Potatoes', 'Category']='Vegetables'

    #Fill in missing unit prices
    #df['Unit Price'].fillna(method='bfill', inplace=True)
    df['Unit Price'] = df.groupby('Item Name')['Unit Price'].transform(lambda x:x.fillna(x.mean()))
    

    #Ensure positive quantity
    #df['Quantity'] = df['Quantity'].abs()
    df['Quantity'] = [abs(x) for x in df['Quantity']]

    return df

In [34]:
#Example usage
df = clean_data(df)
print(df)

    Item Name  Quantity  Unit Price Purchase Date    Category
0      Apples         5         1.5    2024-04-01      Fruits
1        Milk         1         2.0    2024-04-02       Dairy
2       Bread         1         0.2    2024-04-02      Bakery
3        Eggs         3         0.2    2024-04-03       Dairy
4     Bananas         5         1.0    2024-04-03      Fruits
5      Cheese         2         3.0    2024-04-03       Dairy
6    Tomatoes         3         1.2    2024-04-04  Vegetables
7    Potatoes         4         0.5    2024-04-04  Vegetables
8      Onions         1         0.8    2024-04-07  Vegetables
9     Chicken         2         5.0    2024-04-07        Meat
10      Pasta         3         2.5    2024-04-07       Pasta
11       Rice         2         1.0    2024-04-01      Grains
12      Bread         2         4.0    2024-04-04      Bakery
13     Yogurt         2         4.0    2024-04-02       Dairy
14  Ice Cream         1         2.0    2024-04-03    Desserts
15     C