## Customer Purchase Analysis

This notebook focuses on inspecting the **Review Dataset** and transforming it for further analysis. The key steps covered include:

- **Data Handling**:
  - Importing the review dataset using Pandas.
  - Inspecting the structure and contents of the dataset.

- **Data Transformation**:
  - Introducing new columns based on existing data.
  - Assigning values to these columns to prepare the dataset for advanced analysis.


The goal of this notebook is to clean, process, and structure the review dataset into a form that is suitable for deeper analysis in subsequent notebooks.


In [None]:
#loading the libraries
import pandas as pd
import json


In [None]:
#loading the review dataset
with open('./data/raw/reviews.json', 'r') as f:
  data = json.load(f)

df = pd.DataFrame(data)

In [None]:
#printing the contents of the dataframe
df

Unnamed: 0,product_link,product_name,product_description,product_price,reviews
0,https://www.nuuly.com/rent/products/mirage-hig...,Mirage High-Rise Fitted Shorts,The Australian designer couple behind cult-fav...,109,"[{'review_posted_by_username': 'CCPenn', 'user..."
1,https://www.nuuly.com/rent/products/floral-pri...,Floral Printed Cutout Maxi Dress,Founded in Mexico City by French designers Aud...,238,"[{'review_posted_by_username': 'Clothing123', ..."
2,https://www.nuuly.com/rent/products/check-prin...,Check Print Slip Skirt,,68,"[{'review_posted_by_username': 'lcolville', 'u..."
3,https://www.nuuly.com/rent/products/serve-coll...,Serve Collared Bodysuit,Gathering inspiration from the past and interp...,58,"[{'review_posted_by_username': 'SH22', 'user_s..."
4,https://www.nuuly.com/rent/products/skipper-bu...,Skipper Buttondown Top,The ultimate destination for slip-it-on and go...,139,"[{'review_posted_by_username': 'Denise21', 'us..."
...,...,...,...,...,...
3641,https://www.nuuly.com/rent/products/puzzle-gin...,Puzzle Gingham Mini Dress,Featuring fearless-yet-wearable statement piec...,242,"[{'review_posted_by_username': 'LexieDexie', '..."
3642,https://www.nuuly.com/rent/products/camella-mi...,Camella Mini Dress,Gathering inspiration from the past and interp...,128,[{'review_posted_by_username': 'Fennie_from_th...
3643,https://www.nuuly.com/rent/products/hailee-lac...,Hailee Lace Mini Dress,Gathering inspiration from the past and interp...,168,"[{'review_posted_by_username': 'mcpumpkin', 'u..."
3644,https://www.nuuly.com/rent/products/lace-up-fl...,Lace-Up Flutter Sleeve Dress,"Dubbing themselves ""trend-inspired"", ASTR The ...",89,"[{'review_posted_by_username': 'Jessguz', 'use..."


### Extracting Product Details

Below we are extracting relevant product-related data from the `product_df` DataFrame and preparing it for further analysis. The product category (`product_category`) is derived from the product link. The final `product_df` contains the following columns:

- `product_link`: The URL link to the product page.
- `product_name`: The name of the product.
- `product_description`: A short description of the product.
- `product_price`: The price of the product.
- `product_id`: A unique identifier for each product.
- `product_category`: The category of the product, extracted from the `product_link`.


In [None]:
import uuid

# generating product_id
df['product_id'] = [uuid.uuid4() for _ in range(len(df))]

#printing the dataframe
df

Unnamed: 0,product_link,product_name,product_description,product_price,reviews,product_id
0,https://www.nuuly.com/rent/products/mirage-hig...,Mirage High-Rise Fitted Shorts,The Australian designer couple behind cult-fav...,109,"[{'review_posted_by_username': 'CCPenn', 'user...",06546195-f435-4b02-87fe-387dd5cb0e84
1,https://www.nuuly.com/rent/products/floral-pri...,Floral Printed Cutout Maxi Dress,Founded in Mexico City by French designers Aud...,238,"[{'review_posted_by_username': 'Clothing123', ...",156dd7a9-659b-49ea-aaaa-48b32843decb
2,https://www.nuuly.com/rent/products/check-prin...,Check Print Slip Skirt,,68,"[{'review_posted_by_username': 'lcolville', 'u...",77777828-a12f-48a5-9961-a31fee669c33
3,https://www.nuuly.com/rent/products/serve-coll...,Serve Collared Bodysuit,Gathering inspiration from the past and interp...,58,"[{'review_posted_by_username': 'SH22', 'user_s...",ce983ee9-eb7a-43fe-9eeb-78530ba3903e
4,https://www.nuuly.com/rent/products/skipper-bu...,Skipper Buttondown Top,The ultimate destination for slip-it-on and go...,139,"[{'review_posted_by_username': 'Denise21', 'us...",9d09ffb0-a6f7-43af-8e62-2968bfb5a712
...,...,...,...,...,...,...
3641,https://www.nuuly.com/rent/products/puzzle-gin...,Puzzle Gingham Mini Dress,Featuring fearless-yet-wearable statement piec...,242,"[{'review_posted_by_username': 'LexieDexie', '...",3031c9a1-e4dc-4da1-af67-a72116b90044
3642,https://www.nuuly.com/rent/products/camella-mi...,Camella Mini Dress,Gathering inspiration from the past and interp...,128,[{'review_posted_by_username': 'Fennie_from_th...,cdaa714c-7609-4a04-9f3d-df5220f0a0e9
3643,https://www.nuuly.com/rent/products/hailee-lac...,Hailee Lace Mini Dress,Gathering inspiration from the past and interp...,168,"[{'review_posted_by_username': 'mcpumpkin', 'u...",cef75902-61f8-4ffb-a9a9-42996a187b38
3644,https://www.nuuly.com/rent/products/lace-up-fl...,Lace-Up Flutter Sleeve Dress,"Dubbing themselves ""trend-inspired"", ASTR The ...",89,"[{'review_posted_by_username': 'Jessguz', 'use...",019e80c8-0f1c-45c6-a0ea-d31f04d4b5dd


In [None]:
#copy df into product_df
product_df = df.copy(deep=True)

In [None]:
#dropping reviews column
product_df = product_df.drop(columns=["reviews"])

In [None]:
#extracting product category from the product link
product_categories = []
for product in product_df["product_link"]:
  x = product.split("-")[-1]
  product_categories.append(x.split("?")[0])
product_df["product_category"] = product_categories

In [None]:
# displaying the first few rows of the updated product_df to verify the 'product_category' column
product_df.head()

Unnamed: 0,product_link,product_name,product_description,product_price,product_id,product_category
0,https://www.nuuly.com/rent/products/mirage-hig...,Mirage High-Rise Fitted Shorts,The Australian designer couple behind cult-fav...,109,06546195-f435-4b02-87fe-387dd5cb0e84,shorts
1,https://www.nuuly.com/rent/products/floral-pri...,Floral Printed Cutout Maxi Dress,Founded in Mexico City by French designers Aud...,238,156dd7a9-659b-49ea-aaaa-48b32843decb,dress
2,https://www.nuuly.com/rent/products/check-prin...,Check Print Slip Skirt,,68,77777828-a12f-48a5-9961-a31fee669c33,skirt
3,https://www.nuuly.com/rent/products/serve-coll...,Serve Collared Bodysuit,Gathering inspiration from the past and interp...,58,ce983ee9-eb7a-43fe-9eeb-78530ba3903e,bodysuit
4,https://www.nuuly.com/rent/products/skipper-bu...,Skipper Buttondown Top,The ultimate destination for slip-it-on and go...,139,9d09ffb0-a6f7-43af-8e62-2968bfb5a712,top


In [None]:
#extracting reviews and product_id for review analysis
reviews = df[["reviews", "product_id"]]

In [None]:
reviews

Unnamed: 0,reviews,product_id
0,"[{'review_posted_by_username': 'CCPenn', 'user...",06546195-f435-4b02-87fe-387dd5cb0e84
1,"[{'review_posted_by_username': 'Clothing123', ...",156dd7a9-659b-49ea-aaaa-48b32843decb
2,"[{'review_posted_by_username': 'lcolville', 'u...",77777828-a12f-48a5-9961-a31fee669c33
3,"[{'review_posted_by_username': 'SH22', 'user_s...",ce983ee9-eb7a-43fe-9eeb-78530ba3903e
4,"[{'review_posted_by_username': 'Denise21', 'us...",9d09ffb0-a6f7-43af-8e62-2968bfb5a712
...,...,...
3641,"[{'review_posted_by_username': 'LexieDexie', '...",3031c9a1-e4dc-4da1-af67-a72116b90044
3642,[{'review_posted_by_username': 'Fennie_from_th...,cdaa714c-7609-4a04-9f3d-df5220f0a0e9
3643,"[{'review_posted_by_username': 'mcpumpkin', 'u...",cef75902-61f8-4ffb-a9a9-42996a187b38
3644,"[{'review_posted_by_username': 'Jessguz', 'use...",019e80c8-0f1c-45c6-a0ea-d31f04d4b5dd


In [None]:
#function to tranform nested review data into a flat dataframe for analysis
import ast

def transform_reviews(df):
    # list to store the transformed rows
    transformed_data = []

    # iterating over each row in the DataFrame
    for _, row in df.iterrows():
        product_id = row['product_id']
        reviews = row['reviews']

        if isinstance(reviews, str):
            reviews = ast.literal_eval(reviews)

        for review in reviews:
            # appending a new row for each review dictionary
            transformed_data.append({
                "product_id": product_id,
                "customer_username": review.get("review_posted_by_username", None),
                "star_ratings": review.get("star_ratings", None),
                "review_date": review.get("review_date", None),
                "user_age": review.get("user_age", None),
                "user_height": review.get("user_height", None),
                "user_size": review.get("user_size", None),
                "user_weight": review.get("user_weight", None),
                "user_body_type": review.get("user_body_type", None),
                "user_color": review.get("user_color", None),
                "review_title": review.get("review_title", None),
                "review_content": review.get("review_text", None),
            })

    # convert the transformed data into a new DataFrame
    transformed_df = pd.DataFrame(transformed_data)

    return transformed_df

In [None]:
#transforming the reviews data into structured format and storing it
final_df = transform_reviews(reviews)

In [None]:
final_df.head()

Unnamed: 0,product_id,customer_username,star_ratings,review_date,user_age,user_height,user_size,user_weight,user_body_type,user_color,review_title,review_content
0,06546195-f435-4b02-87fe-387dd5cb0e84,CCPenn,4,10/09/23,39.0,"5'8""",XL,214 lbs.,Hourglass,red,Super cute but too small,The item was in great condition and was super ...
1,06546195-f435-4b02-87fe-387dd5cb0e84,Cactusfeathers,3,09/18/23,35.0,"5'5""",L,144 lbs.,Straight,red,Second try,I rented these in my usual size last time and ...
2,06546195-f435-4b02-87fe-387dd5cb0e84,erinmh29,5,09/10/23,36.0,"5'7""",M,135 lbs.,Hourglass,red,Fun shorts,"Loved the color and style. Size up, I’m usuall..."
3,06546195-f435-4b02-87fe-387dd5cb0e84,Ktwesty,1,08/29/23,,"5'5""",S,130 lbs.,Apple,red,Retro tiny tiny shorts,"These shorts were very thin material, very sm..."
4,06546195-f435-4b02-87fe-387dd5cb0e84,Jassica,4,08/21/23,36.0,"5'4""",M,122 lbs.,Straight,red,They're Alright,Cute shorts. Didn't really get to wear much. P...


In [None]:
# creating a mapping of unique customer usernames to user IDs
unique_usernames = final_df["customer_username"].unique()
username_to_userid = {username: f"user_{i+1}" for i, username in enumerate(unique_usernames)}

final_df["customer_userid"] = final_df["customer_username"].map(username_to_userid)

In [None]:
#Drop the 'customer_username' column as it's no longer needed
final_df.drop("customer_username", axis=1, inplace=True)

In [None]:
#merging the product DataFrame with the transformed review DataFrame on 'product_id'
final_final_df = pd.merge(product_df, final_df, on='product_id')

In [None]:
#displaying first few rows
final_final_df.head()

Unnamed: 0,product_link,product_name,product_description,product_price,product_id,product_category,star_ratings,review_date,user_age,user_height,user_size,user_weight,user_body_type,user_color,review_title,review_content,customer_userid
0,https://www.nuuly.com/rent/products/mirage-hig...,Mirage High-Rise Fitted Shorts,The Australian designer couple behind cult-fav...,109,06546195-f435-4b02-87fe-387dd5cb0e84,shorts,4,10/09/23,39.0,"5'8""",XL,214 lbs.,Hourglass,red,Super cute but too small,The item was in great condition and was super ...,user_1
1,https://www.nuuly.com/rent/products/mirage-hig...,Mirage High-Rise Fitted Shorts,The Australian designer couple behind cult-fav...,109,06546195-f435-4b02-87fe-387dd5cb0e84,shorts,3,09/18/23,35.0,"5'5""",L,144 lbs.,Straight,red,Second try,I rented these in my usual size last time and ...,user_2
2,https://www.nuuly.com/rent/products/mirage-hig...,Mirage High-Rise Fitted Shorts,The Australian designer couple behind cult-fav...,109,06546195-f435-4b02-87fe-387dd5cb0e84,shorts,5,09/10/23,36.0,"5'7""",M,135 lbs.,Hourglass,red,Fun shorts,"Loved the color and style. Size up, I’m usuall...",user_3
3,https://www.nuuly.com/rent/products/mirage-hig...,Mirage High-Rise Fitted Shorts,The Australian designer couple behind cult-fav...,109,06546195-f435-4b02-87fe-387dd5cb0e84,shorts,1,08/29/23,,"5'5""",S,130 lbs.,Apple,red,Retro tiny tiny shorts,"These shorts were very thin material, very sm...",user_4
4,https://www.nuuly.com/rent/products/mirage-hig...,Mirage High-Rise Fitted Shorts,The Australian designer couple behind cult-fav...,109,06546195-f435-4b02-87fe-387dd5cb0e84,shorts,4,08/21/23,36.0,"5'4""",M,122 lbs.,Straight,red,They're Alright,Cute shorts. Didn't really get to wear much. P...,user_5


### Extracting Customer Details

The `create_customer_df` function, extracts customer-specific details:
- `customer_userid`: A unique identifier for each customer.
- `user_age`: The age of the customer.
- `user_height`: The height of the customer.
- `user_size`: The clothing size of the customer.
- `user_weight`: The weight of the customer.
- `user_body_type`: The body type description of the customer.
- `user_color`: The customer's preferred color.

This function ensures that each customer appears only once by removing duplicate entries based on `customer_userid`. This dataframe(`customer_df`) will be used for customer analysis.


In [None]:
def create_customer_df(df):
    # selecting relevant columns for customer details
    customer_columns = [
        "customer_userid", "user_age", "user_height", "user_size", "user_weight",
        "user_body_type", "user_color"
    ]

    # drop duplicates to ensure one row per unique customer_username
    customer_df = df[customer_columns].drop_duplicates(subset="customer_userid").reset_index(drop=True)

    return customer_df

In [None]:
#extract the unique customer details from the merged DataFrame
customer_df = create_customer_df(final_final_df)

In [None]:
#print the first few rows of the customer DataFrame
customer_df.head()

Unnamed: 0,customer_userid,user_age,user_height,user_size,user_weight,user_body_type,user_color
0,user_1,39.0,"5'8""",XL,214 lbs.,Hourglass,red
1,user_2,35.0,"5'5""",L,144 lbs.,Straight,red
2,user_3,36.0,"5'7""",M,135 lbs.,Hourglass,red
3,user_4,,"5'5""",S,130 lbs.,Apple,red
4,user_5,36.0,"5'4""",M,122 lbs.,Straight,red


### Extracting Review Details
The `create_review_df` function extracts review-specific details, such as product IDs, customer IDs, star ratings, review dates, titles, and content.
This dataframe (`reviews_df`) will be used to to customer reviews analysis

In [None]:
def create_review_df(df):
  # selecting relevant columns for review details
  review_columns = [
      "product_id", "customer_userid", "star_ratings", "review_date", "review_title", "review_content"
  ]

  review_df = df[review_columns]

  return review_df

In [None]:
#extract review-related details from the merged DataFrame
reviews_df = create_review_df(final_final_df)

In [None]:
#display first few rows of reviews_df
reviews_df.head()

Unnamed: 0,product_id,customer_userid,star_ratings,review_date,review_title,review_content
0,06546195-f435-4b02-87fe-387dd5cb0e84,user_1,4,10/09/23,Super cute but too small,The item was in great condition and was super ...
1,06546195-f435-4b02-87fe-387dd5cb0e84,user_2,3,09/18/23,Second try,I rented these in my usual size last time and ...
2,06546195-f435-4b02-87fe-387dd5cb0e84,user_3,5,09/10/23,Fun shorts,"Loved the color and style. Size up, I’m usuall..."
3,06546195-f435-4b02-87fe-387dd5cb0e84,user_4,1,08/29/23,Retro tiny tiny shorts,"These shorts were very thin material, very sm..."
4,06546195-f435-4b02-87fe-387dd5cb0e84,user_5,4,08/21/23,They're Alright,Cute shorts. Didn't really get to wear much. P...


### Extracting Order Details
Below we are extracting relevant order-related data from the `reviews_df` DataFrame, and preparing the order_df for further analysis.
Paid amount (`paid_amt`) is calculated by applying a random discount (0% to 50%) to the product price.
Order dates are adjusted by simulating delivery times (adding random days between 2 and 10).
The final order_df contains the following columns:
- `product_id`: Links the order to a specific product.
- `customer_userid`: Identifies the customer placing the order.
- `order_id`: A unique identifier for each order.
- `paid_amt`: The amount paid after applying a random discount.
- `order_date`: The adjusted order date, including simulated delivery times.




In [None]:
#select relevant columns for order-related data
order_df = reviews_df[["product_id", "customer_userid", "review_date"]]

In [None]:
#rename to proper column name
order_df.rename(columns={'review_date': 'order_date'}, inplace=True)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  order_df.rename(columns={'review_date': 'order_date'}, inplace=True)


In [None]:
order_df['order_id'] = [uuid.uuid4() for _ in range(len(order_df))]

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  order_df['order_id'] = [uuid.uuid4() for _ in range(len(order_df))]


In [None]:
#display first few rows of order_df
order_df.head()

Unnamed: 0,product_id,customer_userid,order_date,order_id
0,06546195-f435-4b02-87fe-387dd5cb0e84,user_1,10/09/23,e0c684eb-6b27-4864-a5ca-1c6ce63e390a
1,06546195-f435-4b02-87fe-387dd5cb0e84,user_2,09/18/23,b1858447-1020-4e9c-8f79-b91ba3ad85c9
2,06546195-f435-4b02-87fe-387dd5cb0e84,user_3,09/10/23,67cefe65-4148-4b21-93a2-55f81de1d199
3,06546195-f435-4b02-87fe-387dd5cb0e84,user_4,08/29/23,ec90c21c-570d-49b0-a71c-8717ab7eb146
4,06546195-f435-4b02-87fe-387dd5cb0e84,user_5,08/21/23,0b3f50a8-e62e-457c-ba43-64e73dfa21f2


In [None]:
x_df = pd.merge(product_df, order_df, on="product_id")

In [None]:
import random

#Merge the product prices into the order DataFrame
merged_df = order_df.merge(product_df, on="product_id", how="left")

# Calculate the paid_amt column
order_df["paid_amt"] = merged_df["product_price"].apply(
    lambda price: price - (price * random.uniform(0, 0.50)) if pd.notnull(price) else None
)

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  order_df["paid_amt"] = merged_df["product_price"].apply(


In [None]:
#convert the 'order_date' column to datetime format
order_df['order_date'] = pd.to_datetime(order_df['order_date'])

  order_df['order_date'] = pd.to_datetime(order_df['order_date'])
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  order_df['order_date'] = pd.to_datetime(order_df['order_date'])


In [None]:
# adjusting the order dates by simulating delivery times
import numpy as np
order_df["new_date"] = order_df['order_date'] - pd.to_timedelta(np.random.randint(2, 10, size=len(order_df)), unit='D')
order_df.drop("order_date", axis=1, inplace=True)
order_df.rename(columns={"new_date": "order_date"})

Unnamed: 0,product_id,customer_userid,order_id,paid_amt,order_date
0,06546195-f435-4b02-87fe-387dd5cb0e84,user_1,e0c684eb-6b27-4864-a5ca-1c6ce63e390a,84.377450,2023-10-04
1,06546195-f435-4b02-87fe-387dd5cb0e84,user_2,b1858447-1020-4e9c-8f79-b91ba3ad85c9,65.717115,2023-09-12
2,06546195-f435-4b02-87fe-387dd5cb0e84,user_3,67cefe65-4148-4b21-93a2-55f81de1d199,80.502781,2023-09-08
3,06546195-f435-4b02-87fe-387dd5cb0e84,user_4,ec90c21c-570d-49b0-a71c-8717ab7eb146,88.216305,2023-08-23
4,06546195-f435-4b02-87fe-387dd5cb0e84,user_5,0b3f50a8-e62e-457c-ba43-64e73dfa21f2,87.464241,2023-08-17
...,...,...,...,...,...
86677,86f59ec5-ea98-4b29-a307-64b524a7f871,user_8324,eb662f37-1f79-4e10-9722-9a31a02ca87d,69.966847,2023-09-24
86678,86f59ec5-ea98-4b29-a307-64b524a7f871,user_33376,724ff29d-6902-43fc-8fec-29d7bea38abe,111.352694,2023-09-17
86679,86f59ec5-ea98-4b29-a307-64b524a7f871,user_33377,85f5835f-9361-4040-8d91-e8534bada6d4,66.193294,2023-09-21
86680,86f59ec5-ea98-4b29-a307-64b524a7f871,user_29004,52251c84-b66f-498c-a783-11fb82825f9e,122.645538,2023-09-09


In [None]:
#print order_df
order_df.head()

Unnamed: 0,product_id,customer_userid,order_id,paid_amt,new_date
0,06546195-f435-4b02-87fe-387dd5cb0e84,user_1,e0c684eb-6b27-4864-a5ca-1c6ce63e390a,84.37745,2023-10-04
1,06546195-f435-4b02-87fe-387dd5cb0e84,user_2,b1858447-1020-4e9c-8f79-b91ba3ad85c9,65.717115,2023-09-12
2,06546195-f435-4b02-87fe-387dd5cb0e84,user_3,67cefe65-4148-4b21-93a2-55f81de1d199,80.502781,2023-09-08
3,06546195-f435-4b02-87fe-387dd5cb0e84,user_4,ec90c21c-570d-49b0-a71c-8717ab7eb146,88.216305,2023-08-23
4,06546195-f435-4b02-87fe-387dd5cb0e84,user_5,0b3f50a8-e62e-457c-ba43-64e73dfa21f2,87.464241,2023-08-17


In [None]:
#print product_df
product_df.head()

Unnamed: 0,product_link,product_name,product_description,product_price,product_id,product_category
0,https://www.nuuly.com/rent/products/mirage-hig...,Mirage High-Rise Fitted Shorts,The Australian designer couple behind cult-fav...,109,06546195-f435-4b02-87fe-387dd5cb0e84,shorts
1,https://www.nuuly.com/rent/products/floral-pri...,Floral Printed Cutout Maxi Dress,Founded in Mexico City by French designers Aud...,238,156dd7a9-659b-49ea-aaaa-48b32843decb,dress
2,https://www.nuuly.com/rent/products/check-prin...,Check Print Slip Skirt,,68,77777828-a12f-48a5-9961-a31fee669c33,skirt
3,https://www.nuuly.com/rent/products/serve-coll...,Serve Collared Bodysuit,Gathering inspiration from the past and interp...,58,ce983ee9-eb7a-43fe-9eeb-78530ba3903e,bodysuit
4,https://www.nuuly.com/rent/products/skipper-bu...,Skipper Buttondown Top,The ultimate destination for slip-it-on and go...,139,9d09ffb0-a6f7-43af-8e62-2968bfb5a712,top


In [None]:
#print customer_df
customer_df.head()

Unnamed: 0,customer_userid,user_age,user_height,user_size,user_weight,user_body_type,user_color
0,user_1,39.0,"5'8""",XL,214 lbs.,Hourglass,red
1,user_2,35.0,"5'5""",L,144 lbs.,Straight,red
2,user_3,36.0,"5'7""",M,135 lbs.,Hourglass,red
3,user_4,,"5'5""",S,130 lbs.,Apple,red
4,user_5,36.0,"5'4""",M,122 lbs.,Straight,red


In [None]:
#print reviews_df
reviews_df.head()

Unnamed: 0,product_id,customer_userid,star_ratings,review_date,review_title,review_content
0,06546195-f435-4b02-87fe-387dd5cb0e84,user_1,4,10/09/23,Super cute but too small,The item was in great condition and was super ...
1,06546195-f435-4b02-87fe-387dd5cb0e84,user_2,3,09/18/23,Second try,I rented these in my usual size last time and ...
2,06546195-f435-4b02-87fe-387dd5cb0e84,user_3,5,09/10/23,Fun shorts,"Loved the color and style. Size up, I’m usuall..."
3,06546195-f435-4b02-87fe-387dd5cb0e84,user_4,1,08/29/23,Retro tiny tiny shorts,"These shorts were very thin material, very sm..."
4,06546195-f435-4b02-87fe-387dd5cb0e84,user_5,4,08/21/23,They're Alright,Cute shorts. Didn't really get to wear much. P...


In [None]:
order_df.rename(columns={"new_date": "order_date"}, inplace=True)

In [None]:
#exporting all the final transformed dataframes to csv

customer_df.to_csv('customer.csv', index=False)
product_df.to_csv('product.csv', index=False)
order_df.to_csv('order.csv', index=False)
reviews_df.to_csv('reviews.csv', index=False)

print("DataFrames saved to CSV files successfully!")

DataFrames saved to CSV files successfully!
