# Customer Reviews â€“ Exploratory Data Analysis

This notebook performs basic exploratory data analysis (EDA) on a small sample of customer review text data. The goal is to understand review characteristics that inform further analysis.

In [None]:
import pandas as pd
import matplotlib.pyplot as plt

# Sample dataset (placeholder for public portfolio)
data = {
    'review_id': [1, 2, 3, 4, 5],
    'review_text': [
        'The battery life is excellent but the screen is dim.',
        'Customer service was unhelpful and slow.',
        'Fast delivery and great packaging.',
        'The product quality is good but delivery took too long.',
        'Excellent value for money.'
    ]
}

df = pd.DataFrame(data)
df

In [None]:
# Basic dataset information
df.info()

In [None]:
# Review length analysis
df['review_length'] = df['review_text'].str.split().str.len()
df[['review_length']].describe()

In [None]:
# Visualise review length distribution
plt.hist(df['review_length'], bins=5)
plt.xlabel('Number of words per review')
plt.ylabel('Frequency')
plt.title('Distribution of Review Lengths')
plt.show()

## Key Observations

- Review lengths vary, with most reviews being relatively short.
- Several reviews mention multiple aspects (e.g., delivery and product quality).
- This supports the need for more granular, aspect-level analysis in later stages.