# Exploratory Data Analysis

This notebook is dedicated to performing exploratory data analysis (EDA) on the sales data from the Famus Report. The goal is to understand the data structure, identify patterns, and uncover insights that can inform further analysis and reporting.

## Objectives
- Load the dataset
- Perform initial data exploration
- Visualize key metrics and trends
- Identify any data quality issues

## Setup
Before running the analysis, ensure that the necessary libraries are installed and the environment is set up correctly.

In [None]:
# Import necessary libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

# Set visualization style
sns.set(style='whitegrid')

# Load the dataset
data_path = '../data/JP Famus Report Original 05.15.25 - FAMOUS LOT DETAIL REPORT SA GRAPES 24-25.csv'
df = pd.read_csv(data_path, encoding='latin1')

# Display the first few rows of the dataset
df.head()

## Data Exploration

Let's explore the dataset to understand its structure and contents.

In [None]:
# Display basic information about the dataset
df.info()

# Describe the dataset to get statistical summaries
df.describe(include='all')

## Visualization

Next, we will visualize key metrics to identify trends and patterns.

In [None]:
# Example visualization: Sales over time
# Convert 'refdate' to datetime if necessary
df['refdate'] = pd.to_datetime(df['refdate'], errors='coerce')

# Group by date and sum sales
sales_over_time = df.groupby('refdate')['saleamt'].sum().reset_index()

# Plotting
plt.figure(figsize=(12, 6))
sns.lineplot(data=sales_over_time, x='refdate', y='saleamt')
plt.title('Total Sales Over Time')
plt.xlabel('Date')
plt.ylabel('Total Sales Amount')
plt.xticks(rotation=45)
plt.tight_layout()
plt.show()

## Conclusion

This notebook serves as a starting point for exploratory data analysis on the Famus Report dataset. Further analysis can be conducted based on the insights gained from this exploration.