# Walmart Sales Analysis Project
This project analyzes Walmart sales data to uncover insights into customer behavior, product line performance, and sales trends.

## Project Overview
In this project, we use the Walmart sales dataset to answer key business questions, exploring sales patterns, customer behavior, and preferences. Our tools include **Pandas** for data manipulation and **Tableau** for visualization.

## 1. Data Loading and Initial Inspection
We start by loading the data into a Pandas DataFrame and inspecting the first few rows to understand its structure.

In [5]:
import pandas as pd

# Load the dataset
data = pd.read_csv('WalmartSalesData.csv')
data.head()

Unnamed: 0,Invoice ID,Branch,City,Customer type,Gender,Product line,Unit price,Quantity,Tax 5%,Total,Date,Time,Payment,cogs,gross margin percentage,gross income,Rating
0,750-67-8428,A,Yangon,Member,Female,Health and beauty,74.69,7,26.1415,548.9715,2019-01-05,13:08:00,Ewallet,522.83,4.761905,26.1415,9.1
1,226-31-3081,C,Naypyitaw,Normal,Female,Electronic accessories,15.28,5,3.82,80.22,2019-03-08,10:29:00,Cash,76.4,4.761905,3.82,9.6
2,631-41-3108,A,Yangon,Normal,Male,Home and lifestyle,46.33,7,16.2155,340.5255,2019-03-03,13:23:00,Credit card,324.31,4.761905,16.2155,7.4
3,123-19-1176,A,Yangon,Member,Male,Health and beauty,58.22,8,23.288,489.048,2019-01-27,20:33:00,Ewallet,465.76,4.761905,23.288,8.4
4,373-73-7910,A,Yangon,Normal,Male,Sports and travel,86.31,7,30.2085,634.3785,2019-02-08,10:37:00,Ewallet,604.17,4.761905,30.2085,5.3


## 2. Data Cleaning and Preparation
Here, we check for missing values, confirm data types, and extract new features such as day of the week and hour for further analysis.

In [7]:
# Check for missing values
data.isnull().sum()

Invoice ID                 0
Branch                     0
City                       0
Customer type              0
Gender                     0
Product line               0
Unit price                 0
Quantity                   0
Tax 5%                     0
Total                      0
Date                       0
Time                       0
Payment                    0
cogs                       0
gross margin percentage    0
gross income               0
Rating                     0
dtype: int64

In [9]:
# Convert Date and Time columns to appropriate formats
data['Date'] = pd.to_datetime(data['Date'])
data['Time'] = pd.to_datetime(data['Time'], format='%H:%M:%S').dt.time

## 3. Exploratory Data Analysis (EDA)
We explore the data to answer our business questions, starting with a summary of sales by product line and customer type.

In [11]:
# Product Line Revenue Analysis
product_revenue = data.groupby('Product line')['Total'].sum().sort_values(ascending=False)
product_revenue

Product line
Food and beverages        56144.8440
Sports and travel         55122.8265
Electronic accessories    54337.5315
Fashion accessories       54305.8950
Home and lifestyle        53861.9130
Health and beauty         49193.7390
Name: Total, dtype: float64

In [13]:
# Average Transaction Value by Customer Type
avg_transaction_by_customer = data.groupby('Customer type')['Total'].mean()
avg_transaction_by_customer

Customer type
Member    327.791305
Normal    318.122856
Name: Total, dtype: float64

### Sales Trends Analysis
We examine peak sales days and hours, as well as monthly trends to identify any seasonal patterns.

In [19]:
# Extract day of the week and hour
data['DayOfWeek'] = data['Date'].dt.day_name()
data['Hour'] = pd.to_datetime(data['Time'], format='%H:%M:%S').dt.hour

In [21]:
# Sales by Day of the Week
sales_by_day = data.groupby('DayOfWeek')['Total'].sum().sort_values(ascending=False)
sales_by_day

DayOfWeek
Saturday     56120.8095
Tuesday      51482.2455
Thursday     45349.2480
Sunday       44457.8925
Friday       43926.3405
Wednesday    43731.1350
Monday       37899.0780
Name: Total, dtype: float64

In [23]:
# Sales by Hour of the Day
sales_by_hour = data.groupby('Hour')['Total'].sum().sort_values(ascending=False)
sales_by_hour

Hour
19    39699.5130
13    34723.2270
10    31421.4810
15    31179.5085
14    30828.3990
11    30377.3295
12    26065.8825
18    26030.3400
16    25226.3235
17    24445.2180
20    22969.5270
Name: Total, dtype: float64

## 4. Visualization Preparation for Tableau
To visualize insights in Tableau, we export key summaries such as product revenue and sales trends.

In [26]:
# Export product revenue and sales by day and hour for Tableau
product_revenue.to_csv('ProductRevenue.csv', index=True)
sales_by_day.to_csv('SalesByDay.csv', index=True)
sales_by_hour.to_csv('SalesByHour.csv', index=True)

## 5. Conclusion
This notebook provided an analysis of Walmart sales data, answering key business questions regarding product performance, customer spending patterns, and peak sales times. Further visualization in Tableau will enhance these insights.