## ***Exploratory Data Analysis on Global_SuperStore***

### ***Objective***
Conduct a detailed analysis of the Global Superstore Database to uncover valuable insights across multiple business dimensions, such as sales trends, profitability, customer behavior, product performance, and logistics. This project aims to enable data-driven decisions that improve profitability, customer satisfaction, and operational efficiency.

### ***Scope of Analysis***

1.	**Sales Trends and Patterns:** Identify seasonal fluctuations and observe long-term trends in sales across different product categories and regions.

2.	**Profitability Analysis:** Determine profit margins across product categories, customer segments, and regions to pinpoint high-margin areas.

3.	**Customer Segmentation:** Segment customers based on purchasing behavior to determine high-value customer groups and preferences.

4.	**Product Performance:** Assess the performance of products and categories to identify top-sellers and items with underperformance.

5.	**Regional Analysis:** Evaluate sales and profitability by region to identify growth opportunities and regional customer preferences.

6.	**Shipping Dynamics:** Analyze shipping preferences, delivery times, and costs to optimize logistics and enhance customer satisfaction.

7.	**Promotional Effectiveness:** Evaluate the impact of promotions and discounts on sales and profitability.

8.	**Return Analysis:** Investigate return rates and underlying reasons to minimize returns and improve product quality.

#### ***Importing Necessary Libraries***
Importing the necessary libraries which are required for our analysis, such as Data Loading, Statistical Analysis, Data Visualizations, Data Transformations, Merge and Joins etc.

Numpy and Pandas are used for Data Manipulation and Numerical Calculations.

Matplotlib and Seaborn are used for Data Visualizations.

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import mysql.connector

import warnings
warnings.filterwarnings('ignore')

### ***Connecting SQL with Python***

In [2]:
Conn = mysql.connector.connect(
    host='localhost',               # host name
    user='root',                    # the user who has privilege to the database
    password='root',                # password for user
    database='eda_practice',        # database name
)

In [3]:
query = 'select * from global_superstore'

In [4]:
df = pd.read_sql(query, Conn)

In [5]:
df.head(2)

Unnamed: 0,Row ID,Order ID,Order Date,Ship Date,Ship Mode,Customer ID,Customer Name,Segment,Postal Code,City,...,Category,Sub-Category,Product Name,Sales,Quantity,Discount,Profit,Shipping Cost,Order Priority,Returned
0,40098,CA-2014-AB10015140-41954,2014-11-11,2014-11-13,First Class,AB-100151402,Aaron Bergman,Consumer,73120.0,Oklahoma City,...,Technology,Phones,Samsung Convoy 3,221.98,2,0.0,62.1544,40.77,High,
1,26341,IN-2014-JR162107-41675,2014-02-05,2014-02-07,Second Class,JR-162107,Justin Ritter,Corporate,,Wollongong,...,Furniture,Chairs,"Novimex Executive Leather Armchair, Black",3709.4,9,0.1,-288.765,923.63,Critical,


#### ***Understanding the Dataset***