# **Business Insights Amplification** 

## **1. Business Understanding**
The project, Business Insights Amplification, aims to leverage the accumulated transactional data from the year 2019 to derive meaningful insights that empower our client to enhance sales, improve operational efficiency, and identify growth opportunities. The goal is to transform raw data into actionable intelligence, providing a comprehensive view of the business landscape.

##### **Objective:**
To design and deliver a robust end-to-end Business Intelligence solution for our client, GetINNOtized. Through meticulous analysis of the 2019 transactional data, the project aims to answer critical questions, uncover patterns, and deliver strategic recommendations.

##### **Problem Statement:**
The underutilization of 2019 transactional data poses a critical challenge for our client, hindering the identification of growth opportunities, understanding product performance, and optimizing sales strategies, thereby impeding overall sales and operational efficiency.

**The stakeholders**, including GetINNOtized Leadership, Sales and Marketing Teams, Logistics Department, Product Development Team, and Finance Department, collectively play pivotal roles in leveraging actionable insights derived from the analysis of 2019 transactional data to drive business growth, optimize sales strategies, and enhance overall operational efficiency.

**<span style="font-size: Business Analystics Question to be answered:>;">Business Analystics Question to be answered:></span>**

1. How much money did we make this year? 

2. Can we identify any seasonality in the  sales? 

3. What are our best and worst-selling products? 

4. How do sales compare to previous months or weeks? 

5. Which cities are our products delivered to most? 

6. How do product categories compare in revenue generated and quantities  ordered? 

7. You are required to show additional details from your findings in your data.

## **2. Data Understanding**

**I. Installations and Importations**

In [1]:
#Install pyodbc and python-dotenv
%pip install pyodbc  
%pip install python-dotenv
import warnings 
warnings.filterwarnings('ignore')





[notice] A new release of pip is available: 23.2.1 -> 23.3.2
[notice] To update, run: python.exe -m pip install --upgrade pip


Note: you may need to restart the kernel to use updated packages.



[notice] A new release of pip is available: 23.2.1 -> 23.3.2
[notice] To update, run: python.exe -m pip install --upgrade pip


In [2]:
#Importing the necessary Libraries.
#import the dotenv_values function from the dotenv package
import pyodbc                    
from dotenv import dotenv_values 

#Dataloading and preparing libraries
import pandas as pd
import numpy as np
import os
import warnings 
import zipfile
#Vizualiation Libraries
import matplotlib.pyplot as plt
import re
import matplotlib.ticker as ticker
import seaborn as sns

**II. Loading Datasets**

Loading datasets from the zipfile and database

In [13]:
#lOADING JAN - JUNE 2019 from zipfile
# Path to your zipfile
zip_path = "./datasets.zip"
# Initialize the ZipFile object
with zipfile.ZipFile(zip_path, 'r') as zip_ref:

    # Read the 'train.csv' file from the ZIP archive
    with zip_ref.open('Sales_January_2019.csv') as jan:
        jan_19 = pd.read_csv(jan)#index_col = 'date',parse_dates =True)

    with zip_ref.open('Sales_February_2019.csv') as feb:
        feb_19 = pd.read_csv(feb)#index_col = 'date',parse_dates =True)

    with zip_ref.open('Sales_March_2019.csv') as march:
        mar_19 = pd.read_csv(march)#index_col = 'date',parse_dates =True)
    with zip_ref.open('Sales_April_2019.csv') as april:
        april_19 = pd.read_csv(april)#index_col = 'date',parse_dates =True)

    with zip_ref.open('Sales_May_2019.csv') as may:
        may_19 = pd.read_csv(may)#index_col = 'date',parse_dates =True)

    with zip_ref.open('Sales_June_2019.csv') as june:
        june_19 = pd.read_csv(june)#index_col = 'date',parse_dates =True)

In [5]:
#Load environment variables from .env file into a dictionary
environment_variables = dotenv_values('.env')
# Get the values for the credentials you set in the '.env' file
database=environment_variables.get("DATABASE")
server=environment_variables.get("SERVER")
username=environment_variables.get("USERNAME")
password=environment_variables.get("PASSWORD")


connection_string=f"DRIVER={{SQL Server}};SERVER={server};DATABASE={database};UID={username};PWD={password}"

In [6]:
#connecting to the database
connection=pyodbc.connect(connection_string)

In [20]:
# SQL query to select data from multiple tables for the second half of 2019
query = 'SELECT * FROM dbo.Sales_July_2019'
second_half_2019=pd.read_sql(query,connection)

In [21]:
second_half_2019.columns

Index(['Order_ID', 'Product', 'Quantity_Ordered', 'Price_Each', 'Order_Date',
       'Purchase_Address'],
      dtype='object')