# Example usage

Welcome to the `sales_analyzer` package! This package is designed to help small-sized businesses analyze their retail sales data efficiently, without needing extensive data analytics expertise. If you've ever felt overwhelmed by tools like Pandas or Scikit-learn, or wished for more retail-specific functions, you're in the right place.

In this notebook, we'll walk through how to use the `sales_analyzer` package to extract valuable insights from your sales data. We’ll demonstrate key functionalities using real-world examples, so you can start improving your business decisions right away.

## Imports

In [1]:
import pandas as pd
import numpy as np
import random
from datetime import datetime, timedelta
import random

from salesanalyzer.sales_summary_statistics import sales_summary_statistics

## Create a sample data

We'll first create a sample data to work with.

In [3]:
def generate_random_dates(n):
    random.seed(1)
    # Get the current date
    today = datetime.now()
    # Calculate the date two years ago
    two_years_ago = today - timedelta(days=730)
    
    # Generate n random dates
    random_dates = [
        two_years_ago + timedelta(days=random.randint(0, (today - two_years_ago).days))
        for _ in range(n)
    ]

    return random_dates  

[datetime.datetime(2023, 7, 12, 15, 18, 29, 822362),
 datetime.datetime(2023, 5, 24, 15, 18, 29, 822362),
 datetime.datetime(2023, 10, 22, 15, 18, 29, 822362),
 datetime.datetime(2023, 10, 15, 15, 18, 29, 822362),
 datetime.datetime(2023, 6, 10, 15, 18, 29, 822362),
 datetime.datetime(2023, 1, 25, 15, 18, 29, 822362),
 datetime.datetime(2024, 12, 5, 15, 18, 29, 822362),
 datetime.datetime(2023, 12, 5, 15, 18, 29, 822362),
 datetime.datetime(2023, 4, 29, 15, 18, 29, 822362),
 datetime.datetime(2024, 6, 4, 15, 18, 29, 822362),
 datetime.datetime(2024, 10, 13, 15, 18, 29, 822362),
 datetime.datetime(2023, 8, 14, 15, 18, 29, 822362),
 datetime.datetime(2023, 2, 15, 15, 18, 29, 822362),
 datetime.datetime(2024, 6, 20, 15, 18, 29, 822362),
 datetime.datetime(2023, 3, 19, 15, 18, 29, 822362),
 datetime.datetime(2023, 5, 11, 15, 18, 29, 822362),
 datetime.datetime(2023, 2, 11, 15, 18, 29, 822362),
 datetime.datetime(2024, 2, 23, 15, 18, 29, 822362),
 datetime.datetime(2024, 8, 13, 15, 18, 29, 

In [5]:
def anonymize_data(obs=50):
    random.seed(1)

    df = pd.DataFrame({})

    fake_products = [
        "Laptop", "Monitor", "Headphone"
    ]

    fake_cities = [
        "Vancouver", "Toronto", "Calgary"
    ]
    

    # Replace InvoiceNo with random unique numbers
    df['InvoiceNo'] = [f'INV-{random.randint(100000, 999999)}' for _ in range(obs)]

    # Replace StockCode with random alphanumeric strings
    df['StockCode'] = [f'SC{random.randint(1000, 9999)}' for _ in range(obs)]

    # Replace Description with random fake product names
    df['Description'] = [random.choice(fake_products) for _ in range(obs)]

    # Modify Quantity with random realistic values
    df['Quantity'] = [int(np.random.exponential(2)) + 1 for _ in range(obs)]

    # Replace InvoiceDate with random dates in the last two years
    df['InvoiceDate'] = generate_random_dates(obs)

    # Modify UnitPrice with random prices
    df['UnitPrice'] = [round(random.uniform(0.5, 50), 2) for _ in range(obs)]

    # Replace CustomerID with random unique identifiers
    df['CustomerID'] = [random.randint(10000, 99999) for _ in range(obs)]

    # Replace Country with random countries
    df['Country'] = [random.choice(fake_cities) for _ in range(obs)]

    return df

In [7]:
sample_data = anonymize_data()
sample_data.head()

Unnamed: 0,InvoiceNo,StockCode,Description,Quantity,InvoiceDate,UnitPrice,CustomerID,Country
0,INV-223420,SC7102,Monitor,1,2023-01-31 15:20:39.610514,31.11,79215,Vancouver
1,INV-583723,SC9216,Monitor,2,2023-03-07 15:20:39.610514,31.69,40284,Vancouver
2,INV-700974,SC2759,Monitor,1,2024-05-21 15:20:39.610514,24.96,83268,Calgary
3,INV-167910,SC5663,Headphone,4,2024-09-20 15:20:39.610514,42.22,93045,Toronto
4,INV-844216,SC5024,Headphone,2,2023-08-09 15:20:39.610514,38.64,48897,Vancouver


## Get Summary Statistics

One of the key features of `sales_analyzer` is its ability to quickly generate sales summary. Use the `analyze_sales_trends` function to generate insights like total revenue, average order value, and top selling products.

In [8]:
sales_summary_statistics(sample_data)

Unnamed: 0,total_revenue,unique_customers,average_order_value,top_selling_product_quantity,top_selling_product_revenue,average_revenue_per_customer
0,3334.85,50,66.697,Headphone,Laptop,66.697
