# 4.  Combinations, and Permutations

## Overview
This project focuses on **Combinations and Permutations**, showcasing computational techniques to analyze and interpret various groupings and orderings of dataset elements. The goal is to demonstrate skills in using Python's itertools library and other statistical methods to calculate combinations, permutations, and related metrics such as average sales, cumulative profit, and total customer count.

## Key Topics
- **Combinations Analysis**: Evaluating different combinations of data (e.g., days, customers) and calculating associated metrics.
- **Permutations Analysis**: Exploring the effect of different orderings of data on cumulative metrics like profit or sales.
- **Customer Groupings**: Investigating the impact of customer count groupings on profitability and sales.
- **Optimization**: Using permutations to determine the most profitable arrangements of days or campaigns.

## Techniques Used
- **Python Libraries**: 
  - `pandas` for data manipulation.
  - `itertools` for generating combinations and permutations.
  - `numpy` for numerical operations.
- **Statistical Metrics**:
  - Mean, sum, and cumulative profit.
  - Filtering data based on customer groups or specific conditions.
- **Optimization Logic**:
  - Identifying the best permutation for maximum cumulative profit.

---

In [2]:
import pandas as pd
import numpy as np
# Create a simple dataset with date, transactions, sales, profit, and customer count
data = {
    'date': pd.date_range(start='2023-01-01', periods=100, freq='D'),  # Dates over 100 days
    'transactions': np.random.randint(500, 5000, size=100),  # Random number of transactions
    'sales': np.random.uniform(10000, 50000, size=100),  # Random sales amounts
    'profit': np.random.uniform(5000, 20000, size=100),  # Random profit amounts
    'customer_count': np.random.randint(100, 1000, size=100)  # Random customer count
}

# Create the DataFrame
df = pd.DataFrame(data)

# Save the data to a CSV file (optional)
df.to_csv('store_data_with_additional_columns.csv', index=False)

# Show the first few rows of the data
print(df.head())
print(df.tail())
print(df.info())
print(df.describe())

        date  transactions         sales        profit  customer_count
0 2023-01-01          1315  41900.604509  18912.743809             731
1 2023-01-02          4465  32038.797367  15164.608433             312
2 2023-01-03          4284  46054.127015  18502.852596             651
3 2023-01-04          1146  37900.477039  11204.205096             678
4 2023-01-05          2913  11012.004290   7426.890967             545
         date  transactions         sales        profit  customer_count
95 2023-04-06           762  40848.890171  16660.199992             136
96 2023-04-07          4329  30615.807823   7971.934981             585
97 2023-04-08          1932  42410.023095  17415.706865             591
98 2023-04-09          3168  13527.452595   6794.895390             779
99 2023-04-10          1140  21334.785420  10084.305423             864
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 100 entries, 0 to 99
Data columns (total 5 columns):
 #   Column          Non-Null Count  Dt

### Analyzing Customer Group Combinations Based on Daily Data

In [3]:
import itertools
import pandas as pd

# Find all possible combinations of 3 days from the dataset
day_combinations = list(itertools.combinations(df.index, 3))

# Iterate through each combination and calculate the average sales and profit for those days
for combo in day_combinations[:5]:  # Limit output to first 5 combinations for brevity
    days = df.loc[list(combo)]
    avg_sales = days['sales'].mean()
    avg_profit = days['profit'].mean()
    print(f"Days: {combo}, Avg Sales: {avg_sales}, Avg Profit: {avg_profit}")

Days: (0, 1, 2), Avg Sales: 39997.84296360825, Avg Profit: 17526.73494601004
Days: (0, 1, 3), Avg Sales: 37279.959638212706, Avg Profit: 15093.852446078186
Days: (0, 1, 4), Avg Sales: 28317.13538879725, Avg Profit: 13834.747736581872
Days: (0, 1, 5), Avg Sales: 36009.36832433793, Avg Profit: 15658.646405360114
Days: (0, 1, 6), Avg Sales: 37239.14353627973, Avg Profit: 16493.834624869705


### Permutations of Daily Sales Patterns

In [4]:
from itertools import permutations

# Select a subset of sales data for permutation (to keep it computationally manageable)
sales_subset = df['sales'][:5]  # Taking the first 5 days of sales

# Generate permutations of the sales data
sales_permutations = list(permutations(sales_subset))

# Calculate the cumulative profit for each permutation
for perm in sales_permutations[:5]:  # Limiting to first 5 permutations for brevity
    cumulative_profit = np.cumsum(perm)  # Calculate cumulative sum for each permutation
    print(f"Sales permutation: {perm}, Cumulative Profit: {cumulative_profit[-1]}")

Sales permutation: (41900.604509024524, 32038.79736692229, 46054.127014877944, 37900.4770386913, 11012.004290444936), Cumulative Profit: 168906.010219961
Sales permutation: (41900.604509024524, 32038.79736692229, 46054.127014877944, 11012.004290444936, 37900.4770386913), Cumulative Profit: 168906.010219961
Sales permutation: (41900.604509024524, 32038.79736692229, 37900.4770386913, 46054.127014877944, 11012.004290444936), Cumulative Profit: 168906.010219961
Sales permutation: (41900.604509024524, 32038.79736692229, 37900.4770386913, 11012.004290444936, 46054.127014877944), Cumulative Profit: 168906.010219961
Sales permutation: (41900.604509024524, 32038.79736692229, 11012.004290444936, 46054.127014877944, 37900.4770386913), Cumulative Profit: 168906.010219961


### Combinations of Marketing Campaigns and Their Impact on Sales

In [5]:
# Assume that each day has a unique marketing campaign
campaign_combinations = list(itertools.combinations(df.index, 4))  # 4-day combinations

# Analyze the sales impact for different marketing campaign combinations
for combo in campaign_combinations[:5]:  # Limiting to the first 5 combinations
    campaign_days = df.loc[list(combo)]
    total_sales = campaign_days['sales'].sum()
    print(f"Campaign Days: {combo}, Total Sales: {total_sales}")

Campaign Days: (0, 1, 2, 3), Total Sales: 157894.00592951605
Campaign Days: (0, 1, 2, 4), Total Sales: 131005.5331812697
Campaign Days: (0, 1, 2, 5), Total Sales: 154082.23198789172
Campaign Days: (0, 1, 2, 6), Total Sales: 157771.55762371712
Campaign Days: (0, 1, 2, 7), Total Sales: 168020.31160141947


### Customer Count Groupings and Their Effect on Profit

In [6]:
# Analyze the effect of different customer groupings (using combinations)
customer_combinations = list(itertools.combinations(df['customer_count'], 5))  # 5-customer combinations

# Calculate the profit and sales for different customer groupings
for combo in customer_combinations[:5]:  # Limiting to first 5 combinations
    avg_customers = np.mean(combo)
    related_days = df[df['customer_count'].isin(combo)]
    total_sales = related_days['sales'].sum()
    total_profit = related_days['profit'].sum()
    print(f"Customer Group: {combo}, Avg Customers: {avg_customers}, Total Sales: {total_sales}, Total Profit: {total_profit}")

Customer Group: (731, 312, 651, 678, 545), Avg Customers: 583.4, Total Sales: 168906.010219961, Total Profit: 71211.30090115542
Customer Group: (731, 312, 651, 678, 608), Avg Customers: 596.0, Total Sales: 191982.709026583, Total Profit: 76682.99690749016
Customer Group: (731, 312, 651, 678, 714), Avg Customers: 617.2, Total Sales: 195672.0346624084, Total Profit: 79188.56156601892
Customer Group: (731, 312, 651, 678, 570), Avg Customers: 588.4, Total Sales: 205920.78864011075, Total Profit: 74470.36326895427
Customer Group: (731, 312, 651, 678, 693), Avg Customers: 613.0, Total Sales: 183714.94009665918, Total Profit: 82987.34381623668


### Store Optimizer - Find the Best Days for Maximum Profit

In [7]:
# Select a subset of days for which we will calculate the permutations
subset_df = df.head(5)  # Taking the first 5 days for example

# Generate permutations of the subset of days
day_permutations = list(permutations(subset_df.index, len(subset_df)))

# Create a variable to store the best permutation and the maximum cumulative profit
best_permutation = None
max_cumulative_profit = 0

# Iterate through each permutation
for perm in day_permutations:
    # Get the corresponding rows for the current permutation
    permuted_days = subset_df.loc[list(perm)]
    
    # Calculate the cumulative profit for the current permutation
    cumulative_profit = permuted_days['profit'].sum()
    
    # Check if the current permutation gives a higher cumulative profit
    if cumulative_profit > max_cumulative_profit:
        max_cumulative_profit = cumulative_profit
        best_permutation = perm

# Output the best permutation and the maximum cumulative profit
print(f"Best permutation of days: {best_permutation}")
print(f"Maximum cumulative profit: {max_cumulative_profit}")

# Display the details of the best permutation days
print("\nDetails of best permutation days:")
print(subset_df.loc[list(best_permutation)])

Best permutation of days: (0, 1, 2, 3, 4)
Maximum cumulative profit: 71211.30090115542

Details of best permutation days:
        date  transactions         sales        profit  customer_count
0 2023-01-01          1315  41900.604509  18912.743809             731
1 2023-01-02          4465  32038.797367  15164.608433             312
2 2023-01-03          4284  46054.127015  18502.852596             651
3 2023-01-04          1146  37900.477039  11204.205096             678
4 2023-01-05          2913  11012.004290   7426.890967             545


# Output

```plaintext
- Created combinations of days using Python itertools.combinations.
- Calculated averages (mean) of sales and profits for selected combinations.
- Generated permutations of daily sales patterns using `itertools.permutations`.
- Computed cumulative profit for each permutation.
- Analyzed combinations of marketing campaigns using daily data.
- Evaluated sales and profit impact for selected marketing campaign combinations.
- Grouped customer counts using combinations and calculated total sales and profits.
- Found optimal days for maximum profit by generating permutations of daily data.
- Used subset filtering to manage data for computationally intensive tasks like permutations and combinations.
