## Long Sleeve Jersey Filtering & Reporting

This notebook processes the “Details Category” tab from Capelli’s weekly order export to identify which players have (and have not) purchased long sleeve jerseys. It then generates two output files in the **LongSleeveOrders/** directory:

1. **`<DATE>Order_Summary_with_ls_status.csv`**  
   - Aggregated by **Player ID**  
   - Lists each Order ID, first order date, email, club, player name, and a computed long sleeve status (none, home only, away only, or both)  
2. **`Rush Soccer <DATE> - Details_with_ls_status.csv`**  
   - Original detail rows augmented with `Long Sleeve Status`, `Home Count`, and `Away Count`

### What You Need to Update Weekly

- **Source CSV filename**  
  Change the path in `pd.read_csv(...)` to the newest **Details Category** file under **shippingdates/** (e.g.,  
  ```python
  df = pd.read_csv("shippingdates/Rush Soccer 5.11 - Rush Details Category 2025.csv")


In [3]:
import pandas as pd

# Read the CSV file into a DataFrame.
df = pd.read_csv("shippingdates/Rush Soccer 5.4 - Rush Details Category 2025.csv")

# OPTIONAL: To restrict analysis to a specific club, e.g., "Rush Montana", uncomment:
df = df[df["Club Name"] == "Rush Nevada"]
df = df[df["Category"].isin(["Field Players Mandatory Kit", "Goalkeepers Mandatory Kit","Competitive Items"])]

# Convert "Order Date" to datetime and filter for orders in 2025.
df["Order Date"] = pd.to_datetime(df["Order Date"], errors="coerce")
df = df[df["Order Date"].dt.year == 2025]

# Define the long sleeve product descriptions.
home_ls = "NEVADA RUSH BROOKLYN II RUSH SOCCER PYRAMIDS LONG SLEEVE MATCH JERSEY PROMO BLUE BLACK WHITE"
away_ls = "NEVADA RUSH BROOKLYN II RUSH SOCCER SCATTERED SHARDE LONG SLEEVE MATCH JERSEY WHITE PROMO GREY BLACK"

# Function to compute long sleeve status and counts from a list of product names.
def compute_ls_status_player(product_names):
    # Convert the list into a pandas Series for easier aggregation.
    s = pd.Series(product_names)
    count_home = s.isin([home_ls]).sum()
    count_away = s.isin([away_ls]).sum()
    total = count_home + count_away
    if total == 0:
        status = "Did Not Purchase Long Sleeve"
    elif count_home > 0 and count_away > 0:
        status = f"Purchased Both Long Sleeves ({total})"
    elif count_home > 0:
        status = f"Purchased Only Home Long Sleeve ({count_home})"
    elif count_away > 0:
        status = f"Purchased Only Away Long Sleeve ({count_away})"
    else:
        status = "Did Not Purchase Long Sleeve"
    return status, count_home, count_away

# Group by "Player ID" so that if a player purchases in multiple orders, they are aggregated together.
grouped = df.groupby("Player ID").agg({
    "Order ID": lambda x: " ; ".join(x.astype(str).unique()),
    "Order Date": "min",             # earliest order date for that player
    "Customer Email": "first",       # assuming one email per player
    "Club Name": "first",
    "Player Name": "first",
    "Product Name": lambda x: list(x)  # aggregate all product names into a list
}).reset_index()

# Compute long sleeve status and counts for each player.
grouped[["Long Sleeve Status", "Home Count", "Away Count"]] = grouped["Product Name"].apply(
    lambda prod_list: pd.Series(compute_ls_status_player(prod_list))
)

# Print out unique players (Player ID, Customer Email, Player Name, and status) for players who purchased any long sleeve.
purchased_ls = grouped[grouped["Long Sleeve Status"] != "Did Not Purchase Long Sleeve"]
print("Players who purchased long sleeve jerseys:")
# print(purchased_ls[["Player ID", "Customer Email", "Player Name", "Long Sleeve Status", "Home Count", "Away Count"]])

# Print summary statistics: count of unique players by Long Sleeve Status.
summary = grouped.groupby("Long Sleeve Status").size()
print("\nSummary Statistics (Unique Players by Long Sleeve Status):")
print(summary)

# Print total counts of Home and Away long sleeve jerseys purchased across all players.
total_home = grouped["Home Count"].sum()
total_away = grouped["Away Count"].sum()
print(f"\nTotal Home Long Sleeve Jerseys Purchased: {total_home}")
print(f"Total Away Long Sleeve Jerseys Purchased: {total_away}")

# Write the aggregated order summary to CSV in the specified directory.
grouped.to_csv("LongSleeveOrders/5.4Order_Summary_with_ls_status.csv", index=False)

# Optionally, merge the summary information back into the original DataFrame by Player ID.
df = df.merge(grouped[["Player ID", "Long Sleeve Status", "Home Count", "Away Count"]],
              on="Player ID", how="left")
df.to_csv("LongSleeveOrders/Rush Soccer 5.4 - Details_with_ls_status.csv", index=False)

print("\nFiles saved: '5.4Order_Summary_with_ls_status.csv' and 'Rush Soccer 5.4 - Details_with_ls_status.csv'")


Players who purchased long sleeve jerseys:

Summary Statistics (Unique Players by Long Sleeve Status):
Long Sleeve Status
Did Not Purchase Long Sleeve           55
Purchased Both Long Sleeves (2)         8
Purchased Only Away Long Sleeve (1)     5
Purchased Only Home Long Sleeve (1)     1
dtype: int64

Total Home Long Sleeve Jerseys Purchased: 9
Total Away Long Sleeve Jerseys Purchased: 13

Files saved: '5.4Order_Summary_with_ls_status.csv' and 'Rush Soccer 5.4 - Details_with_ls_status.csv'
