# Analyzing Website Traffic Data

## Scenario:
You're a data analyst for a company that runs an e-commerce website. You’ve been given traffic data for three website features (e.g., Homepage, Product Page, Checkout Page) across six days (Monday to Saturday). The data is stored in a 2D NumPy array where each row represents a feature and each column represents the number of user visits on a given day.

## Input Data:  
```python
import numpy as np

traffic_data = np.array([
    [500, 520, 480, 550, 600, 620],  # Homepage
    [300, 310, 290, 340, 350, 360],  # Product Page
    [100, 90, 110, 120, 130, 150]    # Checkout Page
])
```

## Tasks:

### 1. Calculate Total Visits per Feature:  
Compute the total number of visits for each website feature across all six days.  

Output a 1D array with three values (one for each feature):
$$
\sum_{i=1}^{6} x_{fi}, \quad f \in \{\text{Homepage, Product Page, Checkout Page}\}
$$

### 2. Find the Day with the Most Traffic:  
Sum the visits across all features for each day and identify which day had the highest total traffic.  

Output the index of that day (0 for Monday, 1 for Tuesday, etc.):
$$
\arg\max_{d} \sum_{f} x_{fd}
$$

### 3. Detect Features with Traffic Spikes:  
A "traffic spike" occurs when the visits on a day are at least 10% higher than the visits on the previous day. For each feature (row) in the traffic_data array:
Identify all days where a traffic spike occurs compared to the previous day.

Output a list of arrays, where each array contains the day indices (0 to 5, representing Monday to Saturday) of traffic spikes for that feature.



### 4. Calculate Daily Variability:  
For each feature, compute the standard deviation of visits across the six days to measure how much traffic fluctuates.  

Output a 1D array with three values (one for each feature):
$$
\sigma_f = \sqrt{\frac{1}{6} \sum_{d=1}^{6} (x_{fd} - \mu_f)^2}
$$
where $$ \mu_f $$ is the mean traffic for feature $$ f $$.

### 5. Rank Days by Traffic for Each Feature:  
For each feature, sort the days by the number of visits in descending order and return the indices of the days.  

Output a 2D array where each row corresponds to a feature and each column is the day index in ranked order:
$$
\text{argsort}(x_f)_{\text{descending}}
$$

## Expected Output:

- **Task 1**: A 1D array of shape (3,) with total visits per feature.
- **Task 2**: An integer representing the index of the day with the highest total traffic.
- **Task 3**: A 1D boolean array of shape (3,) indicating features with at least one traffic spike.
- **Task 4**: A 1D array of shape (3,) with the standard deviation of visits for each feature.
- **Task 5**: A 2D array of shape (3, 6) with day indices ranked by traffic for each feature.



In [1]:
import numpy as np
traffic_data = np.array([
    [500, 520, 480, 550, 600, 620],  # Homepage
    [300, 310, 290, 340, 350, 360],  # Product Page
    [100, 90, 110, 120, 130, 150]    # Checkout Page
])
traffic_data

array([[500, 520, 480, 550, 600, 620],
       [300, 310, 290, 340, 350, 360],
       [100,  90, 110, 120, 130, 150]])

## Total Visits per Feature  



In [3]:
np.sum(traffic_data,axis=1)

array([3270, 1950,  700])

# Find the Day with the Most Traffic: 



In [9]:
print(f"day - {np.where(np.sum(traffic_data,axis=0) ==np.max(np.sum(traffic_data,axis=0)))[0][0]+1}")

day - 6


## Detect Features with Traffic Spikes:  



In [11]:
traffic_data

array([[500, 520, 480, 550, 600, 620],
       [300, 310, 290, 340, 350, 360],
       [100,  90, 110, 120, 130, 150]])

In [27]:
xyz=np.concatenate((np.expand_dims(np.array([0]*3),axis=1),(np.diff(traffic_data) / traffic_data[...,1::])*100
),axis=1 )
xyz

array([[  0.        ,   3.84615385,  -8.33333333,  12.72727273,
          8.33333333,   3.22580645],
       [  0.        ,   3.22580645,  -6.89655172,  14.70588235,
          2.85714286,   2.77777778],
       [  0.        , -11.11111111,  18.18181818,   8.33333333,
          7.69230769,  13.33333333]])

In [32]:
row,coln= np.where(xyz>10)
row,coln

(array([0, 1, 2, 2], dtype=int64), array([3, 3, 2, 5], dtype=int64))

In [34]:
xyz[row,coln]

array([12.72727273, 14.70588235, 18.18181818, 13.33333333])

In [36]:
print("trafiic spikes on")
for i,j  in zip(row,coln):
    print(f"page{i+1}, day {j+1}")

trafiic spikes on
page1, day 4
page2, day 4
page3, day 3
page3, day 6


## Rank Days by Traffic for Each Feature

In [37]:
traffic_data

array([[500, 520, 480, 550, 600, 620],
       [300, 310, 290, 340, 350, 360],
       [100,  90, 110, 120, 130, 150]])

In [42]:
# Rank days by traffic (descending order)
ranked_days = np.argsort(-traffic_data, axis=1)
ranked_days

array([[5, 4, 3, 1, 0, 2],
       [5, 4, 3, 1, 0, 2],
       [5, 4, 3, 2, 0, 1]], dtype=int64)

In [43]:
day_names = ["Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday"]
feature_names = ["Homepage", "Product Page", "Checkout Page"]
for i in range(ranked_days.shape[0]):
    ranked_day_names = [day_names[idx] for idx in ranked_days[i]]
    print(f"{feature_names[i]}: {ranked_day_names}")

Homepage: ['Saturday', 'Friday', 'Thursday', 'Tuesday', 'Monday', 'Wednesday']
Product Page: ['Saturday', 'Friday', 'Thursday', 'Tuesday', 'Monday', 'Wednesday']
Checkout Page: ['Saturday', 'Friday', 'Thursday', 'Wednesday', 'Monday', 'Tuesday']
