The fundamental difference is this:

*   **`resample()`** is for **frequency conversion**. It aggregates data over fixed, continuous time intervals (e.g., from seconds to minutes, days to months). It is "time-aware" and can create new time bins that didn't exist in the original data (e.g., upsampling or filling gaps).  `resample()` is based on `groupby()` combined with fill operations.
*   **`groupby()`** is for **categorical grouping**. It groups rows based on identical values in a column. When used with time data, it's not grouping by a *frequency*, but rather by a *property* or *label* derived from the timestamp, such as the hour of the day, the day of the week, or the month number.

Important differences:
 * `resample` can interpolate missing values
 * `groupby` can elide missing values (well... supposedly)


You would want to use `groupby()` on time series data instead of `resample()` in scenarios like:
* Find the average value for each day of the week over a multi-week period.
* Group by time and another category

---

### 1. Grouping by Cyclical Time Features (Ignoring the Date)

This is the most common use case. You want to analyze patterns that repeat over time cycles, irrespective of the specific date. For example, you want to compare the average activity across all Mondays vs. all Tuesdays, or see which hour of the day is busiest across your entire dataset.

`resample` cannot do this. `resample('D')` would give you one value for each day. `groupby()` lets you aggregate all data points that share the same time property.

**Scenario:** Find the average value for each day of the week over a multi-week period.

```python
import pandas as pd
import numpy as np

# Create a sample time series DataFrame
date_rng = pd.date_range(start='2023-01-01', end='2023-01-21', freq='H')
df = pd.DataFrame(date_rng, columns=['time'])
df['value'] = np.random.randint(0, 100, size=(len(date_rng)))
df = df.set_index('time')

print("--- Original Data (first 5 rows) ---")
print(df.head())

# Using groupby to find the mean value for each day of the week
# We group by the 'day_name' attribute of the DatetimeIndex
day_of_week_mean = df.groupby(df.index.day_name())['value'].mean()

# Order the results by the days of the week for better readability
days = ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday']
day_of_week_mean = day_of_week_mean.reindex(days)

print("\n--- Mean Value by Day of the Week (using groupby) ---")
print(day_of_week_mean)
```

**Output:**
```
--- Original Data (first 5 rows) ---
                     value
time
2023-01-01 00:00:00     49
2023-01-01 01:00:00     91
2023-01-01 02:00:00     44
2023-01-01 03:00:00     17
2023-01-01 04:00:00     39

--- Mean Value by Day of the Week (using groupby) ---
time
Monday       49.500000
Tuesday      51.375000
Wednesday    48.791667
Thursday     47.041667
Friday       50.833333
Saturday     48.541667
Sunday       50.111111
Name: value, dtype: float64
```

Here, `groupby` collected all the values that occurred on a "Monday" (Jan 2, 9, 16), all values on a "Tuesday" (Jan 3, 10, 17), etc., and aggregated them.

Other examples include:
*   `df.groupby(df.index.hour)`: To find the average value for each hour of the day (0-23).
*   `df.groupby(df.index.month)`: To analyze seasonality by comparing all January data vs. all February data, etc., across multiple years.

### 2. Grouping by Time and Another Category

You often need to create groups based on a combination of a time period *and* another categorical variable. While you can chain operations, using `groupby` with `pd.Grouper` is a powerful and explicit way to do this. `pd.Grouper` allows you to specify a resampling frequency directly inside a `groupby` call.

**Scenario:** Calculate the weekly sum of values for two different categories.

```python
# Add a categorical column to our DataFrame
df['category'] = np.random.choice(['A', 'B'], size=len(df))

print("\n--- Data with Categories (first 5 rows) ---")
print(df.head())

# Group by the 'category' column AND by a weekly frequency on the index
# This is the power of combining groupby with pd.Grouper
weekly_summary = df.groupby(['category', pd.Grouper(freq='W')])['value'].sum()

print("\n--- Weekly Sum by Category (using groupby + pd.Grouper) ---")
print(weekly_summary)
```

**Output:**
```
--- Data with Categories (first 5 rows) ---
                     value category
time
2023-01-01 00:00:00     49        B
2023-01-01 01:00:00     91        A
2023-01-01 02:00:00     44        B
2023-01-01 03:00:00     17        A
2023-01-01 04:00:00     39        B

--- Weekly Sum by Category (using groupby + pd.Grouper) ---
category  time
A         2023-01-01     4327
          2023-01-08     8248
          2023-01-15     8804
          2023-01-22     4321
B         2023-01-01     4141
          2023-01-08     8356
          2023-01-15     7992
          2023-01-22     4239
Name: value, dtype: int64
```
This is much more direct than resampling each category's subset of data separately.

### 3. Event-Based or Session-Based Grouping

Time series data doesn't always have a regular frequency. In user analytics, for example, you might have logs of user actions. You may want to group these actions into "sessions," where a session is defined by a period of inactivity. `resample` is useless here because the groups are not based on a fixed clock interval.

**Scenario:** Group events into sessions where a new session starts after more than 30 minutes of inactivity.

```python
# Create irregular event data
events = pd.DataFrame({
    'time': pd.to_datetime([
        '2023-01-01 10:00:00', '2023-01-01 10:02:00', '2023-01-01 10:05:00', # Session 1
        '2023-01-01 10:40:00', '2023-01-01 10:41:00',                      # Session 2
        '2023-01-01 11:30:00', '2023-01-01 11:31:00', '2023-01-01 11:33:00'  # Session 3
    ]),
    'action': ['view', 'click', 'view', 'view', 'click', 'scroll', 'view', 'click']
}).set_index('time')

# Calculate the time difference between consecutive events
time_diff = events.index.to_series().diff()

# A new session starts where the difference is > 30 mins
# Use cumsum() on this boolean Series to create a unique session ID
session_id = (time_diff > pd.Timedelta('30 minutes')).cumsum()

# Group by the session ID and aggregate
session_summary = events.groupby(session_id).agg(
    session_start=('action', 'first'),
    event_count=('action', 'count')
)

print("\n--- Session Summary (using groupby on calculated session ID) ---")
print(session_summary)
```

**Output:**
```
--- Session Summary (using groupby on calculated session ID) ---
      session_start  event_count
time
0              view            3
1              view            2
2            scroll            3
```

---

### Summary Table: `resample` vs. `groupby`

| Feature | `resample()` | `groupby()` |
| :--- | :--- | :--- |
| **Primary Use Case** | Changing the frequency of time series data (downsampling/upsampling). | Grouping data based on shared labels or properties. |
| **Grouping Logic** | Groups by fixed, continuous time bins (e.g., every month, every hour). | Groups by identical values (e.g., all Mondays, all hours=9). |
| **Index Requirement** | Must be a `DatetimeIndex` or `PeriodIndex`. | Can operate on any column or index attribute. |
| **Handles Missing Bins**| Yes, can create empty time bins and fill them (e.g., with 0 or a forward-fill). | No, only groups by values that actually exist in the data. |
| **Flexibility** | Specialized for time frequency conversion. | Highly flexible; can group by time properties, other columns, or both. |
| **Golden Rule** | Use when your question includes "per day," "every 15 minutes," or "monthly totals." | Use when your question includes "for each day of the week," "by hour of the day," or "by category and week." |