## Building three core AVWAPs anchored to three important 1-minute candles

In this part of the project, we build **three special AVWAP lines** on our SPY 1-minute data.

To understand them, we first need to understand:

- **VWAP (Volume Weighted Average Price)**
- **AVWAP (Anchored VWAP)**


### What is VWAP?

**VWAP** is like an **average price**, but with a twist:

- Minutes with **more volume** get **more weight**.
- Minutes with **less volume** get **less weight**.

So VWAP tells us:

> “At what price did most of the trading actually happen?”

How traders often think about it:

- If the **current price is far above VWAP**  
  → price may be **stretched up** (expensive vs. where most trading was).
- If the **current price is far below VWAP**  
  → price may be **stretched down** (cheap vs. where most trading was).

In short, VWAP acts like a **fair value line** for a chosen period.


### What is Anchored VWAP (AVWAP)?

**Anchored VWAP (AVWAP)** is just a VWAP with a **fixed starting point**.

- We pick a **specific time** (the **anchor**).
- From that time onward, we compute VWAP using:
  - All bars from **the anchor time** up to **the current minute**.

So:

- A normal VWAP might restart each day.
- An **AVWAP** starts at a **chosen event** and keeps going forward.

Why do we anchor it?

- The anchor tells us **exactly** which part of price history is being summarized.
- Example anchors:
  - The **open** of the day
  - A **news event**
  - A **strong move** (like a fast rally or selloff)

So an AVWAP answers:

> “What is the volume-weighted average price **since this specific moment**?”


### Our three AVWAP anchors in this project

In this project, we use **three important starting points (anchors)** to build three AVWAP lines:

1. **Strongest 5-minute up move**

   - We look at all 5-minute windows in the day.
   - We find the 5-minute period where:
     - Price went **up the most** (largest positive change).
   - This 5-minute block marks a strong **upward burst of buying**.
   - From the **start** of this 5-minute block, we anchor an AVWAP:
     - This AVWAP tracks the **average cost of buyers** who joined during and after this strong up move.

2. **Strongest 5-minute down move**

   - We again scan all 5-minute windows.
   - This time, we find the 5-minute period where:
     - Price went **down the most** (largest negative change).
   - This block marks a strong **downward burst of selling**.
   - From the **start** of this 5-minute block, we anchor another AVWAP:
     - This AVWAP tracks the **average cost of sellers** who pushed the price down.

3. **Exact market open at 09:30:00**

   - We also anchor an AVWAP at the **official session open**:
     - **09:30:00 U.S. Eastern Time**
   - This AVWAP represents the **average price since the very start of the day**.
   - It captures the **overall position** of those who traded from the open onward.


### Why these three AVWAPs matter

With these three anchored VWAPs, we get:

- One AVWAP for **the strongest up move** (bullish burst anchor)
- One AVWAP for **the strongest down move** (bearish burst anchor)
- One AVWAP for **the whole day starting at the open**


In [65]:
import pandas as pd
import numpy as np
from pathlib import Path

In [66]:
# Now I can use our cleaned data with IB levels which we made in "01_ib_detect.ipynb"

PROJECT_ROOT = Path("..").resolve()

DATA_CACHE = PROJECT_ROOT / "data" / "cache"

CACHE_FILE = DATA_CACHE / "spy_1min_et_clean_with_IBlevels.csv"

df_new = pd.read_csv(CACHE_FILE, parse_dates=['datetime'])

df_new.head()

Unnamed: 0,datetime,high,low,close,Volume,ib_high,ib_low,ib_mid,ib_width,ib_width_type,day_close,prev_day,prev_close,gap,gap_dir
0,2025-09-08 09:30:00,648.86,648.24,648.26,141588,649.06,647.75,648.405,1.31,narrow,649.0,,,,0.0
1,2025-09-08 09:31:00,648.45,648.15,648.27,42118,649.06,647.75,648.405,1.31,narrow,649.0,,,,0.0
2,2025-09-08 09:32:00,648.46,648.1,648.26,37143,649.06,647.75,648.405,1.31,narrow,649.0,,,,0.0
3,2025-09-08 09:33:00,648.47,648.23,648.4,42231,649.06,647.75,648.405,1.31,narrow,649.0,,,,0.0
4,2025-09-08 09:34:00,648.68,648.32,648.665,23659,649.06,647.75,648.405,1.31,narrow,649.0,,,,0.0


In [67]:
# Firstly we need a tool that calculates AVWAP for each anchor
# Before using our data, we need to define this function
# So we can make our process much more simple 

# The calculation is happenning after we select one 1-min candle as an Anchor

# group: represents our selected trading day, we are ending our calculation at the end of that trading day 
# anchor_idx: in selected day's anchor index number (first,second)
# price_col="close": the price column in the original dataframe we will use in function calculations
# vol_col="Volume" : the volume column in the original dataframe we will use in function calculations

def compute_avwap_for_anchor(group, anchor_idx, price_col="close", vol_col="Volume"):
    
    g = group.copy() # we are copying the whole trading day, because we don't want to change the original cleaned dataframe

    # Not using any data before our anchor

    mask = g.index >= anchor_idx

    # dollar_vol --> Represents price * volume value for one 1-min candle. We get this for each candle for getting ready our summation
    # cum_dollar_vol --> Think this as a cumulative summation of dollar_vol values after anchor. Also this is the numerator of AVWAP equation 
    # cum_vol --> This is cumulative summation of volume datas after anchor. Also this is the denominator of AVWAP equation
    
    # Typical Price = (High + Low + Close) / 3 --> VWAP values are calculating with this reference
    # It is cruicial for getting VWAP value with all 1-min candlestick range
    g["typical_price"] = (g["high"] + g["low"] + g[price_col]) / 3

    # dollar_vol, cum_dollar_vol, cum_vol hesapları artık Typical Price'a göre
    g.loc[mask, "dollar_vol"] = g.loc[mask, "typical_price"] * g.loc[mask, vol_col]
    g.loc[mask, "cum_dollar_vol"] = g.loc[mask, "dollar_vol"].cumsum()
    g.loc[mask, "cum_vol"] = g.loc[mask, vol_col].cumsum()


    # we created new columns "dollar_vol", "cum_dollar_vol", "cum_vol" with .loc function.
    # in pandas, we can create new columns in different ways!!

    # AVWAP formula
    g.loc[mask, "avwap"] = g.loc[mask, "cum_dollar_vol"] / g.loc[mask, "cum_vol"]
    
    return g["avwap"]

# LAST REMAINDER: Our function is doing their operations in our selected group g. So, every calculation happened in dataframe table, therefore we don't need any loop logic


In [68]:
# After creating AVWAP calculating tool, we need anchor finder function for our cleaned dataframe
# This tool will just look at the IB (Initial Balance) timeframe which is 09:30-10:30
# g is our selected day frame, so we will just look at selected trading day's anchors

def find_anchors_for_day(g):
    g = g.sort_values("datetime").copy()

    # open anchor is directly starting point which is our anchored point
    open_idx = g.index[0]

    # IB filter
    is_reallyib = (g["datetime"].dt.hour == 9) | (g["datetime"].dt.hour == 10) & (g["datetime"].dt.minute <= 30) #looking at just IB window for selecting anchors
    ib = g[is_reallyib].copy()
    ib["delta_close"] = ib["close"].diff(-5)

    up_row = ib["delta_close"].idxmax() #it returns highest value's index
    down_row = ib["delta_close"].idxmin() #it returns lowest value's index
    up_idx = up_row
    down_idx = down_row

    g["avwap_open"] = compute_avwap_for_anchor(g, open_idx)
    g["avwap_down"] = compute_avwap_for_anchor(g, up_idx)
    g["avwap_up"] = compute_avwap_for_anchor(g, down_idx)

    return g


## 1) Applying our AVWAP functions to the full dataset

We have already written two key helper functions:

- **`find_anchors_for_day`** → finds the three special anchor points for each day  
  (strongest 5-min up move, strongest 5-min down move, and the open at 09:30).
- **`compute_avwap_for_anchor`** → starting from a given anchor, it calculates the  
  **Anchored VWAP (AVWAP)** forward in time for all later 1-minute bars.

These two functions work **together**:

- First, we **find** the anchors for the day.
- Then, from each anchor, we **build** the corresponding AVWAP line.



### What we do in this step

Now we want to apply these functions to our **main working dataframe**:

- **`spy_1min_et_clean_with_IBlevels`**

This dataframe already contains:

- Clean 1-minute SPY data
- Initial Balance features (ib_high, ib_low, ib_mid, ib_width, etc.)

Our goal is to:

- Use `find_anchors_for_day` to locate the three anchors **for each day**.
- Use `compute_avwap_for_anchor` to calculate the **three AVWAP series**.
- Attach these AVWAP values back into **`spy_1min_et_clean_with_IBlevels`** as new columns.



### Final result

After this step, each 1-minute candle in  
**`spy_1min_et_clean_with_IBlevels`** will have:

- Its day’s **Initial Balance structure**
- Plus the **three AVWAP values** (from:
  - the strongest up move anchor,
  - the strongest down move anchor,
  - and the 09:30 open anchor)

This completes the **technical analysis foundation** of the project and prepares the data for:

- Directional signal studies  
- Pattern tests  
- Hypothesis evaluation based on IB and AVWAP behavior


In [69]:
df_new = df_new.groupby(df_new["datetime"].dt.date, group_keys=False).apply(find_anchors_for_day)

df_new.head(120)

Unnamed: 0,datetime,high,low,close,Volume,ib_high,ib_low,ib_mid,ib_width,ib_width_type,day_close,prev_day,prev_close,gap,gap_dir,avwap_open,avwap_down,avwap_up
0,2025-09-08 09:30:00,648.86,648.240,648.260,141588,649.06,647.75,648.405,1.31,narrow,649.0,,,,0.0,648.453333,,
1,2025-09-08 09:31:00,648.45,648.150,648.270,42118,649.06,647.75,648.405,1.31,narrow,649.0,,,,0.0,648.415886,,
2,2025-09-08 09:32:00,648.46,648.100,648.260,37143,649.06,647.75,648.405,1.31,narrow,649.0,,,,0.0,648.391911,,
3,2025-09-08 09:33:00,648.47,648.230,648.400,42231,649.06,647.75,648.405,1.31,narrow,649.0,,,,0.0,648.387859,,
4,2025-09-08 09:34:00,648.68,648.320,648.665,23659,649.06,647.75,648.405,1.31,narrow,649.0,,,,0.0,648.401650,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
115,2025-09-08 11:25:00,649.52,649.349,649.470,16953,649.06,647.75,648.405,1.31,narrow,649.0,,,,0.0,648.741570,648.877633,649.222537
116,2025-09-08 11:26:00,649.56,649.410,649.500,17960,649.06,647.75,648.405,1.31,narrow,649.0,,,,0.0,648.746174,648.883420,649.226437
117,2025-09-08 11:27:00,649.55,649.380,649.550,10358,649.06,647.75,648.405,1.31,narrow,649.0,,,,0.0,648.748815,648.886725,649.228664
118,2025-09-08 11:28:00,649.65,649.495,649.630,12108,649.06,647.75,648.405,1.31,narrow,649.0,,,,0.0,648.752284,648.891164,649.232169


In [70]:
#checking random time frame for correctly VWAP calculations
# NaN values are normal between initial balance 09:30-10:30 because our data didin't show strongest 5-min up and down moves

df_new.iloc[390:450]

Unnamed: 0,datetime,high,low,close,Volume,ib_high,ib_low,ib_mid,ib_width,ib_width_type,day_close,prev_day,prev_close,gap,gap_dir,avwap_open,avwap_down,avwap_up
390,2025-09-09 09:30:00,649.19,648.8,648.8,87757,649.72,648.43,649.075,1.29,narrow,648.64,2025-09-08,649.0,-0.36,-1.0,648.93,,
391,2025-09-09 09:31:00,649.31,648.76,649.28,31814,649.72,648.43,649.075,1.29,narrow,648.64,2025-09-08,649.0,-0.36,-1.0,648.979666,,
392,2025-09-09 09:32:00,649.36,649.0,649.26,37873,649.72,648.43,649.075,1.29,narrow,648.64,2025-09-08,649.0,-0.36,-1.0,649.034271,,
393,2025-09-09 09:33:00,649.34,649.12,649.24,24278,649.72,648.43,649.075,1.29,narrow,648.64,2025-09-08,649.0,-0.36,-1.0,649.060865,,
394,2025-09-09 09:34:00,649.31,649.12,649.17,30995,649.72,648.43,649.075,1.29,narrow,648.64,2025-09-08,649.0,-0.36,-1.0,649.081139,,
395,2025-09-09 09:35:00,649.28,649.06,649.21,21358,649.72,648.43,649.075,1.29,narrow,648.64,2025-09-08,649.0,-0.36,-1.0,649.090463,,
396,2025-09-09 09:36:00,649.34,649.11,649.22,14937,649.72,648.43,649.075,1.29,narrow,648.64,2025-09-08,649.0,-0.36,-1.0,649.098434,,
397,2025-09-09 09:37:00,649.45,649.16,649.39,17801,649.72,648.43,649.075,1.29,narrow,648.64,2025-09-08,649.0,-0.36,-1.0,649.114105,,
398,2025-09-09 09:38:00,649.49,649.29,649.38,16478,649.72,648.43,649.075,1.29,narrow,648.64,2025-09-08,649.0,-0.36,-1.0,649.129959,,
399,2025-09-09 09:39:00,649.45,649.17,649.19,23287,649.72,648.43,649.075,1.29,narrow,648.64,2025-09-08,649.0,-0.36,-1.0,649.140597,,


In [71]:
#looking for anomaly in whole data with .describe() and .info() functions

df_new.info()
df_new.describe()

<class 'pandas.core.frame.DataFrame'>
Index: 21450 entries, 0 to 21449
Data columns (total 18 columns):
 #   Column         Non-Null Count  Dtype         
---  ------         --------------  -----         
 0   datetime       21450 non-null  datetime64[ns]
 1   high           21450 non-null  float64       
 2   low            21450 non-null  float64       
 3   close          21450 non-null  float64       
 4   Volume         21450 non-null  int64         
 5   ib_high        21450 non-null  float64       
 6   ib_low         21450 non-null  float64       
 7   ib_mid         21450 non-null  float64       
 8   ib_width       21450 non-null  float64       
 9   ib_width_type  21450 non-null  object        
 10  day_close      21450 non-null  float64       
 11  prev_day       21060 non-null  object        
 12  prev_close     21060 non-null  float64       
 13  gap            21060 non-null  float64       
 14  gap_dir        21450 non-null  float64       
 15  avwap_open     21450 non

Unnamed: 0,datetime,high,low,close,Volume,ib_high,ib_low,ib_mid,ib_width,day_close,prev_close,gap,gap_dir,avwap_open,avwap_down,avwap_up
count,21450,21450.0,21450.0,21450.0,21450.0,21450.0,21450.0,21450.0,21450.0,21450.0,21060.0,21060.0,21450.0,21450.0,20185.0,20158.0
mean,2025-10-15 12:44:30,668.153902,667.867178,668.010579,31363.66,669.798091,666.728364,668.263227,3.069727,668.486727,668.765556,0.082037,0.018182,668.026579,667.960132,668.139156
min,2025-09-08 09:30:00,647.51,647.22,647.31,1314.0,649.06,647.75,648.405,0.98,648.64,648.64,-21.47,-1.0,648.386941,648.234765,648.083
25%,2025-09-25 14:22:15,661.33,660.91225,661.1225,14615.5,663.23,659.63,660.975,1.85,660.74,661.9,-2.77,-1.0,661.295261,661.288241,661.555864
50%,2025-10-15 12:44:30,666.97,666.66,666.82,22888.0,670.23,666.36,668.14,2.645,669.29,669.29,0.255,0.0,666.941004,666.921274,666.964218
75%,2025-11-04 11:06:45,673.12,672.87,673.0,36983.0,677.38,672.52,674.95,4.13,674.9,674.9,2.885,1.0,672.992592,672.758035,672.871954
max,2025-11-21 15:59:00,689.7,689.52,689.59,1362579.0,689.7,688.15,688.925,7.45,688.97,688.97,12.7,1.0,689.029685,689.553333,689.029685
std,,9.467176,9.486371,9.477293,35760.77,9.540465,9.566758,9.518547,1.635759,9.387359,9.245447,5.765459,0.981504,9.386457,9.382645,9.304283


## 2) Computing slopes for each AVWAP line

We have already calculated all three AVWAP series using the correct anchors.  
Now we want to know **whether each AVWAP line is pointing up or down over time**.

In other words, we want to measure the **slope** of each AVWAP:

- If the slope is **positive** → the AVWAP is **rising**  
  → the volume-weighted average price is moving **higher** (benchmark getting more expensive).
- If the slope is **negative** → the AVWAP is **falling**  
  → the volume-weighted average price is moving **lower** (benchmark getting cheaper).
- If the slope is **near zero** → the AVWAP is **flat**  
  → no strong directional pressure around that benchmark.

This slope information is **very important** for our later analysis and hypothesis tests, because it:

- Helps us see whether the **“fair value”** (by volume) is drifting up or down.
- Gives us a simple way to detect:
  - Upward pressure vs. downward pressure
  - Trend-like vs. balanced conditions


### What we do in this step

We will create **slope columns** for each of our three AVWAP types:

- **`avwap_open`** → AVWAP anchored at the 09:30 open
- **`awvap_down`** → AVWAP anchored at the strongest 5-minute down move
- **`awvap_up`** → AVWAP anchored at the strongest 5-minute up move

For each of these AVWAP series, we compute how it changes from one minute to the next  
(e.g., using the difference between the current value and the previous value).

After this step, our dataframe will include:

- The **AVWAP levels** themselves
- The **slopes** of those AVWAPs

Together, these will be key inputs for our pattern tests and trading logic.

In [72]:
SLOPE_WINDOW = 5  # 5 bar = 5 minutes (with 1 min candlesticks)

for col in ["avwap_open", "avwap_up", "avwap_down"]:
    name = col.split("_")[1]  # "open", "up", "down"

    df_new[f"slope_{name}"] = (
        df_new.groupby(df_new["datetime"].dt.date, group_keys=False)[col].diff(SLOPE_WINDOW) / SLOPE_WINDOW
    )


In [73]:
df_new.head(120)

Unnamed: 0,datetime,high,low,close,Volume,ib_high,ib_low,ib_mid,ib_width,ib_width_type,...,prev_day,prev_close,gap,gap_dir,avwap_open,avwap_down,avwap_up,slope_open,slope_up,slope_down
0,2025-09-08 09:30:00,648.86,648.240,648.260,141588,649.06,647.75,648.405,1.31,narrow,...,,,,0.0,648.453333,,,,,
1,2025-09-08 09:31:00,648.45,648.150,648.270,42118,649.06,647.75,648.405,1.31,narrow,...,,,,0.0,648.415886,,,,,
2,2025-09-08 09:32:00,648.46,648.100,648.260,37143,649.06,647.75,648.405,1.31,narrow,...,,,,0.0,648.391911,,,,,
3,2025-09-08 09:33:00,648.47,648.230,648.400,42231,649.06,647.75,648.405,1.31,narrow,...,,,,0.0,648.387859,,,,,
4,2025-09-08 09:34:00,648.68,648.320,648.665,23659,649.06,647.75,648.405,1.31,narrow,...,,,,0.0,648.401650,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
115,2025-09-08 11:25:00,649.52,649.349,649.470,16953,649.06,647.75,648.405,1.31,narrow,...,,,,0.0,648.741570,648.877633,649.222537,0.003064,0.001480,0.003698
116,2025-09-08 11:26:00,649.56,649.410,649.500,17960,649.06,647.75,648.405,1.31,narrow,...,,,,0.0,648.746174,648.883420,649.226437,0.003401,0.002256,0.004193
117,2025-09-08 11:27:00,649.55,649.380,649.550,10358,649.06,647.75,648.405,1.31,narrow,...,,,,0.0,648.748815,648.886725,649.228664,0.003335,0.002464,0.004143
118,2025-09-08 11:28:00,649.65,649.495,649.630,12108,649.06,647.75,648.405,1.31,narrow,...,,,,0.0,648.752284,648.891164,649.232169,0.003541,0.002925,0.004440


In [None]:
#checking our results to see if there is any anormally

df_new.info()
df_new.describe()

<class 'pandas.core.frame.DataFrame'>
Index: 21450 entries, 0 to 21449
Data columns (total 21 columns):
 #   Column         Non-Null Count  Dtype         
---  ------         --------------  -----         
 0   datetime       21450 non-null  datetime64[ns]
 1   high           21450 non-null  float64       
 2   low            21450 non-null  float64       
 3   close          21450 non-null  float64       
 4   Volume         21450 non-null  int64         
 5   ib_high        21450 non-null  float64       
 6   ib_low         21450 non-null  float64       
 7   ib_mid         21450 non-null  float64       
 8   ib_width       21450 non-null  float64       
 9   ib_width_type  21450 non-null  object        
 10  day_close      21450 non-null  float64       
 11  prev_day       21060 non-null  object        
 12  prev_close     21060 non-null  float64       
 13  gap            21060 non-null  float64       
 14  gap_dir        21450 non-null  float64       
 15  avwap_open     21450 non

Unnamed: 0,datetime,high,low,close,Volume,ib_high,ib_low,ib_mid,ib_width,day_close,prev_close,gap,gap_dir,avwap_open,avwap_down,avwap_up,slope_open,slope_up,slope_down
count,21450,21450.0,21450.0,21450.0,21450.0,21450.0,21450.0,21450.0,21450.0,21450.0,21060.0,21060.0,21450.0,21450.0,20185.0,20158.0,21175.0,19883.0,19910.0
mean,2025-10-15 12:44:30,668.153902,667.867178,668.010579,31363.66,669.798091,666.728364,668.263227,3.069727,668.486727,668.765556,0.082037,0.018182,668.026579,667.960132,668.139156,-0.000778,0.000814,-0.002118
min,2025-09-08 09:30:00,647.51,647.22,647.31,1314.0,649.06,647.75,648.405,0.98,648.64,648.64,-21.47,-1.0,648.386941,648.234765,648.083,-0.244283,-0.264991,-0.301219
25%,2025-09-25 14:22:15,661.33,660.91225,661.1225,14615.5,663.23,659.63,660.975,1.85,660.74,661.9,-2.77,-1.0,661.295261,661.288241,661.555864,-0.004135,-0.004096,-0.004245
50%,2025-10-15 12:44:30,666.97,666.66,666.82,22888.0,670.23,666.36,668.14,2.645,669.29,669.29,0.255,0.0,666.941004,666.921274,666.964218,0.00059,0.000855,0.000744
75%,2025-11-04 11:06:45,673.12,672.87,673.0,36983.0,677.38,672.52,674.95,4.13,674.9,674.9,2.885,1.0,672.992592,672.758035,672.871954,0.00432,0.004853,0.004756
max,2025-11-21 15:59:00,689.7,689.52,689.59,1362579.0,689.7,688.15,688.925,7.45,688.97,688.97,12.7,1.0,689.029685,689.553333,689.029685,0.197987,0.286227,0.102521
std,,9.467176,9.486371,9.477293,35760.77,9.540465,9.566758,9.518547,1.635759,9.387359,9.245447,5.765459,0.981504,9.386457,9.382645,9.304283,0.017374,0.021092,0.020348


## 3) Saving the final dataframe as a `.csv` file

At this stage, our dataframe is **fully prepared**.  
It now contains:

- All three **AVWAP lines**:
  - **`avwap_open`** (anchored at 09:30 open)
  - **`avwap_up`** (anchored at the strongest 5-minute up move)
  - **`avwap_down`** (anchored at the strongest 5-minute down move)
- The **slope values** for each AVWAP, showing whether:
  - The benchmark price is **rising** (becoming more expensive), or  
  - **Falling** (becoming cheaper), or  
  - Staying **flat**

And this information is computed for **every single 1-minute candlestick** in our dataset.

So this dataframe is now our **final, feature-rich version** of the SPY 1-minute data for this project.


### Why do we save it as a `.csv` in `data/cache/`?

We want to:

- Keep a **ready-to-use file** that any later notebook can load directly.
- Avoid recomputing AVWAPs and slopes every time we run the project.
- Clearly separate:
  - **Clean base data** → in `data/clean/`
  - **Enriched / feature data** (like AVWAPs and slopes) → in `data/cache/`

So in this step, we:

- Take our **final dataframe** with:
  - IB levels  
  - AVWAP levels  
  - AVWAP slopes  
- Save it as a **`.csv` file** into:

> **`data/cache/`**

After saving:

- Any later analysis notebook can simply:
  - **Read this `.csv`**
  - Immediately use all IB and AVWAP information  
  → without re-running the entire technical pipeline.


In [75]:
from pathlib import Path

# 1) Define project root which is the main branch in our repository
PROJECT_ROOT = Path("..").resolve()

# 2) We need to go to data/cache folder so define that pathway
DATA_CACHE = PROJECT_ROOT / "data" / "cache"
DATA_CACHE.mkdir(parents=True, exist_ok=True)  # yoksa oluştur

clean_csv_path = DATA_CACHE / "spy_1min_et_clean_with_IBlevels_and_AWVAPs.csv"

df_new.to_csv(clean_csv_path, index=False)

print("Saved CSV to:", clean_csv_path)

Saved CSV to: /Users/canka/Dev/python/DSA210-Project-Can-Karadogan/data/cache/spy_1min_et_clean_with_IBlevels_and_AWVAPs.csv
