## Gold analysis
Se il massimo di mercoledi, e' inferiore a quello di lunedi, vedrai il minimo di mercoledi rivisitato giovedi (rivisitato=giovedi il prezzo crossa dall'alto o dal basso il minimo di mercoledi)

### Step da fare:
- Calcolare dei massimi e minimi giornalieri
    - bisogna fare resampling dei dati (in modo da ottenere candele da 1 giorno)
- Categorizzare ogni giorno della settimana (bisogna creare una colonna che dice che giorno e')
- Controllare se il max di mercoledi e' inferiore a quello di lunedi'
    - salva il minimo di mercoledi'
    - controlla se il range di giovedi include il minimo di mercoledi'.

### Readint the CSV file and converting it to a parquet one:

In [1]:
# #import the libraries
# import cudf
# import dask
# import dask.dataframe as dd
# #set the enviroment to cuDF so we use the GPU
# dask.config.set({"dataframe.backend": "cudf"})
# #----------------------------------------------

# xau1D = dd.read_csv('/home/edoardocame/Desktop/python_dir/xauusd-d1-bid-2014-01-01-2024-12-11T23.csv')
# xau1D['timestamp'] = dd.to_datetime(xau1D['timestamp'])
# xau1D = xau1D.set_index('timestamp', sorted=True)
# xau1D['weekday'] = xau1D.index.to_series().dt.weekday
# xau1D.head()

### Using parquet file:

In [1]:
#import the libraries
import dask
import dask.dataframe as dd
from dask_cuda import LocalCUDACluster
from dask.distributed import Client
cluster = LocalCUDACluster()
client = Client(cluster)
client
#----------------------------------------------
dask.config.set({"dataframe.backend": "cudf"})


df = dd.read_parquet('/home/edoardocame/Desktop/python_dir/xauusd1D.parquet')
df['returns'] = df['close'].diff() / df['close'].shift(1)
df['week'] = df.index.dt.isocalendar().week
df['year'] = df.index.dt.isocalendar().year
df.head()

Perhaps you already have a cluster running?
Hosting the HTTP server on port 40971 instead


Unnamed: 0_level_0,open,high,low,close,volume,weekday,returns,week,year
timestamp,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
2014-01-01,1203.612,1205.883,1202.302,1205.883,0.2705,2,,1,2014
2014-01-02,1205.913,1230.773,1204.893,1223.71,27.3592,3,0.014783358,1,2014
2014-01-03,1223.687,1240.153,1223.297,1236.683,26.3572,4,0.010601368,1,2014
2014-01-05,1236.983,1238.353,1233.842,1234.042,0.328,6,-0.002135551,1,2014
2014-01-06,1234.042,1248.342,1214.626,1237.665,26.1419,0,0.002935881,2,2014


# Groupby Logic:
Groupby is like sorting your data into different buckets based on common characteristics. Think of it as:

Split: Data is divided into groups based on one or more keys
Apply: A function is applied to each group independently
Combine: Results are combined into a new data structure
Here's the enhanced solution that includes checking if Thursday's price range includes Wednesday's high when Wednesday is lower than Monday:

import dask_cudf as dc
import numpy as np

def analyze_weekly_patterns(df):
    # Ensure datetime format
    df['date'] = dc.to_datetime(df.index if df.index.name else df['date'])
    
    # Extract time components
    df['day_of_week'] = df['date'].dt.dayofweek  # Monday = 0, Sunday = 6
    df['week'] = df['date'].dt.isocalendar().week
    df['year'] = df['date'].dt.year
    
    # Create separate dataframes for each day we need
    # Group by year and week, then get the first occurrence (should be only one per day anyway)
    monday_data = df[df['day_of_week'] == 0].groupby(['year', 'week'])['high'].first()
    wednesday_data = df[df['day_of_week'] == 2].groupby(['year', 'week'])['high'].first()
    
    # For Thursday, we need both high and low for the range check
    thursday_data = df[df['day_of_week'] == 3].groupby(['year', 'week']).agg({
        'high': 'max',  # Get highest point of Thursday
        'low': 'min'    # Get lowest point of Thursday
    })
    
    # Combine the data
    weekly_analysis = dc.concat([
        monday_data.to_frame('monday_high'),
        wednesday_data.to_frame('wednesday_high'),
        thursday_data
    ], axis=1)
    
    # Create flags for our conditions
    weekly_analysis['wed_lower_than_mon'] = weekly_analysis['wednesday_high'] < weekly_analysis['monday_high']
    
    # Check if Thursday's range includes Wednesday's high when Wednesday is lower than Monday
    weekly_analysis['thurs_crosses_wed'] = (
        (weekly_analysis['wed_lower_than_mon']) & 
        (weekly_analysis['low'] <= weekly_analysis['wednesday_high']) & 
        (weekly_analysis['high'] >= weekly_analysis['wednesday_high'])
    )
    
    # Merge back to original dataframe
    result = df.merge(
        weekly_analysis[['wed_lower_than_mon', 'thurs_crosses_wed']],
        left_on=['year', 'week'],
        right_index=True,
        how='left'
    )
    
    return result

Example usage:
df = analyze_weekly_patterns(df)