Experiment with timeseries Exp3P1V5happiness.csv https://road.affectivese.org/datasets/GraphNeuralNetwork/ 

One timeseries contains 110 timestamps.  

To prepare timeseries to the experiment 10% of data should be deleted to make gaps:  

In [16]:
#imports
import pandas as pd
import math
import os
import psycopg2
from psycopg2 import OperationalError
from datetime import datetime, timedelta

In [18]:
# Function that multiplies the original CSV file and saves a new CSV file into the folder
def multiply_original_data(file_name, column_names, repeat_count):
    # Load the CSV file
    exp_ts = pd.read_csv(file_name, skiprows=1, names=column_names)
    
    # Create a new DataFrame by repeating the original DataFrame `repeat_count` times
    exp_ts_xtimes = pd.concat([exp_ts] * repeat_count, ignore_index=True)
    
    # Update the 'FrameMillis' column with sequential values
    exp_ts_xtimes['FrameMillis'] = list(range(len(exp_ts_xtimes)))
    
    # Generate the new file name with `_x{repeat_count}` appended
    new_file_name = file_name.replace('.csv', f'_x{repeat_count}.csv')
    
    # Save the new DataFrame to a CSV file
    exp_ts_xtimes.to_csv(new_file_name, sep=',', index=False, encoding='utf-8')
    
    print(f"File saved as {new_file_name}")
    return exp_ts_xtimes, new_file_name


In [19]:
# Function to delete random rows from a DataFrame and save the modified version
def delete_random_rows(df, perc_to_del, file_name, seed):
    # The number of rows to delete
    rows_to_delete = int(len(df) * perc_to_del)

    # Randomly sample rows to delete with a seed
    rows_to_delete_indices = df.sample(n=rows_to_delete, random_state=seed).index
    rows_to_delete_df = pd.DataFrame({'DeletedIndices': rows_to_delete_indices})
    
    # Store indices of deleted rows to a CSV file
    deleted_rows_file = 'deleted_rows_' + file_name
    rows_to_delete_df.to_csv(deleted_rows_file, sep=',', index=False, encoding='utf-8')
    
    # Drop the randomly selected rows
    df_with_gaps = df.drop(rows_to_delete_indices)

    file_with_gaps = file_name.replace('.csv', '_with_gaps.csv')
    
    # Save the DataFrame with gaps to a new CSV file
    df_with_gaps.to_csv(file_with_gaps, sep=',', index=False, encoding='utf-8')
    
    print(f"File saved with gaps as {file_with_gaps}")

In [20]:
original_file_name = "Exp3P1V5happiness.csv"
column_names = ["FrameMillis", "Value", "EmotionRange"]
multiplication_sizes = [10, 100, 1000]
percent_to_delete = 0.1
seed = 10

# Multiplication of the original file with saving the file and its name, and following gaps creation
for i in multiplication_sizes:
    multiplied_df, generated_file_name = multiply_original_data(original_file_name, column_names, i)
    
    delete_random_rows(multiplied_df, percent_to_delete, generated_file_name, seed)

#creating gaps in the original file
original_df = pd.read_csv(original_file_name, skiprows=1, names=column_names)
delete_random_rows(original_df, percent_to_delete, original_file_name, seed)

File saved as Exp3P1V5happiness_x10.csv
File saved with gaps as Exp3P1V5happiness_x10_with_gaps.csv
File saved as Exp3P1V5happiness_x100.csv
File saved with gaps as Exp3P1V5happiness_x100_with_gaps.csv
File saved as Exp3P1V5happiness_x1000.csv
File saved with gaps as Exp3P1V5happiness_x1000_with_gaps.csv
File saved with gaps as Exp3P1V5happiness_with_gaps.csv


after preparing a file and deleting 10% of data from it, file should be inserted into the database for further transformations.

Connected to the database.
Timeseries table created successfully.
Data inserted into Timeseries table successfully.
Signal_Values_timestamp table created successfully.
Data inserted into Signal_Values_timestamp table successfully.
Database connection closed.
