# Original Data Sampler and Anonymizer

I used the omitted [original_data_social_all_participants.csv]() to create the anonymized mock dataset  [data_social_all_participants.csv](data_social_all_participants.csv).

The file [original_column_names.csv](original_column_names.csv) contains the columns of the original data.

In [51]:
# Import necessary libraries
import pandas as pd
import random

# Load the dataset from the CSV file
df = pd.read_csv('original_data_social_all_participants.csv')

# Save the column names to a new CSV file for reference
column_names = df.columns.tolist()
pd.DataFrame(column_names).to_csv('original_column_names.csv', index=False, header=False)

# Remove unnecessary columns from the dataset
columns_to_drop = ['participantNr', 'player', 'age']
df = df.drop(columns=columns_to_drop)

# Define the columns to anonymize
unique_values = df['uniqueID'].unique()

# Select 20 random unique values for anonymization
selected_values = random.sample(list(unique_values), 20)

# Filter the dataset to include only the selected unique values
df = df[df['uniqueID'].isin(selected_values)]

# Rename the selected unique values based on their order of selection
selected_values.sort()
df.loc[df['uniqueID'].isin(selected_values), 'uniqueID'] = df['uniqueID'].apply(lambda x: selected_values.index(x) + 1)

# Define the 'group' and 'gender' columns based on the 'uniqueID' column
df.loc[df['uniqueID'].apply(lambda x: x % 2 == 0), 'group'] = 'adolescents'
df.loc[df['uniqueID'].apply(lambda x: x % 2 != 0), 'group'] = 'adult'
df.loc[df['uniqueID'].apply(lambda x: x % 2 == 0), 'gender'] = 1
df.loc[df['uniqueID'].apply(lambda x: x % 2 != 0), 'gender'] = 2

# Remap the unique values in the 'unique_rounds' column to indices
unique_rounds = df['unique_rounds'].unique()
remapped_values = {value: i+1 for i, value in enumerate(unique_rounds)}
df.loc[:, 'unique_rounds'] = df['unique_rounds'].map(remapped_values)

# Save the processed dataset to a new CSV file
test_df.to_csv('data_social_all_participants.csv', index=False)