# **Table of contents**

# **Abstract**

# **1. Short introduction**

# **2. Data collection & dataset overview**

Before we start wrangling, cleaning, and analyzing the data, it's essential to understand what we are working with. Here is some vital information about our dataset and the data collection process.

Using [Qualtrics XM](https://www.qualtrics.com/), the online survey was created and distributed through various social media channels. Recruitment for the research was open from February 17, 2023, to March 29, 2023. The distributed survey consisted of the following sections:

- **Homepage**

  Participants were informed about the study conditions, a general outline of their participation, and an estimate of its duration. The introduction mentioned that certain parts of the survey might cause some emotional discomfort. It was stated that complete anonymity would be ensured and participants could withdraw from the study at any time. Clicking the "Proceed" button at the bottom of the page indicated consent to participate.

- **Demographic metrics**

  Basic demographic data of participants were collected: age, gender, level of education, and worldview on spiritual matters. In this section, participants were also asked about their preferred survey version – masculine or feminine. Based on their choice, subsequent content was presented using either masculine or feminine forms of verbs and adjectives (Polish language).

- **Meditation/mindfulness practice**

  Participants were asked if they practice meditation/mindfulness. If they did, they were asked to estimate the number of minutes devoted to this practice in the last 30 days.

- **Psychedelics use**

  A multiple-choice list was presented for participants to mark the psychedelic compounds they had used at least once in their lives, with an option to add other substances. The list included: LSD (or 1P-LSD); psilocybin mushrooms (or synthetic psilocybin); ayahuasca; DMT (other than ayahuasca); 5-MeO-DMT; mescaline; ibogaine; salvia divinorum.

  Those who selected at least one substance were asked to estimate how many times they had used psychedelics and the subjective doses (microdoses, low, average, high, very high). Salvia divinorum and ibogaine were eventually excluded from the psychedelics group as they do not meet the criteria (affinity with 5-HT<sub>2A</sub> receptors) for classic psychedelics.

- **Transcendental experiences**

  Participants were asked if they had experienced a state of consciousness characterized by: **(1)** an altered sense of time and/or space, **(2)** a sense of awe, wonder, or fear, **(3)** ineffability, and **(4)** subjective transcendental/mystical qualities. Negative responses skipped subsequent items and the entire Mystical Experience Questionnaire (MEQ30), proceeding directly to dependent variable questionnaires.

  Affirmative responses led to a multiple-choice list to select the circumstances of the experience. If involving a psychedelic substance, they specified which of the compounds triggered the most intense experience. Multiple circumstances required specification of the first and most intense experience. The last item estimated how long ago the most intense experience occurred.

- **Revised Mystical Experience Questionnaire (MEQ30)**

  This questionnaire, consisting of 30 items, measures the intensity of mystical experiences (MacLean et al., 2012; Barrett et al., 2015). Based on Walter Stace's classic concept (1960/1973), four dimensions are measured: *Mysticism*, *Positive Affect*, *Transcendence of Time and Space*, and *Ineffability*. Participants rated their experience on a 6-point Likert scale. The Polish translation (α = 0.95, N = 515) by the research author was used.

  Confirmatory factor analysis showed a structural difference from the original. The item "Experience of amazement" loaded onto the *Ineffability* dimension rather than *Positive Affect*. However, retaining the original structure preserved slightly better reliability (α<sub>mysticism</sub> = 0.95; α<sub>positive affect</sub> = 0.80; α<sub>transcendence</sub> = 0.85; α<sub>ineffability</sub> = 0.79).

- **Perth Empathy Scale (PES)**

  This tool measures cognitive and affective empathy (Brett et al., 2022), categorized into positive and negative emotions, forming four subscales. The study used the *Positive* and *Negative Affective Empathy* subscales (α<sub>positive-affective</sub> = 0.70; α<sub>negative-affective</sub> = 0.72, N = 676), contributing to *Overall Affective Empathy* (α = 0.73; 10 items). Each item followed a structured format with responses on a 5-point Likert scale. The Polish version by Paweł Larionow and Karolina Mudło-Głagolska (2022) was used.

- **Satisfaction with Life Scale (SWLS)**

  Created by Ed Diener and colleagues (1985), this scale has 5 items rated on a 7-point Likert scale. It measures life satisfaction (example item: *If I could live my life over, I would change almost nothing*). The Polish translation by Konrad Jankowski (2015) was used (α = 0.86, N = 676).

- **Death Attitude Profile - Revised (DAP-R-PL)**

  This tool measures five areas of attitudes toward death, including types of death acceptance and avoidance, and fear of death (Wong et al., 1994). Only the *Fear of Death* subscale was used (example item: *Death is undoubtedly an unpleasant experience*) in the Polish adaptation by Paweł Brudek et al. (2020). This subscale has 10 items rated on a 7-point Likert scale (α = 0.92, N = 676).

- **Subjectively perceived influence**

  This section was for participants who reported undergoing a mystical experience. They answered three questions about the impact of the experience on empathy, life satisfaction, and fear of death, rated on a 5-point Likert scale. Participants could also share their experience in an optional text field.

# **3. Imports**

## **3.1 Importing libraries**

Let's first import all the libraries and packages needed to run the following code cells.

In [2]:
import pandas as pd

## **3.2. Loading dataset**

Since Pandas package is in place, we can import our dataset from the GitHub repository, loading it as a Pandas DataFrame.

In [4]:
dataset_url = "https://github.com/michal-owsiak/research/raw/main/dataset.xlsx"
df = pd.read_excel(dataset_url)

Now let's check how big the loaded set is...

In [5]:
print(f"The dataset consists of {df.shape[0]} records and {df.shape[1]} variables.")

The dataset consists of 1127 records and 193 variables.


...and what all those variables are.

In [6]:
list(df.columns)

['start_date',
 'end_date',
 'progress_in_percent',
 'duration_in_seconds',
 'finished',
 'recorded_date',
 'age',
 'sex',
 'survey_version',
 'worldview',
 'education',
 'meditation_M',
 'meditation_minutes_M',
 'compound_never_M',
 'compound_LSD_M',
 'compound_psylocybin_M',
 'compound_ayahuasca_M',
 'compound_DMT_M',
 'compound_5MeODMT_M',
 'compound_mescaline_M',
 'compound_ibogaine_M',
 'compound_salvia_M',
 'compound_other_M',
 'compound_text_M',
 'use_amount_M',
 'microdose_M',
 'low_dose_M',
 'average_dose_M',
 'high_dose_M',
 'very_high_dose_M',
 'mystical_experience_M',
 'context_psychedelic_M',
 'context_other_psychoactive_M',
 'context_NDE_M',
 'context_meditation_M',
 'context_ritual_M',
 'context_hypnosis_M',
 'context_other_M',
 'context_other_psychoactive_text_M',
 'context_other_text_M',
 'trigger_compound_M',
 'order_M',
 'intensity_M',
 'how_long_ago_M',
 'MEQ30_1_M',
 'MEQ30_2_M',
 'MEQ30_3_M',
 'MEQ30_4_M',
 'MEQ30_5_M',
 'MEQ30_6_M',
 'MEQ30_7_M',
 'MEQ30_8_M',
 '

*Tip: To get a comprehensive overview of all columns, viewing the above list as a scrollable element is recommended.*

We observe that, excluding the first 11 columns, all variables are duplicated, denoted by either an `_F` or `_M` suffix. This duplication stems from the survey's interactive nature, which diverged the main survey flow into two paths – one using masculine and the other feminine language forms based on participants' choices. Variables ending with `_F` contain data obtained from the female version of the survey, while those ending with `_M` pertain to the male version. 

We don't want to analyze those two types of variables separately, so aggregating all the `_F` and `_M` variables together should be one of the first steps for facilitating our analysis. Thus, let's now proceed to the **Data warangling and cleaning** section.