DReAMy is a python library to automatically analyse (for now only) textual dream reports. At the moment, annotations are based on different Hall & Van de Castle features (e.g., character, emotions), but we are looking forward to expanding DReAMy's capabilities. For more details on the theoretical aspect, please refer to the pre-print. You can also follow us on Twitter, to keep up with updates and relevant work!
For more details on the theoretical aspect, please refer to the.
DReAMy can be easily installed via pip, and we do recommend using a virtual environment with python 3.9 or 3.10 installed. If you wish to play/query DReAMy's model, you can do so using the 🤗 Space DReAM
.
pip install dreamy
At the moment, DReAMy has three main features:
- Data: download datasets, containing dream reports from DreamBank.
- Analyse: collect (contextualised) embeddings, dimensionality reduction, clustering. (Colab tutorial)
- Annotate: label dream reports following different HVDC features, using one of three tasks – entity recognition, sentiment analysis, and relation extraction. (Colab tutorial)
Usage examples of all features can be found in the code below, and in the tutorials in the dedicated folder.
DReAMy has direct access to two datasets. A smaller English-only (~ 22k), with more descriptive variables (such as gender and year of collection), and a larger one (~ 30k), with reports both in English and German. You can download them using the simple code below.
import dreamy
# choose between base (~ 22k reports, EN-only & more descriptive variables)
# large (~ 29k reports, reports in EN and De, only series as descriptive variables)
database = "base"
dream_bank = dreamy.get_HF_DreamBank(database=database, as_dataframe=True)
dream_bank.sample(2)
index | series | description | dreams | gender | year |
---|---|---|---|---|---|
5875 | blind-f | Blind dreamers (F) | I'm at work in the office of a rehab teacher named D, a transistor radio is on, I held it in my hand and placed it on my desk. [...]. | female | mid-1990s |
12888 | emma | Emma: 48 years of dreams | I go to Pedro's house, he is fixing his bike. [...] | female | 1949-1997 |
You can also use DReAMy to easily extract, reduce, cluster, and visualise (contextualised) encodings (i.e., vector embeddings) of dream reports, with few and simple lines of code. At the moment, you can choose between four models, which are combinations of small/large English-only/Multilingual models.
import dreamy
# Get some data in a list form
list_of_reports = dream_bank["dreams"].tolist()
# set up model and get encodings
model_size = "small" # or large
model_lang = "english" # or multi, for multilingual
device = "cpu" # se to "cuda" for GPUs
report_encodings = dreamy.get_encodings(
list_of_reports,
model_size=model_size,
language=model_lang,
device=device,
)
# reduce space
# you can choose between pca/t-sne
X, Y = dreamy.reduce_space(report_encodings, method="pca")
# Update your original dataframe with coordinates and plot
dream_bank["DR_X"], dream_bank["DR_Y"] = X, Y
You can then use your favourite library to visualise the results.
As mentioned, the Annotate features revolve around three tasks:
-
NER : (name entity recognition) which annotates reports with respect to the character appearing in a report.
-
SA: (sentiment analysis) which annotates reports with respect to which of the five Hall & Van de Castle emotions (anger, apprehension, confusion, sadn4ssm happiness) appear in a report (possibly, also which character is experiencing them)
-
RE: (relation extraction), which extracts entities (characters) in a report and the relation between them. At the moment, the only RE task available refers to the activity feature of the HVDC framework.
All the tasks are handled via the main annotate_reports
method and can be called by simply changing the task
argument. Check the dedicated tutorial for more.
task = "SA"
batch_size = 16
device = "cpu" # or "cuda" / device number (e.g., 0) for GPU
SA_predictions = dreamy.annotate_reports(
list_of_reports,
task=task,
device=device,
batch_size=batch_size,
)
SA_predictions
[[{'label': 'SD', 'score': 0.9931567311286926},
{'label': 'HA', 'score': 0.08149773627519608},
{'label': 'CO', 'score': 0.04012126475572586},
{'label': 'AN', 'score': 0.007265480700880289},
{'label': 'AP', 'score': 0.006692806258797646}],
...
Please refer to the tutorial(s) for more information and details.
If you wish to contribute, collaborate, or just ask a question, feel free to contact Lorenzo, or use the issue section.
If you use DReAMy, please cite the work
@article{BERTOLINI2024406,
title = {DReAMy: a library for the automatic analysis and annotation of dream reports with multilingual large language models},
journal = {Sleep Medicine},
volume = {115},
pages = {406-407},
year = {2024},
note = {Abstracts from the 17th World Sleep Congress},
issn = {1389-9457},
doi = {https://doi.org/10.1016/j.sleep.2023.11.1092},
url = {https://www.sciencedirect.com/science/article/pii/S1389945723015186},
author = {L. Bertolini and A. Michalak and J. Weeds}
}
DReAMy was supported by the Horizon 2020 project Humane AI (grant N° 952026)