# Text Analysis using custom dictionaries

The pyliwc package is a powerful tool for text analysis using the LIWC framework. It helps in quantifying various linguistic and psychological features from text data, making it invaluable for researchers and data scientists interested in text analytics.

 In this tutorial, we will focus on analyzing a simple string. The goal is to gain insights into their linguistic styles and psychological attributes as expressed in their speeches using custom dictionnaries.

 

# Internal Dictionnaries

The default internal dictionaries available are:


| Dictionary                                      | Language      | Parameter Value in `liwc_dict` |
|-------------------------------------------------|---------------|--------------------------------|
| *LIWC-22 Dictionary*                           | English       | LIWC22                         |
| *LIWC2015 Dictionary*                          | English       | LIWC2015                       |
| *LIWC2007 Dictionary*                          | English       | LIWC2007                       |
| *LIWC2001 Dictionary*                          | English       | LIWC2001                       |
| *DE-LIWC2015 Dictionary*                       | German        | DE-LIWC2015                    |
| *LIWC2015 Dictionary - Chinese (Simplified) (v1.5)* | Chinese (Simplified) | CHNSIMPLLIWC2015          |
| *LIWC2015 Dictionary - Chinese (Traditional) (v1.5)*  | Chinese (Traditional) | CHNTRADLIWC2015          |
| *MR-LIWC2015 Dictionary*                       | Marathi       | MRLIWC2015                     |
| *ES-LIWC2007*                                  | Spanish       | ESLIWC2007                     |
| *J-LIWC2015 Dictionary*                       | Japanese      | JLIWC2015                      |

*LIWC22 is the default option



In [None]:
from pyliwc import Liwc

# Initialize the Liwc instance with the LIWC CLI executable
liwc = Liwc('LIWC-22-cli.exe')

text = "On this day, we gather because we have chosen hope over fear, unity of purpose over conflict and discord."

r = liwc.analyze_string_to_json(text, liwc_dict='LIWC2015')
print(r)

Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=UTF-8


{'Segment': 1, 'WC': 19, 'Analytic': 77.17, 'Clout': 99, 'Authentic': 19.27, 'Tone': 25.77, 'WPS': 19, 'Sixltr': 21.05, 'Dic': 78.95, 'function': 52.63, 'pronoun': 15.79, 'ppron': 10.53, 'i': 0, 'we': 10.53, 'you': 0, 'shehe': 0, 'they': 0, 'ipron': 5.26, 'article': 0, 'prep': 21.05, 'auxverb': 5.26, 'adverb': 0, 'conj': 10.53, 'negate': 0, 'verb': 10.53, 'adj': 0, 'compare': 0, 'interrog': 0, 'number': 0, 'quant': 0, 'affect': 10.53, 'posemo': 5.26, 'negemo': 5.26, 'anx': 5.26, 'anger': 0, 'sad': 0, 'social': 15.79, 'family': 0, 'friend': 0, 'female': 0, 'male': 0, 'cogproc': 15.79, 'insight': 0, 'cause': 10.53, 'discrep': 5.26, 'tentat': 5.26, 'certain': 0, 'differ': 0, 'percept': 0, 'see': 0, 'hear': 0, 'feel': 0, 'bio': 0, 'body': 0, 'health': 0, 'sexual': 0, 'ingest': 0, 'drives': 31.58, 'affiliation': 15.79, 'achieve': 5.26, 'power': 10.53, 'reward': 0, 'risk': 0, 'focuspast': 0, 'focuspresent': 10.53, 'focusfuture': 5.26, 'relativ': 21.05, 'motion': 0, 'space': 15.79, 'time': 5.

# User-Created LIWC Dictionaries

If you would like to analyze your text with an external dictionary file, you may alternatively provide the path to your dictionary.

Visit [liwc.app/dictionaries](https://www.liwc.app/dictionaries/dict-user) to download a user-created dictionnary.

Here is an example to measure the 10 Schwartz Values (and 4 higher-order value dimensions) using the dictionary developed by Ponizovskiy et al. (2020).
> Ponizovskiy, V., Ardag, M., Grigoryan, L., Boyd, R., Dobewall, H., & Holtz, P. (2020). Development and validation of the Personal Values Dictionary: A theory-driven tool for investigating references to basic human values in text. European Journal of Personality, 34(5), 885–902. https://doi.org/10.1002/per.2294 

```python
from pyliwc import Liwc

# Initialize the Liwc instance with the LIWC CLI executable


liwc = Liwc('LIWC-22-cli.exe')

text = "On this day, we gather because we have chosen hope over fear, unity of purpose over conflict and discord."

r = liwc.analyze_string_to_json(text, liwc_dict='dictionaries/stereotype-content-dictionary.dicx')
print(r)
```
```python
Segment                       1.00
WC                           19.00
WPS                          19.00
BigWords                     21.05
Dic                          26.32
Valence_Pos                   3.00
Valence_Neg                   4.11
Valence_Neut                 19.21
Sociability_Freq              0.00
Sociability_Direction         0.00
Morality_Freq               526.32
Morality_Direction           -5.26
Ability_Freq                  0.00
Ability_Direction             0.00
Agency_Freq                1052.63
Agency_Direction              0.00
Health_Freq                 526.32
Health_Direction             -5.26
Status_Freq                   0.00
Status_Direction              0.00
Work_Freq                     0.00
Work_Direction                0.00
Politics_Freq                 0.00
Politics_Direction            0.00
Religion_Freq                 0.00
Religion_Direction            0.00
Beliefs_Other_Freq            0.00
Beliefs_Other_Direction       0.00
Inhabitant_Freq               0.00
Country_Freq                  0.00
Feelings_Freq              1052.63
Relatives_Freq                0.00
Clothing_Freq                 0.00
Ordinariness_Freq             0.00
Ordinariness_Direction        0.00
Body_Part_Freq                0.00
Body_Property_Freq            0.00
Skin_Freq                     0.00
Body_Covering_Freq            0.00
Beauty_Freq                   0.00
Beauty_Direction              0.00
Insults_Freq                  0.00
STEM_Freq                     0.00
Humanities_Freq               0.00
Art_Freq                      0.00
Social_Groups_Freq            0.00
Lacks_Knowledge_Freq          0.00
Fortune_Freq                  0.00
AllPunc                      15.79
Period                        5.26
Comma                        10.53
QMark                         0.00
Exclam                        0.00
Apostro                       0.00
OtherP                        0.00
Emoji                         0.00
```


# Other Dictionaries Available

| Dictionary Name                                    | Description                                                                                                     | Author                                      | Uploaded   |
|----------------------------------------------------|-----------------------------------------------------------------------------------------------------------------|---------------------------------------------|------------|
| Absolutist Dictionary                             | Measures absolutist thinking (e.g., always, never) in texts                                                      | Al-Mosaiwi & Johnstone                     | 2022-01-01 |
| Age Stereotypes Dictionary                        | Reflects eight broadly-defined stereotypes identified in past research as descriptive of older adults             | Jessica Remedios                           | 2022-01-01 |
| Agitation/Dejection Dictionary                    | Based on studies linking promotion versus prevention focus with the emotions “Agitation” and “Dejection”            | Johnsen et al.                             | 2022-01-01 |
| AI Focus Dictionary                               | 122 AI-related words for measuring a company's focus on AI                                                          | Mishra, Ewing, & Cooper                    | 2024-03-14 |
| American Indian Stereotype Dictionary             | A custom dictionary to find the prevalence of words associated with American Indian stereotypes and conventional narratives | Aleksandra Sherman                       | 2022-01-01 |
| anticoagulation                                    | Words associated with anticoagulation therapy - primarily nouns added to supplement existing dictionaries          | Peter Whittaker                            | 2023-02-11 |
| Behavioral Activation Dictionary                  | Captures linguistic indicators of planning and participation in enjoyable activities                               | Burkhardt et al.                           | 2022-01-01 |
| Big Two (Agency & Communion) Dictionary           | Measure the degree to which a person is thinking in terms of agency/communion.                                      | Pietraszkiewicz et al.                     | 2022-01-01 |
| Body Type Dictionary                              | A content analysis dictionary for automating the scoring of Fisher & Cleveland's (1958) body type framework in English-language texts. | Andrew Wilson                            | 2022-01-01 |
| Brand Personality Dictionary                      | Assesses Aaker’s five brand personality dimensions as well as 42 personality trait norms                            | Opoku et al.                               | 2022-01-01 |
| Climate Change Dictionary                         | Dictionary used to quantify mentions of climate change in online discussion                                       | Miti Shah                                  | 2023-05-23 |
| Controversial Terms Lexicon                        | A lexicon of terms that range in their degree of controversiality, particularly in terms of their use in the media. | Mejova et al.                              | 2022-01-01 |
| Corporate Social Responsibility Dictionary        | Reveals four dimensions of corporate social responsibility                                                        | Nadra Pencle & Irina Mălăescu               | 2022-01-01 |
| Cost/Benefit Dictionary                            | Measures language related to perceived costs and benefits that result from a decision or behavior.                 | Michael McCullough                         | 2022-01-01 |
| Creativity and Innovation Dictionary              | Language describing creation and/or innovation                                                                     | Neufeld and Gaucher                        | 2022-01-01 |
| Crovitz Innovator Identification Method           | Identify “innovators” and “non-innovators” using Hebert F. Crovitz’s 42 relational words                              | Greco et al.                               | 2022-01-01 |
| Dehumanization Dictionary                          | Measures several types of (de)humanization, such as mechanistic and animalistic dehumanization                     | Samantha Platten                           | 2022-01-01 |
| Diccionario de polaridad y clase de palabras - Esp| Diccionario de polaridad (+/-) por clases de palabras                                                               | Javier Blasco-Pascual                      | 2022-09-13 |
| Digital Orientation Dimensions                    | Digital orientation dimensions keyword classification based on the research of Kindermann et al. (2021)              | Konstantinos Emexidis                      | 2023-05-17 |
| Empath Default Dictionary                          | General content-coding dictionary from the "Empath" package                                                          | Fast, Chen, & Bernstein                    | 2022-01-01 |
| Empathic Concern Lexicon                           | An automatically created empathy dictionary extracted from document-level ratings                                   | João Sedoc                                 | 2023-09-14 |
| English Prime Dictionary                          | Captures violations of the English Prime system, a theoretical marker of cognitive inflexibility.                  | Ryan L. Boyd                               | 2022-01-01 |
| Enriched American Food Lexicon                     | Dictionary containing ~500 foods mapped to USDA categories, with some caloric information.                           | Abbar et al.                               | 2022-01-01 |
| Entrepreneurial and Mentoring Dictionary           | A dictionary of entrepreneurial and mentoring terms                                                                  | Steven D'Alessandro and Morgan Miles       | 2022-01-01 |
| extended Moral Foundations Dictionary (eMFD)      | The eMFD, unlike previous methods, is constructed from text annotations generated by a large sample of human coders.  | Hopp et al.                                | 2022-01-01 |
| Foresight Lexicon                                 | Measures the degree to which anticipation/foresight occurs. That is, words pointing to indicate where things are heading (often on the basis of recurrent behaviors). | Robert Hogenraad                           | 2022-01-01 |
| Forest Values Dictionary                           | Reflects four distinct ways in which people value forests and forest ecosystems                                       | Bengston & Xu                              | 2022-01-01 |
| General Inquirer IV                                | The "original" mainstream text analysis dictionary that is still used often today. Many categories are of questionable validity. | Philip Stone et al.                       | 2022-01-01 |
| German-Language STEM Dictionary                    | Used to measure communication about STEM (Science, Technology, Engineering, Mathematics).                          | Michael Heilemann                          | 2022-01-01 |
| Global Citizen Dictionary                          | A dictionary to assess language usage related to global citizenship                                                   | Stephen Reysen et al.                      | 2022-01-01 |
| Grant Evaluation Dictionary                        | Captures categories relevant to scientific grant review (ability, achievement, agentic, research, standout, pos eval, neg eval) | Kaatz et al.                               | 2022-01-01 |
| Grievance Dictionary                               | A psycholinguistic dictionary that can be used to automatically understand language use in the context of grievance-fueled violence threat assessment | Isabelle van der Vegt                    | 2022-01-01 |
| Home Perceptions Dictionary                        | Calculates the frequency of words describing clutter, a sense of the home as unfinished, restful words, and nature words | Saxbe & Repetti                           | 2022-01-01 |
| Honor Dictionary                                  | This dictionary is designed to diagnose “honor talk” in any text that you are interested in analyzing.                | Gelfand et al.                             | 2022-01-01 |
| Imagination Lexicon                               | Digital lexicon of 627 entries relative to imagination and transfiguration, i.e., words pointing to the unbelievable and whatever is beyond the real. | Robert Hogenraad                           | 2022-01-01 |
| Incel Violent Extremism Dictionary                 | Three types of words, including common words about violence, weapons, and some of the most recognizable incel terms  | Baele et al.                               | 2023-02-13 |
| Invective Dictionary                              | Use this dictionary to detect invective language in narrative                                                        | A. T. Panter                               | 2022-01-01 |
| Linguistic Category Model (LCM) Dictionary         | A computerized LCM analysis method                                                                                   | Yi-Tai Seih                                | 2022-01-01 |
| LIWC-UD: Urban Dictionary LIWC Supplements         | An automatically generated extension to LIWC’s dictionary which includes terms defined in Urban Dictionary            | Bahgat, Wilson, & Magdy                    | 2023-02-13 |
| Loughran-McDonald LM dic                           | Loughran-McDonald                                                                                                    | Loughran-McDonald                          | 2023-09-14 |
| Loughran-McDonald financial sentiment              | Tone dictionary                                                                                                      | Loughran-McDonald                          | 2023-03-07 |
| Loughran-McDonald Financial Sentiment Dictionary  | Dictionary for measuring positive and negative sentiment specifically in financial texts.                            | Loughran & McDonald                        | 2022-01-01 |
| MarcadoresDiscursivos - Español.dicx               | Diccionario nativo de partículas discursivas del español                                                               | Javier Blasco-Pascual                      | 2022-08-25 |
| Masculine & Feminine Words                         | List of masculine and feminine words from Gaucher et al. (2011)                                                       | Maureen McCusker                           | 2022-01-01 |
| Mindfulness Dictionary                             | Two categories of mindfulness language describing the mindfulness state and the more encompassing “mindfulness journey” | Collins et al.                             | 2022-01-01 |
| Mind Perception Dictionary                         | Measures linguistic use of mind perception (words related to “agency” and “experience”) in naturalistic settings      | Schweitzer & Waytz                         | 2022-01-01 |
| Moral Foundations Dictionary 2.0                   | An updated version of the Moral Foundations Dictionary that is recommended over the original by its creators.       | Jeremy Frimer                               | 2022-01-01 |
| Moral Foundations Dictionary                        | Provides information on the proportions of virtue and vice words for each foundation on whatever corpus of text you are interested in | Jesse Graham and Jonathan Haidt           | 2022-01-01 |
| Morality-as-Cooperation Dictionary                 | Codes for constructs from the "Morality-as-Cooperation" theory, originating from ethnographic accounts of morality    | Mark Alfano, Marc Cheong, & Oliver Scott Curry | 2024-02-29 |
| Moral Justification Dictionary                      | Measures variation in justification content (deontological, consequentialist, or emotive) as a function of moral foundations | Wheeler & Laham                            | 2022-01-01 |
| Moral Universalism in French                        | A set of terminologies associated with morally universal language, based on the English version of Graham et al.        | Michael Jetter & Akim Defesche             | 2024-05-21 |
| Moral Universalism in German                        | A set of terminologies associated with morally universal language, based on the English version of Graham et al.        | Michael Jetter                              | 2024-05-21 |
| Moral Universalism in Italian                       | A set of terminologies associated with morally universal language, based on the English version of Graham et al.        | Michael Jetter & Marianna Piantavigna      | 2024-05-21 |
| Moral Universalism in Spanish                       | A set of terminologies associated with morally universal language                                                      | Michael Jetter & Raquel Bra Nunez          | 2024-05-01 |
| Motivated Social Cognition Dictionary              | Measures the degree to which a person is thinking in a way that reflects various motivated social cognitions (e.g., uncertainty avoidance) | Jayme Renfro et al.                        | 2022-01-01 |
| Mystical Language Dictionary                        | A set of words that, when encountered in an experience report, indicate the occurrence of mystical elements within the experience. | Marija Franka Žuljevića et al.            | 2023-11-03 |
| Nonconformity                                       | Nonconformity                                                                                                        | Patti                                       | 2024-03-12 |
| Nostalgia Dictionary                                | A 98-word dictionary that captures expressions of nostalgia in text.                                                 | Jia Chen                                    | 2023-05-16 |
| Open Science Dictionary                            | This is a validated dictionary that taps into open science categories (e.g., open science, preregistration, replication) | David Markowitz                             | 2022-05-08 |
| Pain Dictionary                                    | Measures the language of pain disclosure. Made with special attention to previously validated pain scales              | Wright et al.                               | 2022-01-01 |
| Personal Values Dictionary                          | Measures the 10 Schwartz Values (and 4 higher-order value dimensions).                                                | Ponizovskiy et al.                          | 2022-01-01 |
| Physiological Sensations Dictionary                 | Measures the amount that a person is describing physical sensations (e.g., bloated, dizzy, faint).                    | Shaffer, Kim, & Yoon                        | 2022-01-01 |
| Policy Position Dictionary                          | Provides estimation of the policy position of political texts (e.g., transcribed speeches) in the United Kingdom        | Laver & Garry                               | 2022-01-01 |
| Pornography Lexicon Dictionary                      | This lexicon is based on merging LIWC's internal lexica of sexuality and drives (2015) with Essam's lexicon of pornography (2017). | Encarnacion S. Arenas                      | 2022-01-01 |
| Privacy Dictionary                                  | Contains a list of words that people use when talking about privacy. Organised into 8 categories that measure different dimensions of privacy | Vasalou et al.                             | 2022-01-01 |
| promotion                                          | PROMFOC                                                                                                              | Gamache                                     | 2023-11-06 |
| Prorefugee Content Dictionary                       | Measures the proportion of words that contained prorefugee content within a text (rather than refugee content in general) | Smith et al.                                | 2022-01-01 |
| Prosocial Words Dictionary                          | Calculates the density of prosocial words in anything that a person says                                              | Jeremy Frimer                               | 2022-01-01 |
| Qualia Dictionary                                  | Words referring to the five senses, broken down into types of qualia (e.g., for vision: colors, luminance, actions, and shapes). | Molly Ireland et al.                       | 2022-01-01 |
| Regressive Imagery Dictionary                       | Dictionary designed to capture primary (i.e., primordial) and secondary (i.e., "conceptual") processes from natural language. | Colin Martindale                            | 2022-01-01 |
| Regulatory Mode Dictionary                          | Locomotion and Assessment States of Goal Pursuit                                                                      | Dana Kanze, Mark A. Conley, and E. Tory Higgins | 2022-01-01 |
| Romantic Love Dictionary                           | Measures of “emotional investment” and “attraction” for historical assessment of romantic love                         | Mauricio de Jesus Dias Martins & Nicolas Baumard | 2023-12-27 |
| Security Language Dictionary                        | Provides a reference for the comparative study of security-related linguistic repertoires in political texts (speeches, policy documents, etc.). | Stephane Baele & Olivier Sterck            | 2022-01-01 |
| Self-Care Dictionary                                | Measures the degree to which self-care words are used (e.g., diet, yoga)                                                | Xunyi Wang et al.                           | 2022-01-01 |
| Self-Determination/Self-Talk Dictionary             | Designed for use with self-talk data. Measures autonomy-supportive versus controlling language within self-talk.       | Oliver et al.                               | 2022-01-01 |
| Self-Transcendent Emotion Dictionary (STED)         | Scores texts for emotions that are invaluable to promote greater human connectedness, prosociality, and human flourishing | Ji & Raney                                 | 2022-01-01 |
| Situational 8 Dictionary (S8-LIWC)                  | Captures language that corresponds to each of the DIAMONDS dimensions of situations                                    | David G. Serfass                            | 2022-01-01 |
| Sleep Dictionary                                   | A dictionary designed to capture sleep-related communications                                                           | Ilana Ladis                                 | 2023-02-11 |
| Social Ties Dictionary                             | Assesses spontaneous indicators of important social relationships                                                     | Pressman & Cohen                            | 2022-01-01 |
| Stereotype Content Dictionary                      | A stereotype content dictionary, made using a semi-automated method, to capture the Stereotype Content Model in text | Nicolas et al.                              | 2022-01-01 |
| Stress Dictionary                                  | A dictionary used to measure psychological stress. Created based on the LIWC2007 English Dictionary.                     | Wei Wang et al.                             | 2022-01-01 |
| The Weighted Reflection-Reorganizing List          | Captures reflection on the emotional significance of an event or events in one's own or someone else's life, or in a dream/fantasy. | Murphy, Bucci, & Maskit                     | 2022-01-01 |
| Threat Dictionary                                  | This dictionary was designed to diagnose threatening language in any text that interests you.                          | Virginia K. Choi et al.                     | 2022-02-23 |
| Transactive Memory Systems (TMS) Strength          | This dictionary provides a measure of the positive and negative indicators of TMS from group conversational text.      | Jonathan Kush, Brandy Aven, Linda Argote   | 2023-08-10 |
| Water Metaphor Dictionary                          | Measures the frequency of inundation-metaphoric expressions as it relates to immigration.                               | Tyler Jimenez                               | 2022-01-01 |
| Weighted Referential Activity Dictionary           | Yields a Referential Activity (RA) score for any natural language segment.                                             | Bucci & Maskit                              | 2022-01-01 |
| Well-being Dictionary                              | Words that might indicate the presence of purpose or meaning                                                            | Ratner et al.                               | 2022-01-01 |
| Whirlall Dictionary                                | Measures words related to "whirling" and "twirling" in Rorschach responses                                             | Ryan L. Boyd                                | 2022-01-01 |
