We introduce an optimal entropy model that sets a standard for the extraction of the informative content from text. Benchmarks are established using 123 text files 34GB in size, most of which were extracted from Facebook posts through its CrowdTangle database. Eight international conflicts are examined using the optimal entropy program, Entropix.R. This shows how the optimal entropy metric reveals that the Taiwan and China conflict is more multi-topical, indicating more relational multiplexity than the comparable-sized conflict of Ukraine and Russia, with half the topic variety. To illustrate the extraction of topics using community detection we examine in more detail the conflict between Israel and Palestine. The groups extracted have face validity. Optimal entropy sets a standard for semantic network analysis and enables comparative analysis of corpora in terms of informative value and topic diversity.
Here is a paper describing the theoretical, methodological, and empirical bases for Entropix. https://docs.google.com/document/d/1aPMRycKKUXdS62NuRz1H9jVyI-b1rf4jfti7c8oMuf8/edit?usp=sharing
Contact me at jdanowski@gmail.com