Skip to content

Analyze text for the most informative semantic network data based on optimal entropy.

License

Notifications You must be signed in to change notification settings

jdanowski/Entropix

Repository files navigation

Entropix

We introduce an optimal entropy model that sets a standard for the extraction of the informative content from text. Benchmarks are established using 123 text files 34GB in size, most of which were extracted from Facebook posts through its CrowdTangle database. Eight international conflicts are examined using the optimal entropy program, Entropix.R. This shows how the optimal entropy metric reveals that the Taiwan and China conflict is more multi-topical, indicating more relational multiplexity than the comparable-sized conflict of Ukraine and Russia, with half the topic variety. To illustrate the extraction of topics using community detection we examine in more detail the conflict between Israel and Palestine. The groups extracted have face validity. Optimal entropy sets a standard for semantic network analysis and enables comparative analysis of corpora in terms of informative value and topic diversity.

Here is a paper describing the theoretical, methodological, and empirical bases for Entropix. https://docs.google.com/document/d/1aPMRycKKUXdS62NuRz1H9jVyI-b1rf4jfti7c8oMuf8/edit?usp=sharing

Contact me at jdanowski@gmail.com

About

Analyze text for the most informative semantic network data based on optimal entropy.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages