forked from pielstroem/Topics
-
Notifications
You must be signed in to change notification settings - Fork 13
/
result.html
executable file
·113 lines (108 loc) · 7.01 KB
/
result.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
<!doctype html>
<html lang="de">
<head>
<meta charset="utf-8">
<title>Topics App</title>
<meta name="viewport" content="width=device-width, initial-scale=1">
<meta name="author" content="DARIAH-DE">
<meta name="description" content="Topics App">
{{ js_resources|safe }}
{{ css_resources|safe }}
{{ corpus_boxplot_script|safe }}
{{ heatmap_script|safe }}
{{ topics_script|safe }}
{{ documents_script|safe }}
<link rel="stylesheet" href="{{url_for('static', filename='css/bootstrap.css')}}" type="text/css" media="screen, projection" />
<link rel="stylesheet" href="{{url_for('static', filename='css/bootstrap-responsive.css')}}" type="text/css" media="screen, projection" />
<link rel="stylesheet" href="{{url_for('static', filename='css/application.css')}}" type="text/css" media="screen, projection" />
<link rel="stylesheet" href="{{url_for('static', filename='css/bootstrap-customization.css')}}" type="text/css" media="screen, projection" />
<link rel="stylesheet" href="{{url_for('static', filename='css/bootstrap-modal.css')}}" type="text/css" media="screen, projection" />
<link rel="stylesheet" href="{{url_for('static', filename='css/font-awesome.css')}}">
<script type="text/javascript" src="{{url_for('static', filename='js/jquery-1.8.2.js')}}"></script>
<script type="text/javascript" src="{{url_for('static', filename='js/bootstrap.js')}}"></script>
<script type="text/javascript" src="{{url_for('static', filename='js/globalmenu.js')}}"></script>
<link rel="shortcut icon" type="image/png" href="{{url_for('static', filename='img/page_icon.png')}}" />
</head>
<body>
<div id="content">
<div style="position: fixed; width: 100%; z-index: 100;" class="navbar navbar-inverse navbar-static-top navbar-dariah" id="top" />
<div class="navbar-inner">
<div class="container-fluid">
<div class="row-fluid">
<div class="span1"></div>
<div class="span10">
<div class="nav-collapse collapse">
<ul class="nav" />
<li>
<span class="brand dropdown-toggle" data-toggle="dropdown">DARIAH-DE</span>
</li>
</ul>
<ul class="nav pull-right">
<li>
<a href="{{ url_for('index') }}"><i class="icon-refresh icon-white"></i> Reset</a>
</li>
<li>
<a href="#"><i class="icon-download-alt icon-white"></i> Save Data</a>
</li>
<li>
<a href="{{ url_for('help') }}"><i class="icon-question-sign icon-white"></i> Help</a>
</li>
</ul>
</div>
</div>
</div>
</div>
</div>
</div>
<div id="content_layout" class="container-fluid">
<div style="height: 70px;"></div>
<div class="row-fluid">
<div class="span10 offset1 main-content-wrapper no-margin">
<div id="content" class="primary-area">
<h1>Topics – Easy Topic Modeling</h1>
<div id="contentInner" style="text-align:justify;">
<h2>1. Corpus and Parameter Summary</h2>
<p>
{% for table in parameter %} {{ table|safe }} {% endfor %}<br>
<center>
{{ corpus_boxplot_div|safe }}</center>
</p><br>
<h2>2. Inspecting the Topic Model</h2>
<p>Topic Models are unsupervised. It is called <i>unsupervised</i>, because you did not have any labels describing the semantic structures or anything related, but only pure word frequencies. Since the examples given to the algorithm are unlabeled,
there is no evaluation of the accuracy, or how <i>good</i> your model is. So, it is up to you now by inspecting the model to decide whether you are satisfied with its performance or not.
<div class="alert alert-info">
<button type="button" class="close" data-dismiss="alert">×</button>
<b>Tip:</b> The quantitative evaluation of topics (meaning a list of words as seen below) is a very challenging task. <b>Pointwise Mutual Information</b> (PMI) is one possibility to evaluate the semantic coherence of topics. We implemented
two variants of PMI in the programming language Python, which is available via GitHub (https://github.com/DARIAH-DE/Topics/dariah_topics/evaluation.py).
</div>
</p>
<h3>2.1. Topics</h3>
<p>Each topic is a probability distribution over the vocabulary of words found in the corpus. The top words (so-called <i>keys</i>) shown in the table below are those words most probable to be found in each topic and describe the semantic structures
of your corpus – ideally in a meaningful way. Basically, lists of the top keys associated with each topic are often all that is needed when the corpus is large and the inferred topics make sense in light of prior knowledge of the corpus.</p><br> {% for table in topics %} {{ table|safe }} {% endfor %}
<br>
<h3>2.2. Topics and Documents</h3>
<p>Each topic has proportions per document, which can be visualized in heatmap. This option displays the kind of information that is probably most useful to literary scholars. Going beyond pure exploration, this visualization can be used to show
thematic developments over a set of texts as well as a single text, akin to a dynamic topic model. What also can become apparent here, is that some topics correlate highly with a specific author or group of authors, while other topics correlate
highly with a specific text or group of texts. All in all, this displays two of LDA's properties – its use as a distant reading tool that aims to get at text meaning, and its use as a provider of data that can be further used in computational
analysis, such as document classification or authorship attribution.</p><br> {{ heatmap_div|safe }}<br><br>
<h3>2.3. Topic Proportions of Documents</h3>
<br>{{ topics_div|safe }}<br><br>
<h3>2.4. Document Proportions of Topics</h3>
<br>{{ documents_div|safe }}<br><br><br>
<h2>2. Diving Deeper into Topic Modeling</h2>
<p>We want to empower users with little or no previous experience and programming skills to create custom workflows mostly using predefined functions within a familiar environment. So, if this practical introduction aroused your interest and
you want to <b>dive deeper into the technical parts</b>, we provide another convenient, modular workflow that can be entirely controlled from within a well documented Jupyter notebook, integrating a total of three popular LDA implementations.</p>
<p>All resources are available via GitHub (https://github.com/DARIAH-DE/Topics).</p>
</div>
</div>
</div>
</div>
<div class="row-fluid">
<div id="footer" class="span10 offset1 no-margin footer">
<span>© 2017-2018 DARIAH-DE</span>
</div>
</div>
</div>
</div>
</body>
</html>