Skip to content

Commit

Permalink
Fix typo
Browse files Browse the repository at this point in the history
  • Loading branch information
severinsimmler committed Jun 3, 2018
1 parent a704a97 commit f51929a
Show file tree
Hide file tree
Showing 2 changed files with 7 additions and 7 deletions.
12 changes: 6 additions & 6 deletions application/templates/model.html
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@
{% block content %}
<h1>Topics – Easy Topic Modeling</h1>
<div id="contentInner" style="text-align:justify;">
<h2>1. Corpus and Parameter Summary</h2>
<h2>1. Corpus and parameter summary</h2>
<p>All parameter settings are summarized in the following table, among others. This kind of information could be useful if you want to create more than one topic model and compare the results. The most common way to evaluate a probabilistic model is to calculate the so-called <b>log-likelihood</b>. If you increase the number of iterations, you will see that not only your topics get better and better, but also the log-likelihood will increase up to a <b>certain point</b>. This way you could find the ideal number of iterations.
{% for table in parameter %}
{{ table|safe }}
Expand All @@ -40,7 +40,7 @@ <h2>1. Corpus and Parameter Summary</h2>
<button type="button" class="close" data-dismiss="alert">&times;</button>
<b>FYI:</b> All tables and graphics shown here are available for <b>download as ZIP archive</b>. Just use the button in the toolbar.
</div>
<h2>2. Inspecting the Topic Model</h2>
<h2>2. Inspecting the topic model</h2>
<p>Topic models are unsupervised. It is called <i>unsupervised</i>, because you did not have any labels describing the semantic structures of your documents, but only pure word frequencies. Therefore, there is no automatic evaluation of how <i>good</i> the topics are. So, it is up to you by inspecting the model to decide whether you are satisfied with your models’ performance or not.
<div class="alert alert-info">
<button type="button" class="close" data-dismiss="alert">&times;</button>
Expand All @@ -54,12 +54,12 @@ <h3>2.1. Topics</h3>
{{ table|safe }}
{% endfor %}
<br>
<h3>2.2. Topics and Documents</h3>
<h3>2.2. Topics and documents</h3>
<p>Each document <i>consists</i> to a certain extent of each topic (this is the theoretical assumption of topic models). The proportions can be visualized in a heatmap. This displays the kind of information that is <b>probably most useful to literary scholars</b>. Going beyond pure exploration, this visualization can be used to show <b>thematic developments</b> over a set of texts as well as a single text, akin to a dynamic topic model. What also can become apparent here, is that some topics correlate highly with a <b>specific author or group of authors</b>, while other topics correlate highly with a <b>specific text or group of texts</b>. All in all, this displays two of LDA’s properties – its use as a <b>distant reading tool</b> that aims to get at text meaning, and its use as a provider of data that can be further used in computational analysis, such as document classification or authorship attribution.</p>
{{ heatmap_div|safe }}
<br>
<br>
<h3>2.3. Distribution of Topics</h3>
<h3>2.3. Distribution of topics</h3>
<p>In the following graphic you can access <i>one</i> dimension of the information displayed in the heatmap above. This might be a more clear approach, if you are interested in a specific topic, or, more precisely, how the topic is distributed over the documents of your corpus. You can use the widget to select a specific topic.
</p>
<div class="alert alert-success">
Expand All @@ -76,7 +76,7 @@ <h3>2.3. Distribution of Topics</h3>
{{ topics_div|safe }}
<br>
</div>
<h3>2.4. Distribution of Documents</h3>
<h3>2.4. Distribution of documents</h3>
<p>Similar to the above barchart, you can access the <i>other</i> dimension displayed in the heatmap. So, if you are intereseted in a specific <i>document</i>, you have the ability to select it via the widget and inspect its proportions.
</p>
<div class="alert alert-success">
Expand All @@ -93,7 +93,7 @@ <h3>2.4. Distribution of Documents</h3>
{{ documents_div|safe }}
<br>
</div>
<h2>2. Delving Deeper into Topic Modeling</h2>
<h2>2. Delving deeper into topic modeling</h2>
<p>We want to introduce users with little or no programming experience to digital methods. If this little insight into the text mining technique topic modeling has aroused your interest, and you want to delve deeper into the <b>technical parts</b>, we provide the same convenient, modular workflow which can be entirely controlled from within a well documented Jupyter notebook, integrating a total of three popular LDA implementations.</p>
<p>All resources are available via GitHub. To prevent dead links in this application, it is probably safer if you search the internet for the GitHub repository yourself. The name of the organization is <b>DARIAH-DE</b>, the name of the repository <b>Topics</b>.</p>
</div>
Expand Down
2 changes: 1 addition & 1 deletion tests/web_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -43,5 +43,5 @@ def test_model(client):
file.write("foo;bar\nfoo;bar")
resp = client.get("/model")
assert resp.status_code == 200
assert b"Inspecting the Topic Model" in resp.data
assert b"Inspecting the topic model" in resp.data
application.utils.unlink_content(tempdir)

0 comments on commit f51929a

Please sign in to comment.