Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Old norse prosody and Old Norse docs updated #810

Merged
merged 36 commits into from Aug 1, 2018

Conversation

Projects
3 participants
@clemsciences
Copy link
Member

clemsciences commented Jul 30, 2018

A little piece of code for automatic analysis of Old Norse poetry. Much better will come like Sievers' types recognition.

clemsciences added some commits Jun 25, 2018

Added invalided onsets for Old Norse, "x" and "z" sound representation
for Old Norse, updated test with the new break_geminants parameter
Merge remote-tracking branch 'origin/old_norse_prosody' into old_nors…
…e_prosody

# Conflicts:
#	cltk/prosody/old_norse/verse.py
#	docs/old_norse.rst
@codecov-io

This comment has been minimized.

Copy link

codecov-io commented Jul 30, 2018

Codecov Report

Merging #810 into master will increase coverage by 0.04%.
The diff coverage is 94.18%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #810      +/-   ##
==========================================
+ Coverage   88.52%   88.56%   +0.04%     
==========================================
  Files         168      169       +1     
  Lines       10393    10479      +86     
==========================================
+ Hits         9200     9281      +81     
- Misses       1193     1198       +5
Impacted Files Coverage Δ
cltk/prosody/old_norse/verse.py 94.18% <94.18%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d7d3ace...01033b7. Read the comment docs.

clement and others added some commits Jul 30, 2018

clement

@kylepjohnson kylepjohnson requested a review from Sedictious Jul 30, 2018

@Sedictious
Copy link
Contributor

Sedictious left a comment

I really like the idea of additional prosody modules and this seems great so far. However, I still think some bits need further improvement before being ready to merge

* Ljóðaháttr
"""
@staticmethod
def is_fornyrdhislag(text: str):

This comment has been minimized.

@Sedictious

Sedictious Jul 31, 2018

Contributor

From what I get, you judge whether the meter is fornyrdhislag, solely upon the number of lines. While this could do, this still requires properly punctuated text, as this will fail for any text without newlines. This may be a overkill, but I think it would be worth to try and see methods for probabilistic metric partition, or at least add supplementary checks (check for alliteration rules?).

This comment has been minimized.

@clemsciences

clemsciences Jul 31, 2018

Author Member

Texts available at https://github.com/cltk/old_norse_texts_heimskringla/tree/master/S%C3%A6mundar-Edda are mostly well-formated and there are most of eddic poems! Here is an example https://github.com/cltk/old_norse_texts_heimskringla/blob/master/S%C3%A6mundar-Edda/V%C3%B6lusp%C3%A1/txt_files/complete.txt. So the format is not really a problem and the kind of formatting is enough. The aim here is more to check if a stanza is a fornyrðislag or a ljóðaháttr than to check if a stanza is prose or poetry. Moreover, eddic poems are not always well-formed, so that, for example, we can find verses without alliteration or strange alliteration.

I exclude for now probabilistic metric partition because it would need a large data set and the Poetic Edda is not that large. I think I'm going to annotate it. For skaldic poetry, it will be needed, because it is more strictly written.

Of course, I will explain specific words and concepts in the docs.

So this is not finished yet! I think I'm going to complete this pull request with a basic alliteration detector, but this would be only to help future annotations and statistics on eddic poems.

This comment has been minimized.

@Sedictious

Sedictious Jul 31, 2018

Contributor

Fair enough, I guess it would make more sense to keep things as is and worry about generalizing this later. But yes, I would too make sure to mention the formatting restrictions in the docs

:param text:
:return:
"""
l = [line for line in text.split("\n") if line != ""]

This comment has been minimized.

@Sedictious

Sedictious Jul 31, 2018

Contributor

Again, alliteration checks would be useful.

for j, viisuordh in enumerate(line):
syllabified_text[i].append([])
words = []
for word in tokenize_old_norse_words(viisuordh):

This comment has been minimized.

@Sedictious

Sedictious Jul 31, 2018

Contributor

a nitpick, but why not replace it with regex? You would also get an extra boost in speed.

This comment has been minimized.

@clemsciences

clemsciences Jul 31, 2018

Author Member

Regular expressions are most of the time not easy to understand and less easier to do (yes, I would need to think, if I used them). Time optimization is not my first priority, so maybe later.

clemsciences added some commits Jul 31, 2018

@clemsciences

This comment has been minimized.

Copy link
Member Author

clemsciences commented Aug 1, 2018

@Sedictious I think it is ready for this pull request.

@Sedictious
Copy link
Contributor

Sedictious left a comment

Great to see another prosody module!

@clemsciences clemsciences merged commit 5427c94 into cltk:master Aug 1, 2018

3 checks passed

codecov/patch 94.18% of diff hit (target 88.52%)
Details
codecov/project 88.56% (+0.04%) compared to d7d3ace
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details

@clemsciences clemsciences added this to Done in old-norse Feb 7, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.