Skip to content

Commit

Permalink
mostly docs, and returned local variable to non-'local_'-prefixed form
Browse files Browse the repository at this point in the history
  • Loading branch information
chrislit committed Oct 10, 2018
1 parent 906e0fe commit f79c739
Show file tree
Hide file tree
Showing 10 changed files with 250 additions and 243 deletions.
2 changes: 1 addition & 1 deletion README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,7 @@ Abydos
:target: https://libraries.io/pypi/abydos
:alt: Libraries.io SourceRank

.. image:: https://img.shields.io/badge/Pylint-9.52/10-green.svg
.. image:: https://img.shields.io/badge/Pylint-9.51/10-green.svg
:target: #
:alt: Pylint Score

Expand Down
38 changes: 20 additions & 18 deletions abydos/compression.py
Original file line number Diff line number Diff line change
Expand Up @@ -51,9 +51,9 @@ def ac_train(text):
This is based on Andrew Dalke's public domain implementation
:cite:`Dalke:2005`. It has been ported to use the fractions.Fraction class.
:param text: The text data over which to calculate probability statistics.
This must not contain the NUL (0x00) character because that's used to
indicate the end of data.
:param str text: The text data over which to calculate probability
statistics. This must not contain the NUL (0x00) character because
that's used to indicate the end of data.
:returns: a probability dict
:rtype: dict
Expand Down Expand Up @@ -117,8 +117,9 @@ def ac_encode(text, probs):
This is based on Andrew Dalke's public domain implementation
:cite:`Dalke:2005`. It has been ported to use the fractions.Fraction class.
:param text: A string to encode
:param probs: A probability statistics dictionary generated by ac_train
:param str text: A string to encode
:param dict probs: A probability statistics dictionary generated by
ac_train
:returns: The arithmetically coded text
:rtype: tuple
Expand Down Expand Up @@ -162,9 +163,10 @@ def ac_decode(longval, nbits, probs):
This is based on Andrew Dalke's public domain implementation
:cite:`Dalke:2005`. It has been ported to use the fractions.Fraction class.
:param longval: The first part of an encoded tuple from ac_encode
:param nbits: The second part of an encoded tuple from ac_encode
:param probs: A probability statistics dictionary generated by ac_train
:param int longval: The first part of an encoded tuple from ac_encode
:param int nbits: The second part of an encoded tuple from ac_encode
:param dict probs: A probability statistics dictionary generated by
ac_train
:returns: The arithmetically decoded text
:rtype: str
Expand Down Expand Up @@ -198,8 +200,8 @@ def bwt_encode(word, terminator='\0'):
together to improve compression.
Cf. :cite:`Burrows:1994`.
:param word: the word to transform using BWT
:param terminator: a character to add to word to signal the end of the
:param str word: the word to transform using BWT
:param str terminator: a character to add to word to signal the end of the
string
:returns: word encoded by BWT
:rtype: str
Expand Down Expand Up @@ -231,8 +233,8 @@ def bwt_decode(code, terminator='\0'):
together to improve compression. This function reverses the transform.
Cf. :cite:`Burrows:1994`.
:param code: the word to transform from BWT form
:param terminator: a character added to word to signal the end of the
:param str code: the word to transform from BWT form
:param str terminator: a character added to word to signal the end of the
string
:returns: word decoded by BWT
:rtype: str
Expand Down Expand Up @@ -270,9 +272,9 @@ def rle_encode(text, use_bwt=True):
Digits 0-9 cannot be in text.
:param text: a text string to encode
:param use_bwt: boolean indicating whether to perform BWT encoding before
RLE encoding
:param str text: a text string to encode
:param bool use_bwt: boolean indicating whether to perform BWT encoding
before RLE encoding
:returns: word decoded by BWT
:rtype: str
Expand Down Expand Up @@ -310,9 +312,9 @@ def rle_decode(text, use_bwt=True):
Digits 0-9 cannot have been in the original text.
:param text: a text string to decode
:param use_bwt: boolean indicating whether to perform BWT decoding after
RLE decoding
:param str text: a text string to decode
:param bool use_bwt: boolean indicating whether to perform BWT decoding
after RLE decoding
:returns: word decoded by BWT
:rtype: str
Expand Down
34 changes: 17 additions & 17 deletions abydos/corpus.py
Original file line number Diff line number Diff line change
Expand Up @@ -47,15 +47,15 @@ def __init__(self, corpus_text='', doc_split='\n\n', sent_split='\n',
- single newlines divide sentences
- other whitespace divides words
:param corpus_text: the corpus text as a single string
:param doc_split: a character or string used to split corpus_text into
documents
:param sent_split: a character or string used to split documents into
sentences
:param filter_chars: A list of characters (as a string, tuple, set, or
list) to filter out of the corpus text
:param stop_words: A list of words (as a tuple, set, or list) to filter
out of the corpus text
:param str corpus_text: the corpus text as a single string
:param str doc_split: a character or string used to split corpus_text
into documents
:param str sent_split: a character or string used to split documents
into sentences
:param list filter_chars: A list of characters (as a string, tuple,
set, or list) to filter out of the corpus text
:param list stop_words: A list of words (as a tuple, set, or list) to
filter out of the corpus text
>>> tqbf = 'The quick brown fox jumped over the lazy dog.\n'
>>> tqbf += 'And then it slept.\n And the dog ran off.'
Expand Down Expand Up @@ -87,7 +87,7 @@ def docs(self):
:returns: the paragraphs in the corpus as a list of lists of lists
of strs
:rtype: list(list(list(str)))
:rtype: [[[str]]]
>>> tqbf = 'The quick brown fox jumped over the lazy dog.\n'
>>> tqbf += 'And then it slept.\n And the dog ran off.'
Expand All @@ -111,7 +111,7 @@ def paras(self):
:returns: the paragraphs in the corpus as a list of lists of lists
of strs
:rtype: list(list(list(str)))
:rtype: [[[str]]]
>>> tqbf = 'The quick brown fox jumped over the lazy dog.\n'
>>> tqbf += 'And then it slept.\n And the dog ran off.'
Expand All @@ -131,7 +131,7 @@ def sents(self):
Each list within a sentence represents the words within that sentence.
:returns: the sentences in the corpus as a list of lists of strs
:rtype: list(list(str))
:rtype: [[str]]
>>> tqbf = 'The quick brown fox jumped over the lazy dog.\n'
>>> tqbf += 'And then it slept.\n And the dog ran off.'
Expand All @@ -149,7 +149,7 @@ def words(self):
r"""Return the words in the corpus as a single list.
:returns: the words in the corpus as a list of strs
:rtype: list(str)
:rtype: [str]
>>> tqbf = 'The quick brown fox jumped over the lazy dog.\n'
>>> tqbf += 'And then it slept.\n And the dog ran off.'
Expand All @@ -170,7 +170,7 @@ def docs_of_words(self):
Thus the sentence level of lists has been flattened.
:returns: the docs in the corpus as a list of list of strs
:rtype: list(list(str))
:rtype: [[str]]
>>> tqbf = 'The quick brown fox jumped over the lazy dog.\n'
>>> tqbf += 'And then it slept.\n And the dog ran off.'
Expand Down Expand Up @@ -216,9 +216,9 @@ def raw(self):
def idf(self, term, transform=None):
"""Calculate the Inverse Document Frequency of a term in the corpus.
:param term: the term to calculate the IDF of
:param transform: a function to apply to each document term before
checking for the presence of term
:param str term: the term to calculate the IDF of
:param function transform: a function to apply to each document term
before checking for the presence of term
:returns: the IDF
:rtype: float
"""
Expand Down
30 changes: 15 additions & 15 deletions abydos/distance.py
Original file line number Diff line number Diff line change
Expand Up @@ -1819,18 +1819,18 @@ def sim_ratcliff_obershelp(src, tar):
>>> sim_ratcliff_obershelp('ATCG', 'TAGC')
0.5
"""
def _lcsstr_stl(local_src, local_tar):
def _lcsstr_stl(src, tar):
"""Return start positions & length for Ratcliff-Obershelp.
Return the start position in the source string, start position in
the target string, and length of the longest common substring of
strings src and tar.
"""
lengths = np_zeros((len(local_src)+1, len(local_tar)+1), dtype=np_int)
lengths = np_zeros((len(src)+1, len(tar)+1), dtype=np_int)
longest, src_longest, tar_longest = 0, 0, 0
for i in range(1, len(local_src)+1):
for j in range(1, len(local_tar)+1):
if local_src[i-1] == local_tar[j-1]:
for i in range(1, len(src)+1):
for j in range(1, len(tar)+1):
if src[i-1] == tar[j-1]:
lengths[i, j] = lengths[i-1, j-1] + 1
if lengths[i, j] > longest:
longest = lengths[i, j]
Expand All @@ -1840,7 +1840,7 @@ def _lcsstr_stl(local_src, local_tar):
lengths[i, j] = 0
return src_longest-longest, tar_longest-longest, longest

def _sstr_matches(local_src, local_tar):
def _sstr_matches(src, tar):
"""Return the sum of substring match lengths.
This follows the Ratcliff-Obershelp algorithm :cite:`Ratcliff:1988`:
Expand All @@ -1851,13 +1851,13 @@ def _sstr_matches(local_src, local_tar):
return 0.
4. Return the sum.
"""
src_start, tar_start, length = _lcsstr_stl(local_src, local_tar)
src_start, tar_start, length = _lcsstr_stl(src, tar)
if length == 0:
return 0
return (_sstr_matches(local_src[:src_start], local_tar[:tar_start]) +
return (_sstr_matches(src[:src_start], tar[:tar_start]) +
length +
_sstr_matches(local_src[src_start+length:],
local_tar[tar_start+length:]))
_sstr_matches(src[src_start+length:],
tar[tar_start+length:]))

if src == tar:
return 1.0
Expand Down Expand Up @@ -3435,11 +3435,11 @@ def _log_manhattan_keyboard_distance(char1, char2):
'log-manhattan': _log_manhattan_keyboard_distance}

def substitution_cost(char1, char2):
local_cost = sub_cost
local_cost *= (metric_dict[metric](char1, char2) +
shift_cost * (_kb_array_for_char(char1) !=
_kb_array_for_char(char2)))
return local_cost
cost = sub_cost
cost *= (metric_dict[metric](char1, char2) +
shift_cost * (_kb_array_for_char(char1) !=
_kb_array_for_char(char2)))
return cost

d_mat = np_zeros((len(src) + 1, len(tar) + 1), dtype=np_float32)
for i in range(len(src) + 1):
Expand Down
2 changes: 1 addition & 1 deletion abydos/ngram.py
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ class NGramCorpus(object):
def __init__(self, corpus=None):
r"""Initialize Corpus.
:param corpus: The Corpus from which to initialize the n-gram
:param Corpus corpus: The Corpus from which to initialize the n-gram
corpus. By default, this is None, which initializes an empty
NGramCorpus. This can then be populated using NGramCorpus methods.
Expand Down
9 changes: 6 additions & 3 deletions abydos/phones.py
Original file line number Diff line number Diff line change
Expand Up @@ -584,7 +584,7 @@ def ipa_to_features(ipa):
:param str ipa: the IPA representation of a phone or series of phones
:returns: a representation of the features of the input string
:rtype: list(int)
:rtype: [int]
>>> ipa_to_features('mut')
[2709662981243185770, 1825831513894594986, 2783230754502126250]
Expand Down Expand Up @@ -632,7 +632,7 @@ def get_feature(vector, feature):
'continuant', 'strident', 'lateral', 'delayed_release', 'nasal'
:returns: a list indicating presence/absence/neutrality with respect to
the feature
:rtype: list(int)
:rtype: [int]
>>> tails = ipa_to_features('telz')
>>> get_feature(tails, 'consonantal')
Expand Down Expand Up @@ -697,7 +697,10 @@ def cmp_features(feat1, feat2):
Otherwise, a float representing their similarity is returned.
:param int feat1, feat2: Two feature bundles to compare
:param int feat1: a feature bundle
:param int feat2: a feature bundle
:returns: a comparison of the feature bundles
:rtype: float
>>> cmp_features(ipa_to_features('l')[0], ipa_to_features('l')[0])
1.0
Expand Down

0 comments on commit f79c739

Please sign in to comment.