Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

0.3.6 #149

Merged
merged 148 commits into from
Nov 17, 2018
Merged

0.3.6 #149

Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
148 commits
Select commit Hold shift + click to select a range
7f113a1
branched 0.3.6
chrislit Oct 28, 2018
b208b89
made Arithmetic class
chrislit Oct 28, 2018
0e69d37
Merge branch 'master' of github.com:chrislit/abydos into 0.3.6
chrislit Oct 28, 2018
11632ce
progress towards embedding in classes
chrislit Oct 28, 2018
59c8535
added dist methods back for docs/doctests
chrislit Oct 30, 2018
8aad011
added dist methods back for docs/doctests
chrislit Oct 30, 2018
a1f308d
added removed methods back, removed (Abstract)Distance class
chrislit Oct 30, 2018
f10739e
encapusalted in classes
chrislit Oct 30, 2018
447d303
encapusalted in classes
chrislit Oct 30, 2018
721cfd9
switched to damerau_levenshtein from levenshtein + 'dam' arg
chrislit Oct 30, 2018
8adffaf
encapsulated in classes
chrislit Oct 30, 2018
bf10b1c
encapsulated in classes
chrislit Oct 31, 2018
2b6b62e
added remaining classes to exports
chrislit Oct 31, 2018
60ce043
encapsulated in classes
chrislit Oct 31, 2018
b45a326
re-enabled dist members, inheretence from Distance
chrislit Nov 1, 2018
c37754d
encapsulated in classes
chrislit Nov 1, 2018
540a1ea
encapsulated in classes
chrislit Nov 1, 2018
3b8e5d0
encapsulated in classes
chrislit Nov 1, 2018
6083d06
black formetted
chrislit Nov 1, 2018
548cb26
flake8 fixes
chrislit Nov 1, 2018
eac7a72
spacing adjustment
chrislit Nov 2, 2018
fec207a
encapsulated in classes
chrislit Nov 2, 2018
dfbdc5e
encapsulated in classes
chrislit Nov 2, 2018
6c7dc0c
embedded in classes
chrislit Nov 2, 2018
e66b042
embedded in classes
chrislit Nov 3, 2018
6a1466a
added corrections
chrislit Nov 3, 2018
1c969b7
more class-ization, moved _get_qgrams into TokenDistance class
chrislit Nov 3, 2018
7e50172
version bump
chrislit Nov 4, 2018
6544a2e
fixed doctests, formatting
chrislit Nov 4, 2018
08141a0
flake8 fixes
chrislit Nov 4, 2018
81370ac
replaced codecs calls with calls to compressor modules
chrislit Nov 4, 2018
60c582f
fixed bz2 distance (was chopping off more than header); made tests mo…
chrislit Nov 4, 2018
f962d55
encapsulated stemmers in classes
chrislit Nov 5, 2018
3cfb2a2
added PyPI downloads/month badge
chrislit Nov 5, 2018
84af913
added darglint to flake8 runs
chrislit Nov 5, 2018
5db2274
reduced to 4-space indentation
chrislit Nov 5, 2018
e73015f
converted to Google doc style
chrislit Nov 5, 2018
e99e13d
converted to Google doc style
chrislit Nov 5, 2018
e27e715
coverted src & tar args to Google doc style
chrislit Nov 5, 2018
ec1a5fb
first pass converting docs to Google doc style
chrislit Nov 6, 2018
51ef98b
completed conversion to Google docs style
chrislit Nov 6, 2018
da90832
converted docstrings to Google doc style
chrislit Nov 6, 2018
03fd125
converted to Google do style
chrislit Nov 6, 2018
c60b298
converted to Google doc style
chrislit Nov 6, 2018
fd8895b
converted docstrings to Google doc style
chrislit Nov 7, 2018
ebf1632
converted docs to Google doc style
chrislit Nov 8, 2018
1dad558
markup correction
chrislit Nov 8, 2018
c284167
markup correction
chrislit Nov 8, 2018
e46602c
Merge branch 'master' into 0.3.6
chrislit Nov 8, 2018
30631d4
beginning further breakout; added suggested futures from Google style…
chrislit Nov 8, 2018
1d6e1bb
broke files out into 1 class/file
chrislit Nov 8, 2018
139c55a
broke files out into 1 class/file
chrislit Nov 9, 2018
318b7e6
broke files out into 1 class/file
chrislit Nov 9, 2018
317513f
switched from deprecated method
chrislit Nov 10, 2018
05619d4
broke files out into 1 class/file
chrislit Nov 10, 2018
1c00db3
broke files out into 1 class/file
chrislit Nov 10, 2018
b278002
black code styling
chrislit Nov 10, 2018
bfc346c
adjusted to refactor
chrislit Nov 10, 2018
a9190eb
renamed file to match class name
chrislit Nov 10, 2018
bbc8ce4
re-ordered imports to alphabetical
chrislit Nov 10, 2018
0bd706b
renamed files to snake_case
chrislit Nov 10, 2018
4099049
addressed #137
chrislit Nov 10, 2018
322deac
refactoring tests to use classes
chrislit Nov 10, 2018
181724e
broke out tests by class
chrislit Nov 10, 2018
693f406
broke out tversky
chrislit Nov 10, 2018
710fb45
added consistent __future__ imports everywhere
chrislit Nov 10, 2018
cb0144a
refactoring tests
chrislit Nov 12, 2018
4e7d848
continued test refactoring
chrislit Nov 12, 2018
f72f98b
continued test refactoring
chrislit Nov 12, 2018
e4ba759
continued test refactoring
chrislit Nov 12, 2018
c519301
continued test refactoring
chrislit Nov 12, 2018
8df0db0
reformatted to black style
chrislit Nov 12, 2018
86020bc
continued test refactoring
chrislit Nov 13, 2018
6e4e083
continued test refactoring
chrislit Nov 13, 2018
50ea2cd
fixed NCDlzma tests to account for difference in Py2.7
chrislit Nov 13, 2018
820d879
transitioned to classes
chrislit Nov 13, 2018
affc1df
reformatted to black style
chrislit Nov 13, 2018
0ea1808
corrected formatting for flake8
chrislit Nov 13, 2018
80cdcc4
removed TODO
chrislit Nov 13, 2018
58d48d3
moved TODO to issue
chrislit Nov 13, 2018
a9735d6
cleared flake8 errors
chrislit Nov 13, 2018
a5365c3
formatting updates
chrislit Nov 13, 2018
8e8633b
added badge/infrastructure for pydocstyle to check numpy style
chrislit Nov 13, 2018
7f064ac
converted to numpy doc style
chrislit Nov 13, 2018
2909dd9
converted to numpy doc style
chrislit Nov 14, 2018
b2e32ad
converted to numpy doc style, removed "Fingerprint" from class names/…
chrislit Nov 14, 2018
6b5dbcf
converted to numpy doc style
chrislit Nov 14, 2018
b2bb9f3
converted to numpy doc style
chrislit Nov 14, 2018
12f0969
converted to numpy doc style
chrislit Nov 14, 2018
5979ef2
converted to numpy doc style
chrislit Nov 14, 2018
ea94bdf
corrected remaining numpy doc styling fixes
chrislit Nov 14, 2018
9a3ec19
updated class names
chrislit Nov 14, 2018
4629d0b
fixed doctests to ignore D202 & test some extra stuff
chrislit Nov 14, 2018
5345905
added pydocstyle to the main tox chain
chrislit Nov 14, 2018
63c7781
fixed formatting to 0-out pydocstyle rating & fix PDF building
chrislit Nov 14, 2018
f022ae4
switched to vertical list for tox environments
chrislit Nov 14, 2018
e38e3ab
workaround for cyrillic non-rendering
chrislit Nov 15, 2018
6e5cabf
updated notebooks to use class objects
chrislit Nov 15, 2018
c41ee59
changed prod to a private function
chrislit Nov 15, 2018
f6ce822
wrote most of tutorial
chrislit Nov 15, 2018
03515da
fixed typo
chrislit Nov 15, 2018
8bf8dec
added links
chrislit Nov 15, 2018
e4be0e8
assorted corrections (docs mostly)
chrislit Nov 15, 2018
f54e94e
completed tutorial
chrislit Nov 15, 2018
1a838ad
renamed file
chrislit Nov 15, 2018
aa4388d
switched to NotImplementedError
chrislit Nov 15, 2018
435a653
added badge notes
chrislit Nov 15, 2018
794bf76
doc8 fixes
chrislit Nov 15, 2018
bcc0920
added links to local tools
chrislit Nov 15, 2018
e943883
docs corrections and enhancements
chrislit Nov 16, 2018
74ef7fe
doc fixes
chrislit Nov 16, 2018
be214be
updated history
chrislit Nov 16, 2018
78950a2
fixed quote rendering
chrislit Nov 16, 2018
46ae849
added links to project page
chrislit Nov 16, 2018
be004de
disabled SF01 errors
chrislit Nov 16, 2018
30902f1
moved links
chrislit Nov 16, 2018
abd6b3e
added apostrophe
chrislit Nov 16, 2018
c9783c9
adjusted formatting
chrislit Nov 16, 2018
dd6d847
removed unused sections
chrislit Nov 16, 2018
cc9ef54
added refs for CT/stats
chrislit Nov 17, 2018
21b017a
moved tutorial into package docs
chrislit Nov 17, 2018
4515d83
removed big list from docs, may add shorter list at some point
chrislit Nov 17, 2018
1d5adff
fixed doctests for new tutorial stuff
chrislit Nov 17, 2018
71f121c
tightened lists
chrislit Nov 17, 2018
aefab00
removed trailing space
chrislit Nov 17, 2018
1fe0d54
fixed encoding
chrislit Nov 17, 2018
5e54892
removed class vars to objects
chrislit Nov 17, 2018
6374763
encapsulated in function to avoid being caught by pytest doctests
chrislit Nov 17, 2018
a3718e4
updated subproject
chrislit Nov 17, 2018
8f4860f
added config for pytest
chrislit Nov 17, 2018
9f97b50
added newline
chrislit Nov 17, 2018
ac2ff2d
changed from AppVeyor to Azure DevOps
chrislit Nov 17, 2018
a303131
flake8 fixes
chrislit Nov 17, 2018
dd6bdc8
exposed indel function
chrislit Nov 17, 2018
8b40c01
increased test coverage
chrislit Nov 17, 2018
1d6bf63
swapped order of pythons back so that coverage reports 3.6 results
chrislit Nov 17, 2018
400061e
removed branch (this wouldn't work if false)
chrislit Nov 17, 2018
c4fd191
removed limitation on length
chrislit Nov 17, 2018
11d7309
refactored base class names
chrislit Nov 17, 2018
8960c34
added tests to increase test coverage
chrislit Nov 17, 2018
15c5c55
added to coding standards
chrislit Nov 17, 2018
5e1a978
updated DOI & set release date
chrislit Nov 17, 2018
cfb8754
fixed converter (now generates data file w/o changes)
chrislit Nov 17, 2018
e9a90c2
updated with pydocstyle, indented
chrislit Nov 17, 2018
20c00b5
enabled doc8 checking of docstrings
chrislit Nov 17, 2018
5bc7440
doc8 fixes
chrislit Nov 17, 2018
f1cc38f
corrected test names
chrislit Nov 17, 2018
c0060e6
removed appveyor config
chrislit Nov 17, 2018
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
2 changes: 1 addition & 1 deletion .codeclimate.yml
Original file line number Diff line number Diff line change
Expand Up @@ -22,4 +22,4 @@ exclude_patterns:
- "docs/"
- "setup.py"
- "badge_update.py"
- "_bmdata.py"
- "_beider_morse_data.py"
6 changes: 3 additions & 3 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -18,17 +18,17 @@ matrix:

notifications:
email: false

# Install packages
install:
- if [[ $TRAVIS_PYTHON_VERSION == 2* ]]; then travis_retry pip install pyliblzma; fi
- travis_retry pip install coveralls
- travis_retry python setup.py install

# Run test
script:
- nosetests --verbose --with-coverage --cover-erase --cover-branches --cover-package=abydos --logging-level=INFO --process-timeout=60 --process-restartworker

# Calculate coverage
after_success:
- coveralls --verbose --rcfile=.coveragerc
14 changes: 12 additions & 2 deletions CODING_STANDARDS.rst
Original file line number Diff line number Diff line change
@@ -1,6 +1,14 @@
CODING STANDARDS
----------------

- nosetest will be used for testing
- flake8 will be used for best practice conformance
- pydocstyle will be used to ensure documentation style conformance to PEP257
(for the most part) and NumPy documentation style
- black will be used to keep code style consistent

----

git commits
~~~~~~~~~~~

Expand All @@ -15,6 +23,8 @@ git pushes
A git push should be performed only under the following conditions:

- library is syntactically correct (compiling correctly) in both Python 2 & 3
- library passes all tests according to nosetests in both Python 2 & 3
- library passes all tests and doctests according to nosetests in both Python 2
& 3
- test coverage is 100% according to nosetests
- using the included pylint.rc, pylint reports a 10/10 rating
- flake8 and pydocstyle should report 0 issues
- black code styling has been applied
16 changes: 16 additions & 0 deletions HISTORY.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,22 @@ Release History
---------------


0.3.6 (2018-11-17) *classy carl*
++++++++++++++++++++++++++++++++

doi:10.5281/zenodo.1490288

Changes:

- Most functions were encapsulated into classes.
- Each class is broken out into its own file, with test files paralleling
library files.
- Documentation was converted from Sphinx markup to Numpy style.
- A tutorial was written for each subpackage.
- Documentation was cleaned up, with math markup corrections and many
additional links.


0.3.5 (2018-10-31) *cantankerous carl*
++++++++++++++++++++++++++++++++++++++

Expand Down
49 changes: 32 additions & 17 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -8,13 +8,13 @@ Abydos
+------------------+------------------------------------------------------+
| Dependencies | |requires| |snyk| |pyup| |fossa| |
+------------------+------------------------------------------------------+
| Local Analysis | |pylint| |flake8| |black| |
| Local Analysis | |pylint| |flake8| |pydocstyle| |black| |
+------------------+------------------------------------------------------+
| Usage | |docs| |mybinder| |license| |sourcerank| |zenodo| |
+------------------+------------------------------------------------------+
| Contribution | |cii| |waffle| |openhub| |
+------------------+------------------------------------------------------+
| PyPI | |pypi| |pypi-ver| |
| PyPI | |pypi| |pypi-dl| |pypi-ver| |
+------------------+------------------------------------------------------+
| conda-forge | |conda| |conda-dl| |conda-platforms| |
+------------------+------------------------------------------------------+
Expand Down Expand Up @@ -71,14 +71,18 @@ Abydos
:target: https://app.fossa.io/projects/git%2Bgithub.com%2Fchrislit%2Fabydos?ref=badge_shield
:alt: FOSSA Status

.. |pylint| image:: https://img.shields.io/badge/Pylint-9.5/10-green.svg
.. |pylint| image:: https://img.shields.io/badge/Pylint-9.16/10-yellowgreen.svg
:target: #
:alt: Pylint Score

.. |flake8| image:: https://img.shields.io/badge/flake8-2-green.svg
.. |flake8| image:: https://img.shields.io/badge/flake8-0-brightgreen.svg
:target: #
:alt: flake8 Errors

.. |pydocstyle| image:: https://img.shields.io/badge/pydocstyle-0-brightgreen.svg
:target: #
:alt: pydocstyle Errors

.. |black| image:: https://img.shields.io/badge/code%20style-black-000000.svg
:target: https://github.com/ambv/black
:alt: black
Expand All @@ -91,16 +95,16 @@ Abydos
:target: https://mybinder.org/v2/gh/chrislit/abydos/master?filepath=binder
:alt: Binder

.. |license| image:: https://img.shields.io/badge/License-GPL%20v3-blue.svg
.. |license| image:: https://img.shields.io/badge/License-GPL%20v3+-blue.svg
:target: https://www.gnu.org/licenses/gpl-3.0
:alt: License: GPL v3
:alt: License: GPL v3.0+

.. |sourcerank| image:: https://img.shields.io/librariesio/sourcerank/pypi/abydos.svg
:target: https://libraries.io/pypi/abydos
:alt: Libraries.io SourceRank

.. |zenodo| image:: https://zenodo.org/badge/DOI/10.5281/zenodo.1462443.svg
:target: https://doi.org/10.5281/zenodo.1463204
.. |zenodo| image:: https://zenodo.org/badge/DOI/10.5281/zenodo.1490288.svg
:target: https://doi.org/10.5281/zenodo.1490288
:alt: Zenodo

.. |cii| image:: https://bestpractices.coreinfrastructure.org/projects/1598/badge
Expand All @@ -119,7 +123,11 @@ Abydos
:target: https://pypi.python.org/pypi/abydos
:alt: PyPI

.. |pypi-ver| image:: https://img.shields.io/pypi/pyversions/abydos.svg
.. |pypi-dl| image:: https://img.shields.io/pypi/dm/abydos.svg
:target: https://pypi.python.org/pypi/abydos
:alt: PyPI downloads/month

.. |pypi-ver| image:: https://img.shields.io/pypi/pyversions/abydos.svg
:target: https://pypi.python.org/pypi/abydos
:alt: PyPI versions

Expand All @@ -138,11 +146,12 @@ Abydos
|

.. image:: https://raw.githubusercontent.com/chrislit/abydos/master/abydos-small.png
:target: https://github.com/chrislit/abydos
:alt: abydos
:align: right

|
| Abydos NLP/IR library
| `Abydos NLP/IR library <https://github.com/chrislit/abydos>`_
| Copyright 2014-2018 by Christopher C. Little

Abydos is a library of phonetic algorithms, string distance measures & metrics,
Expand Down Expand Up @@ -188,6 +197,7 @@ stemmers, and string fingerprinters including:
- SoundexBR
- NRL English-to-phoneme
- Beider-Morse Phonetic Matching

- String distance metrics
- Levenshtein distance
- Optimal String Alignment distance
Expand All @@ -212,7 +222,7 @@ stemmers, and string fingerprinters including:
- Monge-Elkan similarity & distance
- Matrix similarity
- Needleman-Wunsch score
- Smither-Waterman score
- Smith-Waterman score
- Gotoh score
- Length similarity
- Prefix, Suffix, and Identity similarity & distance
Expand All @@ -226,6 +236,7 @@ stemmers, and string fingerprinters including:
- Typo distance
- Indel distance
- Synoname

- Stemmers
- the Lovins stemmer
- the Porter and Porter2 (Snowball English) stemmers
Expand All @@ -236,6 +247,7 @@ stemmers, and string fingerprinters including:
- Paice-Husk Stemmer
- Schinke Latin stemmer
- S stemmer

- String Fingerprints
- string fingerprint
- q-gram fingerprint
Expand All @@ -248,6 +260,7 @@ stemmers, and string fingerprinters including:
- Cisłak & Grabowski's position fingerprint
- Synoname Toolcode


-----

Installation
Expand Down Expand Up @@ -279,7 +292,7 @@ To install Abydos (latest release) from PyPI using pip::

pip install abydos

To install from `conda-forge <https://conda-forge.org/>`_::
To install from `conda-forge <https://anaconda.org/conda-forge/abydos>`_::

conda install abydos

Expand All @@ -292,10 +305,10 @@ To run the whole test-suite just call tox::

tox

The tox setup has the following environments: py27, py36, doctest,
py27-regression, py36-regression, pylint, pycodestyle, flake8, doc8,
badges, docs, py27-fuzz, & py36-fuzz. So if only want to generate documentation
(in HTML, EPUB, & PDF formats), just call::
The tox setup has the following environments: black, py36, py27, doctest,
py36-regression, py27-regression, py36-fuzz, py27-fuzz, pylint, pycodestyle,
pydocstyle, flake8, doc8, badges, docs, & dist. So if you only want to generate
documentation (in HTML, EPUB, & PDF formats), just call::

tox -e docs

Expand All @@ -304,4 +317,6 @@ In order to only run & generate Flake8 reports, call::
tox -e flake8

Contributions such as bug reports, PRs, suggestions, desired new features, etc.
are welcome through the Github Issues & Pull requests.
are welcome through Github
`Issues <https://github.com/chrislit/abydos/issues>`_ &
`Pull requests <https://github.com/chrislit/abydos/pulls>`_.
30 changes: 29 additions & 1 deletion abydos/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,8 +19,36 @@
"""abydos.

Abydos NLP/IR library by Christopher C. Little


There are nine major packages that make up Abydos:

- :py:mod:`.compression` for string compression classes
- :py:mod:`.corpus` for document corpus classes
- :py:mod:`.distance` for string distance measure & metric classes
- :py:mod:`.fingerprint` for string fingerprint classes
- :py:mod:`.phones` for functions relating to phones and phonemes
- :py:mod:`.phonetic` for phonetic algorithm classes
- :py:mod:`.stats` for statistical functions and a confusion table class
- :py:mod:`.stemmer` for stemming classes
- :py:mod:`.tokenizer` for tokenizer classes

Classes with each package have consistent method names, as discussed below.
A tenth package, :py:mod:`.util`, contains functions not intended for end-user
use.

----

"""
__version__ = '0.3.5'

from __future__ import (
absolute_import,
division,
print_function,
unicode_literals,
)

__version__ = '0.3.6'

__all__ = [
'compression',
Expand Down
43 changes: 32 additions & 11 deletions abydos/compression/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,30 +16,51 @@
# You should have received a copy of the GNU General Public License
# along with Abydos. If not, see <http://www.gnu.org/licenses/>.

"""abydos.compression.
r"""abydos.compression.

The compression package defines compression and compression-related functions
for use within Abydos, including implementations of the following:

- arithmetic coding functions (ac_train, ac_encode, & ac_decode)
- Burrows-Wheeler transform encoder/decoder (bwt_encode & bwt_decode)
- Run-Length Encoding encoder/decoder (rle_encode & rle_decode)
- :py:class:`.Arithmetic` for arithmetic coding
- :py:class:`.BWT` for Burrows-Wheeler Transform
- :py:class:`.RLE` for Run-Length Encoding


Each class exposes ``encode`` and ``decode`` methods for performing and
reversing its encoding. For example, the Burrows-Wheeler Transform can be
performed by creating a :py:class:`.BWT` object and then calling
:py:meth:`.BWT.encode` on a string:

>>> bwt = BWT()
>>> bwt.encode('^BANANA')
'ANNB^AA\x00'

----

"""

from __future__ import unicode_literals
from __future__ import (
absolute_import,
division,
print_function,
unicode_literals,
)

from ._arithmetic import ac_decode, ac_encode, ac_train
from ._bwt import bwt_decode, bwt_encode
from ._rle import rle_decode, rle_encode
from ._arithmetic import Arithmetic, ac_decode, ac_encode, ac_train
from ._bwt import BWT, bwt_decode, bwt_encode
from ._rle import RLE, rle_decode, rle_encode

__all__ = [
'Arithmetic',
'ac_decode',
'ac_encode',
'ac_train',
'BWT',
'bwt_decode',
'bwt_encode',
'RLE',
'rle_decode',
'rle_encode',
'ac_decode',
'ac_encode',
'ac_train',
]


Expand Down
Loading