Skip to content
Permalink
Browse files

Initial commit

  • Loading branch information...
thombashi committed Oct 23, 2016
1 parent 99db565 commit c802c1c78503ad5af627417786075384be6630b6
Showing with 5,601 additions and 0 deletions.
  1. +66 −0 .gitignore
  2. +26 −0 .travis.yml
  3. +11 −0 MANIFEST.in
  4. +121 −0 README.rst
  5. +24 −0 appveyor.yml
  6. +225 −0 docs/Makefile
  7. +404 −0 docs/conf.py
  8. +21 −0 docs/index.rst
  9. +281 −0 docs/make.bat
  10. +64 −0 docs/make_readme.py
  11. +5 −0 docs/pages/examples/csv.rst
  12. +7 −0 docs/pages/examples/index.rst
  13. +34 −0 docs/pages/examples/load_csv.txt
  14. +4 −0 docs/pages/genindex.rst
  15. +39 −0 docs/pages/installation.rst
  16. +6 −0 docs/pages/introduction/badges.txt
  17. +9 −0 docs/pages/introduction/feature.txt
  18. +12 −0 docs/pages/introduction/index.rst
  19. +1 −0 docs/pages/introduction/summary.txt
  20. +12 −0 docs/pages/links.rst
  21. +9 −0 docs/pages/reference/data.rst
  22. +15 −0 docs/pages/reference/error.rst
  23. +10 −0 docs/pages/reference/index.rst
  24. +91 −0 docs/pages/reference/loader.rst
  25. +26 −0 examples/load_table_from_csv.py
  26. +27 −0 pytablereader/__init__.py
  27. +33 −0 pytablereader/_acceptor.py
  28. +24 −0 pytablereader/_constant.py
  29. 0 pytablereader/csv/__init__.py
  30. +179 −0 pytablereader/csv/core.py
  31. +42 −0 pytablereader/csv/formatter.py
  32. +182 −0 pytablereader/data.py
  33. +43 −0 pytablereader/error.py
  34. +39 −0 pytablereader/formatter.py
  35. 0 pytablereader/html/__init__.py
  36. +135 −0 pytablereader/html/core.py
  37. +90 −0 pytablereader/html/formatter.py
  38. +168 −0 pytablereader/interface.py
  39. 0 pytablereader/json/__init__.py
  40. +181 −0 pytablereader/json/core.py
  41. +167 −0 pytablereader/json/formatter.py
  42. 0 pytablereader/mediawiki/__init__.py
  43. +135 −0 pytablereader/mediawiki/core.py
  44. +20 −0 pytablereader/mediawiki/formatter.py
  45. 0 pytablereader/spreadsheet/__init__.py
  46. +71 −0 pytablereader/spreadsheet/core.py
  47. +144 −0 pytablereader/spreadsheet/excelloader.py
  48. +3 −0 requirements/docs_requirements.txt
  49. +8 −0 requirements/requirements.txt
  50. +4 −0 requirements/test_requirements.txt
  51. +5 −0 setup.cfg
  52. +59 −0 setup.py
  53. +361 −0 test/test_csvl_reader.py
  54. +256 −0 test/test_data.py
  55. +258 −0 test/test_excel_reader.py
  56. +556 −0 test/test_html_reader.py
  57. +381 −0 test/test_json_reader.py
  58. +497 −0 test/test_mediawiki_reader.py
  59. +10 −0 tox.ini
@@ -0,0 +1,66 @@
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]

# C extensions
*.so

# Distribution / packaging
.Python
env/
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
*.egg-info/
.installed.cfg
*.egg

# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*,cover

# Translations
*.mo
*.pot

# Django stuff:
*.log

# Sphinx documentation
docs/_build/

# PyBuilder
target/
desktop.ini
README.md
*.ipynb
misc/README_HEADER.rst
misc/readme_converter.py
upgrade.sh
sandbox/
uml/
_bkp/
@@ -0,0 +1,26 @@
language: python

matrix:
include:
- python: 2.7
env: TOXENV=python2.7
- python: 3.3
env: TOXENV=python3.3
- python: 3.4
env: TOXENV=python3.4
- python: 3.5
env: TOXENV=python3.5

os:
- linux

install:
- pip install tox
- pip install coveralls

script:
- tox
- python setup.py test --addopts "-v --cov pytablereader --cov-report term-missing"

after_success:
- coveralls
@@ -0,0 +1,11 @@
include docs/pages/introduction/summary.txt
include LICENSE
include README.rst
include setup.cfg
include tox.ini

recursive-include test *
recursive-include requirements *

global-exclude __pycache__/*
global-exclude *.pyc
@@ -0,0 +1,121 @@
pytablereader
=============

.. image:: https://img.shields.io/pypi/pyversions/pytablereader.svg
:target: https://pypi.python.org/pypi/pytablereader
.. image:: https://travis-ci.org/thombashi/pytablereader.svg?branch=master
:target: https://travis-ci.org/thombashi/pytablereader
.. image:: https://coveralls.io/repos/github/thombashi/pytablereader/badge.svg?branch=master
:target: https://coveralls.io/github/thombashi/pytablereader?branch=master

Summary
-------

pytablereader is a python library to load structured table data from various data format: CSV/HTML/JSON/MediaWiki/Excel.

Feature
-------

- Extract structured table data from various data format:
- CSV file/text
- HTML file/text
- JSON file/text
- MediaWiki file/text
- Microsoft Excel :superscript:`TM` file

Examples
========

Load a CSV table
----------------


.. code:: python
from __future__ import print_function
import pytablereader
file_path = "sample_data.csv"
data = "\n".join([
'"attr_a","attr_b","attr_c"',
'1,4,"a"',
'2,2.1,"bb"',
'3,120.9,"ccc"',
])
with open(file_path, "w") as f:
f.write(data)
# load from a csv file ---
loader = pytablereader.CsvTableFileLoader(file_path)
for table_data in loader.load():
print("load from file: {:s}".format(table_data))
# load from a csv text ---
loader = pytablereader.CsvTableTextLoader(csv_text)
for table_data in loader.load():
print("load from text: {:s}".format(table_data))
.. code::

load from file: table_name=sample_data, header_list=[u'attr_a', u'attr_b', u'attr_c'] record_list=[['1', '4', u'a'], ['2', '2.1', u'bb'], ['3', '120.9', u'ccc']]
load from text: table_name=csv2, header_list=[u'attr_a', u'attr_b', u'attr_c'] record_list=[['1', '4', u'a'], ['2', '2.1', u'bb'], ['3', '120.9', u'ccc']]

For more information
--------------------

More examples are available at
http://pytablereader.readthedocs.org/en/latest/pages/examples/index.html

Installation
============

::

pip install pytablereader


Dependencies
============

Python 2.7+ or 3.3+

Mandatory Python packages
----------------------------------

- `beautifulsoup4 <https://www.crummy.com/software/BeautifulSoup/>`__
- `DataPropery <https://github.com/thombashi/DataProperty>`__ (Used to extract data types)
- `jsonschema <https://github.com/Julian/jsonschema>`__
- `pathvalidate <https://github.com/thombashi/pathvalidate>`__
- `path.py <https://github.com/jaraco/path.py>`__
- `pypandoc <https://github.com/bebraw/pypandoc>`__
- `six <https://pypi.python.org/pypi/six/>`__
- `xlrd <https://github.com/python-excel/xlrd>`__

Optional (not Python packages)
----------------------------------

- `lxml <http://lxml.de/installation.html>`__ (faster HTML convert if installed)
- `pandoc <http://pandoc.org/>`__ (require when loading MediaWiki)


Test dependencies
-----------------

- `pytest <http://pytest.org/latest/>`__
- `pytest-runner <https://pypi.python.org/pypi/pytest-runner>`__
- `tox <https://testrun.org/tox/latest/>`__
- `XlsxWriter <http://xlsxwriter.readthedocs.io/>`__

Documentation
=============

http://pytablereader.readthedocs.org/en/latest/

Related Project
===============

- `pytablewriter <https://github.com/thombashi/pytablewriter>`__
- Loaded data by ``pytablereader`` can be write another table format with ``pytablewriter``

@@ -0,0 +1,24 @@
build: false
environment:
matrix:
- PYTHON: "C:/Python27-x64"
- PYTHON: "C:/Python35-x64"

init:
- "ECHO %PYTHON%"
- ps: "ls C:/Python*"

install:
- ps: (new-object net.webclient).DownloadFile('https://bootstrap.pypa.io/get-pip.py', 'C:/get-pip.py')
- "%PYTHON%/python.exe C:/get-pip.py"
- "%PYTHON%/Scripts/pip.exe --version"
- "%PYTHON%/Scripts/pip.exe install pytest"

test_script:
- "%PYTHON%/python.exe setup.py test"

notifications:
- provider: Slack
auth_token:
secure: JyTQAtBzpPYiWK3eRTz/U+rvmAKopqIWE19ti4vSL/IRygV3jUVUkwET1VyTlrqOeYfNx3Kfcp7eUmHCHxFCgw==
channel: notifications

0 comments on commit c802c1c

Please sign in to comment.
You can’t perform that action at this time.