Single API for reading, manipulating and writing data in csv, ods, xls, xlsx and xlsm files
Python Other
Permalink
Failed to load latest commit information.
.moban.d πŸ“š update documentation. related to #141, #139 Jun 29, 2018
docs πŸ“š relocate get an array from an excel sheet. fix #137 Apr 30, 2018
examples update documentation on examples Jan 23, 2017
pyexcel πŸ“š document this flag. #138 May 2, 2018
tests πŸ‘• fix coding style #138 May 1, 2018
.gitignore πŸ“š update documentation. related to #141, #139 Jun 29, 2018
.moban.yml πŸ”¨ code refactor changelog, use changelog.yml for future change logs Jan 11, 2018
.travis.yml πŸ“š update with pyexcel-commons May 13, 2018
CHANGELOG.rst πŸ₯š 🎑 release 0.5.8, maintenance release Mar 26, 2018
LICENSE πŸ“š update documentation. related to #141, #139 Jun 29, 2018
MANIFEST.in 🀝 take the setup.py upgrade from setupmobans Oct 20, 2017
Makefile document save_as functions Jun 14, 2017
README.rst πŸ“š update documentation. related to #141, #139 Jun 29, 2018
changelog.yml πŸ₯š 🎑 release 0.5.8, maintenance release Mar 26, 2018
document.bat fix #62: should not print the optional import as an error but info. d… Oct 30, 2016
document.sh fix #62: should not print the optional import as an error but info. d… Oct 30, 2016
min_requirements.txt πŸ₯š 🎑 release 0.5.7 Jan 11, 2018
pyexcel.yml πŸ₯š 🎑 release 0.5.8, maintenance release Mar 26, 2018
requirements.txt πŸ₯š 🎑 release 0.5.7 Jan 11, 2018
rnd_requirements.txt πŸ’š update rnd packages Jan 11, 2018
setup.cfg use bdist_wheel Jun 14, 2017
setup.py πŸ“š update with pyexcel-commons May 13, 2018
test.bat πŸ“š pump up version number and sync with pyexcel-mobans Jan 11, 2018
test.sh πŸ“š pump up version number and sync with pyexcel-mobans Jan 11, 2018

README.rst

pyexcel - Let you focus on data, instead of file formats

https://raw.githubusercontent.com/pyexcel/pyexcel.github.io/master/images/patreon.png https://api.bountysource.com/badge/team?team_id=288537 https://travis-ci.org/pyexcel/pyexcel.svg?branch=master https://readthedocs.org/projects/pyexcel/badge/?version=latest

Support the project

If your company has embedded pyexcel and its components into a revenue generating product, please support me on patreon or bounty source to maintain the project and develop it further.

If you are an individual, you are welcome to support me too and for however long you feel like. As my backer, you will receive early access to pyexcel related contents.

And your issues will get prioritized if you would like to become my patreon as pyexcel pro user.

With your financial support, I will be able to invest a little bit more time in coding, documentation and writing interesting posts.

Known constraints

Fonts, colors and charts are not supported.

Introduction

Feature Highlights

  1. One application programming interface(API) to handle multiple data sources:
    • physical file
    • memory file
    • SQLAlchemy table
    • Django Model
    • Python data structures: dictionary, records and array
  2. One API to read and write data in various excel file formats.
  3. For large data sets, data streaming are supported. A genenerator can be returned to you. Checkout iget_records, iget_array, isave_as and isave_book_as.

Installation

You can install pyexcel via pip:

$ pip install pyexcel

or clone it and install it:

$ git clone https://github.com/pyexcel/pyexcel.git
$ cd pyexcel
$ python setup.py install

Usage

Please note that you will have to use '.sortable.html' in order to replicate the example.

https://github.com/pyexcel/pyexcel-sortable/raw/master/sortable.gif

>>> # pip install pyexcel-text==0.2.7.1
>>> import pyexcel as p
>>> ccs_insight2 = p.Sheet()
>>> ccs_insight2.name = "Worldwide Mobile Phone Shipments (Billions), 2017-2021"
>>> ccs_insight2.ndjson = """
... {"year": ["2017", "2018", "2019", "2020", "2021"]}
... {"smart phones": [1.53, 1.64, 1.74, 1.82, 1.90]}
... {"feature phones": [0.46, 0.38, 0.30, 0.23, 0.17]}
... """.strip()
>>> ccs_insight2
pyexcel sheet:
+----------------+------+------+------+------+------+
| year           | 2017 | 2018 | 2019 | 2020 | 2021 |
+----------------+------+------+------+------+------+
| smart phones   | 1.53 | 1.64 | 1.74 | 1.82 | 1.9  |
+----------------+------+------+------+------+------+
| feature phones | 0.46 | 0.38 | 0.3  | 0.23 | 0.17 |
+----------------+------+------+------+------+------+

Suppose you have the following data in a dictionary:

Name Age
Adam 28
Beatrice 29
Ceri 30
Dean 26

you can easily save it into an excel file using the following code:

>>> import pyexcel
>>> # make sure you had pyexcel-xls installed
>>> a_list_of_dictionaries = [
...     {
...         "Name": 'Adam',
...         "Age": 28
...     },
...     {
...         "Name": 'Beatrice',
...         "Age": 29
...     },
...     {
...         "Name": 'Ceri',
...         "Age": 30
...     },
...     {
...         "Name": 'Dean',
...         "Age": 26
...     }
... ]
>>> pyexcel.save_as(records=a_list_of_dictionaries, dest_file_name="your_file.xls")

And here's how to obtain the records:

>>> import pyexcel as p
>>> records = p.iget_records(file_name="your_file.xls")
>>> for record in records:
...     print("%s is aged at %d" % (record['Name'], record['Age']))
Adam is aged at 28
Beatrice is aged at 29
Ceri is aged at 30
Dean is aged at 26
>>> p.free_resources()

Advanced usage πŸ”₯

If you are dealing with big data, please consider these usages:

>>> def increase_everyones_age(generator):
...     for row in generator:
...         row['Age'] += 1
...         yield row
>>> def duplicate_each_record(generator):
...     for row in generator:
...         yield row
...         yield row
>>> records = p.iget_records(file_name="your_file.xls")
>>> io=p.isave_as(records=duplicate_each_record(increase_everyones_age(records)),
...     dest_file_type='csv', dest_lineterminator='\n')
>>> print(io.getvalue())
Age,Name
29,Adam
29,Adam
30,Beatrice
30,Beatrice
31,Ceri
31,Ceri
27,Dean
27,Dean
<BLANKLINE>

Two advantages of above method:

  1. Add as many wrapping functions as you want.
  2. Constant memory consumption

Available Plugins

A list of file formats supported by external plugins
Package name Supported file formats Dependencies Python versions
pyexcel-io csv, csvz [1], tsv, tsvz [2] Β  2.6, 2.7, 3.3, 3.4, 3.5, 3.6 pypy
pyexcel-xls xls, xlsx(read only), xlsm(read only) xlrd, xlwt same as above
pyexcel-xlsx xlsx openpyxl same as above
pyexcel-ods3 ods pyexcel-ezodf, lxml 2.6, 2.7, 3.3, 3.4 3.5, 3.6
pyexcel-ods ods odfpy same as above
Dedicated file reader and writers
Package name Supported file formats Dependencies Python versions
pyexcel-xlsxw xlsx(write only) XlsxWriter Python 2 and 3
pyexcel-xlsxr xlsx(read only) lxml same as above
pyexcel-odsr read only for ods, fods lxml same as above
pyexcel-htmlr html(read only) lxml,html5lib same as above
Other data renderers
Package name Supported file formats Dependencies Python versions
pyexcel-text write only:rst, mediawiki, html, latex, grid, pipe, orgtbl, plain simple read only: ndjson r/w: json tabulate 2.6, 2.7, 3.3, 3.4 3.5, 3.6, pypy
pyexcel-handsontable handsontable in html handsontable same as above
pyexcel-pygal svg chart pygal 2.7, 3.3, 3.4, 3.5 3.6, pypy
pyexcel-sortable sortable table in html csvtotable same as above
pyexcel-gantt gantt chart in html frappe-gantt except pypy, same as above

In order to manage the list of plugins installed, you need to use pip to add or remove a plugin. When you use virtualenv, you can have different plugins per virtual environment. In the situation where you have multiple plugins that does the same thing in your environment, you need to tell pyexcel which plugin to use per function call. For example, pyexcel-ods and pyexcel-odsr, and you want to get_array to use pyexcel-odsr. You need to append get_array(..., library='pyexcel-odsr').

Footnotes

[1]zipped csv file
[2]zipped tsv file

Acknowledgement

All great work have been done by odf, ezodf, xlrd, xlwt, tabulate and other individual developers. This library unites only the data access code.

.. testcode::
   :hide:

   >>> import os
   >>> os.unlink("your_file.xls")



License

New BSD License