GitHub - PandaPost/panda_post: Postgres/Python ndarray data type

PandaPost

These extensions allow you to represent Python NumPy/Pandas objects in Postgres.

Warning

This is very much a work in progress. Expect the APIs and what is in each extension to change!

Currently you can think of PandaPost more as an experiment than anything else. It seems like it would be useful to be able to natively store ndarrays, DataFrames, etc in a Postgres column. Then again, maybe not. :)

If you can make use of any of this, please drop a line to our Google Group. We’d love any feedback!

Installation

Install the pgxn client, and then pgxn install panda_post --unstable. After that you can CREATE EXTENSION "PandaPost"; in your database. Currently it installs everything in your default schema; I suggest creating a panda schema and installing there. Note that it also requires plpythonu.

To wrap all that together:

pgxn install panda_post --unstable
psql <database name>
CREATE EXTENSION plpythonu;
CREATE SCHEMA panda;
CREATE EXTENSION panda_post WITH SCHEMA panda;

Current Status

There are now casts between ndarray and all the plpython supported types (boolean, int*, float, real, numeric and text). This allows things like SELECT panda.str(array[1,1,2,2]::panda.ndarray);.

Table 1. Table General Functions

Function	Description
T	Apply ndarray T() function to input
eval	Returns the results of running python’s eval() on the input, as an ndarray. IE: `panda.repr(panda.eval('range(4)'))`
repr	Returns the python repr() of the input.
str	Returns the python str() of the input.

Table 2. Table Set Functions

Function	Description
ediff1d	Returns difference between elements in array
in1d	Returns boolean array of whether each element in `i` is present in `ar2`. See numpy.in1d.
intersect1d	Return unique list of values that are present in both `i` and `ar2`, similar to `INTERSECT` in SQL. See numpy.intersect1d.
ndunique	Return unique elements in `ar`, plus other potential output. See numpy.unique.
ndunique1	Same as ndunique() but only return the array of unique elements.
setdiff1d	Return the difference between `i` and `ar2`, similar to `EXCEPT` in SQL. See numpy.setdiff1d.
setxor1d	Return the values that are in only one (but not both) `i` and `ar2`. See numpy.setxor1d.
union1d	Return the unique values that are in either `i` or `ar2`. See numpy.union1d.

Table 3. Table Special Functions

Function	Description
create_cast	Creates casts to and from a Postgres data type and ndarray

Next up: porting the more common ndarray functions:

np.ndarray.byteswap      np.ndarray.cumsum        np.ndarray.flat          np.ndarray.min           np.ndarray.ravel         np.ndarray.shape         np.ndarray.tobytes
np.ndarray.all           np.ndarray.choose        np.ndarray.data          np.ndarray.flatten       np.ndarray.nbytes        np.ndarray.real          np.ndarray.size          np.ndarray.tofile
np.ndarray.any           np.ndarray.clip          np.ndarray.diagonal      np.ndarray.getfield      np.ndarray.ndim          np.ndarray.repeat        np.ndarray.sort          np.ndarray.tolist
np.ndarray.argmax        np.ndarray.compress      np.ndarray.dot           np.ndarray.imag          np.ndarray.newbyteorder  np.ndarray.reshape       np.ndarray.squeeze       np.ndarray.tostring
np.ndarray.argmin        np.ndarray.conj          np.ndarray.dtype         np.ndarray.item          np.ndarray.nonzero       np.ndarray.resize        np.ndarray.std           np.ndarray.trace
np.ndarray.argpartition  np.ndarray.conjugate     np.ndarray.dump          np.ndarray.itemset       np.ndarray.partition     np.ndarray.round         np.ndarray.strides       np.ndarray.transpose
np.ndarray.argsort       np.ndarray.copy          np.ndarray.dumps         np.ndarray.itemsize      np.ndarray.prod          np.ndarray.searchsorted  np.ndarray.sum           np.ndarray.var
np.ndarray.astype        np.ndarray.ctypes        np.ndarray.fill          np.ndarray.max           np.ndarray.ptp           np.ndarray.setfield      np.ndarray.swapaxes      np.ndarray.view
np.ndarray.base          np.ndarray.cumprod       np.ndarray.flags         np.ndarray.mean          np.ndarray.put           np.ndarray.setflags      np.ndarray.take

Examples

Basic examples

CREATE EXTENSION IF NOT EXISTS plpythonu;
CREATE EXTENSION panda_post;

CREATE TEMP TABLE s AS SELECT array['a','b','c']::ndarray AS s1, array['c','d']::ndarray AS s2;

-- python repr() of an ndarray of strings
SELECT repr(s1) FROM s;
          repr
-------------------------
 array(['a', 'b', 'c'], +
       dtype='|S1')
(1 row)

-- python str() of same array
SELECT str(s1) FROM s;
      str
---------------
 ['a' 'b' 'c']
(1 row)

-- exclusive-or of two ndarrays
SELECT str(setxor1d(s1, s2)) FROM s;
      str
---------------
 ['a' 'b' 'd']
(1 row)

-- Intersection
SELECT str(intersect1d(s1, s2)) FROM s;
  str
-------
 ['c']
(1 row)

This more complicated example uses the lambad extension to create a dataframe, which can currently be returned as an ndarray.

Note	Eventually there will be an actual DataFrame Postgres data type

Basic ndarray type storing a subclass (in this case, a Pandas DataFrame)

CREATE EXTENSION IF NOT EXISTS lambda;
\set df pd.DataFrame.from_dict([{"a":1,"b":"a"},{"a":2,"b":"b"}])
SELECT repr(
    lambda(
      $l$(
        ndarray
      ) RETURNS ndarray
      LANGUAGE plpythonu
      TRANSFORM FOR TYPE ndarray
      AS $body$
        import pandas as pd

        return $l$ || :'df' || $l$
      $body$
      $l$
      , NULL::ndarray
    )
  );
  repr
---------
    a  b+
 0  1  a+
 1  2  b
(1 row)

Copyright and License

PandaPost is released under a BSD liscense.

Name		Name	Last commit message	Last commit date
Latest commit History 76 Commits
pgxntool		pgxntool
sql		sql
src		src
test		test
.gitignore		.gitignore
.travis.yml		.travis.yml
LICENSE		LICENSE
META.in.json		META.in.json
META.json		META.json
Makefile		Makefile
README.asc		README.asc
panda_post.control		panda_post.control
pg-travis-test.sh		pg-travis-test.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PandaPost

Installation

Current Status

Examples

Copyright and License

About

Releases

Packages

Languages

License

PandaPost/panda_post

Folders and files

Latest commit

History

Repository files navigation

PandaPost

Installation

Current Status

Examples

Copyright and License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages