Skip to content

Commit

Permalink
devel/py-cykhash: Cython equivalent to khash-sets/maps
Browse files Browse the repository at this point in the history
Cykhash is a cython equivalent to khash-sets/maps, efficient
implementation of isin and unique

Benefits:

    Brings functionality of khash to Python and Cython and can be used
    seamlessly in numpy or pandas.

    Numpy's world is lacking the concept of a (hash-)set. This
    shortcoming is fixed and efficient (memory- and speedwise compared
    to pandas') unique and isin are implemented.

    Python-set/dict have big memory-footprint. For some datatypes the
    overhead can be reduced by using khash by factor 4-8.
  • Loading branch information
Jason W. Bacon authored and Jason W. Bacon committed Aug 6, 2023
1 parent 2eff095 commit 21e0240
Show file tree
Hide file tree
Showing 3 changed files with 37 additions and 0 deletions.
20 changes: 20 additions & 0 deletions devel/py-cykhash/Makefile
@@ -0,0 +1,20 @@
PORTNAME= cykhash
DISTVERSION= 2.0.1
CATEGORIES= devel python
MASTER_SITES= CHEESESHOP
PKGNAMEPREFIX= ${PYTHON_PKGNAMEPREFIX}

MAINTAINER= jwb@FreeBSD.org
COMMENT= Cython equivalent to khash-sets/maps
WWW= https://pypi.python.org/project/cykhash/

LICENSE= MIT
LICENSE_FILE= ${WRKSRC}/LICENSE

USES= python
USE_PYTHON= autoplist cython distutils

post-stage:
@${STRIP_CMD} ${STAGEDIR}${PYTHON_SITELIBDIR}/cykhash/*.so

.include <bsd.port.mk>
3 changes: 3 additions & 0 deletions devel/py-cykhash/distinfo
@@ -0,0 +1,3 @@
TIMESTAMP = 1691328170
SHA256 (cykhash-2.0.1.tar.gz) = b4794bc9f549114d8cf1d856d9f64e08ff5f246bf043cf369fdb414e9ceb97f7
SIZE (cykhash-2.0.1.tar.gz) = 44895
14 changes: 14 additions & 0 deletions devel/py-cykhash/pkg-descr
@@ -0,0 +1,14 @@
Cykhash is a cython equivalent to khash-sets/maps, efficient
implementation of isin and unique

Benefits:

Brings functionality of khash to Python and Cython and can be used
seamlessly in numpy or pandas.

Numpy's world is lacking the concept of a (hash-)set. This
shortcoming is fixed and efficient (memory- and speedwise compared
to pandas') unique and isin are implemented.

Python-set/dict have big memory-footprint. For some datatypes the
overhead can be reduced by using khash by factor 4-8.

0 comments on commit 21e0240

Please sign in to comment.