Skip to content

Commit

Permalink
Expose max_internal_size and max_leaf_size as rw on the C classes
Browse files Browse the repository at this point in the history
Just like on the Python clasess. Fixes #166

This takes a metaclass at the C level, but it's a very simple one.

In addition to consistency, and letting the sizes be customized for an
entire application, this has some nice properties:

- We can optimize some zope.interface object storage
- We delete the DEFAULT_MAX_*_SIZE macros. Now there's only one source
  of truth for those, whether in C or Python: _datatypes.py

In addition, I was able to make a small optimization for
__slotnames__. Previously it was computed (to be empty) and then
discarded over and over (every time you pickled or deactivated an
object); now it is properly cached. This won't affect any subclasses.
  • Loading branch information
jamadden committed May 19, 2021
1 parent 86fd464 commit 261d608
Show file tree
Hide file tree
Showing 31 changed files with 284 additions and 103 deletions.
15 changes: 12 additions & 3 deletions CHANGES.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,20 @@
BTrees Changelog
==================

4.8.1 (unreleased)
4.9.0 (unreleased)
==================

- Nothing changed yet.

- Fix the C implementation to match the Python implementation and
allow setting custom node sizes for an entire application directly
by changing ``BTree.max_leaf_size`` and ``BTree.max_internal_size``
attributes, without having to create a new subclass. These
attributes can now also be read from the classes in the C
implementation. See `issue 166
<https://github.com/zopefoundation/BTrees/issues/166>`_.

- Add various small performance improvements for storing
zope.interface attributes on ``BTree`` and ``TreeSet`` as well as
deactivating persistent objects from this package.

4.8.0 (2021-04-14)
==================
Expand Down
45 changes: 27 additions & 18 deletions docs/development.rst
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
=====================
Developer Information
=====================
=======================
Developer Information
=======================

This document provides information for developers who maintain or extend
`BTrees`.
Expand All @@ -25,21 +25,6 @@ Configuration Macros
A string (like "IO" or "OO") that provides the prefix used for the module.
This gets used to generate type names and the internal module name string.

``DEFAULT_MAX_BUCKET_SIZE``

An int giving the maximum bucket size (number of key/value pairs). When a
bucket gets larger than this due to an insertion *into a BTREE*, it
splits. Inserting into a bucket directly doesn't split, and functions
that produce a bucket output (e.g., ``union()``) also have no bound on how
large a bucket may get. Someday this will be tunable on `BTree`.
instances.

``DEFAULT_MAX_BTREE_SIZE``

An ``int`` giving the maximum size (number of children) of an internal
btree node. Someday this will be tunable on ``BTree`` instances.


Macros for Keys
---------------

Expand Down Expand Up @@ -194,6 +179,30 @@ Macros for Set Operations
a ``multiunion()`` function (compute a union of many input sets at high
speed). This currently makes sense only for structures with integer keys.

Datatypes
=========

There are two tunable values exposed on BTree and TreeSet classes.
Their default values are found in ``_datatypes.py`` and shared across
C and Python.


``max_leaf_size_str``

An int giving the maximum bucket size (number of key/value pairs).
When a bucket gets larger than this due to an insertion *into a
BTREE*, it splits. Inserting into a bucket directly doesn't split,
and functions that produce a bucket output (e.g., ``union()``)
also have no bound on how large a bucket may get. This used to
come from the C macro ``DEFAULT_MAX_BUCKET_SIZE``.


``max_internal_size``

An ``int`` giving the maximum size (number of children) of an
internal btree node. This used to come from the C macro
``DEFAULT_MAX_BTREE_SIZE``


BTree Clues
===========
Expand Down
3 changes: 3 additions & 0 deletions docs/overview.rst
Original file line number Diff line number Diff line change
Expand Up @@ -462,6 +462,9 @@ values for ``max_leaf_size`` or ``max_internal_size`` in your subclass::
... max_leaf_size = 500
... max_internal_size = 1000

As of version 4.9, you can also set these values directly on an
existing BTree class if you wish to tune them across your entire application.

``max_leaf_size`` is used for leaf nodes in a BTree, either Buckets or
Sets. ``max_internal_size`` is used for internal nodes, either BTrees
or TreeSets.
Expand Down
87 changes: 68 additions & 19 deletions src/BTrees/BTreeModuleTemplate.c
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,7 @@

static PyObject *sort_str, *reverse_str, *__setstate___str;
static PyObject *_bucket_type_str, *max_internal_size_str, *max_leaf_size_str;
static PyObject *__slotnames__str;
static PyObject *ConflictError = NULL;

static void PyVar_Assign(PyObject **v, PyObject *e) { Py_XDECREF(*v); *v=e;}
Expand Down Expand Up @@ -314,6 +315,7 @@ typedef struct BTree_s {
long max_leaf_size;
} BTree;

static PyTypeObject BTreeTypeType;
static PyTypeObject BTreeType;
static PyTypeObject BucketType;

Expand Down Expand Up @@ -583,19 +585,50 @@ VALUEMACROS_H
BTREEITEMSTEMPLATE_C
;

int
init_persist_type(PyTypeObject *type)
static int
init_type_with_meta_base(PyTypeObject *type, PyTypeObject* meta, PyTypeObject* base)
{
int result;
PyObject* slotnames;
#ifdef PY3K
((PyObject*)type)->ob_type = &PyType_Type;
((PyObject*)type)->ob_type = meta;
#else
type->ob_type = &PyType_Type;
type->ob_type = meta;
#endif
type->tp_base = cPersistenceCAPI->pertype;
type->tp_base = base;

if (PyType_Ready(type) < 0)
return 0;
/*
persistent looks for __slotnames__ in the dict at deactivation time,
and if it's not present, calls ``copyreg._slotnames``, which itself
looks in the dict again. Then it does some computation, and tries to
store the object in the dict --- which for built-in types, it can't.
So we can save some runtime if we store an empty slotnames for these classes.
*/
slotnames = PyTuple_New(0);
if (!slotnames) {
return 0;
}
result = PyDict_SetItem(type->tp_dict, __slotnames__str, slotnames);
Py_DECREF(slotnames);
return result < 0 ? 0 : 1;
}

int /* why isn't this static? */
init_persist_type(PyTypeObject* type)
{
return init_type_with_meta_base(type, &PyType_Type, cPersistenceCAPI->pertype);
}

static int init_tree_type(PyTypeObject* type, PyTypeObject* bucket_type)
{
if (!init_type_with_meta_base(type, &BTreeTypeType, cPersistenceCAPI->pertype)) {
return 0;
}
if (PyDict_SetItem(type->tp_dict, _bucket_type_str, (PyObject*)bucket_type) < 0) {
return 0;
}
return 1;
}

Expand Down Expand Up @@ -644,6 +677,24 @@ module_init(void)
max_leaf_size_str = INTERN("max_leaf_size");
if (! max_leaf_size_str)
return NULL;
__slotnames__str = INTERN("__slotnames__");
if (!__slotnames__str)
return NULL;

BTreeType_setattro_allowed_names = PyTuple_Pack(
5,
/* BTree attributes */
max_internal_size_str,
max_leaf_size_str,
/* zope.interface attributes */
/*
Technically, INTERNING directly here leaks references,
but since we can't be unloaded, it's not a problem.
*/
INTERN("__implemented__"),
INTERN("__providedBy__"),
INTERN("__provides__")
);

/* Grab the ConflictError class */
interfaces = PyImport_ImportModule("BTrees.Interfaces");
Expand Down Expand Up @@ -694,25 +745,22 @@ module_init(void)
SetType.tp_new = PyType_GenericNew;
BTreeType.tp_new = PyType_GenericNew;
TreeSetType.tp_new = PyType_GenericNew;

if (!init_persist_type(&BucketType))
return NULL;
if (!init_persist_type(&BTreeType))
return NULL;
if (!init_persist_type(&SetType))
return NULL;
if (!init_persist_type(&TreeSetType))
return NULL;

if (PyDict_SetItem(BTreeType.tp_dict, _bucket_type_str,
(PyObject *)&BucketType) < 0)
{
fprintf(stderr, "btree failed\n");
if (!init_type_with_meta_base(&BTreeTypeType, &PyType_Type, &PyType_Type)) {
return NULL;
}
if (PyDict_SetItem(TreeSetType.tp_dict, _bucket_type_str,
(PyObject *)&SetType) < 0)
{
fprintf(stderr, "bucket failed\n");

if (!init_tree_type(&BTreeType, &BucketType)) {
return NULL;
}

if (!init_persist_type(&SetType))
return NULL;

if (!init_tree_type(&TreeSetType, &SetType)) {
return NULL;
}

Expand All @@ -727,6 +775,7 @@ module_init(void)

/* Add some symbolic constants to the module */
mod_dict = PyModule_GetDict(module);

if (PyDict_SetItemString(mod_dict, MOD_NAME_PREFIX "Bucket",
(PyObject *)&BucketType) < 0)
return NULL;
Expand Down
91 changes: 85 additions & 6 deletions src/BTrees/BTreeTemplate.c
Original file line number Diff line number Diff line change
Expand Up @@ -20,12 +20,12 @@ _get_max_size(BTree *self, PyObject *name, long default_max)
{
PyObject *size;
long isize;

size = PyObject_GetAttr(OBJECT(OBJECT(self)->ob_type), name);
if (size == NULL) {
PyErr_Clear();
return default_max;
PyErr_Clear();
return default_max;
}

#ifdef PY3K
isize = PyLong_AsLong(size);
#else
Expand All @@ -48,7 +48,7 @@ _max_internal_size(BTree *self)
long isize;

if (self->max_internal_size > 0) return self->max_internal_size;
isize = _get_max_size(self, max_internal_size_str, DEFAULT_MAX_BTREE_SIZE);
isize = _get_max_size(self, max_internal_size_str, -1);
self->max_internal_size = isize;
return isize;
}
Expand All @@ -59,7 +59,7 @@ _max_leaf_size(BTree *self)
long isize;

if (self->max_leaf_size > 0) return self->max_leaf_size;
isize = _get_max_size(self, max_leaf_size_str, DEFAULT_MAX_BUCKET_SIZE);
isize = _get_max_size(self, max_leaf_size_str, -1);
self->max_leaf_size = isize;
return isize;
}
Expand Down Expand Up @@ -1035,6 +1035,14 @@ BTree__p_deactivate(BTree *self, PyObject *args, PyObject *keywords)
}
}

/*
Always clear our node size cache, whether we're in a jar or not. It is
only read from the type anyway, and we'll do so on the next write after
we get activated.
*/
self->max_internal_size = 0;
self->max_leaf_size = 0;

if (self->jar && self->oid)
{
ghostify = self->state == cPersistent_UPTODATE_STATE;
Expand Down Expand Up @@ -2496,8 +2504,79 @@ static PyNumberMethods BTree_as_number_for_nonzero = {
bucket_or, /* nb_or */
};

static PyTypeObject BTreeType = {
static PyObject* BTreeType_setattro_allowed_names; /* initialized in module */

static int
BTreeType_setattro(PyTypeObject* type, PyObject* name, PyObject* value)
{
/*
type.tp_setattro prohibits setting any attributes on a built-in type,
so we need to use our own (metaclass) type to handle it. The set of
allowable values needs to be carefully controlled.
Alternately, we could use heap-allocated types when they are supported
an all the versions we care about, because those do allow setting attributes.
*/
int allowed;
allowed = PySequence_Contains(BTreeType_setattro_allowed_names, name);
if (allowed < 0) {
return -1;
}

if (allowed) {
PyDict_SetItem(type->tp_dict, name, value);
PyType_Modified(type);
if (PyErr_Occurred()) {
return -1;
}
return 0;
}
PyErr_Format(
PyExc_TypeError,
/* distinguish the error message from what type would produce */
"BTree: can't set attributes of built-in/extension type '%s'",
type->tp_name);
return -1;
}

static PyTypeObject BTreeTypeType = {
PyVarObject_HEAD_INIT(NULL, 0)
MODULE_NAME MOD_NAME_PREFIX "BTreeType",
0, /* tp_basicsize */
0, /* tp_itemsize */
0, /* tp_dealloc */
0, /* tp_print */
0, /* tp_getattr */
0, /* tp_setattr */
0, /* tp_compare */
0, /* tp_repr */
0, /* tp_as_number */
0, /* tp_as_sequence */
0, /* tp_as_mapping */
0, /* tp_hash */
0, /* tp_call */
0, /* tp_str */
0, /* tp_getattro */
(setattrofunc)BTreeType_setattro, /* tp_setattro */
0, /* tp_as_buffer */
#ifndef PY3K
Py_TPFLAGS_CHECKTYPES |
#endif
Py_TPFLAGS_DEFAULT |
Py_TPFLAGS_BASETYPE, /* tp_flags */
0, /* tp_doc */
0, /* tp_traverse */
0, /* tp_clear */
0, /* tp_richcompare */
0, /* tp_weaklistoffset */
0, /* tp_iter */
0, /* tp_iternext */
0, /* tp_methods */
0, /* tp_members */
};

static PyTypeObject BTreeType = {
PyVarObject_HEAD_INIT(&BTreeTypeType, 0)
MODULE_NAME MOD_NAME_PREFIX "BTree", /* tp_name */
sizeof(BTree), /* tp_basicsize */
0, /* tp_itemsize */
Expand Down
4 changes: 2 additions & 2 deletions src/BTrees/_IFBTree.c
Original file line number Diff line number Diff line change
Expand Up @@ -26,8 +26,8 @@

#define MOD_NAME_PREFIX "IF"

#define DEFAULT_MAX_BUCKET_SIZE 120
#define DEFAULT_MAX_BTREE_SIZE 500



#include "_compat.h"
#include "intkeymacros.h"
Expand Down
4 changes: 2 additions & 2 deletions src/BTrees/_IIBTree.c
Original file line number Diff line number Diff line change
Expand Up @@ -26,8 +26,8 @@

#define MOD_NAME_PREFIX "II"

#define DEFAULT_MAX_BUCKET_SIZE 120
#define DEFAULT_MAX_BTREE_SIZE 500



#include "_compat.h"
#include "intkeymacros.h"
Expand Down
Loading

0 comments on commit 261d608

Please sign in to comment.