Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mmap_cache: add SID and type to struct sss_mc_rec #413

Closed
wants to merge 4 commits into from

Conversation

Projects
None yet
4 participants
@sumit-bose
Copy link
Contributor

commented Oct 19, 2017

This patchset updates the memory cache by adding some new members to struct
sss_mc_rec. One is the addition of a hash value for SID based lookup which will
be added in later patches.

The other is a new record type and a member indicating the type. The new type
is a link record which links an alias name, e.g. an UPN, to the original record
of the related user or group object.

Besides aliases this link record will be used in case in-sensitive setups. E.g.
if getpwnam() returns the name of an AD users as Administrator@ad.domain bit
some applications or users use administrator@ad.domain for lookups the memory
cache is currently never used because there is no entry with the hash of
'administrator@ad.domain'. With this patch the original data record is created
as before with the hash for 'Administrator@ad.domain' and a link record is
create with the hash of 'administrator@ad.domain'. Now both lookups can be
handled by the memory cache. If now another application uses
ADMINISTRATOR@AD.DOMAIN for lookups the first request will go to the NSS
responder but upcoming requests can use the memory cache as well because a link
record for ADMINISTRATOR@AD.DOMAIN is created.

The last patch in this series adds some additional data to the user and group
lookup requests, the short name, the domain name, the short domain name and the
SID. Those are needed to be able to support SID based lookups in the memory
cache and allow applications to not depend on the name format returned by
getpw{nam|uid}. Upcoming patches for libsss_nss_idmap will make those
additional values available to applications I added them already here to keep
the memory cache related changes in one PR. Application which will benefit here
are the interfaces SSSD provides e.g. to Samba related applications like SSSD's
version of libwbclient but also IPA plugins like extdom and slapi-nis.

sumit-bose added some commits Sep 25, 2017

mmap_cache: add SID and type to struct sss_mc_rec
The type is used to indicate a new type the memory cache record which
does not contain the user or group data but a link to a data object.
This new type can be used to handled alias names or UPN and will help in
case-insensitive setup if clients use different cases during lookups.

The added third hash slot will be used to store the hash of the SID of
the user or group if available.

This patch updates the major version number of the memory cache as well.

Realated to https://pagure.io/SSSD/sssd/issue/3193
            https://pagure.io/SSSD/sssd/issue/2727
mmap_cache: add link record to handle alias names
If the name used to lookup a user or group object is different from the
"canonical" name, i.e. the name returned by getpwnam() or getgrnam()
calls, a link record is added to the memory hash to speed up further
lookups with this name.

Related to https://pagure.io/SSSD/sssd/issue/3193
@pbrezina

This comment has been minimized.

Copy link
Member

commented Oct 19, 2017

 /*
- * 40 seem a good compromise for slot size
- * 4 blocks are enough for the average passwd entry of 42 bytes
- * passwd records have 84 bytes of overhead, 160 - 82 = 78 bytes
- * 3 blocks can contain a very minimal entry, 120 - 82 = 38 bytes
+ * 48 seem a good compromise for slot size
+ * 4 blocks are more than two times the average passwd entry of 42 bytes
+ * passwd records have 84 bytes of overhead, 192 - 82 = 110 bytes
+ * 3 blocks can contain a typical entry, 144 - 82 = 62 bytes
  *
  * 3 blocks are enough for groups w/o users (private user groups)
- * group records have 68 bytes of overhead, 120 - 66 = 54 bytes
+ * group records have 68 bytes of overhead, 144 - 66 = 78 bytes
  */
 #define MC_SLOT_SIZE 40

You have changed it to 48 in comment but the constant is still 40.

There are also some questions raised inside the code, it would be better to resolve them. But I will let someone who understands memory cache better to review this patch set.

@simo5
Copy link
Contributor

left a comment

I am not sure I really like the link data type record concept or the extra data I see in this patchset.
Do you expect to have multiple aliases ? If not then I would be better to embed the link data directly in the record by adding an alias filed.
As for extra data, I see this will just caus us to waste tons of space by writing the same domain anmes over and over again in each record. This is not wise use of limited space and increases the size of records by a non trivial margin. There are very few domains commonly in use and they are always the same, we should have a separate place where we store these name (which almost never change) and then just reference them via a short ID (almost certainly a byte will be sufficient), unless we think there will ever be setups with more than 256 domains.
Let's discuss these changes better please.

@@ -73,7 +74,7 @@ typedef uint32_t rel_ptr_t;
#define MC_VALID_BARRIER(val) (((val) & 0xff000000) == 0xf0000000)

#define MC_CHECK_RECORD_LENGTH(mc_ctx, rec) \
((rec)->len >= MC_HEADER_SIZE && (rec)->len != MC_INVALID_VAL32 \
((rec)->len >= MC_REC_SIZE && (rec)->len != MC_INVALID_VAL32 \

This comment has been minimized.

Copy link
@simo5

simo5 Oct 19, 2017

Contributor

Please explain this change in the commit message (and maybe also add a comment above MC_CHECK_RECORD_LENGTH to explain what it does/why it does it that way)

* 48 seem a good compromise for slot size
* 4 blocks are more than two times the average passwd entry of 42 bytes
* passwd records have 84 bytes of overhead, 192 - 82 = 110 bytes
* 3 blocks can contain a typical entry, 144 - 82 = 62 bytes

This comment has been minimized.

Copy link
@simo5

simo5 Oct 19, 2017

Contributor

This description needs to be imporved, first of all 48 may be a compromise but the code says 40 ...
Later on it says overhead is 84 but the calculations use 82 ...

This comment has been minimized.

Copy link
@simo5

simo5 Oct 19, 2017

Contributor

Also thinking in terms of cache lines we may want to consider using 64 bytes blocks (cache lines are 64 bytes) as this has a chance of aligning well.

uint32_t hash3; /* val of third hash (usually SID of record) */
uint32_t type; /* type of record, SSS_MC_REC_TYPE_DATA or
* SSS_MC_REC_TYPE_LINK */
uint32_t b2; /* barrier 2 - 40 bytes mark, fits a slot */

This comment has been minimized.

Copy link
@simo5

simo5 Oct 19, 2017

Contributor

this is not 40 bytes anymore, at the very least needs correction in the comment.

@@ -150,6 +161,12 @@ struct sss_mc_initgr_data {
* after gids */
};

struct sss_mc_link_data {
rel_ptr_t name; /* ptr to name string, rel. to struct base addr */
uint32_t hash; /* name hash of the related data record */

This comment has been minimized.

Copy link
@simo5

simo5 Oct 19, 2017

Contributor

This comment is unclear, what's the hash on ? what is ti used for ?
Also why do we need an alias record ?


/* Use domain->name if domain->flat_name is undefined */
/* FIXME: or should it be better "" ? */
flat_name = domain->flat_name != NULL ? domain->flat_name : domain->name;

This comment has been minimized.

Copy link
@simo5

simo5 Oct 19, 2017

Contributor

remove "!= NULL" if you are using this kind of short expression you may optimize that out as well

@@ -293,13 +305,15 @@ nss_protocol_fill_pwent(struct nss_ctx *nss_ctx,
SAFEALIGN_SET_STRING(&body[rp], gecos.str, gecos.len, &rp);
SAFEALIGN_SET_STRING(&body[rp], homedir.str, homedir.len, &rp);
SAFEALIGN_SET_STRING(&body[rp], shell.str, shell.len, &rp);
SAFEALIGN_SET_STRING(&body[rp], extra_data.data, extra_data.len, &rp);

This comment has been minimized.

Copy link
@simo5

simo5 Oct 19, 2017

Contributor

Why are you formatting this information as an additional structure, instead of simply adding these fileds one by one in the current structure ?

/* Extra data starts after shell string */
p = result->pw_shell + strlen(result->pw_shell) + 1;
if ( (p - buffer) + sizeof(uint32_t) > data->strs_len) {
/* Remaining data to small for data size */

This comment has been minimized.

Copy link
@simo5

simo5 Oct 19, 2017

Contributor

to -> too

@lslebodn

This comment has been minimized.

Copy link
Contributor

commented Oct 20, 2017

Memory cache is critical part of client code and it is a nice tradition to write integration tests together with new features there.

@sumit-bose

This comment has been minimized.

Copy link
Contributor Author

commented Oct 16, 2018

I will come up with a new design later.

@sumit-bose sumit-bose closed this Oct 16, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.