Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ssh: use cache_req #127

Closed
wants to merge 9 commits into from
Closed

ssh: use cache_req #127

wants to merge 9 commits into from

Conversation

pbrezina
Copy link
Member

@pbrezina pbrezina commented Jan 18, 2017

These patches makes SSH responder use the cache_req interface.

Since this responder uses that same cache-domain-cache lookup logic
for host certificates I implemented host by name request in cache_req.
In order to achieve this I moved data provider lookup function from cache_req
code into plugins.

The first two patches fixes minor issues in the SSH responder and should be
pushed to earlier versions as well. The first patch fix a little issue
introduced probably by overrides and the second patch removes name qualification
since it is already qualified in the sysdb since fqname patches.

@jhrozek
Copy link
Contributor

jhrozek commented Jan 30, 2017

I will review this as these changes make the requested changes to the files provider code on review easier.

@jhrozek
Copy link
Contributor

jhrozek commented Jan 31, 2017

There is a bug when looking up keys for a user from a trusted domain:

==3945== Invalid read of size 8
==3945==    at 0x4100FE: sss_dp_callback_destructor (responder_dp.c:61)
==3945==    by 0x9019C0F: ??? (in /usr/lib64/libtalloc.so.2.1.8)
==3945==    by 0x901B7F0: _talloc_free (in /usr/lib64/libtalloc.so.2.1.8)
==3945==    by 0x77D4DBF: tevent_req_received (in /usr/lib64/libtevent.so.0.9.30)
==3945==    by 0x77D4DF8: ??? (in /usr/lib64/libtevent.so.0.9.30)
==3945==    by 0x9019C0F: ??? (in /usr/lib64/libtalloc.so.2.1.8)
==3945==    by 0x9019694: ??? (in /usr/lib64/libtalloc.so.2.1.8)
==3945==    by 0x901B7F0: _talloc_free (in /usr/lib64/libtalloc.so.2.1.8)
==3945==    by 0x77D4DBF: tevent_req_received (in /usr/lib64/libtevent.so.0.9.30)
==3945==    by 0x77D4DF8: ??? (in /usr/lib64/libtevent.so.0.9.30)
==3945==    by 0x901BADF: _talloc_free (in /usr/lib64/libtalloc.so.2.1.8)
==3945==    by 0x405F97: ssh_cmd_get_user_pubkeys_done (ssh_cmd.c:134)
==3945==  Address 0xb687390 is 688 bytes inside a block of size 797 free'd
==3945==    at 0x4C2ED4A: free (vg_replace_malloc.c:530)
==3945==    by 0x901B89A: _talloc_free (in /usr/lib64/libtalloc.so.2.1.8)
==3945==    by 0x411047: sss_dp_req_done (responder_dp.c:408)
==3945==    by 0x411E34: sss_dp_internal_get_done (responder_dp.c:840)
==3945==    by 0x5728391: ??? (in /usr/lib64/libdbus-1.so.3.16.3)
==3945==    by 0x572BCDE: dbus_connection_dispatch (in /usr/lib64/libdbus-1.so.3.16.3)
==3945==    by 0x4E8A4E8: sbus_dispatch (sssd_dbus_connection.c:96)
==3945==    by 0x77D84FF: tevent_common_loop_timer_delay (in /usr/lib64/libtevent.so.0.9.30)
==3945==    by 0x77D9518: ??? (in /usr/lib64/libtevent.so.0.9.30)
==3945==    by 0x77D7C06: ??? (in /usr/lib64/libtevent.so.0.9.30)
==3945==    by 0x77D3ABC: _tevent_loop_once (in /usr/lib64/libtevent.so.0.9.30)
==3945==    by 0x77D3CEA: tevent_common_loop_wait (in /usr/lib64/libtevent.so.0.9.30)
==3945==  Block was alloc'd at
==3945==    at 0x4C2DB9D: malloc (vg_replace_malloc.c:299)
==3945==    by 0x901F2BB: _talloc_pooled_object (in /usr/lib64/libtalloc.so.2.1.8)
==3945==    by 0x77D4ACC: _tevent_req_create (in /usr/lib64/libtevent.so.0.9.30)
==3945==    by 0x4118B9: sss_dp_internal_get_send (responder_dp.c:689)
==3945==    by 0x410C5F: sss_dp_issue_request (responder_dp.c:334)
==3945==    by 0x411351: sss_dp_get_account_send (responder_dp.c:528)
==3945==    by 0x41A0CB: cache_req_user_by_upn_dp_send (cache_req_user_by_upn.c:102)
==3945==    by 0x418266: cache_req_search_dp (cache_req_search.c:289)
==3945==    by 0x417FDB: cache_req_search_send (cache_req_search.c:232)
==3945==    by 0x416912: cache_req_next_domain (cache_req.c:544)
==3945==    by 0x416D00: cache_req_done (cache_req.c:693)
==3945==    by 0x77D4473: tevent_common_loop_immediate (in /usr/lib64/libtevent.so.0.9.30)

@pbrezina
Copy link
Member Author

pbrezina commented Feb 1, 2017

It should be fixed. Thanks.

@jhrozek
Copy link
Contributor

jhrozek commented Feb 2, 2017

The code looks good to me now but I found one regression - if you set default_domain_suffix to the AD domain and try to look up a host, the ssh responder will query the AD domain. Since hosts can only be placed in the IPA domain, we should ignore the default_domain_suffix for host searches.

And one more question is related to the first two patches -- looks like they are legitimate bugs in the 1.14 codebase, should we push them to 1.14 separately? If yes, I would prefer to have some better commit message with steps to reproduce the bug or at least a desription.

SSH responder returned invalid number of certificates when
original ad pubkey attribute was not empty. Since we always
return all certificates to the client we should add number
of results to the output not override it.
We store fully qualified name in sysdb so there is no need to append
the domain part again which result in name@domain@domain string.
This field is not actually used in ssh client so it doesn't cause
any issue but we should stay correct here.
It is not always desirable to consider default_domain from configuration
but expect none instead. For example when we search host certificates.

This is currently not used in this patch since host lookups parse
name directly with sss_parse_name but it will be used in the next
patch.
This will be used in the next plugin "host by name" where
it is not desirable to use default domain suffix if set.
Sometime is is desirable to aquire more attribute from user object
than SYSDB_PW_ATTRS set. such as user's public key.
Some sysdb methods doesn't return ldb_result as output but return
ldb_message instead. Changing sysdb to be consistent is too big
so I added this helper function that will wrap resulting message
into ldb_result.
This will allow to use cache req even for object that do not use
account request such as hosts.
@pbrezina
Copy link
Member Author

pbrezina commented Feb 3, 2017

The code looks good to me now but I found one regression - if you set default_domain_suffix to the AD domain and try to look up a host, the ssh responder will query the AD domain. Since hosts can only be placed in the IPA domain, we should ignore the default_domain_suffix for host searches.

Thanks. Fixed.

And one more question is related to the first two patches -- looks like they are legitimate bugs in the 1.14 codebase, should we push them to 1.14 separately? If yes, I would prefer to have some better commit message with steps to reproduce the bug or at least a desription.

Yes, those two patches should be pushed into 1.14. I improved commit message.

@jhrozek
Copy link
Contributor

jhrozek commented Feb 6, 2017

I'm afraid this is still not fixed correctly. Please check this gdb session when I requested an IPA host with a default_domain_suffix pointing to a Windows domain:

0x00007f0a1cc0d543 in __epoll_wait_nocancel () at ../sysdeps/unix/syscall-template.S:84
84      T_PSEUDO (SYSCALL_SYMBOL, SYSCALL_NAME, SYSCALL_NARGS)
Missing separate debuginfos, use: dnf debuginfo-install sssd-common-1.14.2-2.fc25.x86_64
(gdb) b sysdb_search_ssh_hosts
Breakpoint 1 at 0x7f0a21095b90: file /sssd/src/db/sysdb_ssh.c, line 290.
(gdb) c
Continuing.

Breakpoint 1, sysdb_search_ssh_hosts (mem_ctx=0x20abf60, domain=0x20bb680, 
    filter=0x20bd6a0 "(name=client.ipa.test)", attrs=0x20b73f0, num_hosts=0x7fffa2d19770, hosts=0x7fffa2d19768)
    at /sssd/src/db/sysdb_ssh.c:290
290         tmp_ctx = talloc_new(NULL);
(gdb) c
Continuing.

Breakpoint 1, sysdb_search_ssh_hosts (mem_ctx=0x20abf60, domain=0x20bb680, 
    filter=0x20b8440 "(name=client.ipa.test)", attrs=0x20b73f0, num_hosts=0x7fffa2d198e0, hosts=0x7fffa2d198d8)
    at /sssd/src/db/sysdb_ssh.c:290
290         tmp_ctx = talloc_new(NULL);
(gdb) n
291         if (!tmp_ctx) {
(gdb) 
295         ret = sysdb_search_custom(tmp_ctx, domain, filter,
(gdb) 
298         if (ret != EOK && ret != ENOENT) {
(gdb) p ret
$1 = 2
(gdb) p filter
$2 = 0x20b8440 "(name=client.ipa.test)"
(gdb) p *domain
$3 = {name = 0x20ac820 "win.trust.test", conn_name = 0x20b7d60 "ipa.test", provider = 0x20bc0f0 "ipa", timeout = 0, 
  enumerate = false, sd_enumerate = 0x0, fqnames = true, mpg = true, ignore_group_members = false, id_min = 1, 
  id_max = 4294967295, cache_credentials = true, cache_credentials_min_ff_length = 8, legacy_passwords = false, 
  case_sensitive = false, case_preserve = false, override_gid = 0, override_homedir = 0x0, fallback_homedir = 0x0, 
  subdomain_homedir = 0x20ab490 "/home/%d/%u", homedir_substr = 0x0, override_shell = 0x0, default_shell = 0x0, 
  user_timeout = 5400, group_timeout = 5400, netgroup_timeout = 5400, service_timeout = 5400, autofsmap_timeout = 0, 
  sudo_timeout = 0, ssh_host_timeout = 0, refresh_expired_interval = 0, subdomain_refresh_interval = 0, 
  cached_auth_timeout = 0, pwd_expiration_warning = -1, sysdb = 0x20b31d0, names = 0x20ad590, parent = 0x20aed20, 
  subdomains = 0x0, realm = 0x20baff0 "WIN.TRUST.TEST", flat_name = 0x20bc490 "WIN", 
  domain_id = 0x20b6060 "S-1-5-21-1733986419-3120376054-1365464917", trust_direction = 0, subdomains_last_checked = {
    tv_sec = 0, tv_usec = 0}, has_views = false, view_name = 0x20b85f0 "default", prev = 0x0, next = 0x20be700, 
  state = DOM_ACTIVE, sd_inherit = 0x0, forest = 0x20ae870 "win.trust.test", forest_root = 0x20bb680, 
  upn_suffixes = 0x0}
(gdb) 

This is a bigger change since both supported commands could be
rewritten for cache_req and the logic could be deleted. I decided
to also split the file into more modules and follow similar pattern
as with nss responder.

Resolves:
https://fedorahosted.org/sssd/ticket/1126
@pbrezina
Copy link
Member Author

pbrezina commented Feb 7, 2017

Try now.

@jhrozek
Copy link
Contributor

jhrozek commented Feb 7, 2017 via email

@lslebodn
Copy link
Contributor

lslebodn commented Feb 8, 2017

The commit "cache_req: add api to create ldb_result from message" broke integration tests:
http://sssd-ci.duckdns.org/logs/job/62/01/summary.html

The following commit fixed that. IMHO changing order should be safe.

I tried to run some downstream tests and they failed; need to find a reason

@jhrozek
Copy link
Contributor

jhrozek commented Feb 8, 2017

Well, I just pushed the patches:
0b7ded1..a8191ce

That's why I leave the accepted flag for quite some time before pushing..feel free to just remove it next time if you plan on testing patches more than the reviewee.

@jhrozek jhrozek closed this Feb 8, 2017
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants