New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing results in sharphound group collection compared to bloodhound #20

Closed
jbfuzier opened this Issue Mar 8, 2018 · 42 comments

Comments

Projects
None yet
3 participants
@jbfuzier

jbfuzier commented Mar 8, 2018

Hi,

I am seeing some huge differences between group collection from bloodhound.ps1 & sharphound.exe.
Both were runned on the same domain from the same PC with the same account, results are consistent accross several runs.

After normalizing the data & diffing here is the verdict :

2833179 lines in sharphound
4363527 lines in bloodhound
2831326 lines in both
**1532201 lines missing in sharphound**
1853 lines missing in bloodhound

I do not see any pattern on what's missing in sharphound :(

Using commit 3355c35 (03/05/18) :
https://github.com/BloodHoundAD/BloodHound/blob/master/Ingestors/BloodHound_Old.ps1
https://github.com/BloodHoundAD/BloodHound/blob/master/Ingestors/SharpHound.exe

@rvazarkar

This comment has been minimized.

Collaborator

rvazarkar commented Mar 8, 2018

Can you validate the missing groups and make sure they exist? We've made lots of changes to sharphound that make group collection different. Can you get an idea of what the missing groups are as well?

@jbfuzier

This comment has been minimized.

jbfuzier commented Mar 8, 2018

Commands used for collection :

BloodHound_Old.ps1 :

Invoke-BloodHound -Domain domain.net -SearchForest -CollectionMethod 'Group'

SharpHound.exe:

SharpHound.exe -d domain.net -s -c Groups

@rvazarkar

This comment has been minimized.

Collaborator

rvazarkar commented Mar 8, 2018

You can't use the -d and -s flag at the same time. Just use -s for sharphound

@jbfuzier

This comment has been minimized.

jbfuzier commented Mar 8, 2018

Do you think the -d & -s could cause such issue ?

I noticed the issue first, because with sharphound, the group that is holding my domain admins has no user in it.

I just doubled check with net group admins /domain and my users are indeed inside it. Bloodhound output is also Ok.

When looking inside the diff, I do not see anything obvious, it seems pretty random, there are users, groups, computers missing. Nothing special about those.

@rvazarkar

This comment has been minimized.

Collaborator

rvazarkar commented Mar 8, 2018

Theres no functional difference between sharphound enumeration and the old bloodhound enumeration as far as pulling down groups, and I've never seen a discrepancy that large before. It's also basically impossible to debug this issue remotely.

If you can find the reason for the discrepancy, let me know. I'll leave this issue open for now.

@rvazarkar

This comment has been minimized.

Collaborator

rvazarkar commented Mar 8, 2018

As a side note, are the missing users/computers from a specific domain? Or your current domain? Or a mix of all of them

@jbfuzier

This comment has been minimized.

jbfuzier commented Mar 8, 2018

From my current domain.

@jbfuzier

This comment has been minimized.

jbfuzier commented Mar 8, 2018

Does it compile with visual studio express ?

@rvazarkar

This comment has been minimized.

Collaborator

rvazarkar commented Mar 8, 2018

I compile using Visual Studio 2017 Community

@dschaudel

This comment has been minimized.

dschaudel commented Mar 8, 2018

I have a similar issue due to LocalGroup function where the result generated by SharpHound is missing some members.

For example: Two domain groups as a member of a local administrators group. SharpHound returns one group and BloodHound both. What I‘m noticing is, that the scope of the missing group is ‚DomainLocal‘ whereas the scope of the other group is ‚Global‘.

@jbfuzier

This comment has been minimized.

jbfuzier commented Mar 9, 2018

Just to make sure I am understanding properly this :

var groups = GroupHelpers.ProcessAdObject(entry, resolved, _currentDomainSid);

is in charge of getting the list of member in a given group (identified by its DN : entry) ?

@rvazarkar

This comment has been minimized.

Collaborator

rvazarkar commented Mar 9, 2018

Thats correct. However, we don't look at group objects directly, we look at user/computer/group objects and resolve the memberof property

@jbfuzier

This comment has been minimized.

jbfuzier commented Mar 9, 2018

thanks, I will keep digging

@jbfuzier

This comment has been minimized.

jbfuzier commented Mar 9, 2018

Are you sure it is not this :

items = _utils.DoWrappedSearch(ldapFilter, SearchScope.Subtree, props, domainName);

which gets a list of all users, computers & groups and then :

var groups = GroupHelpers.ProcessAdObject(entry, resolved, _currentDomainSid);

resolves which groups those users,computers&groups are members of ?

@rvazarkar

This comment has been minimized.

Collaborator

rvazarkar commented Mar 9, 2018

The first line gets all the items. The second item is in charge of actually processing the group membership information.

@rvazarkar

This comment has been minimized.

Collaborator

rvazarkar commented Mar 9, 2018

I just pushed a bunch of changes to group collection. Why dont you give them a try and see if it works properly now. Applies to you as well @dschaudel.

@rvazarkar

This comment has been minimized.

Collaborator

rvazarkar commented Mar 9, 2018

On second thought, give me a bit more time to fix some more bugs

@rvazarkar

This comment has been minimized.

Collaborator

rvazarkar commented Mar 9, 2018

Alright...fixed some more stuff...here you go

Release.zip

@dschaudel

This comment has been minimized.

dschaudel commented Mar 10, 2018

Solved my issue! Thank you!

@jbfuzier

This comment has been minimized.

jbfuzier commented Mar 12, 2018

Unfortunatly still the same...

I did some digging, the ldap query bellow does not returns everything it should.

items = _utils.DoWrappedSearch(ldapFilter, SearchScope.Subtree, props, domainName);

I added some logging of the items returned by this query, across several runs, all results start the same, but there are a few thousands more or less results at the end of the file. My guess is that there is some kind of timeout in the ldap query (around 30mn it seems), and that in those 30mn we do not have the time to fetch all the results needed.

@jbfuzier

This comment has been minimized.

jbfuzier commented Mar 12, 2018

Bloodhound used DirectorySearcher, sharphound SearchRequest, is there a reason for the switch ?

@jbfuzier

This comment has been minimized.

jbfuzier commented Mar 12, 2018

ClientTimeout, ServerTimeLimit, ServerPageTimeLimit are all timeout values to prevent a search from running too long. The first value, ClientTimeout, controls how long the client waits. The other two time limits are imposed by the server. I suggest setting at least the ClientTimeout, to avoid the possibility of your application waiting indefinitely. It is important to note, though, that if one of the server-based time limits is hit, the entries retrieved up to that point are returned. Nothing is returned if the client time limit is exceeded first. Keep in mind that the server itself has a time limit that can be configured by the administrator, so it may time out (and return the incomplete results) before the time limit you have specified

https://msdn.microsoft.com/en-us/library/ms973834.aspx

@rvazarkar

This comment has been minimized.

Collaborator

rvazarkar commented Mar 12, 2018

We switched because the Protocols API is faster, has less overhead, and allows us to support encryption properly. If you think its a timeout problem, why not try doing each domain individually?

@rvazarkar

This comment has been minimized.

Collaborator

rvazarkar commented Mar 12, 2018

Release.zip

Heres another build you can try. I upped the LdapConnection timeout to 60 seconds from the default 30, and took out a possible bad check in the paging

@jbfuzier

This comment has been minimized.

jbfuzier commented Mar 12, 2018

The timeout is not a client side timeout, it is someting set on the server side (see quoted msdn doc).
I implemented a quick & dirty workaround for huge domain, which is the case here; with this workaround I got the same results as with bloodhound.

The idea is to split the huge ldap query into smaller one so that no query reaches the server timeout (around 30mn it seems). Firstly, I do a ldap query to fetch all OUs (|(objectClass=builtinDomain)(objectClass=organizationalUnit)) (I am not sure if users & group can be in another class of object ?). For each OU, I do a ldap query with a seachScope of OneLevel and the OU's dn as adsPath.

It is likely slower, but it allows to handle very large domains. It would also be good if we are able to get the number of results before iterating in order to check that at the end the number of iteration is equal to the number of expected results (that would allow to catch the error I had before the workaround).

I can share my changes with you if you want, but right now the code is too ugly !

@rvazarkar

This comment has been minimized.

Collaborator

rvazarkar commented Mar 12, 2018

As far as I know, theres no way to get the number of objects without actually enumerating all the objects. If this is a server side timeout, then why is the original BloodHound.ps1 working fine? It should be subject to the same timeout.

@rvazarkar

This comment has been minimized.

Collaborator

rvazarkar commented Mar 12, 2018

Also, are you on the BloodHound Slack channel? Its much easier to figure this out there

@rvazarkar

This comment has been minimized.

Collaborator

rvazarkar commented Mar 13, 2018

I think I understand why the old BloodHound.ps1 doesn't have this problem, since the query is split between collection methods. I guess we might have to go back to seperate ldap queries for each collection method if all else fails...

@jbfuzier

This comment has been minimized.

jbfuzier commented Mar 13, 2018

@jbfuzier

This comment has been minimized.

jbfuzier commented Mar 14, 2018

Here is the result of my tests, every scenarios runned 3 times to be sure.

This represent the number of results returned by the ldap query :

ldapFilter = "(|(memberof=)(primarygroupid=))"

Small variations are likely due to the "life" of the directory which is as you can see quite large.

  Run 1 Run 2 Run 3
SearchRequest 272000 (~30mn) 273000 (~30mn) 27200 (32mn)
DirectorySearcher (Large time & ram overhead) 668 921 669 462 (91mn) 669 476 (82mn)
SearchRequest with per OU search 656 599 669 097 (40mn) 669 097 (23mn)

What we can conclude from that, is that SearchRequest is somehow not reliable when dealing with very large directories. DirectorySearcher seems reliable, this is what is also used in bloodhound ps1, so that explains why bloodhound had no issue getting accurate results. Howerver it has quite a large memory & speed overhead.
Last one is the workaround I explained earlier using SearchRequest to get a list of every OUs and then SearchRequest is applied on each OUs with a OneLevel scope.

@rvazarkar

This comment has been minimized.

Collaborator

rvazarkar commented Mar 14, 2018

Did you configure DirectorySearcher in any way? Set any particular properties?

@rvazarkar

This comment has been minimized.

Collaborator

rvazarkar commented Mar 15, 2018

@jbfuzier could you try this build and see if its giving you a timeout error properly?

Sharphound.zip

@jbfuzier

This comment has been minimized.

jbfuzier commented Mar 15, 2018

I will try the new build.

Nothing special with Directory searcher config, I just did the same as in bloodhoundold.ps1.


    public System.DirectoryServices.DirectorySearcher GetDirectorySearcher(string filter, System.DirectoryServices.SearchScope scope, string[] attribs, string domainName = null, string adsPath = null)
        {
            Domain targetDomain;
            try
            {
                targetDomain = GetDomain(domainName);
            }
            catch
            {
                Verbose($"Unable to contact domain {domainName}");
                return null;
            }
            domainName = targetDomain.Name;
            adsPath = adsPath?.Replace("LDAP://", "") ?? $"DC={domainName.Replace(".", ",DC=")}";
            var request = new System.DirectoryServices.DirectorySearcher(filter, attribs);
            request.PageSize = 200;//TODO
            request.SearchScope = scope;
            return request;
        }
        
        

        public IEnumerable<Wrapper<System.DirectoryServices.SearchResult>> DoWrappedDirectorySearch(string filter, System.DirectoryServices.SearchScope scope, string[] props,
            string domainName = null, string adsPath = null, bool useGc = false)
        {
            using (var conn = useGc ? GetGcConnection(domainName) : GetLdapConnection(domainName))
            {
                if (conn == null)
                {
                    Verbose("Unable to contact LDAP");
                    yield break;
                }
                var request = GetDirectorySearcher(filter, scope, props, domainName, adsPath);

                if (request == null)
                {
                    Verbose($"Unable to contact domain {domainName}");
                    yield break;
                }
                // Might be leaking memory
                using (System.DirectoryServices.SearchResultCollection src = request.FindAll())
                {
                    foreach (System.DirectoryServices.SearchResult sr in src)
                    {
                        
                        yield return new Wrapper<System.DirectoryServices.SearchResult> { Item = sr };
                    }
                }

            }
        }
@jbfuzier

This comment has been minimized.

jbfuzier commented Mar 15, 2018

It raises an explicit message about the timeout : (~given up on operation because client wait time has been exceeded)

System.DirectoryServices.Protocols.LdapException: L'opération a été abandonnée car la limite de délai d'attente du client a été dépassée.
à System.DirectoryServices.Protocols.LdapConnection.ConstructResponse(Int32 messageId, LdapOperation operation, ResultAll resultType, TimeSpan requestTimeOut, Boolean exceptionOnTimeOut)
à System.DirectoryServices.Protocols.LdapConnection.SendRequest(DirectoryRequest request, TimeSpan requestTimeout)
à Sharphound2.Utils.d__28.MoveNext()
Waiting for enumeration threads to finish
Status: 393000 objects enumerated (+0 404,321/s --- Using 373 MB RAM )
Waiting for writer thread to finish

@rvazarkar

This comment has been minimized.

Collaborator

rvazarkar commented Mar 15, 2018

Interestingly, that indicates its a client timeout, not a server timeout. I wonder if something is getting hung up in processing thats causing issues

@jbfuzier

This comment has been minimized.

jbfuzier commented Mar 15, 2018

Nice finding ! What is weird is that is does not always stop at the same result & also not exactly at the same time.

@rvazarkar

This comment has been minimized.

Collaborator

rvazarkar commented Mar 15, 2018

@jbfuzier if you don't mind running again, can you try this new build. I upped the client timeout to 5 minutes, which is an absurd number in the context of LDAP. If it runs into the same issue, I'll re-investigate, but if we can avoid re-architecting the ldap search, I'm sure going to try. Thanks for being so helpful too, I appreciate it.

Sharphound.zip

@rvazarkar

This comment has been minimized.

Collaborator

rvazarkar commented Mar 15, 2018

Interestingly, the build you ran before had an ldap timeout double the regular, and it got an extra 150k objects compared to the first run, so this definitely looks like a client side issue

@jbfuzier

This comment has been minimized.

jbfuzier commented Mar 16, 2018

Well done !
The number of result is in the same range as with the old collector.
Did you keep the client timeout detection in this build ?

Waiting for enumeration threads to finish
Status: 845257 objects enumerated (+6257 179,4601/s --- Using 487 MB RAM )
Waiting for writer thread to finish
Finished enumeration for corp.mydaomain.net in 01:18:30.7954386
0 hosts failed ping. 0 hosts timedout.

Removed 4111261 duplicate lines from .\group_membership.csv
@rvazarkar

This comment has been minimized.

Collaborator

rvazarkar commented Mar 16, 2018

@jbfuzier

This comment has been minimized.

jbfuzier commented Mar 16, 2018

Thanks for your time.

Just for the record, here is the last messages exchanged with the non working version :

image

image

(Don't know what to make of that...)

@rvazarkar

This comment has been minimized.

Collaborator

rvazarkar commented Mar 16, 2018

Just LDAP terminating the connection, not too worried. I've committed these changes to master already. Again, thanks a lot for your patience

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment