Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Set.disjoint #1986

Merged
merged 4 commits into from Aug 13, 2018

Conversation

Projects
None yet
3 participants
@nojb
Copy link
Contributor

commented Aug 10, 2018

This adds a function to decide if two sets are disjoint. This can be done more efficiently from inside the Set module. See MPR#6450.

This version has been used at LexiFi for a long time without problems.

@nojb nojb added the stdlib label Aug 10, 2018

@gasche

This comment has been minimized.

Copy link
Member

commented Aug 10, 2018

I can see how this function works (and I believe it is correct, although I would be reassured with a couple unit tests), but I'm not sure you would ever need to join to compute disjoint. Naively I would think of the following pseudo-code:

disjoint (Node (l1, v1, r1) as t1) (Node (l2, v2, r2) as t2) =
  if v1 = v2 then false
  else if v1 < v2 then disjoint t1 l2 && disjoint r1 t2
  else if v1 > v2 then disjoint t1 r2 && disjoint l1 t2

is this less efficient?

@gasche

This comment has been minimized.

Copy link
Member

commented Aug 10, 2018

Answer: yes, this may be less efficient, as (when v1 < v2) a part of r1 will be checked against l2 twice, once as part of the first call, and once as a sub-part of the second call. So you would need something like

disjoint (Node (l1, v1, r1) as t1) (Node (l2, v2, r2) as t2) =
  if v1 = v2 then false
  else if v1 < v2 then disjoint l1 l2 && not_in v1 l2 && disjoint r1 t2
  else if v1 > v2 then disjoint r1 r2 && not_in v1 r2 && disjoint l1 t2 
@nojb

This comment has been minimized.

Copy link
Contributor Author

commented Aug 11, 2018

My understanding is that your code will perform a bit better for some inputs (thanks to avoiding the join) but potentially much worse for other ones (due to the the asymmetry in the treatment of t1 and t2).

For example, when all elements of t1 are smaller (or larger) than all elements of t2, then you will end up doing a number of recursive calls that is linear on the cardinal of the first set.

In the PR this situation is avoided by splitting t2 with respect to v1 so that at each step of the recursion one of the two recursive calls gets cut down early and so the total number of calls ends up logarithmic.

nojb added some commits Aug 13, 2018

@nojb nojb force-pushed the nojb:set_disjoint branch from a18c686 to f0c3c03 Aug 13, 2018

@gasche

gasche approved these changes Aug 13, 2018

Copy link
Member

left a comment

Good to go once CI passes.

(Would it be useful to also have Map?)

@@ -41,6 +41,9 @@ let test x s1 s2 =
(let s = S.inter s1 s2 in
fun i -> S.mem i s = (S.mem i s1 && S.mem i s2));

checkbool "disjoint"
(S.is_empty (S.inter s1 s2) = S.disjoint s1 s2);

This comment has been minimized.

Copy link
@gasche

gasche Aug 13, 2018

Member

Very nice!

@nojb nojb merged commit 894e19e into ocaml:trunk Aug 13, 2018

2 checks passed

continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details

@nojb nojb deleted the nojb:set_disjoint branch Aug 13, 2018

@@ -98,6 +98,9 @@ module type S =
val inter: t -> t -> t
(** Set intersection. *)

val disjoint: t -> t -> bool
(** Test if two sets are disjoint. *)

This comment has been minimized.

Copy link
@Armael

Armael Aug 15, 2018

Member

Sorry for arriving a bit late, but shouldn't there be a @since 4.08.0 annotation?

This comment has been minimized.

Copy link
@nojb

nojb Aug 16, 2018

Author Contributor

Fixed in 5289ee6

@vicuna vicuna referenced this pull request Mar 14, 2019

Closed

Set.disjoint #6450

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.