Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

new list and set functions #373

Merged
merged 20 commits into from
May 27, 2016
Merged

new list and set functions #373

merged 20 commits into from
May 27, 2016

Conversation

poke1024
Copy link
Contributor

This implements Gather[], GatherBy[], Tally[], Union[], Intersect[], IntersectingQ[], DisjointQ[].

It also updates the existing implementations of Complement[] and DeleteDuplicates[] so that they work in n log n if SameTest is SameQ (as opposed to the old implementations who where n^2).

It also implements SortBy[] and BinarySearch[].

return

if list.get_head_name() != 'System`List':
evaluation.error(self.get_name(), 'needlist')
Copy link
Member

@sn6uv sn6uv May 15, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be return evaluation.message(self.get_name(), 'list', expr, 1)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't understood the difference between evaluation.message and evaluation.error yet. Is message more like a warning or is it also used for errors? I'm just wondering if the thing above should be an evaluation.error after all?

Copy link
Member

@sn6uv sn6uv May 19, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

evaluation.message acts as both a warning and an error message. It's used whenever evaluation can continue. Normally continuing just means return the expression unevaluated. In this case for example, there is no evaluation 'error', it's just that the function doesn't apply to all inputs.

evaluation.error on the other hand is used when there is an error with the evaluation process itself. For example, if the recursion depth gets above $RecursionLimit then that's an evaluation error since evaluation can't continue. We only use evauation.error when the calculation has to be aborted.

We try to keep the return values and messages compatible with Mathematica:

In[1]:= Gather[a + b + c + a]
Gather::list: List expected at position 1 in Gather[2 a + b + c].
Out[1]= Gather[2 a + b + c]

Does that make sense?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I think I understand, so it usually be evaluation.message.

@poke1024
Copy link
Contributor Author

I redid the evaluation.message stuff, and I added additional checks in SortBy[] with an error message guarding against bad things happening in the Map[].

@sn6uv
Copy link
Member

sn6uv commented May 19, 2016

Okay, I'll have a final look over tomorrow but I think this looks good to merge.

@sn6uv
Copy link
Member

sn6uv commented May 20, 2016

I think a better place for the hash tests would be e.g. test/test_hash.py.

@poke1024
Copy link
Contributor Author

Moved there now. And it runs without explicitly hooking it up. This is nice.

@sn6uv
Copy link
Member

sn6uv commented May 21, 2016

Some performance verification for Gather:
verified_gather_performance
Note the expected O(n) performance when there are both few or many collisions.

yield ea


class Intersect(_SetOperation):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor typo: the name of this function is Intersection.

Bin = self._bin

select = equivalence.select()
same = equivalence.same()
Copy link
Member

@bnjones bnjones May 22, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless I'm missing something, the usage of .select() and .same() here seems more complicated than it needs to be. I don't think these need to return closures here.

How about refactoring these methods so this function can just call equivalance.select(elem) and equivalence.same(prototype, elem)? This would also allow _make_test_pair() to be folded into _SlowEquivalence.same().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that the code is less readable than it should be here. I'm always concerned about Python name lookups in inner loops, that's why I came up with these lambdas in the first place, but I think the code's much cleaner without them. Changed accordingly.

@sn6uv
Copy link
Member

sn6uv commented May 25, 2016

I'll do some final tests with this but I think it's looking good to merge.

@sn6uv
Copy link
Member

sn6uv commented May 26, 2016

One minor issue I found with Intersection. When a different SameTest is used only one representative from each equivalence class should be taken. For example:
Intersection[{2, -2, 1, 3, 1}, {2, 1, -2, -1}, SameTest -> (Abs[#1] == Abs[#2] &)]
gives {-2, 1, 1, 2} expected {1, 2}.
The same issue holds for for Union.
Edit: See poke1024#1 for fix.

sn6uv and others added 2 commits May 26, 2016 20:51
@sn6uv sn6uv merged commit d1f19d4 into mathics:master May 27, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants