Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Faulty test for Grade School #151

Closed
mambocab opened this issue Dec 16, 2014 · 21 comments · Fixed by #152
Closed

Faulty test for Grade School #151

mambocab opened this issue Dec 16, 2014 · 21 comments · Fixed by #152

Comments

@mambocab
Copy link
Contributor

Problem

The README for the Grade School exercise states that the School object should be able to

[g]et a sorted list of all students in all grades. Grades should sort as 1, 2, 3, etc., and students within a grade should be sorted alphabetically by name.

However, the corresponding test does not enforce sorting by grade, since dict objects are unordered:

sorted_students = {
    3: ("Kyle",),
    4: ("Christopher", "Jennifer",),
    6: ("Kareem",)
}
self.assertEqual(sorted_students, self.school.sort())

Goal

This test should enforce the grade-wise ordering of the result of School.sort as stated in the README.

Proposed Solutions

I can think of several possible solutions:

  • change the sorted_students variable to a list of tuples, i.e.
    sorted_students = [(3, ("Kyle")), (4, ("Christopher", "Jennifer")), (6, ("Kareem"))],
  • compare against an analogous OrderedDict, which would teach users about the collections module in the standard library,
  • pass the result of the sort() call to an OrderedDict, then make sure the result is correct, i.e.
sorted_students = OrderedDict((3, ("Kyle",),)
                              (4, ("Christopher", "Jennifer",),)
                              (6, ("Kareem",)))
self.assertEqual(sorted_students, OrderedDict(self.school.sort()))

I'm partial to this last solution, since it allows sort to return any ordered iterable where each element is a 2-element ordered iterable of the form grade, student_tuple. I'd say this pretty Pythonic; we don't care whether the return value is a list, tuple, OrderedDict, generator, etc., just that it's an ordered iterable with the values in the correct order.

@sjakobi
Copy link
Contributor

sjakobi commented Dec 16, 2014

Well spotted! 👍

Leaving the choice of iterable to the user would indeed be nice. But I fear that some will still use a simple dict and pass the tests by chance when the iteration through the dict happens to be in sorted order.
It would of course be possible to test whether the return type for sort is dict or not…

What do you think?

@mambocab
Copy link
Contributor Author

Good catch! I think checking the type of the return value is reasonable:

expected = OrderedDict((3, ("Kyle",)),
                       (4, ("Christopher", "Jennifer",)),
                       (6, ("Kareem",)))
actual = self.school.sort()
self.assertFalse(type(actual) == dict, msg="return value of 'sort' cannot be of type 'dict' because 'dict's are unordered")
self.assertEqual(sorted_students, OrderedDict(actual))

I can't think of any other common unordered types users might return that would lead to false negatives, so we can probably just check against dict.

I figure the message might be helpful guidance for some users who haven't wrapped their heads around dict's lack of order, but I defer to maintainers to decide whether to use a message, and to decide on the best wording of a message.

Unless someone gets to it first, I'm happy to submit a PR with this change once a decision is reached on how to implement it. Are any changes required other than updating the test and the example implementation?

@sjakobi
Copy link
Contributor

sjakobi commented Dec 17, 2014

The message (msg) is just fine!
But it seems to me that assertNotEqual would be a more fitting method than assertFalse.

Are any changes required other than updating the test and the example implementation?

No :).

A PR would be very welcome!

@Dog
Copy link
Contributor

Dog commented Dec 17, 2014

The example isn't valid (at least in python 2).

$ python test.py
Traceback (most recent call last):
  File "test.py", line 5, in <module>
    (6, ("Kareem",)))
  File "c:\Python27\lib\collections.py", line 45, in __init__
    raise TypeError('expected at most 1 arguments, got %d' % len(args))
TypeError: expected at most 1 arguments, got 3

I think another potential false negative is a set (though it depends on the change. I'm a little confused from the example). The best way I can think to test for order is to check if reversed is callable. The only way you can (sanely) implement reversed is if there is an order.

Python 2.7.8 (default, Jun 30 2014, 16:03:49) [MSC v.1500 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> test = {1, 2, 3}
>>> def go(arg):
...     for k in arg:
...         print k
...
>>> go(test)
1
2
3
>>> go(reversed(test))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: argument to reversed() must be a sequence

but

>>> test = ['a', 'b', 'c']
>>> go(test)
a
b
c
>>> go(reversed(test))
c
b
a

also for further sanity, here is reversed on dictionary and OrderedDictionary

>>> test = dict(a=1, b=2, c=3)
>>> reversed(test)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: argument to reversed() must be a sequence

>>> from collections import OrderedDict
>>> test = OrderedDict(a=1, b=2, c=3)
>>> reversed(test)
<generator object __reversed__ at 0x02A73DC8>

I kind of don't like putting OrderedDict in the test case because I think its the first thing people will read through when solving the problem. I worry people will just use OrderedDict and not stop to think about why its important.

@Dog
Copy link
Contributor

Dog commented Dec 17, 2014

I did a bit of looking and this may work:

from collections import Sequence
assert isinstance(obj, Sequence)

Example:

>>> isinstance(list(), Sequence)
True

However, this test fails for an OrderedDict:

>>> test = OrderedDict(a=1, b=2, c=3)
>>> isinstance(test, Sequence)
False

I think this may be the best way to solve the issue for now:

from collections import Sequence
self.assertTrue(isinstance(obj, Sequence) or callable(getattr(obj, '__reversed__', False)))

Examples:

>>> test = OrderedDict(a=1, b=2, c=3)
>>> isinstance(test, Sequence) or callable(getattr(test, '__reversed__', False))
True

>>> isinstance('test', Sequence) or callable(getattr('test', '__reversed__', False))
True

>>> isinstance(list(), Sequence) or callable(getattr(list(), '__reversed__', False))
True

>>> isinstance(set(), Sequence) or callable(getattr(set(), '__reversed__'), False)
False

>>> isinstance(dict(), Sequence) or callable(getattr(dict(), '__reversed__'), False)
False

@mambocab
Copy link
Contributor Author

The example isn't valid...

My mistake - I think I was editing on a tablet 😛. Corrected OrderedDict call here:

sorted_students = OrderedDict([(3, ("Kyle")),
                               (4, ("Christopher", "Jennifer")),
                               (6, ("Kareem"))])

I'm a little confused from the example

Sorry about that; let me know what I can clarify!

callable(getattr(obj, '__reversed__', False)))

Any reason to use this over hasattr(obj, '__reversed__')?

The isinstance... or hasattr... approach seems a reasonable type-check.

I worry people will just use OrderedDict and not stop to think about why its important.

True, but they'll also learn about the collections module. Just a thought -- I don't know if that's a priority for this exercise.

The benefit of using OrderedDict was that it would consume its input and put it in a checkable format regardless of its type. That is to say, it would treat all of these the same:

  • OrderedDict([('a', 1), ('b', 2), ('c', 3)])
  • [('a', 1), ('b', 2), ('c', 3)]
  • (('a', 1), ('b', 2), ('c', 3))
  • generate_pairs()

where generate_pairs is:

def generate_pairs():
    for p in [('a', 1), ('b', 2), ('c', 3)]:
        yield p

By my reckoning, these are all properly-sorted values, and using them as the input to an OrderedDict effectively normalizes them for correctness checking.

Though I may be overthinking the problem -- the spec does say that you should be able to "get a sorted list" [emphasis mine]. So maybe checking could be limited to something like

actual = self.school.sort()
self.assertEqual(actual[0], (3, ("Kyle")))
self.assertEqual(actual[1], (4, ("Christopher", "Jennifer")))
self.assertEqual(actual[2], (6, ("Kareem")))

which would disallow generator functions and OrderedDicts. That might be ok! I defer to the maintainers on this kind of decision.

@Dog
Copy link
Contributor

Dog commented Dec 18, 2014

The reason I did callable(getattr(obj, '__reversed__', False))) was for sanity. Here is where hasattr would fail:

>>> class Example:
...     def __init__(self):
...         self.__reversed__ = 'Some men just want to watch the world burn'
...
>>> hasattr(Example(), '__reversed__')
True
>>> callable(getattr(Example(), '__reversed__', False))
False

I was thinking that the test could be written similar to your last example.

Here is a version I wrote up quickly:

def test_sort_school(self):
    students = {
        3: ("Kyle",),
        4: ("Christopher", "Jennifer",),
        6: ("Kareem",)
    }

    for grade, students in students.iteritems():
        for student in students:
            self.school.add(student, grade)

    result = self.school.sort()

    # Attempts to catch false positives
    self.assertTrue(isinstance(result, Sequence) or callable(getattr(result, '__reversed__', False)))

    previous_key = None
    for key, value in result:
        self.assertTrue(previous_key < key)
        previous_key = key
        self.assertEqual(students[key], value)

I can't test it at the moment because I have three minutes of battery life left.

This isn't to say that using an OrderedDict is a bad idea. Functionality-wise it works for what is needed. I just think it may be better to avoid mentioning it since the test cases are provided.

@mambocab
Copy link
Contributor Author

self.__reversed__ = ...

I don't see why someone would do this (unless, as you say, they want to watch the world burn), I don't think checking __reversed__ is callable does any harm.

for key, value in result:
...

Beautiful. Communicates the intent much better than my proposal.

I've tested this with several small changes:

def test_sort_school(self):
    students = {
        3: ("Kyle",),
        4: ("Christopher", "Jennifer",),
        6: ("Kareem",)
    }

    for grade, students_in_grade in students.items():
        for student in students_in_grade:
            self.school.add(student, grade)

    result = self.school.sort()

    # Attempts to catch false positives
    self.assertTrue(isinstance(result, Sequence) or
                    callable(getattr(result, '__reversed__', False)))

    try:
        result = result.items()
    except AttributeError:
        pass

    previous_key = min(students.keys()) - 1
    for key, value in result:
        self.assertTrue(previous_key < key)
        previous_key = key
        self.assertEqual(students[key], value)

Changes:

  • iterates over students.items() for Python 3 compatibility
  • uses students_in_grade to avoid clobbering students
  • sets result to result.items() if it has that attribute -- i.e. if it's an OrderedDict, since iterating over an OrderedDict just iterates over the keys
  • initializes previous_key with a meaningful number instead of None.

Works on 2.7 and 3.4 on my machine.

@Dog
Copy link
Contributor

Dog commented Dec 19, 2014

I'd argue that None is more meaningful because its not a number. There is no previous key on the initial iteration.

None < any_integer is always True. It also prevents having to iterate through all the keys an extra time to figure out which is the minimum, then decrementing to simply pass the first check.

I think setting previous_key to -1 would be the next best solution, because you can not have a negative grade and it still avoids the extra iterating required for min.

I'll take a closer look at it when I get home tonight.

@sjakobi
Copy link
Contributor

sjakobi commented Dec 19, 2014

If I understand the tests correctly, they'd currently allow an empty sequence to pass.
Wouldn't it be easier to call list on result and compare that against a constant?

@Dog
Copy link
Contributor

Dog commented Dec 19, 2014

An empty sequence would pass since we only iterate the keys in result.

We can't pass result to list though, because then we lose the values:

>>> test = {'Animal': 'Dog'}
>>> print list(test)
['Animal']

Unless I misunderstand your suggestion.

@sjakobi
Copy link
Contributor

sjakobi commented Dec 19, 2014

Ok. Maybe something like result_list = list(result.items() if hasattr(result, "items") else result)?

@Dog
Copy link
Contributor

Dog commented Dec 19, 2014

I'm trying to think of what case .items() may not work for, and if using result may cause an issue. The only one I can think of is if someone decided to write their own custom container for their solution - which is a bit overkill. In that case they probably realize they would need a .items() method looking at the test case.

Also result.items() is already a list:

>>> test = {'Animal': 'Dog'}
>>> list(test.items())
[('Animal', 'Dog')]
>>> test.items()
[('Animal', 'Dog')]

So we could probably just check .items() against a constant.

@sjakobi
Copy link
Contributor

sjakobi commented Dec 19, 2014

In Python 3.4 items returns some kind of iterator:

In [2]: o = OrderedDict({1: ('Alice',)})

In [3]: o.items()
Out[3]: ItemsView(OrderedDict([(1, ('Alice',))]))

@mambocab
Copy link
Contributor Author

None is more meaningful

Fair enough! Never seen it used that way is all.

We could probably just check .items() against a constant.

That'd deal with the case where .sort returns something empty.

If we use @sjakobi's suggested result_list = ... implementation, then compare the result_list to a constant, then we've covered lists, tuples, generator functions, and OrderedDicts, I believe. That's good enough for me. I'd expect users to either be new enough to stick to the basic datatypes or experienced enough to read the test and work with the API it requires.

@sjakobi
Copy link
Contributor

sjakobi commented Dec 19, 2014

Updated test case:

    def test_sort_school(self):
        students = [
            (3, ("Kyle",)),
            (4, ("Christopher", "Jennifer",)),
            (6, ("Kareem",))
        ]

        for grade, students_in_grade in students:
            for student in students_in_grade:
                self.school.add(student, grade)

        result = self.school.sort()

        # Attempts to catch false positives
        self.assertTrue(isinstance(result, Sequence) or
                        callable(getattr(result, '__reversed__', False)))

        result_list = list(result.items() if hasattr(result, "items")
                           else result)

        self.assertEqual(result_list, students)

The test case still fails (at least with Python3.4) when sort returns a generic generator, because it neither is an instance of Sequence nor has the __reversed__ attribute.

@mambocab
Copy link
Contributor Author

The test case still fails (at least with Python3.4) when sort returns a generic generator, because it neither is an instance of Sequence nor has the __reversed__ attribute.

Good catch. How about types.GeneratorType? So the type check would be

from types import GeneratorType
...
        self.assertTrue(isinstance(result, Sequence) or
                        isinstance(result, GeneratorType) or
                        callable(getattr(result, '__reversed__', False)))

This still prevents false positives from sets and dicts:

>>> isinstance({}, GeneratorType)
False
>>> isinstance(set(), GeneratorType)
False

This holds on 3.4 and 2.7.

@mambocab
Copy link
Contributor Author

Anyone have any more revisions to propose? If not, let's do this. I'm happy to put together the PR unless someone else wants to grab the glory.

@sjakobi
Copy link
Contributor

sjakobi commented Dec 23, 2014

I just realized that we haven't talked about the packaging of the students for each grade. Do we allow tuples, lists, generators?

At the same time I'm under the impression that our test case is already complicated enough…

@mambocab
Copy link
Contributor Author

I'm ok with enforcing the use of a tuple and keeping complexity down. Any objections?

@sjakobi
Copy link
Contributor

sjakobi commented Dec 23, 2014

I'm ok with enforcing the use of a tuple and keeping complexity down.

Me too!

Looking forward to your PR!

@sjakobi sjakobi closed this as completed Dec 23, 2014
@sjakobi sjakobi reopened this Dec 23, 2014
mambocab added a commit to mambocab/xpython that referenced this issue Dec 24, 2014
The specification requires that the code return "a sorted list of all
students in all grades". However, the previous test compared the result
against a dictionary, which doesn't enforce that order. This commit
addresses that.

closes exercism#151.

Made possible by great feedback from @Dog and @sjakobi. See this thread:

exercism#151
mambocab added a commit to mambocab/xpython that referenced this issue Dec 24, 2014
The specification requires that the code return "a sorted list of all
students in all grades". However, the previous test compared the result
against a dictionary, which doesn't enforce that order. This commit
addresses that.

closes exercism#151.

Made possible by great feedback from @Dog and @sjakobi. See this thread:

exercism#151
mambocab added a commit to mambocab/xpython that referenced this issue Dec 24, 2014
The specification requires that the code return "a sorted list of all
students in all grades". However, the previous test compared the result
against a dictionary, which doesn't enforce that order. This commit
addresses that.

closes exercism#151.

Made possible by great feedback from @Dog and @sjakobi. See this thread:

exercism#151
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants