Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MAINT: Fix all tests #1656

Closed
wants to merge 50 commits into from
Closed

MAINT: Fix all tests #1656

wants to merge 50 commits into from

Conversation

thequackdaddy
Copy link

Hello,

So I really like this project but it seems to have been quite quiet lately.

I spent some time today un-pinning all the packages and getting the tests to pass again. One caveat is that odo has a problem. For this, I've just odo from a repository I built. Obviously, this is not ideal.

Let me know if this is helpful. This is an amazing set of tools.

@thequackdaddy
Copy link
Author

cc @postelrich

@coveralls
Copy link

coveralls commented Aug 22, 2017

Coverage Status

Changes Unknown when pulling 16a70ec on thequackdaddy:test into ** on blaze:master**.

Copy link
Contributor

@postelrich postelrich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@thequackdaddy Nice! This goes above and beyond my attempt to get 3.6 working. Do you want to make a PR for the odo fix? Once that is in we can try to get this merged pointing to that.

@@ -1,6 +1,6 @@
flask>=0.10.1
flask-cors
odo>=0.5.0
# odo>=0.5.1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would have to be uncommented before merge

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've made the pull request (see below).

Once you change this, I'll revert this change and push the update.

(For reasons I don't fully understand, pulling from my repo kept the version number below 0.5.0 so this kept failing :-/)

electronwill and others added 2 commits September 5, 2017 12:54
[docs] company name change from Continuum to Anaconda
[docs] company name change from Continuum to Anaconda
@necaris necaris mentioned this pull request Sep 6, 2017
@@ -131,14 +131,14 @@ def utcfromtimestamp(expr):


class nanosecond(DateTime): _dtype = datashape.int64
class week(DateTime): _dtype = datashape.int16
class week(DateTime): _dtype = datashape.int64
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change makes the tests pass; however, #1585 intentionally made the datetime types as narrow as possible.

I ran across your proposed fix while looking at the following failure when running the test py.test -s -x blaze/compute/tests/test_postgresql_compute.py::test_datetime_access[week] against current master of blaze, odo, and datashape:

obj = 'Attributes', message = 'Attribute "dtype" are different', left = dtype('int16')
right = dtype('int64'), diff = None

    def raise_assert_detail(obj, message, left, right, diff=None):
        if isinstance(left, np.ndarray):
            left = pprint_thing(left)
        if isinstance(right, np.ndarray):
            right = pprint_thing(right)
    
        msg = """{0} are different
    
    {1}
    [left]:  {2}
    [right]: {3}""".format(obj, message, left, right)
    
        if diff is not None:
            msg = msg + "\n[diff]: {diff}".format(diff=diff)
    
>       raise AssertionError(msg)
E       AssertionError: Attributes are different
E       
E       Attribute "dtype" are different
E       [left]:  int16
E       [right]: int64

/mnt/envs/blaze/lib/python3.4/site-packages/pandas/util/testing.py:1018: AssertionError

(It might make sense to break off making the aforementioned test pass into a separate issue/PR.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This patch may checks the values, but ignores the dtypes.

$ git diff
diff --git a/blaze/compute/tests/test_postgresql_compute.py b/blaze/compute/tests/test_postgresql_compe.
py                                                                                                     
index 6c15703..608e8fb 100644
--- a/blaze/compute/tests/test_postgresql_compute.py
+++ b/blaze/compute/tests/test_postgresql_compute.py
@@ -886,4 +886,5 @@ def test_datetime_access(attr, sql_with_dts):
         compute(expr, sql_with_dts, return_type=pd.Series),
         getattr(compute(s.A, sql_with_dts, return_type=pd.Series).dt, attr),
         check_names=False,
+        check_dtype=False,
     )

compute(s.A, sql_with_dts, return_type=pd.Series).dt is an instance of pandas.tseries.common.DatetimeProperties which is going to return a Series with a dtype of int64 when accessing the week attribute.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#1660 applies the patch in the previous comment.



def test_summary_count(bank):
expr = by(t.name, how_many=t.amount.count())
result = compute(expr, bank)
assert result == [('Bob', 3), ('Alice', 2)]
assert ((result == [('Bob', 3), ('Alice', 2)]) or
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could just sort both the result and expectation

python-dateutil==2.5.3
pytz==2016.7
pyyaml==3.11
bcolz
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the point of this file was to pin the requirements we test with. Tests kept failing because downstream packages were being updated. I don't want to fix the whole world with every PR and it put a huge burden on whoever wanted to open the next PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment still stands

@@ -1,7 +1,7 @@
numpy >= 1.7
pandas >= 0.15.0
git+git://github.com/blaze/datashape.git
git+git://github.com/blaze/odo.git
git+git://github.com/thequackdaddy/odo.git@reversion
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?

The results of the expressions compared in the test_datetime_access test do not
share the same dtype.

The computation which does the attribute lookup as part of the expression uses
the dtypes specified in `blaze.expr.datetime`.

The computation which does not do the attribute lookup returns a
`DatetimeProperties` instance; attribute lookups on those instances are all
`int64`.

Fix by setting the check dtype flag to `False`.

Add expected dtypes to the tests fixture and assert that the computation which
does the attribute lookup returns the correct type.
ehebert pushed a commit to quantopian/odo that referenced this pull request Oct 13, 2017
`discover` was returning a bytes dtype for the `sa.TIMESTAMP` instead of
`datetime_`.

That bug was because the SQLAlchemy dialect for MSSQL currently directly imports
`sa.TIMESTAMP`, causing a collision in `odo.backends.sql.revtypes`.

Fix by 1) creating a subclass of `mssql.TIMESTAMP` to use as the key in
`revtypes` so that it does not overwrite `sa.TIMESTAMP` 2) assign that subclass
to the mssql dialect's `'TIMESTAMP'` (using `ischema_names`), so that the
subclass will be returned by the type engine instead of `mssql.TIMESTAMP`.

The added test for the (`sa.TIMESTAMP`, `datetime_`) without the fix applied
would result in this error:
```
a = Bytes(), b = DateTime(tz=None), path = ('.measure', "['ts']", '.ty')
kwargs = {'check_dim': True, 'check_record_order': True}

    @assert_dshape_equal.register(object, object)
    def _base_case(a, b, path=None, **kwargs):
>       assert a == b, '%s != %s\n%s' % (a, b, _fmt_path(path))
E       AssertionError: bytes != datetime
E       path: _.measure['value'].ty
```

This patch includes checks for the other types besides `sa.TIMESTAMP` for better
protection against the general case of failures when using `revtypes` to map
SQLAlchemy types to dtypes.

Those extra cases exposed an issue with `sa.Float(precision=24)`.
That case is commented out to keep the fix of this patch on the `sa.TIMESTAMP` mapping.
(I would have used `pytest.param` to mark it xfail, but pytest for this project needs to be
upgraded first.)

See:
blaze#567
blaze#568
https://bitbucket.org/zzzeek/sqlalchemy/issues/4092/type-problem-with-mssqltimestamp
blaze/blaze#1656
llllllllll pushed a commit to blaze/odo that referenced this pull request Oct 16, 2017
`discover` was returning a bytes dtype for the `sa.TIMESTAMP` instead of
`datetime_`.

That bug was because the SQLAlchemy dialect for MSSQL currently directly imports
`sa.TIMESTAMP`, causing a collision in `odo.backends.sql.revtypes`.

Fix by 1) creating a subclass of `mssql.TIMESTAMP` to use as the key in
`revtypes` so that it does not overwrite `sa.TIMESTAMP` 2) assign that subclass
to the mssql dialect's `'TIMESTAMP'` (using `ischema_names`), so that the
subclass will be returned by the type engine instead of `mssql.TIMESTAMP`.

The added test for the (`sa.TIMESTAMP`, `datetime_`) without the fix applied
would result in this error:
```
a = Bytes(), b = DateTime(tz=None), path = ('.measure', "['ts']", '.ty')
kwargs = {'check_dim': True, 'check_record_order': True}

    @assert_dshape_equal.register(object, object)
    def _base_case(a, b, path=None, **kwargs):
>       assert a == b, '%s != %s\n%s' % (a, b, _fmt_path(path))
E       AssertionError: bytes != datetime
E       path: _.measure['value'].ty
```

This patch includes checks for the other types besides `sa.TIMESTAMP` for better
protection against the general case of failures when using `revtypes` to map
SQLAlchemy types to dtypes.

Those extra cases exposed an issue with `sa.Float(precision=24)`.
That case is commented out to keep the fix of this patch on the `sa.TIMESTAMP` mapping.
(I would have used `pytest.param` to mark it xfail, but pytest for this project needs to be
upgraded first.)

See:
#567
#568
https://bitbucket.org/zzzeek/sqlalchemy/issues/4092/type-problem-with-mssqltimestamp
blaze/blaze#1656
Eddie Hebert and others added 5 commits October 16, 2017 17:32
The `datetime.second` attribute was truncating to second resolution, instead of
including subsecond values.

i.e. `second` attribute of a datetime `2014-01-01 00:00:00.042` would return `0`
instead of `0.042`

Fix by changing the type of `datetime.Second` from `int64` to `float64`.

Returning a float was the behavior before the `compute_up` from `DateTime` ->
`ColumnElement` begain enforcing the specified type via a `CAST`.
msgpack seems to have moved
@coveralls
Copy link

coveralls commented Oct 20, 2017

Coverage Status

Changes Unknown when pulling 733e469 on thequackdaddy:test into ** on blaze:master**.

@postelrich
Copy link
Contributor

@thequackdaddy @ehebert @llllllllll I'm at a loss for why only 3.4 is failing with that mysql error. I tried setting LOCAL as well as set global local_infile=OK. I also made sure they're all running the latest pymysql, yet it still fails. I'm open to suggestions.

Since we're adding 3.6, any objection to removing testing for 3.4? I see a lot of projects don't make builds for 3.4.

@llllllllll
Copy link
Member

We are still using 3.4 for our blaze servers at Quantopian. We don't use mysql so I would be fine with skipping those tests on 3.4 while we migrate to 3.6.

@ehebert
Copy link
Contributor

ehebert commented Oct 20, 2017

@llllllllll, I agree, skipping the mysql tests for 3.4 is a good interim solution.

@coveralls
Copy link

coveralls commented Oct 20, 2017

Coverage Status

Changes Unknown when pulling c06ac09 on thequackdaddy:test into ** on blaze:master**.

@coveralls
Copy link

Coverage Status

Changes Unknown when pulling 834aaec on thequackdaddy:test into ** on blaze:master**.

4 similar comments
@coveralls
Copy link

Coverage Status

Changes Unknown when pulling 834aaec on thequackdaddy:test into ** on blaze:master**.

@coveralls
Copy link

Coverage Status

Changes Unknown when pulling 834aaec on thequackdaddy:test into ** on blaze:master**.

@coveralls
Copy link

Coverage Status

Changes Unknown when pulling 834aaec on thequackdaddy:test into ** on blaze:master**.

@coveralls
Copy link

coveralls commented Oct 20, 2017

Coverage Status

Changes Unknown when pulling 834aaec on thequackdaddy:test into ** on blaze:master**.

@postelrich
Copy link
Contributor

@llllllllll @ehebert @necaris Someone want to review?

@llllllllll
Copy link
Member

@postelrich I will review this on monday

@postelrich
Copy link
Contributor

Also need to do an odo py3.6 release too.

@llllllllll
Copy link
Member

I am fine with everything except the change to requirements_ci.txt. in the past not installing the same test requirements has wasted so much time and discouraged people from contributing.

@coveralls
Copy link

Coverage Status

Changes Unknown when pulling 16b5591 on thequackdaddy:test into ** on blaze:master**.

1 similar comment
@coveralls
Copy link

coveralls commented Oct 24, 2017

Coverage Status

Changes Unknown when pulling 16b5591 on thequackdaddy:test into ** on blaze:master**.

@postelrich
Copy link
Contributor

@llllllllll doesn't look like there are a lot of packages built for all 2.7, 3.4, 3.5, and 3.6. If you know a good way to come up with fixed dependencies, please go for it. Otherwise, we can get this PR merged in without pegged deps, and add it back in another PR when we drop 3.4 support.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants