Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP Py3 #280

Open
wants to merge 53 commits into
base: master
from

Conversation

@borrob
Copy link
Collaborator

borrob commented Sep 17, 2019

Please do NOT approve this pull request. This is Work In Progress for the python3 conversion. I have an issue with one specific test and we want to see if Travis gets the same error or if it passes without problems.

@justb4

This comment has been minimized.

Copy link
Member

justb4 commented Sep 19, 2019

The Travis failure is only a flake8 error.

@borrob

This comment has been minimized.

Copy link
Collaborator Author

borrob commented Sep 19, 2019

:) yes, so the normal testing passes, so I guess it is something with my setup that the one test fails on my machine.

I haven't yet had much time to work on the docker issue, but otherwise I would say we are almost there.

@justb4

This comment has been minimized.

Copy link
Member

justb4 commented Oct 10, 2019

Started testing locally with py3 branch, will add findings in comments:

1. all tests run OK

2. filter() returns map not list in Py3

  File "/Users/just/project/geohealthcheck/borrob.git/GeoHealthCheck/plugins/probe/wfs.py", line 146, in expand_params
    if len(ft_namespaces) > 0:
TypeError: object of type 'filter' has no len()

3. OWSLib API call error TMS

  File "/Users/just/project/geohealthcheck/borrob.git/GeoHealthCheck/plugins/probe/tms.py", line 110, in get_metadata
    return TileMapService(resource.url, version=version)
  File "/Users/just/.pyenv/versions/3.7.1/envs/geohealthcheck3.7.1/lib/python3.7/site-packages/owslib/tms.py", line 61, in __init__
    self.version, url=self.url, un=self.username, pw=self.password
AttributeError: 'TileMapService' object has no attribute 'username'
  • investigating...could be due to 0.17.1 to 0.18.0 OWSLib upgrade in GHC..Yes OWSLib issue, opened: see geopython/OWSLib#614.

4. Basic Auth Fails

  • add a Resource URL that requires Basic Auth
  • in Edit configure Basic Authentication: Username, Password, Save
  • Test, is False
  • See error message in report:
"message": "Perform_request Err: TypeError expected bytes-like object, not str",
borrob added 5 commits Oct 11, 2019
Code review feedback by @justb4:

- add a WFS e.g. https://geodata.nationaalgeoregister.nl/aan/wfs
- in Edit try to add any Probe for bbox query
- see error in UI (cannot add Probe)
- in log

```
File "/Users/just/project/geohealthcheck/borrob.git/GeoHealthCheck/plugins/probe/wfs.py", line 146, in expand_params
    if len(ft_namespaces) > 0:
TypeError: object of type 'filter' has no len()
```

- solution: in https://github.com/borrob/GeoHealthCheck/blob/py3/GeoHealthCheck/plugins/probe/wfs.py#L142 replace line with ft_namespaces = list(filter(None, list(ft_namespaces)))
- will look for more filter() occurrences assuming list...(ok, no other occurs)

Fixed as suggested.
Code review feedback by @justb4:

- add TMS e.g. https://geodata.nationaalgeoregister.nl/tiles/service/tms/1.0.0
- add Probe for any GetTile
- exception:
```
File "/Users/just/project/geohealthcheck/borrob.git/GeoHealthCheck/plugins/probe/tms.py", line 110, in get_metadata
  return TileMapService(resource.url, version=version)
File "/Users/just/.pyenv/versions/3.7.1/envs/geohealthcheck3.7.1/lib/python3.7/site-packages/owslib/tms.py", line 61, in __init__
  self.version, url=self.url, un=self.username, pw=self.password
AttributeError: 'TileMapService' object has no attribute 'username'
```

-  investigating...could be due to 0.17.1 to 0.18.0 OWSLib upgrade in GHC..Yes OWSLib issue, opened: see geopython/OWSLib#614.

Fixed by downgrading to OWSLib 0.17.1. I made a note in the requirements.txt to
only upgrade to a new version when a solution to issue 614 is included.
From code review @justb4:

- add a Resource URL that requires Basic Auth
- in Edit configure Basic Authentication: Username, Password, Save
- Test, is False
- See error message in report:

```
"message": "Perform_request Err: TypeError expected bytes-like object, not str",

```
- analysis: may be due to: https://www.reddit.com/r/learnpython/comments/7qa0at/typeerror_a_byteslike_object_is_required_not_str/ , see comment: python version you're using is python3, while your code is for python2.
  ...base64 lib now operated on bytes, not strings and bytes objects don't have replace() method
- think this needs fix here: https://github.com/borrob/GeoHealthCheck/blob/py3/GeoHealthCheck/util.py#L230 will investigate..

FIX:
Fixed by edit on resrouceauths plugin: explicity converting string to byte
@justb4

This comment has been minimized.

Copy link
Member

justb4 commented Oct 11, 2019

Tested after pulling above commits.

5 - HttpStatusNoError Never detects errors

Was testing Basic Auth. Curious why even without credentials HTTPGet Probe was successful. We trapped into the div issue from Python2 to Python3! Problem is this line in checks.py:

    def perform(self):
        """Default check: Resource should at least give no error"""
        status = self.probe.response.status_code
==>     overall_status = status / 100
        if overall_status in [4, 5]:
            self.set_result(False, 'HTTP Error status=%d' % status)

In Python2 gives floor value, but Python3 a float.
Fix needs to be: overall_status = status // 100 AFAICS.
(Now Basic Auth always fails, even with right credentials, but is other (encoding?) issue...)

ad 4 - Basic Auth encoding gives wrong HTTP header value

For a correctly encoded header string like 'Basic aWF...XS1\n'
this line gives something like 'Basic b\'aWF...XS1\\n\''
(the \n is later always stripped).

@justb4

This comment has been minimized.

Copy link

justb4 commented on 5b6722f Oct 11, 2019

Ok, thought OWSLib==0.17.1 was Python2-only but appearantly not.

borrob added 4 commits Oct 11, 2019
Codereview bij @justb4:
`check.py` does not give a HTTPStatusError when it should give you an error.
This is due to the difference between python 2/3 on dividing: python2 provides
floor, whil python 3 provides float.

-> Fixed as suggested
Code review by @just4B
For a correctly encoded header string like 'Basic aWF...XS1\n'
this line gives something like 'Basic b\'aWF...XS1\\n\''
(the \n is later always stripped).

Fixed by formatting the return value
@justb4

This comment has been minimized.

Copy link
Member

justb4 commented Oct 12, 2019

Ok, tested again: 3 (OWSLib TMS) and 5 (HTTP status check) above found solved. 4 (auth encoding) only with SQLite as DB.

6 - Auth info Decoding/Encoding Failure with Postgres

Only occurs when using Postgres, not with SQLite. Background: GHC encodes/encrypts a generic auth dict info structure via JSON string to be stored as textfield in DB. It decodes/decrypts when reading. This way we can support multiple auth types with a single auth column in resource table. So this is another encoding than in 4 for HTTP auth headers, but think similar problem.
We also need to deal with existing PG DBs that have auth columns already present in resource table.

  • install psycopg2 : pip install psycopg2
  • used existing PG DB, only changed in config_site.py: SQLALCHEMY_DATABASE_URI = 'postgresql://name:passw@localhost:5432/ghc', but may create new
  • add URL Resource with Basic Auth (even does not have to have basic auth)
  • in Edit add Basic Auth Username and Password
  • click Save
  • Exception:
  File "/Users/just/project/geohealthcheck/borrob.git/GeoHealthCheck/models.py", line 584, in auth_type
    return self.auth['type']
  File "/Users/just/project/geohealthcheck/borrob.git/GeoHealthCheck/models.py", line 578, in auth
    return ResourceAuth.decode(self._auth)
  File "/Users/just/project/geohealthcheck/borrob.git/GeoHealthCheck/resourceauth.py", line 87, in decode
    raise err
  File "/Users/just/project/geohealthcheck/borrob.git/GeoHealthCheck/resourceauth.py", line 83, in decode
    s = decode(APP.config['SECRET_KEY'], encoded)
  File "/Users/just/project/geohealthcheck/borrob.git/GeoHealthCheck/util.py", line 247, in decode
    string = base64.urlsafe_b64decode(string + b'===')
TypeError: can only concatenate str (not "bytes") to str

Analysis: problem appears in decode() (and probably encode() as well) in util.py :

def decode(key, string):
    string = base64.urlsafe_b64decode(string + b'===')
    string = string.decode('latin') if six.PY3 else string
    encoded_chars = []
    for i in range(len(string)):
        key_c = key[i % len(key)]
        encoded_c = chr((ord(string[i]) - ord(key_c) + 256) % 256)
        encoded_chars.append(encoded_c)
    encoded_string = ''.join(encoded_chars)
    return encoded_string

With SQLite string is of type bytes but with PG of type string.
Tried something like:

def decode(key, string):
    if type(string) is not bytes:
        string = string.encode()
    string = base64.urlsafe_b64decode(string + b'===')

But then get other error on Edit and Test:

File "/Users/just/project/geohealthcheck/borrob.git/GeoHealthCheck/util.py", line 247, in decode
string = base64.urlsafe_b64decode(string + b'===')
File "/Users/just/.pyenv/versions/3.7.1/lib/python3.7/base64.py", line 133, in urlsafe_b64decode
return b64decode(s)
File "/Users/just/.pyenv/versions/3.7.1/lib/python3.7/base64.py", line 87, in b64decode
return binascii.a2b_base64(s)
binascii.Error: Invalid base64-encoded string: number of data characters (213) cannot be 1 more than a multiple of 4
  • don't think six is required if we do Py3-only?
@borrob

This comment has been minimized.

Copy link
Collaborator Author

borrob commented Oct 16, 2019

Weird... I would expect SQLAlchemy to deal with the abstraction and get the string/byte conversion right regardless of which database is used. I will look into it.

Agree: we should be able to drop six.

borrob added 2 commits Oct 18, 2019
From code review by @justb4:
6 - Auth info Decoding/Encoding Failure with Postgres

Only occurs when using Postgres, not with SQLite. Background: GHC
encodes/encrypts a generic auth dict info structure via JSON string to be stored
as textfield in DB. It decodes/decrypts when reading. This way we can support
multiple auth types with a single auth column in resource table. So this is
another encoding than in 4 for HTTP auth headers, but think similar problem.
We also need to deal with existing PG DBs that have auth columns already present
in resource table.

-  install psycopg2 : pip install psycopg2
-  used existing PG DB, only changed in config_site.py: SQLALCHEMY_DATABASE_URI
   = 'postgresql://name:passw@localhost:5432/ghc', but may create new
-  add URL Resource with Basic Auth (even does not have to have basic auth)
-  in Edit add Basic Auth Username and Password
-  click Save

Fixed by ensuring encode/decode takes `string` as input and gives `string` as
output. These methodes now use static typing. The origin of the problem was the
difference of how SQLite and Postgres store the encoded string (either as text
or in some binary form). This solution makes the encoded authentication a
string object and *not* bytes.
@borrob

This comment has been minimized.

Copy link
Collaborator Author

borrob commented Oct 21, 2019

We're not there yet: I noticed some paver commands still need an update because of issues with the importing of modules. Also: the docker image is building, but I haven't checked yet if it is actually working. I had to change the gunicorn configuration on a non-docker deploy.

@justb4

This comment has been minimized.

Copy link

justb4 commented on e47a127 Oct 21, 2019

Ok, thanks for diving in! Tried with Postgres:

  • works ok (encode/decode,Test) when adding new Resource with Basic or Bearer Token Auth
  • existing Resources (from Customer deployed Python 2.7 GHC instance): stored strings are different (from when creating new locally in Py3) and get decode errors

Will investigate further. Unfortunately cannot provide the existing encoded strings.

@justb4

This comment has been minimized.

Copy link
Member

justb4 commented Oct 25, 2019

@borrob good to see you synced with master! Will try not to do too many big changes.

The Postgres-char-issue: my bad! For security reasons the SECRET_KEY is used for encoding/decoding stored auth creds, and the Py3 instance had a different key! Using the same key: no problem. Pff, was thinking that we had a very serious encoding issue with difficult DB migrations. So we're getting closer!

borrob added 2 commits Oct 25, 2019
@borrob

This comment has been minimized.

Copy link
Collaborator Author

borrob commented Oct 25, 2019

I fixed one paver issue and did some testing (also with docker). I think we're good to go and I'm curious what the results of the demo environment will be.

@borrob borrob marked this pull request as ready for review Oct 25, 2019
@borrob

This comment has been minimized.

Copy link
Collaborator Author

borrob commented Oct 25, 2019

I removed the draft tag from this pull request (that was there for the automated testing with travis). Please review and let's hope we can move to py3 soon!

@justb4

This comment has been minimized.

Copy link
Member

justb4 commented Oct 27, 2019

Yes, good, I only want to release 0.7.0 from current master first, and create a 07.0 maintenance branch. Then further test in particular with Docker and then merge this PR and have a quite some testing time on demo site. OK, @tomkralidis ? Let's aim for all of this before nov 1 ok?

NB solved a nasty concurrency bug (two lines) with #301 #302 today.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.