Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The protocol scheme was added in the phase of finding the active RM. #67

Merged

Conversation

Melchizedek13
Copy link
Contributor

I faced with the issue "Python request: No connection adapters were found for..." when I passed yarn-site.xml file and called the constructor of the class ResourceManager without any arguments.

According to the Yarn documentation https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/ResourceManagerHA.html
the protocol scheme isn't stored in the values of parameters yarn.resourcemanager.webapp.address.rm-id & yarn.resourcemanager.webapp.https.address.rm-id

@Melchizedek13 Melchizedek13 changed the title The protocol scheme was added in phase of finding the active RM. The protocol scheme was added in the phase of finding the active RM. Dec 26, 2019
@dimon222
Copy link
Collaborator

dimon222 commented Dec 26, 2019

Can you confirm that you tried it on python 3.8.1?
I seem to have noticed change in behaviour in this patch version of python and detailed my finding in #69
If possible, could you try on lower python version and see if you still have the issue?

@Melchizedek13
Copy link
Contributor Author

Melchizedek13 commented Dec 26, 2019

@dimon222 I faced with the issue (lacking of the protocol scheme) using Python 3.8.0 version.

I've just updated Python version on 3.8.1 and run the unit tests:

... raise exceptions.NoMockAddress(request) requests_mock.exceptions.NoMockAddress: No mock address: GET example.com://80/ololo

Indeed, not all tests successfully completed on the 3.8.1 version.

On the 3.7.6 Python version the situation is the same:

`λ C:\temp\venv\hadoop-yarn-api-python-client\Scripts\python.exe --version
Python 3.7.6

Testing started at 11:42 PM ...
C:\temp\venv\hadoop-yarn-api-python-client\Scripts\python.exe "C:\Program Files (x86)\JetBrains\PyCharm Community Edition 2019.3\plugins\python-ce\helpers\pycharm_jb_unittest_runner.py" --path C:/git/hadoop-yarn-api-python-client/tests
Launching unittests with arguments python -m unittest discover -s C:/git/hadoop-yarn-api-python-client/tests -t C:\git\hadoop-yarn-api-python-client\tests in C:\git\hadoop-yarn-api-python-client\tests

DEBUG:yarn_api_client.application_master:Get configuration from hadoop conf dir
INFO:yarn_api_client.base:API Endpoint example.com://80/ololo
DEBUG:requests_mock.adapter:GET example.com://80/ololo 404
INFO:yarn_api_client.base:API Endpoint example.com://80/ololo
DEBUG:requests_mock.adapter:GET example.com://80/ololo 200
INFO:yarn_api_client.base:API Endpoint example.com://80/ololo

Error
Traceback (most recent call last):
File "C:\Python\3.7.6\lib\unittest\case.py", line 59, in testPartExecutor
yield
File "C:\Python\3.7.6\lib\unittest\case.py", line 628, in run
testMethod()
File "C:\git\hadoop-yarn-api-python-client\tests\test_base.py", line 32, in test_valid_request_with_parameters
response = client.request('/ololo', params={"foo": 'bar'})
File "C:\git\hadoop-yarn-api-python-client\yarn_api_client\base.py", line 78, in request
response = self.session.request(method=method, url=api_endpoint, headers=headers, timeout=self.timeout, **kwargs)
File "C:\temp\venv\hadoop-yarn-api-python-client\lib\site-packages\requests\sessions.py", line 533, in request
resp = self.send(prep, **send_kwargs)
File "C:\temp\venv\hadoop-yarn-api-python-client\lib\site-packages\requests_mock\mocker.py", line 111, in _fake_send
return _original_send(session, request, **kwargs)
File "C:\temp\venv\hadoop-yarn-api-python-client\lib\site-packages\requests\sessions.py", line 646, in send
r = adapter.send(request, **kwargs)
File "C:\temp\venv\hadoop-yarn-api-python-client\lib\site-packages\requests_mock\adapter.py", line 255, in send
raise exceptions.NoMockAddress(request)
requests_mock.exceptions.NoMockAddress: No mock address: GET example.com://80/ololo

DEBUG:requests_mock.adapter:GET https://example2:8022/cluster 200
DEBUG:requests_mock.adapter:GET https://example2:8022/cluster 500
Error to access RM - HTTP Code 500
DEBUG:requests_mock.adapter:GET https://example2:8022/cluster 404
Error to access RM - HTTP Code 404
DEBUG:requests_mock.adapter:GET https://example2:8022/cluster 401
Error to access RM - HTTP Code 401
DEBUG:yarn_api_client.history_server:Get information from hadoop conf dir
DEBUG:yarn_api_client.resource_manager:Get configuration from hadoop conf dir: /etc/hadoop/conf

Ran 85 tests in 0.199s

FAILED (errors=1)

Process finished with exit code 1

Assertion failed

Assertion failed

Assertion failed

`

I used the following packs in Python environment:
`
λ C:\temp\venv\hadoop-yarn-api-python-client\Scripts\python.exe -m pip list
Package Version


certifi 2019.11.28
chardet 3.0.4
entrypoints 0.3
flake8 3.7.9
idna 2.8
mccabe 0.6.1
mock 3.0.5
pip 19.0.3
pycodestyle 2.5.0
pyflakes 2.1.1
requests 2.22.0
requests-mock 1.7.0
setuptools 40.8.0
six 1.13.0
urllib3 1.25.7
`

@dimon222
Copy link
Collaborator

dimon222 commented Dec 26, 2019

Interesting, I'm able to run tests successfully on 3.8.0
Also TravisCI confirms all tests are successful on 3.8.0 and 3.7.1. Could be some change been ported to other python trees.
https://travis-ci.org/toidi/hadoop-yarn-api-python-client/builds/629721243

UPD:
3.7.5 - PASS
3.7.6 - FAIL
3.8.0 - PASS
3.8.1 - FAIL
Something extremely new.

UPD2:
Found it.
https://docs.python.org/3.7/whatsnew/changelog.html#python-3-7-6-final
bpo-27657: Fix urllib.parse.urlparse() with numeric paths. A string like “path:80” is no longer parsed as a path but as a scheme (“path”) and a path (“80”).
https://bugs.python.org/issue27657

@Melchizedek13
Copy link
Contributor Author

Melchizedek13 commented Dec 26, 2019

@dimon222 I think I found the interesting thing:

Python 3.8.1 (tags/v3.8.1:1b293b6, Dec 18 2019, 22:39:24) [MSC v.1916 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> from urllib.parse import urlparse
>>> urlparse('example.com:80')
ParseResult(scheme='example.com', netloc='', path='80', params='', query='', fragment='')
>>>
Python 3.7.4 (default, Jul  9 2019, 00:06:43)
[GCC 6.3.0 20170516] on linux
>>> from urllib.parse import urlparse
>>> urlparse('example.com:80')
ParseResult(scheme='', netloc='', path='example.com:80', params='', query='', fragment='')
>>>

It's due with the test "test_valid_request_with_parameters()":

 def get_client(self):
        client = base.BaseYarnAPI()
        client.service_uri = base.Uri('example.com:80')

where base.Uri is

class Uri(object):
    def __init__(self, service_endpoint):
        service_uri = urlparse(service_endpoint)

@dimon222
Copy link
Collaborator

dimon222 commented Dec 26, 2019

I think I confused this thread with another issue altogether. My apology for this.
Essentially 2 issues.

  1. This PR is fixing prepending protocol for URLs parsed out of configuration file since we directly parse it from config to pass to requests library when user didn't supply any argument.
  2. Test is breaking because new python patch is changing the way urlparse works, so it will break when user is passing urls like "localhost:80" to constructor manually (not via configuration file). I've prepared fix for this in another PR Fix urlparse for new python version #70 .

Thanks for helping, hopefully @lresende @kevin-bates can assist on reviewing this PR.
Looks good to me so far.

@Melchizedek13
Copy link
Contributor Author

@dimon222

You're welcome =)

I'm glad I was able to help.

@@ -103,7 +104,7 @@ def test_get_resource_endpoint(self):

endpoint = hadoop_conf.get_resource_manager_endpoint()

self.assertEqual('example.com:8022', endpoint)
self.assertEqual('http://example.com:8022', endpoint)
parse_mock.assert_called_with(hadoop_conf_path + 'yarn-site.xml',
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are these tests not failing without these changes?

Copy link
Collaborator

@dimon222 dimon222 Dec 29, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the test changed because result value now will include schema (in same PR, scroll down a bit)
https://github.com/toidi/hadoop-yarn-api-python-client/blob/f88ac6966eff575b18563ef8829967546cf3cc3f/yarn_api_client/hadoop_conf.py#L43

@Maximinho
Copy link

Hello!

This test is failing because get_resource_manager_endpoint function is now returning RM address with a protocol scheme.

@lresende
Copy link
Collaborator

I have added a PR that enables builds with the latest Python 3.8.x and I still cannot see any breaking build issues - see build results.

Having said that, I will trust @dimon222 and/or @kevin-bates judgment here. Please let me know what do you guys think.

@dimon222
Copy link
Collaborator

dimon222 commented Dec 29, 2019

@lresende test is written in assumption that it was working with this type of URL, but I guess it wasn't. It's highly possible that regression was introduced when active_rm check was changed to utilize requests library, as it's the requests library that looks for scheme. It might have been missed due to not many people directly utilizing read from XML config file.
Test is not failing for this since it's incorrect, just fyi.

Copy link
Member

@kevin-bates kevin-bates left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes look good to me. My only issue was the print() statement, but that's just following the existing precedent. I have opened #73 to address this.

@dimon222 dimon222 mentioned this pull request Jan 13, 2020
@dimon222
Copy link
Collaborator

@lresende anything pending on this?

Copy link
Collaborator

@lresende lresende left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@kevin-bates kevin-bates merged commit 2231963 into gateway-experiments:master Jan 21, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants