New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix test_policy_to_permissions test failing when there's no internet #158
Conversation
'xhtml-cache.xml' | ||
) | ||
elif 'xml_cache/xhtml-cache.xml' not in os.environ['XML_CATALOG_FILES']: | ||
os.environ['XML_CATALOG_FILES'] += ' ' + os.path.join( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not clear to me what delimiter lxml supports, be it a space or :
or if lists of paths are supported at all. Do you know of an example somewhere? I only see it referenced here: http://xmlsoft.org/catalog.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, from here: https://stackoverflow.com/a/8391738/2577586
Note that entries in XML_CATALOG_FILES are space-separated URLs. You can use Python's pathname2url and urljoin (with file:) to generate the URL from a pathname.
Do we need to prefix the path with file://
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know. I tested it manually by setting my XML_CATALOG_FILES environment variable to /etc/xml/catalog
and verified that the test still passes when it appends to the existing environment variable.
On this xmlsoft.org page it doesn't mention anything about file://
The user can change the default catalog behaviour by redirecting queries to its own set of catalogs, this can be done by setting the XML_CATALOG_FILES environment variable to a list of catalogs, an empty one should deactivate loading the default /etc/xml/catalog default catalog
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ruffsl It seems to work OK. . . I can pass the test without an internet connection, and my final XML_CATALOG_FILES environment variable is:
/etc/xml/catalog file%3A///sros2/install/sros2/share/sros2/xml_cache/xhtml-cache.xml
If take the following out of the test. . .
. . and add it to the (currently empty) |
Ok, I put the code that extends the XML_CATALOG_FILES environment variable into I also changed the path into a URL with 'file://' and the colon character replaced with |
I'm sorry to say that I'm not very familiar with this part of the package. Perhaps @mikaelarguedas has more context for how this fits together. |
Can anybody suggest what I need to do to move this PR forward? |
sros2/xml_cache/xhtml-cache.xml
Outdated
@@ -0,0 +1,5 @@ | |||
<catalog xmlns="urn:oasis:names:tc:entity:xmlns:xml:catalog"> | |||
<rewriteURI | |||
uriStartString="http://www.w3.org/2001/03/" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there any other w3 documents that we use that would also have there uri prefix rewritten? Was the XML.xsd the only source of outbound network traffic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The manual test I ran for this was:
- Build and test 'master' and observe that the tests pass
- Rip the ethernet cable out of the back of my computer (or if you want to be less dramatic
sudo ifconfig eno2 down
) and observe that the tests fail - Build and test this branch and observe that the tests pass again
- Plug the ethernet cable back into my computer and observe that the tests still pass
So I'm pretty confident that this was the only external resource, unless there are other schemas that aren't tested as part of the unit test
If another external resource with the above prefix is used, but a local copy is not added to the XML cache, then the lookup will fail.
I suppose it's possible for some other python program to import sros2 (thus getting its XML catalog injected into its environment), then attempt a lookup of another document with a prefix http://www.w3.org/2001/03/something_else
In that case, the lookup will fail.
I will change the prefix to
<rewriteURI
uriStartString="http://www.w3.org/2001/03/xml.xsd"
rewritePrefix="./xml.xsd" />
So that ONLY this file is cataloged. That should prevent us from breaking other XML lookups in the same python process as sros2
It might be handy to have a test that can assert the entire CLI works as expected when offline. Perhaps the test could force the python interpreter to use a non-existent proxy via temporary environment variable, then continue with the test suite, then unset it once finished? I'm not sure if that's possible. I'll try and make some time this week to test the pr locally. |
I will investigate this, but a solution that works across platforms seems tricky |
grepping around in the libxml2 sourcecode, I did find some promising environment variables:
Your suggestion may be easier to implement than I thought. I can set |
@ruffsl Ok, I've added the requested test. There's also a sanity check test that's skipped because it takes around 60 seconds to pass. I don't mind a test that takes a long time to fail, but a test that takes a long time to pass is problematic. I'll let you decide what to do with it. This test will fail on |
@ruffsl Have you been able to reproduce the failure with the steps I gave? You can also break libxml2's ability to fetch a schema remotely by doing
before running the test. That will also let you observe the failure on 'master' and see that things down fail using this branch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added some suggestions. You'll want to ping @jacobperron or @mjcarroll from OSRF and ask to trigger a CI job when your ready.
sros2/test/sros2/test_cli.py
Outdated
env['HTTP_PROXY'] = 'http://example.com' | ||
tmp_dir = tempfile.mkdtemp(prefix='sros2_test') | ||
|
||
subprocess.run( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the api be used directly, to ensure we're testing the expected library, rather whatever subprocess finds on the path? I think the other system tests use this pattern:
from sros2.api import create_keystore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ruffsl I'm sorry, but I can't answer this question for you. - I added this test based on your statement
It might be handy to have a test that can assert the entire CLI works as expected when offline
I can try to find a way to make sure that subprocess
finds the right executable, but you're on both sides of the question "Should we invoke the API, or should we use the CLI"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you should be calling the api, as this unit test is to check if the XML expansion works offline, not that the command line verb doesn't return a non-zero for some other reason.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ruffsl If your goal is to reduce the scope of the test to check that XML expansion works offline, would you accept running just test_policy_to_permissions
with the HTTP_PROXY environment set?
sros2/test/sros2/test_cli.py
Outdated
args=['ros2', 'security', 'create_keystore', 'demo_keys'], | ||
check=True, | ||
cwd=tmp_dir, | ||
env=env |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If switching to calling the command via the api, perhaps you could use the same env_backup pattern you have bellow to set/restore the env before/after setup and teardown.
With respect to #158 (comment) , perhaps testing via added unit tests could be reapproached? My earlier statement was thinking of more of adding the fake proxy env to the entire test environment, so that the test suite by default acts as if its offline as far as the lxml is concerned. That way, if any more online requirements sneak their way in the future, via added schema references or what have you, they'd always surface using existing unit tests. I guess I'm saying that blaketing the entire test suite for sros2's cli to run offline would be simpler than adding separate offline-unit tests for every new verb or feature. I think the normative case is that sros2 cli/api should work entirely devoid of the internet, and if later we do add a feature that needs to connect to a remote resource, then we can unset the env for just those respective unit tests. |
@ruffsl Ok, I will put in some pytest hooks to set up the environment for all sros2 unit tests In the meantime, I am also encountering new problems because sros2 does not correctly install its package.xml, or register itself correctly with ament_index. See colcon/colcon-ros#74 and https://discourse.ros.org/t/can-we-chat-about-breaking-changes-in-stable-releases/10689/16 for some background information Also see the recent PR ros2/launch#323 for the fix to the same problem from the package 'launch' I will open another PR to address the missing package marker and manifest, but right now this PR is blocked until that issue can be resolved |
dfe54d9
to
dc6b2c0
Compare
@ruffsl I have removed the test that used subprocess to run the CLI and have replaced it with a Note that this branch also contains the changes from PR#160, so I'll need to rebase once that is merged. Without the changes from the
If you want to double-check that the |
Signed-off-by: Pete Baughman <pete.baughman@apex.ai>
- This changes a test-only fix into a always fix - Make the path into a URL instead of just a local file path Signed-off-by: Pete Baughman <pete.baughman@apex.ai>
Signed-off-by: Pete Baughman <pete.baughman@apex.ai>
dc6b2c0
to
9a21010
Compare
|
||
# Disable lxml2 lookup of resources on the internet by configuring it to use a proxy | ||
# that does not exist | ||
os.environ['HTTP_PROXY'] = 'http://example.com' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know enough about conftest.py, but will pytest unset this after testing this package? It might be an issue if the original proxy settings, be it empty/unset or somethings else for the build farm, is not restored afterwards.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ruffsl os.environ
will only modify the environment for the current process, so this will only affect the environment for pytests in the sros2 package
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess one other important piece of information is that pytest
is invoked (at least) once per package, so all the other packages get their own pytest
with a clean set of environment variables.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, TDIL. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. @jacobperron or @mjcarroll , could either of you trigger a CI job?
Looks like windows doesn't like a trailing slash in the catalog path
I'll take another crack when I'm back at a computer w/ dev tools on it |
- This should fix the windows CI job Signed-off-by: Pete Baughman <pete.baughman@apex.ai>
@jacobperron Can we try once more? Thanks |
Sigh. . . OK I'm going to figure out to get a windows machine to develop this feature. It doesn't appear that lxml is using the local cataloged schema document on Windows. |
Signed-off-by: Pete Baughman <pete.baughman@apex.ai>
@ruffsl Windows doesn't like Recap:
|
Will that still be safe if/when the file path includes non-url safe characters, like spaces, etc? |
Related? vulnersCom/getsploit#1 |
Ok, I'll try putting pathname2url around just the path portion and not the 'file:' portion, or trying to quote the path instead. Edit will also test with paths that contain spaces on windows and linux. This works on linux with or without spaces in the path:
Will have to wait until I go home to test on Windows |
This should work on linux AND on windows Signed-off-by: Pete Baughman <pete.baughman@apex.ai>
46a8e2e
to
6e949b9
Compare
@jacobperron Let's try one more time. This works on my Ubuntu machine at work, and on my windows machine at home. |
…os2#158) * Install an XML catalog so we can look this schema up locally Signed-off-by: Pete Baughman <pete.baughman@apex.ai> * Mangle the XML_CATALOG_FILE when you import sros2 - This changes a test-only fix into a always fix - Make the path into a URL instead of just a local file path Signed-off-by: Pete Baughman <pete.baughman@apex.ai> * Disable lxml2 resource lookup over the internet for tests Signed-off-by: Pete Baughman <pete.baughman@apex.ai> * Remove trailing slash from path - This should fix the windows CI job Signed-off-by: Pete Baughman <pete.baughman@apex.ai> * Add README.md explaining where the cataloged schema comes from Signed-off-by: Pete Baughman <pete.baughman@apex.ai> * Escape the path, not the 'file:' part This should work on linux AND on windows Signed-off-by: Pete Baughman <pete.baughman@apex.ai>
Thanks so much guys! Any chance we can also get this on Dashing ? I've got a version here that has the three necessary commits to make this build and test successfully on the docker image |
I think the most recent Dashing sync was set for 2019-09-24, but you could chime in on the topic to ask if the sync is already out the door, or if we could squeeze in this patch: |
We probably need to get it backported to the dashing branch before we start that conversation, though. I suspect we're too late for that sync. |
Just getting it onto the dashing branch is enough for us |
Then you may need to open a rebased PR at |
My only concern about getting it back into dashing is the additional |
…os2#158) * Install an XML catalog so we can look this schema up locally Signed-off-by: Pete Baughman <pete.baughman@apex.ai> * Mangle the XML_CATALOG_FILE when you import sros2 - This changes a test-only fix into a always fix - Make the path into a URL instead of just a local file path Signed-off-by: Pete Baughman <pete.baughman@apex.ai> * Disable lxml2 resource lookup over the internet for tests Signed-off-by: Pete Baughman <pete.baughman@apex.ai> * Remove trailing slash from path - This should fix the windows CI job Signed-off-by: Pete Baughman <pete.baughman@apex.ai> * Add README.md explaining where the cataloged schema comes from Signed-off-by: Pete Baughman <pete.baughman@apex.ai> * Escape the path, not the 'file:' part This should work on linux AND on windows Signed-off-by: Pete Baughman <pete.baughman@apex.ai>
…os2#158) * Install an XML catalog so we can look this schema up locally Signed-off-by: Pete Baughman <pete.baughman@apex.ai> * Mangle the XML_CATALOG_FILE when you import sros2 - This changes a test-only fix into a always fix - Make the path into a URL instead of just a local file path Signed-off-by: Pete Baughman <pete.baughman@apex.ai> * Disable lxml2 resource lookup over the internet for tests Signed-off-by: Pete Baughman <pete.baughman@apex.ai> * Remove trailing slash from path - This should fix the windows CI job Signed-off-by: Pete Baughman <pete.baughman@apex.ai> * Add README.md explaining where the cataloged schema comes from Signed-off-by: Pete Baughman <pete.baughman@apex.ai> * Escape the path, not the 'file:' part This should work on linux AND on windows Signed-off-by: Pete Baughman <pete.baughman@apex.ai>
…os2#158) * Install an XML catalog so we can look this schema up locally Signed-off-by: Pete Baughman <pete.baughman@apex.ai> * Mangle the XML_CATALOG_FILE when you import sros2 - This changes a test-only fix into a always fix - Make the path into a URL instead of just a local file path Signed-off-by: Pete Baughman <pete.baughman@apex.ai> * Disable lxml2 resource lookup over the internet for tests Signed-off-by: Pete Baughman <pete.baughman@apex.ai> * Remove trailing slash from path - This should fix the windows CI job Signed-off-by: Pete Baughman <pete.baughman@apex.ai> * Add README.md explaining where the cataloged schema comes from Signed-off-by: Pete Baughman <pete.baughman@apex.ai> * Escape the path, not the 'file:' part This should work on linux AND on windows Signed-off-by: Pete Baughman <pete.baughman@apex.ai>
…158) (#161) * Install an XML catalog so we can look this schema up locally Signed-off-by: Pete Baughman <pete.baughman@apex.ai> * Mangle the XML_CATALOG_FILE when you import sros2 - This changes a test-only fix into a always fix - Make the path into a URL instead of just a local file path Signed-off-by: Pete Baughman <pete.baughman@apex.ai> * Disable lxml2 resource lookup over the internet for tests Signed-off-by: Pete Baughman <pete.baughman@apex.ai> * Remove trailing slash from path - This should fix the windows CI job Signed-off-by: Pete Baughman <pete.baughman@apex.ai> * Add README.md explaining where the cataloged schema comes from Signed-off-by: Pete Baughman <pete.baughman@apex.ai> * Escape the path, not the 'file:' part This should work on linux AND on windows Signed-off-by: Pete Baughman <pete.baughman@apex.ai>
This PR does the following:
Makes test_policy_to_permissions.py use the installed catalog so the test doesn't die when you can't reach www.w3.orgA quick recap
You can observe the original problem by running
colcon test
on the sros2 package without an internet connection. The test will fail withAfter this PR, the test will succeed, even without an internet connection
Open Questions
I have only fixed the test. The only way I could figure out to get lxml to use the new catalog was to add the path to the XML_CATALOG_FILES environment variable. I put some extra start-up code in test_policy_to_permissions to append (or create) this environment variable.Presumably, we would want to same behavior to happen if you were just using the sros2 package. I would like to give the package maintainers an opportunity to suggest how to do that.If we want to keep the test-only fix, we should make ament_index_python atest_depend
instead of anexec_depend
In the latest version of this PR, you get the caching at run time, not just test time
Reference:
The instructions I (more or less) followed are here: https://www.w3.org/blog/2008/09/caching-xml-data-at-install-ti/