Fix non-utf8 when found in uploaded RPMs. #1028

jortel · 2017-02-22T23:47:24Z

https://pulp.plan.io/issues/1903

Seemed to make more sense to decode entire primary XML fragment instead of doing it element-by-element. This approach is simpler and more comprehensive.

mention-bot · 2017-02-22T23:47:25Z

@jortel, thanks for your PR! By analyzing the history of the files in this pull request, we identified @ipanova, @mhrivnak and @BrnoPCmaniak to be potential reviewers.

mhrivnak · 2017-02-24T21:12:06Z

plugins/pulp_rpm/plugins/importers/yum/parse/rpm.py

@@ -44,7 +44,9 @@ def get_package_xml(pkg_path, sumtype=util.TYPE_SHA256):
        return {}
    # RHEL6 createrepo throws a ValueError if _cachedir is not set
    po._cachedir = None
-    primary_xml_snippet = change_location_tag(po.xml_dump_primary_metadata(), pkg_path)
+    primary_xml_snippet = po.xml_dump_primary_metadata()
+    primary_xml_snippet = primary_xml_snippet.decode('utf-8', 'replace')


I notice that you moved this to be sooner in the workflow than where string_to_unicode was being called. What's the reasoning for that?

Never mind, I read your comment on the PR which explains this.

mhrivnak · 2017-02-24T21:28:33Z

plugins/pulp_rpm/plugins/importers/yum/parse/rpm.py

-    end_portion = string_to_unicode(primary_xml_snippet[end_index:])
-    location = string_to_unicode("""<location href="%s"/>""" % (
-        file_utils.make_packages_relative_path(relpath)))
+    first_portion = primary_xml_snippet[:start_index]


I think the doc block above needs to be changed to reflect that it takes unicode in and returns unicode back.

makes sense.

mhrivnak · 2017-02-24T21:28:50Z

plugins/pulp_rpm/plugins/importers/yum/parse/rpm.py

@@ -44,7 +44,9 @@ def get_package_xml(pkg_path, sumtype=util.TYPE_SHA256):
        return {}
    # RHEL6 createrepo throws a ValueError if _cachedir is not set
    po._cachedir = None
-    primary_xml_snippet = change_location_tag(po.xml_dump_primary_metadata(), pkg_path)
+    primary_xml_snippet = po.xml_dump_primary_metadata()
+    primary_xml_snippet = primary_xml_snippet.decode('utf-8', 'replace')


Never mind, I read your comment on the PR which explains this.

mhrivnak · 2017-02-24T21:33:05Z

plugins/test/unit/plugins/importers/yum/parse/test_rpm.py

+      <checksum type="sha256" pkgid="YES">b88a43acd5c9239</checksum>
+      <summary>rabbit IDE</summary>
+      <description>The rabbit (professional) IDE.</description>
+      <packager>Frosty \u2603\x9a</packager>


closes #1903

mhrivnak reviewed Feb 24, 2017

View reviewed changes

mhrivnak added the question label Feb 24, 2017

mhrivnak approved these changes Feb 24, 2017

View reviewed changes

Fix non-utf8 when found in uploaded RPMs.

a4ccac9

closes #1903

jortel removed the question label Feb 24, 2017

jortel merged commit 2818e3e into pulp:2.12-dev Feb 24, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix non-utf8 when found in uploaded RPMs. #1028

Fix non-utf8 when found in uploaded RPMs. #1028

jortel commented Feb 22, 2017 •

edited

Loading

mention-bot commented Feb 22, 2017

mhrivnak Feb 24, 2017

mhrivnak Feb 24, 2017

mhrivnak Feb 24, 2017

jortel Feb 24, 2017

mhrivnak Feb 24, 2017

mhrivnak Feb 24, 2017

Fix non-utf8 when found in uploaded RPMs. #1028

Fix non-utf8 when found in uploaded RPMs. #1028

Conversation

jortel commented Feb 22, 2017 • edited Loading

mention-bot commented Feb 22, 2017

mhrivnak Feb 24, 2017

Choose a reason for hiding this comment

mhrivnak Feb 24, 2017

Choose a reason for hiding this comment

mhrivnak Feb 24, 2017

Choose a reason for hiding this comment

jortel Feb 24, 2017

Choose a reason for hiding this comment

mhrivnak Feb 24, 2017

Choose a reason for hiding this comment

mhrivnak Feb 24, 2017

Choose a reason for hiding this comment

jortel commented Feb 22, 2017 •

edited

Loading