Fixed Long Description text parsing #78

Hunter275 · 2017-06-21T03:23:09Z

Fixed parsing issue with Long Description

tomasbedrich

Hello Hunter, thanks for your contribution.

I am sad to say it, but it contains multiple issues. I have tried to explain most of them in other comments. Next time, please focus on fixing one issue at time, which would allow me to merge your fix immediately and discuss other changes separately.

Two more comments, which cannot be added to specific line of code:

Please revert the mode change for setup.py file, it has no sense.
Please cleanup a commit history before resubmitting your PR (use a git rebase command and squash your commits - help).

But thank you for your effort! Looking forward to merge your fix. Tomas

tomasbedrich · 2017-07-19T19:47:25Z

pycaching/cache.py

@@ -624,6 +695,16 @@ def load(self):

        self.location = Point.from_string(root.find(id="uxLatLon").text)

+        self.lat = str(root.find(id="ctl00_ContentBody_Location").find("a")).split("=")[2].split("&")[0]


Well, this is problematic. The a sub-element seems to be optional – only presented for caches inside US. If you look at the ctl00_ContentBody_Location element for any cache outside the US, it has no a sub-element, so this code cannot work. Please see the picture:

The other thing is, that if you need lat and lon (I assume you mean cache latitude and longitude), they are already parsed in cache.location – read the docs. 😉

Therefore, I would suggest deleting the whole lat+lon code you added (the lines in load() method + in load_quick() method and the properties+setters).

tomasbedrich · 2017-07-19T20:02:10Z

pycaching/cache.py

+
+        self.lon = str(root.find(id="ctl00_ContentBody_Location").find("a")).split("=")[3].split("&")[0]
+
+        self.id = str(root.find(id="ctl00_ContentBody_GeoNav_logButton")).split("=")[3].split("&")[0]


This is definitely not a good idea to parse an URL. Please take a look at urllib.parse to know how do this correctly. 😉

On the other hand – I don't see any reason, why to parse this cache ID. We already have 2 different IDs (waypoint + guid) which are used elsewhere. So why should we want another ID, which wouldn't be used anywhere?

tomasbedrich · 2017-07-19T20:19:59Z

pycaching/cache.py

+
+        self.id = str(root.find(id="ctl00_ContentBody_GeoNav_logButton")).split("=")[3].split("&")[0]
+
+        self.stateprovince = str(root.find(id="ctl00_ContentBody_Location").text.split(",")[0].split(" ")[1])


I appreciate your effort to parse a state and a country, but the second split(" ")[1] is problematic. On the website, there is a text "In STATE, COUNTRY". What if the STATE part consists of multiple words (for example – see the screenshot attached to another comment)?

Notwithstanding that, I would prefer not to solve this in pycaching, but if you need detailed address info in your application, try something like Google Geocoding API instead. The benefit would be, that you could geocode any point, not only the cache (for example multicache waypoints, parking places, etc.).

Therefore, I would also suggest deleting this + the country parsing (including the properties and setters of course).

tomasbedrich · 2017-07-19T20:20:29Z

pycaching/cache.py

@@ -637,7 +718,7 @@ def load(self):

        user_content = root.find_all("div", "UserSuppliedContent")
        self.summary = user_content[0].text
-        self.description = str(user_content[1])
+        self.description = str(user_content[1].text)


Nice simple fix. Thank you!

tomasbedrich · 2017-07-19T20:24:26Z

pycaching/cache.py

@@ -729,6 +810,16 @@ def load_by_guid(self):
        self.location = Point.from_string(
            content.find("p", "LatLong Meta").text)

+        self.lat = str(content.find(id="ctl00_ContentBody_Location").find("a").find("a")).split("=")[2].split("&")[0]


None of the bellow can work, because the content is loaded from cache print-page (example), which doesn't have these elements. So I would like to kindly ask you to delete these lines, irrespectively of what has been written in other comments.

Hunter275 · 2017-07-24T20:48:25Z

Thanks for the input, I'll take what you said and go back to the drawing board. I wrote a majority of these changes to use with my Garmin device so a lot of these probably are out of scope of this project.

Hunter275 added 2 commits June 20, 2017 23:20

Fixed parsing issue with Long Description

b562bdd

See tomasbedrich#77

Merge pull request #1 from Hunter275/Hunter275-patch-long-desc

7bedd57

Fixed parsing issue with Long Description

Hunter275 mentioned this pull request Jun 21, 2017

cache.description is not handled gracefully #77

Closed

Hunter275 added 4 commits June 22, 2017 17:47

Added ability to pull DMS, and ID

b4bce40

Merge branch 'master' of https://github.com/Hunter275/pycaching

593fc5f

Removed share/

e8ee438

Added ability to get Country and State/Provice)

fabfe64

tomasbedrich requested changes Jul 19, 2017

View reviewed changes

tomasbedrich added this to the 3.6.1 milestone Jul 20, 2017

Hunter275 closed this Jul 24, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixed Long Description text parsing #78

Fixed Long Description text parsing #78

Hunter275 commented Jun 21, 2017

tomasbedrich left a comment

tomasbedrich Jul 19, 2017

tomasbedrich Jul 19, 2017

tomasbedrich Jul 19, 2017

tomasbedrich Jul 19, 2017

tomasbedrich Jul 19, 2017

Hunter275 commented Jul 24, 2017 •

edited

Loading

		@@ -624,6 +695,16 @@ def load(self):

		self.location = Point.from_string(root.find(id="uxLatLon").text)

		self.lat = str(root.find(id="ctl00_ContentBody_Location").find("a")).split("=")[2].split("&")[0]


		self.lon = str(root.find(id="ctl00_ContentBody_Location").find("a")).split("=")[3].split("&")[0]

		self.id = str(root.find(id="ctl00_ContentBody_GeoNav_logButton")).split("=")[3].split("&")[0]


		self.id = str(root.find(id="ctl00_ContentBody_GeoNav_logButton")).split("=")[3].split("&")[0]

		self.stateprovince = str(root.find(id="ctl00_ContentBody_Location").text.split(",")[0].split(" ")[1])

Fixed Long Description text parsing #78

Fixed Long Description text parsing #78

Conversation

Hunter275 commented Jun 21, 2017

tomasbedrich left a comment

Choose a reason for hiding this comment

tomasbedrich Jul 19, 2017

Choose a reason for hiding this comment

tomasbedrich Jul 19, 2017

Choose a reason for hiding this comment

tomasbedrich Jul 19, 2017

Choose a reason for hiding this comment

tomasbedrich Jul 19, 2017

Choose a reason for hiding this comment

tomasbedrich Jul 19, 2017

Choose a reason for hiding this comment

Hunter275 commented Jul 24, 2017 • edited Loading

Hunter275 commented Jul 24, 2017 •

edited

Loading