Newlines in titles #9

Closed
rromanchuk opened this Issue Sep 10, 2011 · 2 comments

Comments

Projects
None yet
2 participants
Contributor

rromanchuk commented Sep 10, 2011

This might be controversial, but it might be nice to strip any newlines inside Scraper#title

<title> Carol Bartz exclusive: Yahoo "f---ed me over" - Postcards </title> # page.title

=> "\n\t\t Carol Bartz exclusive: Yahoo "f---ed me over" - \n\t\tPostcards\t"

Ran into above at http://postcards.blogs.fortune.cnn.com/2011/09/08/carol-bartz-fired-yahoo/ We could probably do the callee a favor by cleaning up extra markup which is usually not expected for a title.

LMK and I'll check it in..and add the rest of the missing tests

Owner

jaimeiniesta commented Sep 10, 2011

Sounds fine to me.

I think we should respect html markup, but not newlines or tabs.

Owner

jaimeiniesta commented Apr 28, 2012

Fixed on this commit

8d5a1de

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment