Skip to content

Add extract_first() method to SelectorList #572

Closed
wants to merge 6 commits into from

5 participants

@shirk3y
shirk3y commented Jan 30, 2014

Related to discussion #568

@dangra
Scrapy project member
dangra commented Jan 30, 2014

cool, missing tests and docs.

@kmike kmike commented on an outdated diff Jan 30, 2014
scrapy/selector/unified.py
@@ -172,6 +172,10 @@ def re(self, regex):
def extract(self):
return [x.extract() for x in self]
+ def extract_first(self):
+ for x in self.extract():
@kmike
Scrapy project member
kmike added a note Jan 30, 2014

This is a bit inefficient: there is no need to build full [x.extract() for x in self] list if we're only interested in a first value

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@shirk3y
shirk3y commented Jan 30, 2014

Yep, I optimized it a little.

@kmike kmike and 1 other commented on an outdated diff Jan 31, 2014
docs/topics/selectors.rst
@@ -117,6 +117,16 @@ method, as follows::
>>> sel.xpath('//title/text()').extract()
[u'Example website']
+If you want to extract only first matched element, you must call the selector ``.extract_first()``
@kmike
Scrapy project member
kmike added a note Jan 31, 2014

I think "must" is too strong here - there are other means of taking first matched element.

@shirk3y
shirk3y added a note Feb 1, 2014

Agreed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@kmike kmike commented on an outdated diff Jan 31, 2014
docs/topics/selectors.rst
@@ -117,6 +117,16 @@ method, as follows::
>>> sel.xpath('//title/text()').extract()
[u'Example website']
+If you want to extract only first matched element, you must call the selector ``.extract_first()``
+
+ >>> sel.xpath('//ul/li').extract_first()
+ u'First list element'
@kmike
Scrapy project member
kmike added a note Jan 31, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@kmike kmike and 1 other commented on an outdated diff Jan 31, 2014
docs/topics/selectors.rst
@@ -117,6 +117,16 @@ method, as follows::
>>> sel.xpath('//title/text()').extract()
[u'Example website']
+If you want to extract only first matched element, you must call the selector ``.extract_first()``
+
+ >>> sel.xpath('//ul/li').extract_first()
+ u'First list element'
+
+It returns ``None`` if no element was found:
+
+ >>> sel.xpath('//ul/li[999]').extract_first()
+ None
@kmike
Scrapy project member
kmike added a note Jan 31, 2014

Python shell doesn't print None in such cases. Maybe write ... is None and >>> True ?

@shirk3y
shirk3y added a note Feb 1, 2014

That's true, I'll fix it soon

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@ananana ananana referenced this pull request Mar 4, 2014
Closed

Selectorlist extract first #624

@kreedz
kreedz commented Feb 12, 2015

What about this feature? It will be implemented?

@kmike
Scrapy project member
kmike commented Feb 12, 2015

I like this feature and I think we should add it. Every other library has this feature, even browsers have it via document.querySelector. There is a follow-up PR which fixes issues with this PR (#624). The problem is that we haven't agreed on it yet - see #568.

@kmike
Scrapy project member
kmike commented Mar 13, 2015

Closing it in favor of #624.

@kmike kmike closed this Mar 13, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.