Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

[DomCrawler] Added auto-discovery and explicit registration of namespaces in filter() and filterByXPath() #6650

Merged
merged 9 commits into from

5 participants

Jakub Zalas Fabien Potencier Jean-François Simon arryon Christophe Coevoet
Jakub Zalas
Collaborator
Q A
Bug fix: no
Feature addition: yes
Backwards compatibility break: yes, default namespace is no longer removed in the addContent method
Symfony2 tests pass: yes
Fixes the following tickets: #4845
Todo: -
License of the code: MIT
Documentation PR: symfony/symfony-docs#2979
  • added support for automatic discovery and explicit registration of document namespaces for Crawler::filterXPath() and Crawler::filter()
  • improved content type guessing in Crawler::addContent()
  • [BC BREAK] Crawler::addXmlContent() no longer removes the default document namespace

I mentioned in #4845 it would probably be possible to use DOMNode::lookupNamespaceURI() to find a namespace URI by given prefix. Unfortunately we cannot use it here since we'd have to call it on a node in the namespace we're looking for.

Current implementation makes the following query to find a namespace:

$domxpath->query('(//namespace::*[name()="media"])[last()]')
Fabien Potencier
Owner

@jakzal Any news on this PR. I would love to have this in 2.3.

Jakub Zalas
Collaborator

@fabpot I stumbled across few issues, some of them already reported by @arryon. Let me do a brain dump here. Note that we'll have similar problems with #5886.

XML content is loaded in different ways depending on a method we use.

Consider simple XML below:

<?xml version="1.0" encoding="UTF-8"?>
<entry xmlns="http://www.w3.org/2005/Atom" xmlns:media="http://search.yahoo.com/mrss/" xmlns:yt="http://gdata.youtube.com/schemas/2007">
    <id>tag:youtube.com,2008:video:kgZRZmEc9j4</id>
    <yt:accessControl action="comment" permission="allowed"/>
    <yt:accessControl action="videoRespond" permission="moderated"/>
    <media:group>
        <media:title type="plain">Chordates - CrashCourse Biology #24</media:title>
        <yt:aspectRatio>widescreen</yt:aspectRatio>
    </media:group>
</entry>

Most of the time XML is loaded as html, since this is a default type in the addContent() method. addContent is also used when calling add() or using the constructor.

When loading the XML with one of the mentioned methods:

$crawler = new Crawler(file_get_contents('entry.xml'));

foreach ($crawler as $domElement) {
    echo $domElement->ownerDocument->saveXML();
}

we end up with the following result:

<?xml version="1.0" standalone="yes"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<?xml version="1.0" encoding="UTF-8"??>
<html>
    <body>
        <entry xmlns="http://www.w3.org/2005/Atom" xmlns:media="http://search.yahoo.com/mrss/" xmlns:yt="http://gdata.youtube.com/schemas/2007">
            <id>tag:youtube.com,2008:video:kgZRZmEc9j4</id>
            <accesscontrol action="comment" permission="allowed"/>
            <accesscontrol action="videoRespond" permission="moderated"/>
            <group>
                <title type="plain">Chordates - CrashCourse Biology #24</title>
                <aspectratio>widescreen</aspectratio>
            </group>
        </entry>
    </body>
</html>

Notice that namespace attributes remain unchanged on the entry tag, but nodes are not namespaced anymore. Our XML is embedded in an html now. I'm not sure if this was an intended behaviour. If so, it might be worth documenting and advising loading XMLs with addXmlContent().

At the moment only way to load XML with addXmlContent is to call it directly:

$crawler = new Crawler();
$crawler->addXmlContent(file_get_contents('entry.xml'));

foreach ($crawler as $domElement) {
    echo $domElement->ownerDocument->saveXML();
}

addXmlContent removes all the namespaces to simplify xpath expressions, so we end up with:

<?xml version="1.0" encoding="UTF-8"?>
<entry ns="http://www.w3.org/2005/Atom" media="http://search.yahoo.com/mrss/" yt="http://gdata.youtube.com/schemas/2007">
    <id>tag:youtube.com,2008:video:kgZRZmEc9j4</id>
    <accessControl action="comment" permission="allowed"/>
    <accessControl action="videoRespond" permission="moderated"/>
    <group>
        <title type="plain">Chordates - CrashCourse Biology #24</title>
        <aspectRatio>widescreen</aspectRatio>
    </group>
</entry>

It indeed makes xpath expressions simpler, since we no longer have to use prefixes. However, it might cause problems with complex XMLs where we might get the same node in multiple namespaces. Namespaces are there to prevent name collisions, by removing them we're exposing ourselves to this issue.

Only way to keep the original XML is to use DOMDocument:

$dom = new \DOMDocument();
$dom->loadXML(file_get_contents('entry.xml'));

$crawler = new Crawler($dom);

foreach ($crawler as $domElement) {
    echo $domElement->ownerDocument->saveXML();
}

My solution only works if we load the XML with DOMDocument. In other cases either namespace definitions or nodes are modified.

Furthermore, at the moment my solution doesn't work with a default namespace (oh irony). Just because we have a default namespace, doesn't mean we can drop it from xpath expressions. That's probably the reason why code in addXmlContent is removing it.

In other words, even if we fixed issues with variate of ways we can load XML, we'd have to register a default namespace in some way (either by convention, or explicitly).

Imho we have to problems to solve here:

  • consistency in loading XMLs
  • providing users with a way to use namespace prefixes in xpath expressions
Fabien Potencier fabpot referenced this pull request from a commit
Fabien Potencier fabpot merged branch troymccabe/cssselector_namespace_support (PR #7585)
This PR was merged into the master branch.

Discussion
----------

[CssSelector] Updated parsers to support namespaces (fix for ClassParser included)

ClassParser was passing improper parameters to `ElementNode`, as well as namespaces simply not being supported in the various parsers. This is a natural extension of #6650, by properly parsing the requested CSS filter if supplied.

| Q             | A
| ------------- | ---
| Bug fix?      | yes
| New feature?  | yes
| BC breaks?    | no
| Deprecations? | no
| Tests pass?   | yes
| Fixed tickets | NA
| License       | MIT
| Doc PR        | NA

Commits
-------

3c015d5 Updated parsers to support namespaces (fix for ClassParser included)
d9d9635
Jakub Zalas jakzal closed this
Fabien Potencier
Owner

@jakzal Are you closing because you think there are no good solutions? If that's the case, can we also close #5886 and #4845?

Jakub Zalas
Collaborator

@fabpot I can't find a good solution for autodiscovery. Main reason is that even the default namespace has to be registered before querying with xpath. I'm not sure what would be a good prefix for a default namespace (or if we should be doing this at all).

#5886 is not a serious problem, since we can actually work around it (i've posted an update there, it's possible to skip the namespace in xpath). It could become a problem with complex XMLs, with multiple namespaces containing the same nodes.

If we decided that solution for #5886 is good enough, than we could close #4845 as well.

Otherwise we'd probably need to make a BC break, to avoid playing with namespaces when loading XMLs:

// remove the default namespace to make XPath expressions simpler
@$dom->loadXML(str_replace('xmlns', 'ns', $content), LIBXML_NONET);

I'm not sure what was the intention, but it actually removes ALL the namespaces (which is not what comment says). We definitely have to document it.

Fabien Potencier
Owner

@jakzal Would it be possible to only remove the default namespace and keep the other ones? That would be the best option I think.

Jakub Zalas
Collaborator

Removing the default namespace would actually work. We'd only need to make sure it's removed consistently (no matter how you load an XML).

I'll continue working on this.

Jakub Zalas jakzal reopened this
Jakub Zalas
Collaborator

Filtering with CSS expressions fails because CssSelector::toXPath() converts nodes to lowercase. Instead of descendant-or-self::yt:accessControl it generates descendant-or-self::yt:accesscontrol.

@jfsimon any reason why node names are lowercased?

Jakub Zalas
Collaborator

@fabpot while that's perfectly valid for css selectors, XPath expressions are case sensitive. That's why calling CssSelectors::toXPath() yields invalid results in some cases (in my case original accessControl becomes accesscontrol).

Fabien Potencier
Owner

@jakzal I think we can just safely remove the strtolower call then, no?

Jean-François Simon

It seems that documents loaded as HTML are case-lowered (http://fr2.php.net/manual/fr/domxpath.query.php#77048) so in this case xpath must be case-lowered too. In the case of XML (which is the one here), they're not.

Case-insensitive search can be performed

Jean-François Simon

Note that the CssSelector component already supports some differences between HTML and XML.
The case could be lowered for HTML documents only.
That said, the DomCrawler component would have to be aware of the loaded document type (HTML or XML).

Jakub Zalas
Collaborator

@jfsimon Calling CssSelector::disableHtmlExtension() before CssSelector::toXPath() makes that my example works. However, it's probably not what we want with html.

Jean-François Simon

@jakzal exactly. The HtmlExtension must be enabled for HTML queries and disabled for XML ones.
I guess this wont be easy to implement in the DomCrawler component.
That's why I talked about case-insensitive search, which could be a "cross-type" solution (but not a perfect one).

Fabien Potencier
Owner

Any news on this one?

arryon

For my own use I 'solved' this problem by naming the default namespace 'default', and keeping a registry of all other namespaces. Xpath queries using the default namespace must then begin with 'default', or any other sensible name that you could come up with.

I haven't followed this discussing since, but here's my own commit implementing adding namespaces, could be a starting point for someone to pick this issue up:
arryon/DomCrawler@7c4a774

Fabien Potencier
Owner

@jakzal I need to take a decision about this one: can we rely on auto-detection or do we allow people to register namespaces?

Jakub Zalas
Collaborator

Auto-detection seems to be working fine and there's no need for special treatment of a default namespace. See

I'd appreciate if someone reviewed it to make sure I'm not missing anything (ping @jfsimon @stof).

Notice I had to manually disable the html extension for css selector with CssSelector::disableHtmlExtension() to make filter() work. Does it make sense to query XML with a css selector? I could update the filter() tests to query an HTML document instead.

Christophe Coevoet
Collaborator

@jakzal I think it makes sense to be able to use CssSelector for XML documents (this is exactly why we allow disabling the HTML extension of the CssSelector component btw).

Fabien Potencier
Owner

The CssSelector and the DomCrawler components must support both XML and HTML.

Fabien Potencier
Owner

@jakzal Can you make a PR to update the docs?

Jakub Zalas
Collaborator

:+1:

Jakub Zalas jakzal referenced this pull request in symfony/symfony-docs
Closed

[DomCrawler] Documented xml namespace autodiscovery #2979

Jakub Zalas
Collaborator

There's an inconsistency in the way XMLs are loaded to the Crawler, I'm not sure if it should be addressed here or as a separate issue.

If an XML is loaded directly from string, the Crawler::addXmlContent is used, which removes the default namespace. If an XML is loaded from a DOMDocument, the default namespace is not removed.

Implication for the current PR is that if we load DOMDocument filtering by default namespace doesn't work atm. I think inconsistency should be solved first, otherwise we'd have two behaviours depending on method used to load the document.

Options we have:

  • Make that addXmlContent doesn't remove the default namespace and register the namespace under a fixed alias (default?) when calling filterXPath
  • Make that addDocument removes the default namespace just like addXmlContent (probably would require dumping to string and loading it again or $dom->documentElement->removeAttributeNS($dom->documentElement->getAttributeNode('xmlns')->nodeValue, '');). This wouldn't involve further changes in filterXPath.
  • Document differences in behaviour between addXmlContent and addDocument
Fabien Potencier
Owner

For the inconsistency you found, I would go with option 1. If that's not too complex, I would like to get the fix in this PR, if not, let's create an issue.

Also, can you add a note in the component CHANGELOG file about this new feature?

Jakub Zalas
Collaborator

Updated this PR with changes described in the option 1.

Note that the outcome is we HAVE to use prefixes once there's at least one non-default namespace in the document (that's how it works in php).

We could still improve this PR by:

  • letting users change the default namespace prefix (Crawler::setDefaultNamespacePrefix()) - this would be very easy
  • allowing users to register namespaces manually - not much harder
Fabien Potencier
Owner

The 2 suggestions make sense.

Jakub Zalas
Collaborator

@fabpot both suggestions are now implemented.

src/Symfony/Component/DomCrawler/Tests/CrawlerTest.php
((53 lines not shown))
/**
* @covers Symfony\Component\DomCrawler\Crawler::filter
*/
public function testFilter()
{
+ $this->markSkippedIfCssSelectorNotPresent();
Fabien Potencier Owner
fabpot added a note

Can you remove the test for CSS selector as we have removed all such checks everywhere as composer must have been run to execute the tests.

Jakub Zalas Collaborator
jakzal added a note

:+1:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Fabien Potencier fabpot referenced this pull request from a commit
Fabien Potencier fabpot merged branch jakzal/domcrawler-namespace-autodiscovery (PR #6650)
This PR was merged into the master branch.

Discussion
----------

[DomCrawler] Added auto-discovery and explicit registration of namespaces in filter() and filterByXPath()

| Q | A
| --- | ---
|Bug fix: | no
|Feature addition: |yes
|Backwards compatibility break: | yes, default namespace is no longer removed in the `addContent` method
|Symfony2 tests pass: | yes|
|Fixes the following tickets: | #4845
|Todo: | -
|License of the code:| MIT
|Documentation PR: | symfony/symfony-docs#2979

* added support for automatic discovery and explicit registration of document namespaces for `Crawler::filterXPath()` and `Crawler::filter()`
* improved content type guessing in `Crawler::addContent()`
* [BC BREAK] `Crawler::addXmlContent()` no longer removes the default document namespace

I mentioned in #4845 it would probably be possible to use [DOMNode::lookupNamespaceURI()](http://www.php.net/manual/en/domnode.lookupnamespaceuri.php) to find a namespace URI by given prefix. Unfortunately we cannot use it here since we'd have to call it on a node in the namespace we're looking for.

Current implementation makes the following query to find a namespace:
```php
$domxpath->query('(//namespace::*[name()="media"])[last()]')
```

Commits
-------

77e2fa5 [DomCrawler] Removed checks if CssSelector is present.
9110468 [DomCrawler] Enabled manual namespace registration.
be1e4e6 [DomCrawler] Enabled default namespace prefix overloading.
943d446 [DomCrawler] Updated the CHANGELOG with namespace auto-registration details.
c6fbb13 [DomCrawler] Added support for an automatic default namespace registration.
587e2dd [DomCrawler] Made that default namespace is no longer removed when loading documents with addXmlContent().
c905bba [DomCrawler] Added more tests for namespaced filtering.
6e717a3 [DomCrawler] Made sure only the default namespace is removed when loading an XML content.
e5b8abb [DomCrawler] Added auto-discovery of namespaces in Crawler::filter() and Crawler::filterByXPath().
98c0d38
Fabien Potencier fabpot merged commit 77e2fa5 into from
Jakub Zalas jakzal deleted the branch
Fabien Potencier fabpot referenced this pull request from a commit
Fabien Potencier fabpot bug #9771 Crawler default namespace fix (crudecki)
This PR was squashed before being merged into the 2.4 branch (closes #9771).

Discussion
----------

Crawler default namespace fix

| Q             | A
| ------------- | ---
| Bug fix?      | no
| New feature?  | no
| BC breaks?    | yes
| Deprecations? | no
| Tests pass?   | yes
| Fixed tickets | #9732, #6650
| License       | MIT
| Doc PR        | symfony/symfony-docs#2979

Fix backwards compatibility of xml namespaces for having only one default namespace.

Commits
-------

cfff054 Crawler default namespace fix
4a0989b
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
This page is out of date. Refresh to see the latest.
9 src/Symfony/Component/DomCrawler/CHANGELOG.md
View
@@ -1,6 +1,15 @@
CHANGELOG
=========
+2.4.0
+-----
+
+ * added support for automatic discovery and explicit registration of document
+ namespaces for `Crawler::filterXPath()` and `Crawler::filter()`
+ * improved content type guessing in `Crawler::addContent()`
+ * [BC BREAK] `Crawler::addXmlContent()` no longer removes the default document
+ namespace
+
2.3.0
-----
96 src/Symfony/Component/DomCrawler/Crawler.php
View
@@ -28,6 +28,16 @@ class Crawler extends \SplObjectStorage
protected $uri;
/**
+ * @var string The default namespace prefix to be used with XPath and CSS expressions
+ */
+ private $defaultNamespacePrefix = 'default';
+
+ /**
+ * @var array A map of manually registered namespaces
+ */
+ private $namespaces = array();
+
+ /**
* Constructor.
*
* @param mixed $node A Node to use as the base for the crawling
@@ -92,7 +102,7 @@ public function add($node)
public function addContent($content, $type = null)
{
if (empty($type)) {
- $type = 'text/html';
+ $type = 0 === strpos($content, '<?xml') ? 'application/xml' : 'text/html';
}
// DOM only for HTML/XML content
@@ -195,9 +205,7 @@ public function addXmlContent($content, $charset = 'UTF-8')
$dom = new \DOMDocument('1.0', $charset);
$dom->validateOnParse = true;
-
- // remove the default namespace to make XPath expressions simpler
- @$dom->loadXML(str_replace('xmlns', 'ns', $content), LIBXML_NONET);
+ @$dom->loadXML($content, LIBXML_NONET);
libxml_use_internal_errors($current);
libxml_disable_entity_loader($disableEntities);
@@ -579,7 +587,8 @@ public function filterXPath($xpath)
$root->appendChild($document->importNode($node, true));
}
- $domxpath = new \DOMXPath($document);
+ $prefixes = $this->findNamespacePrefixes($xpath);
+ $domxpath = $this->createDOMXPath($document, $prefixes);
return new static($domxpath->query($xpath), $this->uri);
}
@@ -710,6 +719,25 @@ public function form(array $values = null, $method = null)
}
/**
+ * Overloads a default namespace prefix to be used with XPath and CSS expressions.
+ *
+ * @param string $prefix
+ */
+ public function setDefaultNamespacePrefix($prefix)
+ {
+ $this->defaultNamespacePrefix = $prefix;
+ }
+
+ /**
+ * @param string $prefix
+ * @param string $namespace
+ */
+ public function registerNamespace($prefix, $namespace)
+ {
+ $this->namespaces[$prefix] = $namespace;
+ }
+
+ /**
* Converts string for XPath expressions.
*
* Escaped characters are: quotes (") and apostrophe (').
@@ -792,4 +820,62 @@ protected function sibling($node, $siblingDir = 'nextSibling')
return $nodes;
}
+
+ /**
+ * @param \DOMDocument $document
+ * @param array $prefixes
+ *
+ * @return \DOMXPath
+ *
+ * @throws \InvalidArgumentException
+ */
+ private function createDOMXPath(\DOMDocument $document, array $prefixes = array())
+ {
+ $domxpath = new \DOMXPath($document);
+
+ foreach ($prefixes as $prefix) {
+ $namespace = $this->discoverNamespace($domxpath, $prefix);
+ $domxpath->registerNamespace($prefix, $namespace);
+ }
+
+ return $domxpath;
+ }
+
+ /**
+ * @param \DOMXPath $domxpath
+ * @param string $prefix
+ *
+ * @return string
+ *
+ * @throws \InvalidArgumentException
+ */
+ private function discoverNamespace(\DOMXPath $domxpath, $prefix)
+ {
+ if (isset($this->namespaces[$prefix])) {
+ return $this->namespaces[$prefix];
+ }
+
+ // ask for one namespace, otherwise we'd get a collection with an item for each node
+ $namespaces = $domxpath->query(sprintf('(//namespace::*[name()="%s"])[last()]', $this->defaultNamespacePrefix === $prefix ? '' : $prefix));
+
+ if ($node = $namespaces->item(0)) {
+ return $node->nodeValue;
+ }
+
+ throw new \InvalidArgumentException(sprintf('Could not find a namespace for the prefix: "%s"', $prefix));
+ }
+
+ /**
+ * @param $xpath
+ *
+ * @return array
+ */
+ private function findNamespacePrefixes($xpath)
+ {
+ if (preg_match_all('/(?P<prefix>[a-zA-Z_][a-zA-Z_0-9\-\.]*):[^:]/', $xpath, $matches)) {
+ return array_unique($matches['prefix']);
+ }
+
+ return array();
+ }
}
90 src/Symfony/Component/DomCrawler/Tests/CrawlerTest.php
View
@@ -11,6 +11,7 @@
namespace Symfony\Component\DomCrawler\Tests;
+use Symfony\Component\CssSelector\CssSelector;
use Symfony\Component\DomCrawler\Crawler;
class CrawlerTest extends \PHPUnit_Framework_TestCase
@@ -370,6 +371,55 @@ public function testFilterXPath()
$this->assertCount(6, $crawler->filterXPath('//li'), '->filterXPath() filters the node list with the XPath expression');
}
+ public function testFilterXPathWithDefaultNamespace()
+ {
+ $crawler = $this->createTestXmlCrawler()->filterXPath('//default:entry/default:id');
+ $this->assertCount(1, $crawler, '->filterXPath() automatically registers a namespace');
+ $this->assertSame('tag:youtube.com,2008:video:kgZRZmEc9j4', $crawler->text());
+ }
+
+ public function testFilterXPathWithCustomDefaultNamespace()
+ {
+ $crawler = $this->createTestXmlCrawler();
+ $crawler->setDefaultNamespacePrefix('x');
+ $crawler = $crawler->filterXPath('//x:entry/x:id');
+
+ $this->assertCount(1, $crawler, '->filterXPath() lets to override the default namespace prefix');
+ $this->assertSame('tag:youtube.com,2008:video:kgZRZmEc9j4', $crawler->text());
+ }
+
+ public function testFilterXPathWithNamespace()
+ {
+ $crawler = $this->createTestXmlCrawler()->filterXPath('//yt:accessControl');
+ $this->assertCount(2, $crawler, '->filterXPath() automatically registers a namespace');
+ }
+
+ public function testFilterXPathWithMultipleNamespaces()
+ {
+ $crawler = $this->createTestXmlCrawler()->filterXPath('//media:group/yt:aspectRatio');
+ $this->assertCount(1, $crawler, '->filterXPath() automatically registers multiple namespaces');
+ $this->assertSame('widescreen', $crawler->text());
+ }
+
+ /**
+ * @expectedException \InvalidArgumentException
+ * @expectedExceptionMessage Could not find a namespace for the prefix: "foo"
+ */
+ public function testFilterXPathWithAnInvalidNamespace()
+ {
+ $this->createTestXmlCrawler()->filterXPath('//media:group/foo:aspectRatio');
+ }
+
+ public function testFilterXPathWithManuallyRegisteredNamespace()
+ {
+ $crawler = $this->createTestXmlCrawler();
+ $crawler->registerNamespace('m', 'http://search.yahoo.com/mrss/');
+
+ $crawler = $crawler->filterXPath('//m:group/yt:aspectRatio');
+ $this->assertCount(1, $crawler, '->filterXPath() uses manually registered namespace');
+ $this->assertSame('widescreen', $crawler->text());
+ }
+
/**
* @covers Symfony\Component\DomCrawler\Crawler::filter
*/
@@ -384,6 +434,30 @@ public function testFilter()
$this->assertCount(6, $crawler->filter('li'), '->filter() filters the node list with the CSS selector');
}
+ public function testFilterWithDefaultNamespace()
+ {
+ $crawler = $this->createTestXmlCrawler()->filter('default|entry default|id');
+ $this->assertCount(1, $crawler, '->filter() automatically registers namespaces');
+ $this->assertSame('tag:youtube.com,2008:video:kgZRZmEc9j4', $crawler->text());
+ }
+
+ public function testFilterWithNamespace()
+ {
+ CssSelector::disableHtmlExtension();
+
+ $crawler = $this->createTestXmlCrawler()->filter('yt|accessControl');
+ $this->assertCount(2, $crawler, '->filter() automatically registers namespaces');
+ }
+
+ public function testFilterWithMultipleNamespaces()
+ {
+ CssSelector::disableHtmlExtension();
+
+ $crawler = $this->createTestXmlCrawler()->filter('media|group yt|aspectRatio');
+ $this->assertCount(1, $crawler, '->filter() automatically registers namespaces');
+ $this->assertSame('widescreen', $crawler->text());
+ }
+
public function testSelectLink()
{
$crawler = $this->createTestCrawler();
@@ -656,6 +730,22 @@ public function createTestCrawler($uri = null)
return new Crawler($dom, $uri);
}
+ protected function createTestXmlCrawler($uri = null)
+ {
+ $xml = '<?xml version="1.0" encoding="UTF-8"?>
+ <entry xmlns="http://www.w3.org/2005/Atom" xmlns:media="http://search.yahoo.com/mrss/" xmlns:yt="http://gdata.youtube.com/schemas/2007">
+ <id>tag:youtube.com,2008:video:kgZRZmEc9j4</id>
+ <yt:accessControl action="comment" permission="allowed"/>
+ <yt:accessControl action="videoRespond" permission="moderated"/>
+ <media:group>
+ <media:title type="plain">Chordates - CrashCourse Biology #24</media:title>
+ <yt:aspectRatio>widescreen</yt:aspectRatio>
+ </media:group>
+ </entry>';
+
+ return new Crawler($xml, $uri);
+ }
+
protected function createDomDocument()
{
$dom = new \DOMDocument();
Something went wrong with that request. Please try again.