Skip to content
This repository has been archived by the owner on Jan 8, 2020. It is now read-only.

Deprecate Zend\Dom\Query in favor of more logical OO approach #5356

Merged
merged 5 commits into from Nov 1, 2013

Conversation

tca3
Copy link
Contributor

@tca3 tca3 commented Oct 26, 2013

Hello everyone,

I am quite a heavy user of DOMDocument XPath queries and I have always been bothered by the Dom implementation in ZF2. The Zend\Dom\Query class seems very illogical as an object and contains quite a lot of inefficiencies and lacks access to the used DOMDocument generated from a string.

I created Zend\Dom\Document which makes more sense, because we are querying this object and not the other way around (creating a Query object that generates a DOMDocument which we then query and that we can't even access).

Zend\Dom\Query class:

$query = new Query($html);
$query->execute('a'); // Generation of a DOMDocument
$query->execute('span'); // Generation of another DOMDocument from the same input string
$query->getDOMDocument(): // Impossible

Zend\Dom\Document class:

$document = new Document($html);
$document->query('a', Document::QUERY_CSS);
$document->query('//span', Document::QUERY_XPATH);
$dom      = $document->getDomDocument();
$dom->createTextNode('<node></node>');

Problems solved:

  • Addition of a method to get the created DOMDocument instance
  • Caching of generated DOMDocument on each query
  • More logical object (Document VS Query)
  • Only one method to query a document

The methods more or less stay the same, I reused the tests already available for Query and added some for the new behavior. I deprecated Zend\Dom\Query in favor of this new class.

I'm in favor of adding this to a ZF2.* future release because I consider this more like an a new feature (new logic & different object) than a component refactor. ZF3 is very far away and Zend\Dom is currently lacking a proper DOMDocument wrapper.

return $this;
}

public function getType()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

docblock missing

@blanchonvincent
Copy link
Contributor

👍 nice refactoring. Maybe we have to keep this work for ZF3 ? cc @Maks3w

@tca3
Copy link
Contributor Author

tca3 commented Oct 26, 2013

@blanchonvincent It does qualify for a refactoring, I agree, but if we want to fully do this, we would need to change NodeList as well because I can't for the life of me figure out why it needs a cssQuery and an xPathQuery in the constructor. I would need to adress that as well as the CSS2Xpath class that should be a simple static method...

But I'd really like to see this in the next ZF2 release though :(

@blanchonvincent
Copy link
Contributor

@ThomasCantonnet I think it's too important to have this in the next release. And, yes, I have the same opinion, you need to change NodeList (you can have cssQuery && xpathQuery when you use the queryCss, because the cssQuery is transformed to a xpathQuery :) ).

* @param string|null $document
* @param string|null $forcedType Type for the provided document (see constants)
* @param string|null $forcedEncoding Encoding for the provided document
* @return Document
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@return self

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup ok

@blanchonvincent
Copy link
Contributor

@ThomasCantonnet Maybe you can have a Zend\Dom\Document (with encoding, errors, document transformation to domdocument) and a Zend\Dom\Query (cssQuery & xpathQuery) which will use a Zend\Dom\Document.

Your code will be more decoupled and easier to test & maintain.

@tca3
Copy link
Contributor Author

tca3 commented Oct 26, 2013

@blanchonvincent Having an object just to store 2 properties and a NodeList seems like a lot of unncessary overhead to me, don't you think ? In this case all we need to do is rename NodeList to Query

@blanchonvincent
Copy link
Contributor

@ThomasCantonnet no because the "setStringDocument" and the "getDomDocument" logic will be moved into a same class, and the query logic will be on an other class.

@blanchonvincent
Copy link
Contributor

@ThomasCantonnet SRP = Single Responsibility Principle :)

@tca3
Copy link
Contributor Author

tca3 commented Oct 26, 2013

Can you give me more details on how you view this new architecture? I'm not sure I'm following.

}

return $this->domDocument;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When getDomDoument() called twice, getStringDocument() called twice ?

can be ?

if (null === $this->domDocument) {
    if (null === ($stringDocument = $this->getStringDocument())) {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getStringDocument() is a simple pointer to the property $stringDocument.

@weierophinney
Copy link
Member

A few things to note:

  • If you have completely new classes, this is completely Backwards Compatible (BC), meaning we can push it into 2.3.0.
  • I agree with @blanchonvincent -- have one class for Document, another for Query (I'd call the new one Zend\Dom\Document\Query to keep BC). This keeps the responsibilities clear, and makes testing easier, as Blanchon noted.
  • I also agree with Blanchon that a public setter for the document string/object doesn't make sense. This should be passed to the constructor, and remain immutable. If you want to query a different document, you should create a new instance.
  • Finally, once the design is solid, let's also update Zend\Test to use the new functionality. :)

@tca3
Copy link
Contributor Author

tca3 commented Oct 28, 2013

@weierophinney Thanks for your input, I'll look into it during the week if I have time.

On Dom\Query though, should I change all the functions while keeping BC or create a different class ? I'm not sure I can keep BC with the state of Dom\Query.

@tca3
Copy link
Contributor Author

tca3 commented Oct 30, 2013

@weierophinney @blanchonvincent Alright guys, deprecated a lot of classes but it looks much cleaner now :)

I added a Query object and refactored the NodeList object, what do you guys think about it now?

$document = new Document($html, Document::DOC_XHTML, 'UTF-8');
$query    = new Document\Query('.foo', Document\Query::TYPE_CSS);
$result   = $document->execute($query);
echo $result->count();

@weierophinney
Copy link
Member

@ThomasCantonnet I think the class NodeList must be only a container, NodeList must be have only the node list, not the factory method, no logic business. Put the logic business in the execute method in Query class.

After reviewing, I agree with this. You can remove the factory method from NodeList, and then have the execute method of Query prepare the data to inject into the NodeList.

@weierophinney
Copy link
Member

@ThomasCantonnet the parameter of the method execute is a document, strange ...

I disagree, @blanchonvincent. Look at how PHP's DOM library works, particularly DOMXPath::query -- that accepts the expression and the DOM node (or document).

With that said, @ThomasCantonnet - it may make sense to make that mirroring even more clear, and maybe do the following:

  • Remove the constructor to the Query class
  • Make the execute method signature execute($expression, Document $document, $type = static::TYPE_XPATH). Then parse the $expression if we have a CSS expression, before handing it off to DOMXPath.

Usage would then become:

$document = new Document($contents);
$nodelist    = Query($someExpression, $document, Query::TYPE_CSS);
foreach ($nodelist as $node) {
    // ...
}

Or, more succinctly, for those who like Perl golf:

foreach (Query($someExpression, new Document($contents), Query::TYPE_CSS) as $node) {
    // ...
}

Both of which look quite clean to me.

@tca3
Copy link
Contributor Author

tca3 commented Nov 1, 2013

OK guys, I removed the NodeList factory method and changed Query::execute to be a static method with Matthew's proposition for a signature.

Do we agree now ? :)

@ghost ghost assigned weierophinney Nov 1, 2013
weierophinney added a commit that referenced this pull request Nov 1, 2013
Deprecate Zend\Dom\Query in favor of more logical OO approach

Conflicts:
	tests/ZendTest/Dom/QueryTest.php
weierophinney added a commit that referenced this pull request Nov 1, 2013
- Marked Css2Xpath::transform as deprecated. It now triggers an
  E_USER_DEPRECATED error and proxies to Zend\Dom\Document\Query::cssToXpath().
weierophinney added a commit that referenced this pull request Nov 1, 2013
- Detailed the deprecation of `Zend\Dom\Css2Xpath::transform`
weierophinney added a commit that referenced this pull request Nov 1, 2013
@weierophinney weierophinney merged commit f72c22d into zendframework:develop Nov 1, 2013
@weierophinney
Copy link
Member

Merged to develop for release with 2.3.0. @ThomasCantonnet -- can you do a new PR that will update Zend\Test\PHPUnit to use the new API, please?

@tca3
Copy link
Contributor Author

tca3 commented Nov 1, 2013

Sure, I'll see what I can do!

Le 1 nov. 2013 à 22:33, weierophinney notifications@github.com a écrit :

Merged to develop for release with 2.3.0. @ThomasCantonnet -- can you do a new PR that will update Zend\Test\PHPUnit to use the new API, please?


Reply to this email directly or view it on GitHub.

@tca3 tca3 deleted the query-deprecated branch November 2, 2013 08:43
Maks3w added a commit that referenced this pull request Nov 3, 2013
Fix typo added in 064fb12
@Maks3w Maks3w mentioned this pull request Nov 3, 2013
@Maks3w
Copy link
Member

Maks3w commented Nov 3, 2013

@ThomasCantonnet @weierophinney I already made the PR for fix Zend\Test. See #5356

@Maks3w
Copy link
Member

Maks3w commented Nov 3, 2013

Well in true I fixed the deprecated call which made ZendTest\Test tests fail

weierophinney added a commit that referenced this pull request Nov 5, 2013
weierophinney added a commit that referenced this pull request Nov 5, 2013
@jmleroux
Copy link
Contributor

The documentation needs an update.

weierophinney added a commit to zendframework/zend-dom that referenced this pull request May 15, 2015
…net/query-deprecated

Deprecate Zend\Dom\Query in favor of more logical OO approach

Conflicts:
	tests/ZendTest/Dom/QueryTest.php
weierophinney added a commit to zendframework/zend-dom that referenced this pull request May 15, 2015
- Marked Css2Xpath::transform as deprecated. It now triggers an
  E_USER_DEPRECATED error and proxies to Zend\Dom\Document\Query::cssToXpath().
weierophinney added a commit to zendframework/zend-dom that referenced this pull request May 15, 2015
gianarb pushed a commit to zendframework/zend-dom that referenced this pull request May 15, 2015
Fix typo added in 064fb125603dede6a85a2a0c5409a078c69a12b5
gianarb pushed a commit to zendframework/zend-dom that referenced this pull request May 15, 2015
Issued introduced by 064fb125603dede6a85a2a0c5409a078c69a12b5
gianarb pushed a commit to zendframework/zend-dom that referenced this pull request May 15, 2015
Css2Xpath is not longer tested by the file.
gianarb pushed a commit to zendframework/zend-dom that referenced this pull request May 15, 2015
gianarb pushed a commit to zendframework/zend-dom that referenced this pull request May 15, 2015
gianarb pushed a commit to zendframework/zend-dom that referenced this pull request May 15, 2015
weierophinney added a commit to zendframework/zend-dom that referenced this pull request May 15, 2015
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants