Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Native iterate through arrays and objects of type Value #25

Closed
wants to merge 12 commits into from

Conversation

valmat
Copy link
Contributor

@valmat valmat commented Mar 11, 2014

Iterate over arrays and objects of type Value

I propose to consider the iteration of arrays and objects without the use of additional containers.

A few examples:

    void loopValue(Php::Parameters &params)
    {
        std::cout << "\nArray/Object contains " << params[0].size() << " items" << std::endl;

        // Preferred variant
        // Value::iterator& Value::end() is called only once
        for(auto &i: params[0]) {
            std::cout << "\n["<< i.ind() << "]:["<< i.key() << "]=" << i.value() << "\t | \t" << i.typestr();
        }


        // Possible variant
        // Value::iterator& Value::end() is called with each iteration
        for (auto it=params[0].begin(); it!=params[0].end(); it++) {
            std::cout << "\n["<< it->ind() << "]:["<< it->key() << "]=" << it->value() << "\t | \t" << it->typestr();
        }


        // Recursive iterations
        Php::Value v;
        for(auto &i: params[0]) {
            v = i.value();
            std::cout << std::endl << "["<< i.ind() << "]:["<< i.key() << "]=" << v << "\t | \t" << i.typestr();
            for(auto &j: v) {
                std::cout << std::endl << "\t["<< j.ind() << "]:["<< j.key() << "]="<< j.value();
            }
        }
        std::cout << std::endl;
    }

Iteration method

        for(auto &i: params[0]) {...}

is more convenient and less expensive, but if someone can conveniently otherwise, he can iterate and in this way:

        for (auto it=params[0].begin(); it!=params[0].end(); it++) { {...}

or this:

        for (Value::iterator it=params[0].begin(); it!=params[0].end(); ++it) { {...}

A few words about using.
Value::iterator supports the following methods:

Value::iterator it;

/**
*  Retrieve a string key of element of the array
*  the same as (*it).key();
* 
*  @return std::string
*/
it->key(); 

/**
*  Retrieve a integer key (index) of element of the array
*  the same as (*it).ind();
* 
*  @return unsigned long
*/
it->ind(); 

/**
*  key type is string?
*  true ‒ string, false ‒ integer
*  the same as (*it).isstr();
* 
*  @return bool
*/
it->isstr(); 

/**
*  debug function
*  the same as (*it).typestr();
*  if key type is string, return "str", else return "num"
* 
*  @return const char *
*/
it->typestry(); 

Now this new feature allows just iterate through in one direction.
If necessary, it possible deepen and enhance functionality.
At this point, this iterator is not a iterator is compatible with STL C++11 and does not support the corresponding algorithms.

Why iteration in a way better than iterating through Value::mapValue():

  1. Firstly preserved the original order of elements in the array.
  2. This method is faster, as the passage is provided once, while using Value::mapValue(), we first fill std::map<std::string,Php::Value> result, and then iterate through the filled container.
    In addition, there is an economy to initialize the container std::map and operations over them.
  3. It is very convenient.

@EmielBruijntjes
Copy link
Member

I agree for the full 100% that iterators are necessary. However, your implementation should be improved to make it perfect. I do want the iterators to be C++11 STL iterator compatible (at least compatible with forward iterators), so that they do can be used in algorithms indeed.

I have no experience in making iterators, so you will have to find out yourselves how to do this. My feeling says that we need to be working with std::pairPhp::Value,Php::Value objects somewhere in the iterator...

The following code should be functional:

Php::Value map;
map["x"] = "a";
map["y"] = "b";
map["z"] = "c";

for (auto &iter : map)
{
    std::cout << iter.first << " " << iter.second << std::endl;
}

for (auto iter = map.begin(); iter != map.end(); iter++)
{
    std::cout << iter->first << " " << iter->second << std::endl;
}

Iterator methods like 'isstr()' and so on should all be removed. These methods do not belong in the iterator class, because they do not say anything about the information that is iterated over.

@valmat
Copy link
Contributor Author

valmat commented Mar 12, 2014

@EmielBruijntjes, I'm afraid you overestimate my experience in writing iterators. This is my first iterator written in C++ :). But I promise to look into this issue to the end and I'll provide the final version is ready for use.

Also I have found a bug related with memory leaks in my implementation. So this is not my last commit on this topic.
This preliminary version I have provided for that would agreement with you architectural solution before develop it further. If you are satisfied with the approach presented by me, I can extend the functionality within the chosen approach.

As to what the end container to choose (std::pair<Php::Value,Php::Value> or whatever else). I will think about this question and provide arguments. So it will be possible to solve this question in the discussion. So far, I see the positive side as in the use of std::pair and in the use of our own container.

I propose to discuss what features we expect from the iterator.
Of course, a forward iteration

for(Value::iterator &i: params[0]) {...}

and

for (Value::iterator it=params[0].begin(); it!=params[0].end(); ++it) {...}

In addition, iterate through the arrays created by the user:

Php::Value map;
map["x"] = "a";
map["y"] = "b";
map["z"] = "c";
for (auto &iter : map) {...}

May be something else? I'll think about it, but maybe someone has ideas on this subject.

By the way, in the above example should be map.size() == 3, and now map.size() == 0. I have not looked why it so.

If you don't mind, I'll to merge the master branch into my branch ValueIterator to avoid confusion in the future. Because while I'll make the corrections, the main branch may leave and to merge will need conflict resolution.

@valmat
Copy link
Contributor Author

valmat commented Mar 12, 2014

About

Php::Value map;
map["x"] = "a";
map["y"] = "b";
map["z"] = "c";
for (auto &iter : map) {...}

This form of writing in the current implementation does not fill an array!

And I do remember that it worked before. I rummaged through a huge heap of source Zend. Thought the reason in add_assoc_zval_ex. Until not realize that something has changed in the library.
In short, in such form:

Php::Value map;
map.set("x", "a");
map.set("y", "b");
map.set("z", "c");
for (auto &iter : map) {...}

Iterate through the array works!
therefore, the question of iteration an array created in the extension think resolved.

Why doesn't fill with access via operator[], I assume that the reason for the overload of []. The operator should return a reference. Something like that:

value_type& operator[](index_type idx);
const value_type& operator[](index_type idx) const;

I now have 5 hours of night, so I will not now deal with this issue.

PS Please do not close until the Pull Request. I haven't finished the work on the iterator.

PPS It would be a good idea (maybe not now but when the php-cpp will stable) to cover the library tests. Because it is quite a normal situation, when something breaks in the process of changing the code.
Can something like this: https://github.com/php/php-src/tree/master/ext/json/tests
Here is my version: https://github.com/valmat/myscrnav/tree/master/tests (I can make it work for PHP-CPP).

@EmielBruijntjes
Copy link
Member

It is a good idea to have unit tests indeed.

I think the bug with the arrays is caused by a change I made to the Value class copy constructor.

@valmat
Copy link
Contributor Author

valmat commented Mar 13, 2014

Yes. The reason is that

HashMember(const Value *base, Type index) : _base(*base), _index(index) {}

Invokes the Value class copy constructor
I will not make changes in the class HashMember as afraid to break something in it.
May be need to do _base as pointer (Value *)?

added classes HashItemObjectReverse & HashItemArrayReverse
Added ability iterate objects of classes that implement PHP-interface Traversable or Iterator
Also some cosmetic changes, improves code readability
@valmat
Copy link
Contributor Author

valmat commented Mar 15, 2014

So. Well mainly seems done.
Have to write examples and description.
Will do it tomorrow.

… of HashItem from the class Value to the class ValueIterator. Added operator+=(int) in ValueIterator

Storage HashItem in Value was not logical. Now ValueIterator itself manages its contents. In addition there are more opportunities for iteration.
Fixed: itend = params[0].end(); itend++;
Some cosmetic changes.
@EmielBruijntjes
Copy link
Member

I didn't like feel comfortable with your implementation. In stead of merging I took the ideas from your class and turned them into a different implementation.

You had stored the position of the iterator inside the Php::Value class. This is wrong: it is the iterator that keeps track of the current position, and not the container object - otherwise it would be impossible to have two iterators at the same time.

I came up with a simpler implementation. It only has support for const iterators at this moment. In C++ STL there are iterators and const iterators, as well as regular iterators and reverse iterations. Four different implementation in total. We may want to add this to the PHP-CPP library too. For now I'm closing all the pull requests and issues regaring iterating, because it seems to work.

The implementation of Php::Value::mapValue() has also been simplified now that we have the easy to use iterator.

@valmat
Copy link
Contributor Author

valmat commented Mar 16, 2014

You had stored the position of the iterator inside the Php::Value class.

I just fixed it :) but these fixes is no time to get here.
See: valmat@6d7224c

@valmat
Copy link
Contributor Author

valmat commented Mar 16, 2014

@EmielBruijntjes, for me not important how the implementation of the iteration will get in your library. And I have nothing against what would you change my classes. So no problem.
For me, the main thing that the library was the best possible functional and perform their tasks well. Since I plan to use it in my projects.

Since it late to write descriptions and examples, let me explain some ideas that formed the basis of my implementation.

Arrays and iterable types are in php is very important place. Without them it is impossible to write even the simplest project.
Not optimal realization of the iteration in the future will affect performance.
So I tried to reduce to a minimum the potential overhead costs arising at realization of the iteration. Besides, it is good when the benchmarks look beautiful :)

Implemented support iteration of the arrays, objects (fields) and objects of classes that implement an iterable interfaces (Traversable, Iterator)
For arrays and objects is possible reverse iteration. By the way, in the php is not available!

I.e. there is a

foreach($arr as $k => $v ){...}

But there is no way to iterate backwards.

Direct iteration can be carried out as follows:

for (auto it=v.begin(), itend = v.end(); it != itend; ++it) {...}

Or so (which is the same):

for(auto &i: v) {...}

Reverse only this:

for (auto it=v.rbegin(), itend = v.rend(); it != itend; ++it) {...}

In addition, my last commit added the ability to iterate over the intervals other than [v.begin(), v.end()]:

auto it=v.begin(), itend = v.begin();
++it, itend+=4;
for (; it != itend; it++) {}

Of course, the same applies to reverse iteration.

A few words about the methods available inside a loop (class HashItem)

Value value() const;

and

Value key() const

We could use a structure with fields value and key. But retrieving the value or key is a separate operation.
Why waste resources, if we only need the keys or only values?
Thus, the use of methods instead of fields allows to retrieve data only on demand. And never spend resources in vain.

unsigned long intKey() const

and

std::string strKey() const

In some cases it may be known exactly what type of keys will be used. In this case there is no need for each iteration to create and delete an object of class Php::Value.
Just get an integer or a string type - significantly cheaper operation. It seems to me to allow the programmer to choose what he wants - it's a good idea.

bool isstr() const

This method simply lets you know the type of key. it's necessary, if there are methods intKey and strKey

Maybe in the future it will be possible to implement the same ability to remove and change the values ​​of the array (for array only) in the iteration process.

I've tested my implementation. The behavior is exactly the same, which is expected from a program written in pure php and even added some nice features that are missing in pure php.

By the way. It is not obvious and few people know about it, but in php itself the next cycle (array with integer keys)

foreach($arr as $i => $v ){...}

Works faster than the next

$n = count($arr);
for($i = 0; $i < $n; $i++ ){...}

the same goes for iteration introduced in PHP-CPP.

The reason is as php arrays are arranged inside. They are implemented as a doubly linked list.
Therefore foreach($arr as $i => $v ){...} quickly retrieves the values one by one and has the complexity O(n), and for($i = 0; $i < $n; $i++ ){...} at each iteration looks for the key and has the complexity of close to O(n**2)

I will be happy if you accept my arguments when editing my code.

@EmielBruijntjes
Copy link
Member

I see your point. Indeed, when you now iterate over a Php::Value object, we create a Php::Value object for each key and for each value. If you are only interested in the values (or only in the keys), this indeed is a waste. So it would indeed be more efficient to do it differently.

But: this is also the normal behavior when you loop over std::map. I don't want to introduce anything new for PHP-CPP, if the same sort of technologies do not also exist for standard C++. The whole idea of the PHP-CPP library is that it is easy, and that it uses the same sort of functions that people already know from PHP or from C++. For a std::map iterator it is normal to return a std::pair, so our Php::Value uses std::pair objects too. This makes it easy for a C++ programmer to start using PHP-CPP.

The original Zend engine is full of difficult macro's and strange functions because such a function or macro is sometimes more efficient. But at the same time, this is why the Zend engine is so difficult. This is exactly what we try to prevent with PHP-CPP. In PHP-CPP everything is simple, and everything works in a normal, easy to understand and logical way. And indeed - sometimes this means that we have some overhead, and that it could have been more efficient had we used a different interface. But that is the price that we are willing to pay for being an easy library.

If you are not happy about the performance, the real challenge is to improve the performance, while keeping the same simple API. I think there still is a lot of room for such performance gains, especially by running less dynamic memory allocation calls. Special iterator functions like isstr(), strKey() and intKey() introduce extra complexity, and that is not what the idea of the library is. The library should be simple and normal.

We could look at different looping systems, to loop through values without having to fetch keys

for (auto &iter : Php::Values(array))
{
     std::cout << iter << std::endl;
}

I don't know if constructs like "iter+=4" are also supported by stl iterfaces. If they are - we should of course have them in PHP-CPP too. It "iter+=4" is not standard - we also don't need it in our library.

Reverse looping using rbegin() and rend() should be implemented in PHP-CPP. It shouldn't be too difficult to achieve that.

@valmat
Copy link
Contributor Author

valmat commented Mar 16, 2014

isstr(), strKey() и intKey() - were introduced just as the additional features.
The main methods of extraction of keys and values, of course, are the key() and value().

I don't see what code

for (auto &iter : arr)
{
     f( iter.key(), iter.value() );
}

is more difficult than that

for (auto &iter : arr)
{
     f( iter.first, iter.second );
}

Moreover, it seems reasonable to call the key as key and value as value.

And finally. To use the std::map as a wrapper for the php-array is not so much practical value.

Here is an example.
Suppose we have received data from a data-storage:

ID Text
30 text1
11 text2
735 text3
734 text4
733 text5

After packing of these data in a std::map they will take the form of:

ID Text
11 text2
30 text1
733 text5
734 text4
735 text3

That is, their use is not possible because the ordering is lost.

I don't insist that my version shall be final or even optimal, but the use of std::map is still leaves the question by iterate the values of the array open.
In particular, in its current form, the library is still not suitable for writing wrappers for databases.

@EmielBruijntjes
Copy link
Member

In fact, in my opinion iter.key() and iter.value() are better than iter.first and iter.second. The method names iter.key() and iter.value() are more logical, and for a first-time programmer easier to understand. If I had to design C++ from the group up, I would always go for iter.key() and iter.value(), and would not even consider a iter.first and iter.second syntax!

BUT: the simple reality is that iter.first and iter.second are already used by the C++ standard template library (STL). All C++ programmers are familiar with STL iterators, and I want the PHP-CPP iterator to exactly match the STL iterator - so that programmers can use it exactly in the same manner as the STL. Is the STL syntax completely logical? No. Is it this the most efficient implementation for PHP-CPP? No. But is it compatible with the STL and are all C++ programmers familiar with this syntax? Yes - and that is what counts here.

About your example output example: I did not mean that you should convert a Php::Value first to a std::map before you start iterating. Of course that will result in a different order! I meant that the following code should be familiar:

// all c++ programmers know that you iterate over a std::map like this
std::map<std::string,std::string> data;
for (auto &iter : data) std::cout << iter.first << ":" << iter.second << std::endl;

// and because a PHP array variable very much resembles a std::map, one
// should be able to iterate over a PHP array in exactly the same way:
Php::Value data;
for (auto &iter : data) std::cout << iter.first << ":" << iter.second << std::endl;

@valmat
Copy link
Contributor Author

valmat commented Mar 16, 2014

Now I understand what you mean about std::map. You've already included reworked version ValueIterator in the master branch. I just now saw it.
Now, I'll look.

@valmat
Copy link
Contributor Author

valmat commented Mar 16, 2014

I ran through the tests that were with me when I wrote my version.
Here are a few notes:

  • You have removed the postfix operator++ (ValueIterator& operator++(int)), i.e. If we are talking about the standard behavior, it needs.
  • Iteration object is incorrect. protected and private fields should not get in the iterator. Only a public.
  • If php-class implements Iterator or implements IteratorAggregate, we get a segmentation fault

@EmielBruijntjes
Copy link
Member

You mentioned two issues:

  • Protected and private fields should not be in an iterator. Q: why not?
  • If PHP class implemented Iterator you run into sigsegv. I just made a PHP Iterator class, but it seems to work. Can you show a demonstration script that does this?

@valmat
Copy link
Contributor Author

valmat commented Mar 17, 2014

Protected and private fields should not be in an iterator. Q: why not?

Actually I never saw anyone has iterate regular object through the foreach. But if such a feature is implemented, it should work exactly the same as in native php.
In short, because this behavior overloads native behavior php.

If PHP class implemented Iterator you run into sigsegv. I just made a PHP Iterator class, but it seems to work. Can you show a demonstration script that does this?

sigsegv occurred to me at this example:
PHP:

<?php
function out($obj) {
    $cl_name =  is_object($obj) ?
                    get_class($obj) :
                    (is_array($obj) ? "Array" : "----");

    echo "\n\n\x1b[36m++++++++++++++++++-\x1b[1;36m" , $cl_name , "\x1b[0;36m-++++++++++++++++++\n\n";

    echo "\x1b[1;31m";
    echo "\n Native iterate:\n\n";
    foreach($obj as $k => $v) {
        echo "$k\t=>\t$v\n";
    }
    echo "\x1b[0;32m";

    (new MyClass())->loopValueObject($obj);
    echo "\x1b[0m";
}

class SimpleClass {
    public    $cl0_float = 3.14;
    public    $cl0_str1  = 'public str1';
    private   $cl0_str2  = 'private str2';
    protected $cl0_str3  = 'protected str3';
}

class impIterator implements Iterator {
    private $position = 0;
    private $array = array(
        "firstElement",
        "secondElement",
        "lastelEment",
    );  

    public function __construct() {
        $this->position = 0;
    }

    function rewind() {
        $this->position = 0;
    }

    function current() {
        return $this->array[$this->position];
    }

    function key() {
        return 'key_' . $this->position;
    }

    function next() {
        ++$this->position;
    }

    function valid() {
        return isset($this->array[$this->position]);
    }

}

class impIterAggr1 implements IteratorAggregate {
    public function getIterator() {
        return new ArrayIterator(new SimpleClass);
    }
}

class impIterAggr2 implements IteratorAggregate {
    public function getIterator() {
        return new impIterator();
    }
}

$arr  = array('qwe' => 'any text',5,'asd' => 'dfgdfgdfgdfg', 'zxccvx' => 'sdfsecvyh6bug6yfty',);
$obj  = (object)$arr;
$smpl = new SimpleClass;
$it   = new impIterator;
$iag1 = new impIterAggr1;
$iag2 = new impIterAggr2;


out($arr);
out($obj);
out($smpl);
out($it);
out($iag1);
out($iag2);

C++:

...
    void loopValue(Php::Parameters &params)
    {
        std::cout << "\nArray/Object contains " << params[0].size() << " items" << std::endl;
        for (auto it=params[0].begin(), itend = params[0].end(); it != itend; ++it) {
            std::cout << "[\x1b[0;34m"<< it->first << "\x1b[0m]="<< it->second << std::endl;
        }
        return;
    }
...

Unfortunately, now I have no opportunity. But if you want, in the evening will send the prepared tests.

@valmat
Copy link
Contributor Author

valmat commented Mar 17, 2014

By the way, problems associated with iteration private and protected fields were solved in this class: https://github.com/valmat/PHP-CPP/blob/ValueIterator/src/hashpositionwrapper.cpp

Added some tests to ValueIterator
Makefile now supports command 'make test'
@valmat
Copy link
Contributor Author

valmat commented Mar 17, 2014

As promised, I have prepared the tests.

At the same time prepared a common platform for creating unit tests.
My version is just a proposal. And don't pretend to final implementation.

In addition to the tests and control scripts will need to write a comment. Today I do not have time.

A few words about the tests themselves.
To write test was chosen format phpt - as it is, to some extent, standard. It is used when writing extensions and, from my subjective point of view, is quite easy to learn.
To run the tests now run 'make test'
To create tests were written test extension that does not require installation. I.e. the tests can be run before installation of the library.
I added tests for ValueIterator (in my case because it just works). And as for example added a couple of tests for basic capabilities.
Of course, for the tests needed to create a separate Pull Request, but since the question about them came here, I thought it acceptable to add them to the branch ValueIterator.

Tomorrow with a fresh mind, I once again look. Maybe what of any improvements.

If you accept the commit with unit tests, please see which lacks comments. Since it is difficult for me to properly formulate my thoughts on English (I try but I can dedicate a little time learning the language)

To make tests for your variant ValueIterator replace in the file /path/to/PHP-CPP/tests/cpp/include/testValueIterator.h following lines

            //std::cout << "["<< it->first << "]="<< it->second << std::endl;
            std::cout << "["<< it->key() << "]="<< it->value() << std::endl;

on

            std::cout << "["<< it->first << "]="<< it->second << std::endl;

@EmielBruijntjes
Copy link
Member

It is too much. Can you make a new pull request with just the test cases?

I am in favour of having a unit testing system.

@valmat
Copy link
Contributor Author

valmat commented Mar 18, 2014

Well. I will send a new Pull Request a little later.

@EmielBruijntjes
Copy link
Member

Your example did not crash here, but valgrind did report some unitialized memory issues - which were indeed caused by a stupid error in ValueIterator.cpp.

A unit testing framework is well appreciated.

@valmat
Copy link
Contributor Author

valmat commented Mar 18, 2014

Done. Now send.
On ValueIterator tests do not pass.
Moreover, you seem to have the same error I had: valmat@796d5bf.
Here are the logs:
003.log

---- EXPECTED OUTPUT
Array/Object contains 0 items
[cl0_float]=3.14
[cl0_str1]=public str1
---- ACTUAL OUTPUT
Array/Object contains 0 items
[cl0_float]=3.14
[cl0_str1]=public str1
[^@SimpleClass^@cl0_str2]=private str2
[^@*^@cl0_str3]=protected str3
---- FAILED

004.log


---- EXPECTED OUTPUT
Array/Object contains 0 items
[key_0]=firstElement
[key_1]=secondElement
[key_2]=lastelEment
~impIterator
---- ACTUAL OUTPUT
Array/Object contains 0 items
[^@impIterator^@position]=0

Notice: Array to string conversion in /home/valmat/work/PHP-CPP/tests/php/phpt/valueiterator/004.php on line 6
[^@impIterator^@array]=Array
~impIterator
---- FAILED

005.log

---- EXPECTED OUTPUT
Array/Object contains 0 items
[cl0_float]=3.14
[cl0_str1]=public str1
~impIterAggr1
---- ACTUAL OUTPUT
Array/Object contains 0 items
~impIterAggr1
---- FAILED

006.log

---- EXPECTED OUTPUT
Array/Object contains 0 items
[key_0]=firstElement
[key_1]=secondElement
[key_2]=lastelEment
~impIterAggr2
~impIterator
---- ACTUAL OUTPUT
Array/Object contains 0 items
~impIterAggr2
---- FAILED

valmat added a commit to valmat/PHP-CPP that referenced this pull request Mar 18, 2014
@valmat valmat mentioned this pull request Mar 18, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants