Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Removals from an ArrayCollection gets normalized wrong #285

Closed
Drachenkaetzchen opened this issue Sep 24, 2015 · 44 comments
Closed

Removals from an ArrayCollection gets normalized wrong #285

Drachenkaetzchen opened this issue Sep 24, 2015 · 44 comments

Comments

@Drachenkaetzchen
Copy link

Hi,

I just noticed that if one removes an element from a collection, the result is wrong.

Example

/**
 * @ORM\Entity
 */
class Project
{
    /**
     * @ORM\OneToMany(targetEntity="PartKeepr\ProjectBundle\Entity\ProjectPart",mappedBy="project",cascade={"persist", "remove"})
     * @Groups({"default"})
     * @var ArrayCollection
     */
    private $parts;

    public function __construct () {
        $this->parts = new ArrayCollection();
    }

    /**
     * Returns the parts array
     *
     * @return ProjectPart[] An array of ProjectPart objects
     */
    public function getParts()
    {
        return $this->parts;
    }

     /**
     * Adds a Project Part
     *
     * @param ProjectPart $projectPart An attachment to add
     */
    public function addPart($projectPart)
    {
        $projectPart->setProject($this);
        $this->parts->add($projectPart);
    }

    /**
     * Removes a Project Part
     *
     * @param ProjectPart $projectPart An attachment to remove
     */
    public function removePart($projectPart)
    {
        $projectPart->setProject(null);
        $this->parts->removeElement($projectPart);
    }
}

Note that the example class is barebones and doesn't include e.g. an identifier, which is actually the case in my project.

When I remove any part except the first, the resulting JSON has array indexes in the result:

{  
   "@context":"\/~felicitus\/PartKeepr\/web\/app_dev.php\/api\/contexts\/Project",
   "@id":"\/~felicitus\/PartKeepr\/web\/app_dev.php\/api\/projects\/25",
   "@type":"Project",
   "parts":{  
      "0":{  # This index is included which is otherwise missing
         "@id":"\/~felicitus\/PartKeepr\/web\/app_dev.php\/api\/project_parts\/758",
         "@type":"ProjectPart",
         "part":{  
            "@id":"\/~felicitus\/PartKeepr\/web\/app_dev.php\/api\/parts\/284",
            "@type":"Part",
       }
}

I would expect the resulting JSON like this:

{  
   "@context":"\/~felicitus\/PartKeepr\/web\/app_dev.php\/api\/contexts\/Project",
   "@id":"\/~felicitus\/PartKeepr\/web\/app_dev.php\/api\/projects\/25",
   "@type":"Project",
   "parts":{  
      {  # No index
         "@id":"\/~felicitus\/PartKeepr\/web\/app_dev.php\/api\/project_parts\/758",
         "@type":"ProjectPart",
         "part":{  
            "@id":"\/~felicitus\/PartKeepr\/web\/app_dev.php\/api\/parts\/284",
            "@type":"Part",
       }
}

If my adder/remover implementation is wrong, please let me know. If so, I must refer to #153 (comment) again (Sample Implementation of getters/adders/removers)

@Drachenkaetzchen
Copy link
Author

I think the root cause for this is with the json_encode function of PHP:

$data = array(
        0 => array("foo" => "bar"),
        1 => array("foo" => "bar")
);

echo json_encode($data);

$data2 = array(
        2 => array("foo" => "bar"),
        1 => array("foo" => "bar")
);

echo json_encode($data2);

Results in:

[{"foo":"bar"},{"foo":"bar"}]
{"2":{"foo":"bar"},"1":{"foo":"bar"}}

I wonder what the correct workaround is. sort() unfortunately doesn't work on ArrayCollection, and there's no option one could pass to json_encode() to ignore .

I tried the following approach, which works:

    public function getParts()
    {
        return array_values($this->parts->getIterator()->getArrayCopy());
    }

However, this is counter-intuitive and would have to be documented. I'm also not certain about the performance…

@theofidry
Copy link
Contributor

@FELICITUS I've already encountered the problem and unfortunately there is not much solutions as PHP doesn't have explicit hash map data structures.

When using ArrayCollection you can use the ::toArray():

$arr = new Doctrine\Common\Collections\ArrayCollection();

$e1 = new stdclass();
$e2 = new stdclass();

$arr->add($e1);
$arr->add($e2);

$arr->remove(0);
json_encode($arr->toArray()) // => "{"1":{}}"
json_encode(array_values($arr->toArray())) // => "[{}]"

@Drachenkaetzchen
Copy link
Author

@theofidry Yes, however, I believe this is really an issue DunglasApiBundle should solve (or a JsonEncoder which supports removing array keys). I even wonder if the hydra schema is violated with that behavior.

@theofidry
Copy link
Contributor

Personally the solution I came up with was to do the array_values in the remove method, I think it's better in terms of performance.

I believe this is really an issue DunglasApiBundle should solve

I agree it would be great to address this issue. Maybe by adding a meta tag when we want to specify the collection manipulated is actually and hashmap or having an option to handle arrays as arrays only.

@dunglas
Copy link
Member

dunglas commented Sep 25, 2015

This issue is related to Doctrine. It doesn't reindex arrays when a remove occurs for performance reasons but this lead to issues when using json_encode later (I first found this problem when using FOSRest + JMSSerializer).
The array_values option suggested by @theofidry is the better one IMO but it's not the responsibility of API Platform (it must be implemented in your domain code). However, it will be nice to have a doc explaining this edge case and the workaround.

@Drachenkaetzchen
Copy link
Author

@dunglas sorry but I have to disagree. I can't see how removals from a collection and a resulting issue is a domain-specific edge case. One can't assume that only the last item of a collection gets removed.

Think of a blog with a 1:n comments relation: If the API user removes any comment but the last, the issue will occur.

@theofidry
Copy link
Contributor

What do you think of having an annotation for hashmaps @dunglas? It's true that the root of the problem is Doctrine but they won't fix it and as @FELICITUS say, it's quite annoying as it occurs more often than not.

@dunglas
Copy link
Member

dunglas commented Sep 25, 2015

@FELICITUS OK let me reword: it's a Doctrine problem. But I think they will not fix it. I think it's not the responsibility of API Platform to fix Doctrine bugs/edge cases (we are in the process of removing all hard dependencies to Doctrine and to support other persistence system such as POMM).

I'm not for adding such annotation, it will be a pain to maintain. What do you think about an entry in the doc?

@Drachenkaetzchen
Copy link
Author

@dunglas Yes, with that I must agree. A doc entry is almost absolutely necessary. Especially if users will implement DunglasApiBundle with many entities (like I do).

@theofidry
Copy link
Contributor

I'm not for adding such annotation, it will be a pain to maintain. What do you think about an entry in the doc?

That lots of people will still encounter the issue as we use almost always use arrays rather than hashmaps and I'm not sure POMM, Eloquent or Propel are careful of that as this issue is encountered only for JSON serialization.

@dunglas
Copy link
Member

dunglas commented Sep 25, 2015

A fix can be an option in the Doctrine mapping (reorder: true or something similar). I doubt they will allow that but you can still open an issue on their bug tracker, maybe that it's a common problem.

My POV is that fixing an issue occurring in other contexts (FOSRest for instance) in API Platform is not right.
The real problem, as you pointed out, is the lack of real list structure in PHP and AFAIK this is not going to be fixed soon...

@theofidry
Copy link
Contributor

Then leave a VERY BIG mention in the doc ;)

@Drachenkaetzchen
Copy link
Author

My feeling is that this probably could be fixed in the Serializer somehow, as json_encode is most likely the culprit here due to non-consistent behavior.

@dunglas
Copy link
Member

dunglas commented Sep 25, 2015

AFAIK there is no way to fix it in the serializer, how can you distinguish when you want array(1 => 'a', 3 => 'b') => {"1": "a", "3": "b"} and for the same PHP structure (because of a Doctrine edge case) ["a", "b"]?

@theofidry
Copy link
Contributor

It's not just Doctrine, it's a problem of context: when you're manipulating collections you don't care about the order (unless you're using an hashmap), and it causes a problem only when it comes to serializing into JSON. Hence my suggestion of an annotation for the serializer.

@dunglas
Copy link
Member

dunglas commented Sep 25, 2015

It depends, if you manipulate a map or a list. The problem can be fixed in Doctrine Collection too, it should reorder the array when a operation delete occurs.

@theofidry
Copy link
Contributor

We could do that, but then if we choose this path it would means providing bridge for each Doctrine collection... well, that's a bit of a pain but that would be an extend with the overriding of only a method.

@dunglas
Copy link
Member

dunglas commented Sep 25, 2015

I was thinking about patching doctrine (provide a reorder constructor flag in for Doctrine Collection implementations for instance).

@theofidry
Copy link
Contributor

I was thinking about patching doctrine (provide a reorder constructor flag in for Doctrine Collection implementations for instance).

That would be great, assuming they'll accept it.

@Drachenkaetzchen
Copy link
Author

@theofidry that sounds good to me as well.

You guys are amazing. If you ever need something 3D-printed, let me know :)

@theofidry
Copy link
Contributor

@FELICITUS I don't now for a 3D-printed thing but maybe a little help on LoopBackApiBundle soon would be welcomed 👯 :D

@Drachenkaetzchen
Copy link
Author

@theofidry in what regards? I have an own bundle which does advanced filtering and searching, maybe merge this into LoopBackApiBundle?

@theofidry
Copy link
Contributor

Right now I'm refactoring it to supports nested search properties so it's a big WIP that I'll hopefully finish this weekend. But I would love some feedback after that and merging what can be merged in it like the JSON formatting support.

@Drachenkaetzchen
Copy link
Author

@theofidry I will write you an E-Mail in order to not pollute the issue ;)

@theofidry
Copy link
Contributor

An update on this issue.

Right now if you have an adder, you may have something like this:

public function addJob($job)
    {
        // Check for duplication
        if (false === $this->jobs->contains($job)) {
            $this->jobs->add($job);
            $this->jobs = new ArrayCollection(array_values($this->jobs->toArray()));
        }

        // Ensure the relation is bidirectional
        $job->setMandate($this);

        return $this;
    }

(Note the absence of Job type hint as otherwise the adder won't be used cc @dunglas).

Then if you do a PUT request with:

{
 "jobs": ["/api/jobs/120", "/api/jobs/121"]
}

Whatever the value of jobs was before, it will now result in ["/api/jobs/120", "/api/jobs/121"]. My point is that in this case, this show that even though we're using an adder, the result is the same as if we used a setter, which might be more performant.

@teohhanhui
Copy link
Contributor

@dunglas

AFAIK there is no way to fix it in the serializer, how can you distinguish when you want array(1 => 'a', 3 => 'b') => {"1": "a", "3": "b"} and for the same PHP structure (because of a Doctrine edge case) ["a", "b"]?

Why can't this be corrected during normalization? If we know the relation is a to-many association (metadata that we already have) we can just do array_values.

Forget about Doctrine for a bit. Even if the persistence layer is an in-memory array, the array can still become sparse, and we should properly normalize the collection.

@yelmontaser
Copy link

I think first of all that this is not a bug, but a desired behavior by json_encode function.

Note:
When encoding an array, if the keys are not a continuous numeric sequence starting from 0, all keys are encoded as strings, and specified explicitly for each key-value pair.

http://php.net/manual/en/function.json-encode.php#refsect1-function.json-encode-notes

@teohhanhui I agree, however, it must apply to Doctrine or ApiBundle?
In any case apply to the normalizer seems a simple and quick solution.

@teohhanhui
Copy link
Contributor

@dunglas @theofidry I assume the recommended approach is to do something like this then?

/**
 * @var Collection<ImageObject> An image of the item. This can be a [URL](http://schema.org/URL) or a fully described [ImageObject](http://schema.org/ImageObject).
 *
 * @ORM\ManyToMany(targetEntity="ImageObject")
 * @ORM\JoinTable(
 *     joinColumns={@ORM\JoinColumn(onDelete="CASCADE")},
 *     inverseJoinColumns={@ORM\JoinColumn(onDelete="CASCADE")}
 * )
 * @Iri("http://schema.org/image")
 */
protected $images;

/**
 * Gets images.
 *
 * @return ImageObject[]
 */
public function getImages()
{
    return $this->images->getValues();
}

@theofidry
Copy link
Contributor

@teohhanhui yeah it works. I prefer to put that in the setter instead as in my case getters are much more used than setters, but that's the gist of it.

@teohhanhui
Copy link
Contributor

@theofidry Is that possible with Doctrine collections? We can probably safely recreate a new ArrayCollection, but what if it's a PersistentCollection?

@theofidry
Copy link
Contributor

Yeap, see #285 (comment)

@teohhanhui
Copy link
Contributor

That code is unsafe (as I've pointed out...)

@theofidry
Copy link
Contributor

Ha, can you elaborate a bit on this then? Not sure to understand why.

@teohhanhui
Copy link
Contributor

A PersistentCollection has internal state. If we recreate it, it'll most likely cause problems... (I think it'll mess up the diff)

http://www.doctrine-project.org/api/orm/2.5/class-Doctrine.ORM.PersistentCollection.html

@theofidry
Copy link
Contributor

Sounds fishy indeed. I would then either go with your solution. I don't think I've ever used PersistentCollection so that's why I never encountered this case I guess.

@teohhanhui
Copy link
Contributor

Doctrine automatically creates PersistentCollection in certain (lazy loading) cases :)

Basically, you're not supposed to assume that it's an ArrayCollection.

@theofidry
Copy link
Contributor

Basically, you're not supposed to assume that it's an ArrayCollection.

But if you're initialising those fields as ArrayCollection, Doctrine won't override them will it?

@teohhanhui
Copy link
Contributor

But if you're initialising those fields as ArrayCollection, Doctrine won't override them will it?

That's only true when you're creating a new entity. You might get a PersistentCollection in its place if the entity is being retrieved from the database.

@teohhanhui
Copy link
Contributor

I think it should be safe to replace the PersistentCollection with an ArrayCollection, provided you have not yet made any changes to the collection beforehand...

@Simperfit
Copy link
Contributor

is there an ETA on this ?

@bertrandseurot
Copy link

bertrandseurot commented Oct 25, 2018

Hi all !
In ApiPlatform\Core\Serializer\AbstractItemNormalizer :

 /**
     * Normalizes a collection of relations (to-many).
     *
     * @param iterable $attributeValue
     */
    protected function normalizeCollectionOfRelations(PropertyMetadata $propertyMetadata, $attributeValue, string $resourceClass, string $format = null, array $context): array
    {
        $value = [];
        foreach ($attributeValue as $index => $obj) {
            $value[$index] = $this->normalizeRelation($propertyMetadata, $obj, $resourceClass, $format, $context);
        }
        return $value;
    }

Returning a non-indexed array solves the problem :
$value[] = $this->normalizeRelation($propertyMetadata, $obj, $resourceClass, $format, $context);

Is there a case we want to keep the indexes in a to-many relations collection ?

@benjaminrau
Copy link

@Bertranddev Thanks a lot! It tricked us for quite a while now until we noticed the wrong normalization. Then we luckely found your comment here quite fast and it solved our issue.

@bendavies
Copy link
Contributor

@darkopetreski
Copy link

darkopetreski commented Jan 28, 2024

I am experiencing the same problem in my project with Symfony and Doctrine (it is not api-platform). In the functions that modify the collections at the end, before serializing and after flushing, I clear the first level cache ($em->clear()) and then fetch fresh entity and serialize the fresh entity. It is a little overhead in some places in the system.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

10 participants