Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Array order is not consistent #414

Closed
faruqsandi opened this issue Nov 12, 2022 · 7 comments
Closed

[BUG] Array order is not consistent #414

faruqsandi opened this issue Nov 12, 2022 · 7 comments
Labels

Comments

@faruqsandi
Copy link

faruqsandi commented Nov 12, 2022

First of all, thank you for this great library!
Describe the bug
I have a hierarchy of documents such as a course that will have multiple linked topics in an array. I put functionality in my FastAPI backend app, to change the order of topics. I see that in MongoDB Compass that the array is in the right order. But every time I call the endpoint, the Beanie Document object will return the array, not in the right order.

To Reproduce
Here is what the document looks like:
image

{
  "_id": {
    "$oid": "637011ebe4a0b602dfa188a7"
  },
  "revision_id": null,
  "author_id": 19,
  "name": "Introduction to programming",
  "description": "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse malesuada lacus ex, sit amet blandit leo lobortis eget Suspendisse malesuada lacus ex, sit amet blandit leo lobortis eget.",
  "image_url": "https://i.imgur.com/ifQ7Lxo.jpeg",
  "topics": [
    {
      "$ref": "Topic",
      "$id": {
        "$oid": "63701201e4a0b602dfa188a8"
      }
    },
    {
      "$ref": "Topic",
      "$id": {
        "$oid": "637011fbe3ae5cc480291a65"
      }
    }
  ]
}

Here is the things needed to get the doc:

class Course(Document):
    author_id: int
    name: str
    description: Optional[str]
    image_url: Optional[str]
    topics: List[Link[Topic]] = []

    class Settings:
        use_cache = False
        cache_expiration_time = datetime.timedelta(seconds=10)
        cache_capacity = 5

def topic_helper(topic: dict) -> dict:
    id = topic.pop('_id', None)
    if id:
        topic['id'] = str(id)
    return topic

async def get_topics(course_id: str) -> list:
    course = await Course.get(course_id, fetch_links=True)  # type: ignore
    topics = []
    if course:
        # await course.fetch_link(Course.topics)
        for topic in course.topics:
            await topic.fetch_all_links()  # type: ignore
            topic = topic.dict()  # type: ignore
            # _ = topic.pop('contents', None)
            topics.append(topic_helper(topic))
    return topics

And here is how I check the result:

import time
import requests
url = 'http://127.0.0.1:8000/course/637011ebe4a0b602dfa188a7/topic'

for i in range(20):
    response = requests.get(url)
    print([x['id'] for x in response.json()['data'][0]])
    time.sleep(0.2)

output:

['637011fbe3ae5cc480291a65', '63701201e4a0b602dfa188a8']
['637011fbe3ae5cc480291a65', '63701201e4a0b602dfa188a8']
['637011fbe3ae5cc480291a65', '63701201e4a0b602dfa188a8']
['637011fbe3ae5cc480291a65', '63701201e4a0b602dfa188a8']
['637011fbe3ae5cc480291a65', '63701201e4a0b602dfa188a8']
['637011fbe3ae5cc480291a65', '63701201e4a0b602dfa188a8']
['637011fbe3ae5cc480291a65', '63701201e4a0b602dfa188a8']
['637011fbe3ae5cc480291a65', '63701201e4a0b602dfa188a8']
['637011fbe3ae5cc480291a65', '63701201e4a0b602dfa188a8']
['63701201e4a0b602dfa188a8', '637011fbe3ae5cc480291a65']
['63701201e4a0b602dfa188a8', '637011fbe3ae5cc480291a65']
['637011fbe3ae5cc480291a65', '63701201e4a0b602dfa188a8']
['637011fbe3ae5cc480291a65', '63701201e4a0b602dfa188a8']
['637011fbe3ae5cc480291a65', '63701201e4a0b602dfa188a8']
['637011fbe3ae5cc480291a65', '63701201e4a0b602dfa188a8']
['637011fbe3ae5cc480291a65', '63701201e4a0b602dfa188a8']
['63701201e4a0b602dfa188a8', '637011fbe3ae5cc480291a65']
['63701201e4a0b602dfa188a8', '637011fbe3ae5cc480291a65']
['637011fbe3ae5cc480291a65', '63701201e4a0b602dfa188a8']
['637011fbe3ae5cc480291a65', '63701201e4a0b602dfa188a8']

Expected behavior
It should return an array in the right order.

Additional context

@faruqsandi
Copy link
Author

faruqsandi commented Nov 12, 2022

It seems to_list() won't return a list of objects with an order like ids.
https://github.com/roman-right/beanie/blob/2aca1994c3b6c762b26101650fb5879a4907c90e/beanie/odm/fields.py#L179

This query in MongoDB compass:

{ '_id': { $in: [ObjectId('63701201e4a0b602dfa188a8'), ObjectId('637011fbe3ae5cc480291a65')] } }

return things in order. But:

  topics1 = await Topic.find(In(Topic.id, topic_id_list)).to_list()
  topics2 = await Topic.find({'_id': {'$in': topic_id_list}}).to_list()

is not.
Sorry I am not good enough to debug things.

@roman-right
Copy link
Member

Hi,
I think this is how links are resolved. I use aggregations inside (bc normal find can not join collections) with lookup. I bet, it doesn't sort. I'll check if sort by id will not have performance issues there. And if will not - I'll make it default. If it will - I'll think about solutions. Probably optional sort parameters or smth like this.

Thank you for the issue. It is very reasonable.

@faruqsandi
Copy link
Author

Thank you for your kind response.

@github-actions
Copy link

This issue is stale because it has been open 30 days with no activity.

@github-actions github-actions bot added the Stale label Dec 20, 2022
@github-actions
Copy link

github-actions bot commented Jan 4, 2023

This issue was closed because it has been stalled for 14 days with no activity.

@github-actions github-actions bot closed this as completed Jan 4, 2023
@moonrail
Copy link

@roman-right
I also have use-cases that depend on the order of links and am unable to use fetch_links/fetch_all_links because of its reordering, which is a shame due to the usefulness of this feature.

Other than OP, the order on my setups is always the same when using fetch_*, but not as defined in DB or in Document List[Link[...]] when not resolving links.

As this issue was closed by stale bot, I'd suggest reopening it.

@moonrail
Copy link

@roman-right
This seems to fix the order in my basic tests:
https://github.com/roman-right/beanie/blob/1.18.0b1/beanie/odm/fields.py#L181-L182

The above lines replaced with:

            ids.append(PydanticObjectId(link.ref.id))  # casting to be sure to have a comparable ObjectId, as type-hinting says "Any" for `DBRef.id`
        docs = await model_class.find(In("_id", ids), with_children=True, fetch_links=fetch_links).to_list()  # type: ignore
        docs.sort(key=lambda d: ids.index(d.id))
        return docs

The overhead should be low, but of course, there is more overhead than not sorting anything.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants