Skip to content
This repository has been archived by the owner on Jan 9, 2023. It is now read-only.

The errata migration continues to fail with "pymongo.errors.DocumentTooLarge: BSON document too large" error #572

Closed
hao-yu opened this issue Jul 19, 2022 · 0 comments · Fixed by #573

Comments

@hao-yu
Copy link
Contributor

hao-yu commented Jul 19, 2022

Cloned from https://bugzilla.redhat.com/show_bug.cgi?id=2074099

Version
satellite-6.9.9-1.el7sat.noarch
tfm-rubygem-katello-3.18.1.53-1.el7sat.noarch
tfm-rubygem-pulp_2to3_migration_client-0.10.0-1.el7sat.noarch

Describe the bug
Description of problem:

The following error was supposed to be fixed via Bugzilla https://bugzilla.redhat.com/show_bug.cgi?id=2003888 but it still exists after upgrade to 6.9.8 when the count of ERRATA's are way to be migrated is very high.

pymongo.errors.DocumentTooLarge: BSON document too large (18938832 bytes) - the connected server supports BSON document sizes up to 16777216 bytes.

This happens even if we configure the value of PULP_CONTENT_PREMIGRATION_BATCH_SIZE as low as 25 and repeatedly try.

Version-Release number of selected component (if applicable):

Satellite 6.9.8

How reproducible:

At times when ERRATA count is very high (assuming)

Steps to Reproduce:

Same as Bugzilla https://bugzilla.redhat.com/show_bug.cgi?id=2003888.

Perhaps, sync a good amount of repo to have a huge amount of errata to be migrated during pulp 2 - pulp3 content migration and then attempt the same on a Satellite 6.9.8

Actual results:

Mar 28 14:06:50 satellite69 pulpcore-worker-3: RuntimeWarning)
Mar 28 14:06:50 satellite69 pulpcore-worker-3: pulp: rq.worker:ERROR: Traceback (most recent call last):
Mar 28 14:06:50 satellite69 pulpcore-worker-3: File "/usr/lib/python3.6/site-packages/rq/worker.py", line 936, in perform_job
Mar 28 14:06:50 satellite69 pulpcore-worker-3: rv = job.perform()
Mar 28 14:06:50 satellite69 pulpcore-worker-3: File "/usr/lib/python3.6/site-packages/rq/job.py", line 684, in perform
Mar 28 14:06:50 satellite69 pulpcore-worker-3: self._result = self._execute()
Mar 28 14:06:50 satellite69 pulpcore-worker-3: File "/usr/lib/python3.6/site-packages/rq/job.py", line 690, in _execute
Mar 28 14:06:50 satellite69 pulpcore-worker-3: return self.func(*self.args, **self.kwargs)
Mar 28 14:06:50 satellite69 pulpcore-worker-3: File "/usr/lib/python3.6/site-packages/pulp_2to3_migration/app/tasks/migrate.py", line 77, in migrate_from_pulp2
Mar 28 14:06:50 satellite69 pulpcore-worker-3: pre_migrate_all_content(plan)
Mar 28 14:06:50 satellite69 pulpcore-worker-3: File "/usr/lib/python3.6/site-packages/pulp_2to3_migration/app/pre_migration.py", line 70, in pre_migrate_all_content
Mar 28 14:06:50 satellite69 pulpcore-worker-3: pre_migrate_content_type(content_model, mutable_type, lazy_type, premigrate_hook)
Mar 28 14:06:50 satellite69 pulpcore-worker-3: File "/usr/lib/python3.6/site-packages/pulp_2to3_migration/app/pre_migration.py", line 301, in pre_migrate_content_type
Mar 28 14:06:50 satellite69 pulpcore-worker-3: record.id: record for record in batched_mongo_content_qs.no_cache()
Mar 28 14:06:50 satellite69 pulpcore-worker-3: File "/usr/lib/python3.6/site-packages/pulp_2to3_migration/app/pre_migration.py", line 300, in <dictcomp>
Mar 28 14:06:50 satellite69 pulpcore-worker-3: pulp2_content_by_id = {
Mar 28 14:06:50 satellite69 pulpcore-worker-3: File "/usr/lib/python3.6/site-packages/mongoengine/queryset/base.py", line 1590, in __next__
Mar 28 14:06:50 satellite69 pulpcore-worker-3: raw_doc = next(self._cursor)
Mar 28 14:06:50 satellite69 pulpcore-worker-3: File "/usr/lib64/python3.6/site-packages/pymongo/cursor.py", line 1207, in next
Mar 28 14:06:50 satellite69 pulpcore-worker-3: if len(self.__data) or self._refresh():
Mar 28 14:06:50 satellite69 pulpcore-worker-3: File "/usr/lib64/python3.6/site-packages/pymongo/cursor.py", line 1124, in _refresh
Mar 28 14:06:50 satellite69 pulpcore-worker-3: self.__send_message(q)
Mar 28 14:06:50 satellite69 pulpcore-worker-3: File "/usr/lib64/python3.6/site-packages/pymongo/cursor.py", line 1001, in __send_message
Mar 28 14:06:50 satellite69 pulpcore-worker-3: address=self.__address)
Mar 28 14:06:50 satellite69 pulpcore-worker-3: File "/usr/lib64/python3.6/site-packages/pymongo/mongo_client.py", line 1372, in _run_operation_with_response
Mar 28 14:06:50 satellite69 pulpcore-worker-3: exhaust=exhaust)
Mar 28 14:06:50 satellite69 pulpcore-worker-3: File "/usr/lib64/python3.6/site-packages/pymongo/mongo_client.py", line 1471, in _retryable_read
Mar 28 14:06:50 satellite69 pulpcore-worker-3: return func(session, server, sock_info, slave_ok)
Mar 28 14:06:50 satellite69 pulpcore-worker-3: File "/usr/lib64/python3.6/site-packages/pymongo/mongo_client.py", line 1366, in _cmd
Mar 28 14:06:50 satellite69 pulpcore-worker-3: unpack_res)
Mar 28 14:06:50 satellite69 pulpcore-worker-3: File "/usr/lib64/python3.6/site-packages/pymongo/server.py", line 116, in run_operation_with_response
Mar 28 14:06:50 satellite69 pulpcore-worker-3: sock_info.send_message(data, max_doc_size)
Mar 28 14:06:50 satellite69 pulpcore-worker-3: File "/usr/lib64/python3.6/site-packages/pymongo/pool.py", line 711, in send_message
Mar 28 14:06:50 satellite69 pulpcore-worker-3: (max_doc_size, self.max_bson_size))
Mar 28 14:06:50 satellite69 pulpcore-worker-3: pymongo.errors.DocumentTooLarge: BSON document too large (18938832 bytes) - the connected server supports BSON document sizes up to 16777216 bytes.
Mar 28 14:06:50 satellite69 pulpcore-worker-3: Traceback (most recent call last):
Mar 28 14:06:50 satellite69 pulpcore-worker-3: File "/usr/lib/python3.6/site-packages/rq/worker.py", line 936, in perform_job
Mar 28 14:06:50 satellite69 pulpcore-worker-3: rv = job.perform()
Mar 28 14:06:50 satellite69 pulpcore-worker-3: File "/usr/lib/python3.6/site-packages/rq/job.py", line 684, in perform
Mar 28 14:06:50 satellite69 pulpcore-worker-3: self._result = self._execute()
Mar 28 14:06:50 satellite69 pulpcore-worker-3: File "/usr/lib/python3.6/site-packages/rq/job.py", line 690, in _execute
Mar 28 14:06:50 satellite69 pulpcore-worker-3: return self.func(*self.args, **self.kwargs)
Mar 28 14:06:50 satellite69 pulpcore-worker-3: File "/usr/lib/python3.6/site-packages/pulp_2to3_migration/app/tasks/migrate.py", line 77, in migrate_from_pulp2
Mar 28 14:06:50 satellite69 pulpcore-worker-3: pre_migrate_all_content(plan)
Mar 28 14:06:50 satellite69 pulpcore-worker-3: File "/usr/lib/python3.6/site-packages/pulp_2to3_migration/app/pre_migration.py", line 70, in pre_migrate_all_content
Mar 28 14:06:50 satellite69 pulpcore-worker-3: pre_migrate_content_type(content_model, mutable_type, lazy_type, premigrate_hook)
Mar 28 14:06:50 satellite69 pulpcore-worker-3: File "/usr/lib/python3.6/site-packages/pulp_2to3_migration/app/pre_migration.py", line 301, in pre_migrate_content_type
Mar 28 14:06:50 satellite69 pulpcore-worker-3: record.id: record for record in batched_mongo_content_qs.no_cache()
Mar 28 14:06:50 satellite69 pulpcore-worker-3: File "/usr/lib/python3.6/site-packages/pulp_2to3_migration/app/pre_migration.py", line 300, in <dictcomp>
Mar 28 14:06:50 satellite69 pulpcore-worker-3: pulp2_content_by_id = {
Mar 28 14:06:50 satellite69 pulpcore-worker-3: File "/usr/lib/python3.6/site-packages/mongoengine/queryset/base.py", line 1590, in __next__
Mar 28 14:06:50 satellite69 pulpcore-worker-3: raw_doc = next(self._cursor)
Mar 28 14:06:50 satellite69 pulpcore-worker-3: File "/usr/lib64/python3.6/site-packages/pymongo/cursor.py", line 1207, in next
Mar 28 14:06:50 satellite69 pulpcore-worker-3: if len(self.__data) or self._refresh():
Mar 28 14:06:50 satellite69 pulpcore-worker-3: File "/usr/lib64/python3.6/site-packages/pymongo/cursor.py", line 1124, in _refresh
Mar 28 14:06:50 satellite69 pulpcore-worker-3: self.__send_message(q)
Mar 28 14:06:50 satellite69 pulpcore-worker-3: File "/usr/lib64/python3.6/site-packages/pymongo/cursor.py", line 1001, in __send_message
Mar 28 14:06:50 satellite69 pulpcore-worker-3: address=self.__address)
Mar 28 14:06:50 satellite69 pulpcore-worker-3: File "/usr/lib64/python3.6/site-packages/pymongo/mongo_client.py", line 1372, in _run_operation_with_response
Mar 28 14:06:50 satellite69 pulpcore-worker-3: exhaust=exhaust)
Mar 28 14:06:50 satellite69 pulpcore-worker-3: File "/usr/lib64/python3.6/site-packages/pymongo/mongo_client.py", line 1471, in _retryable_read
Mar 28 14:06:50 satellite69 pulpcore-worker-3: return func(session, server, sock_info, slave_ok)
Mar 28 14:06:50 satellite69 pulpcore-worker-3: File "/usr/lib64/python3.6/site-packages/pymongo/mongo_client.py", line 1366, in _cmd
Mar 28 14:06:50 satellite69 pulpcore-worker-3: unpack_res)
Mar 28 14:06:50 satellite69 pulpcore-worker-3: File "/usr/lib64/python3.6/site-packages/pymongo/server.py", line 116, in run_operation_with_response
Mar 28 14:06:50 satellite69 pulpcore-worker-3: sock_info.send_message(data, max_doc_size)
Mar 28 14:06:50 satellite69 pulpcore-worker-3: File "/usr/lib64/python3.6/site-packages/pymongo/pool.py", line 711, in send_message
Mar 28 14:06:50 satellite69 pulpcore-worker-3: (max_doc_size, self.max_bson_size))
Mar 28 14:06:50 satellite69 pulpcore-worker-3: pymongo.errors.DocumentTooLarge: BSON document too large (18938832 bytes) - the connected server supports BSON document sizes up to 16777216 bytes.

Expected results:

The satellite should be able properly to use the PULP_CONTENT_PREMIGRATION_BATCH_SIZE value and should be able to handle a large amount of ERRATA migration as well without any such errors.

Additional info:

I think it is because the query size is too large when using "id__in=". Based on what I observed from the below tests, Pymongo will send full query to server once and then follow by the "GetMore" operation. Regardless of what batch_size we set or what fields we are limiting, they don't affect the query size. That means if we query half a million of ids in 1 shot, we will (should) get the BSON limit error.

I am using RPM for testing because it always has more records than errata. Although, I still don't understand why the migration kept or only failed when migrating errata.

### Test 1: no batch size and no limit fields ###

PULP_SETTINGS=/etc/pulp/settings.py pulpcore-manager shell
>>> from pulp_2to3_migration.pulp2 import connection
>>> from pulp_2to3_migration.app.plugin.rpm.pulp2_models import RPM
>>> connection.initialize()
>>> rpm_ids = RPM.objects.only("id").all().values_list("id")
>>> len(rpm_ids)
243594    <======================================= 24K rpms

>>> batched_mongo_content_qs = RPM.objects(id__in=rpm_ids)
>>> pulp2_content_by_id = {record.id: record for record in batched_mongo_content_qs.no_cache()}
>>>>operation
<pymongo.message._GetMore object at 0x7fae547edf48>
max_doc_size: 48 vs max_bson_size: 16777216

>>>>operation
<pymongo.message._Query object at 0x7fae6b272eb8>
max_doc_size: 11825054 vs max_bson_size: 16777216  <================== size of the query almost hits the 16MB limit

>>>>operation
<pymongo.message._GetMore object at 0x7fae921daec8>  <================ subsequent getMore Ops are fine
max_doc_size: 48 vs max_bson_size: 16777216
>>>>operation
<pymongo.message._GetMore object at 0x7fae89c7eec8>
max_doc_size: 48 vs max_bson_size: 16777216
>>>>operation
<pymongo.message._GetMore object at 0x7fae543fef48>
max_doc_size: 48 vs max_bson_size: 16777216
>>>>operation
<pymongo.message._GetMore object at 0x7fae886859c8>
max_doc_size: 48 vs max_bson_size: 16777216
>>>>operation
<pymongo.message._GetMore object at 0x7fae930b8ac8>
max_doc_size: 48 vs max_bson_size: 16777216
...


### Test 2: Set batch size and limit fields make no different ###

>>> from pulp_2to3_migration.pulp2 import connection
>>> from pulp_2to3_migration.app.plugin.rpm.pulp2_models import RPM
>>> connection.initialize()
pulp: pulp_2to3_migration.pulp2.connection:INFO: Attempting to connect to localhost:27017
>>> rpm_ids = RPM.objects.only("id").all().values_list("id")
>>> len(rpm_ids)
243594
>>> mongo_fields = set(['id', '_storage_path', '_last_updated', '_content_type_id'])
>>> batched_mongo_content_qs = RPM.objects(id__in=rpm_ids).only(*mongo_fields).batch_size(50)
>>> pulp2_content_by_id = {record.id: record for record in batched_mongo_content_qs.no_cache()}
<pymongo.message._Query object at 0x7fdf81476a98>
>>>>operation
<pymongo.message._Query object at 0x7fdf81476a98>
max_doc_size: 11825155 vs max_bson_size: 16777216  <======================
>>>>operation
<pymongo.message._GetMore object at 0x7fdf813c40c8>
max_doc_size: 63 vs max_bson_size: 16777216
>>>>operation
<pymongo.message._GetMore object at 0x7fdf813c40c8>
max_doc_size: 63 vs max_bson_size: 16777216
>>>>operation
<pymongo.message._GetMore object at 0x7fdf813c40c8>
max_doc_size: 63 vs max_bson_size: 16777216
>>>>operation
<pymongo.message._GetMore object at 0x7fdf813c40c8>
max_doc_size: 63 vs max_bson_size: 16777216
>>>>operation
<pymongo.message._GetMore object at 0x7fdf813c40c8>
...



### Test 3: Reduce the query does seems to reduce the bson size ###

>>> from pulp_2to3_migration.pulp2 import connection
>>> from pulp_2to3_migration.app.plugin.rpm.pulp2_models import RPM
>>> connection.initialize()
pulp: pulp_2to3_migration.pulp2.connection:INFO: Attempting to connect to localhost:27017
>>> rpm_ids = RPM.objects.only("id").all().limit(5000).values_list("id")  <================= limit 5000
>>> len(rpm_ids)
5000
>>> mongo_fields = set(['id', '_storage_path', '_last_updated', '_content_type_id'])
>>> batched_mongo_content_qs = RPM.objects(id__in=rpm_ids).only(*mongo_fields).batch_size(50)
>>> pulp2_content_by_id = {record.id: record for record in batched_mongo_content_qs.no_cache()}
<pymongo.message._Query object at 0x7f39078bf1a8>
>>>>operation
<pymongo.message._Query object at 0x7f39078bf1a8>
max_doc_size: 234049 vs max_bson_size: 16777216   <=========================== query size is much smaller
>>>>operation
<pymongo.message._GetMore object at 0x7f39071d6cc8>
max_doc_size: 63 vs max_bson_size: 16777216
>>>>operation
<pymongo.message._GetMore object at 0x7f39071d6cc8>
max_doc_size: 63 vs max_bson_size: 16777216
>>>>operation
<pymongo.message._GetMore object at 0x7f39071d6cc8>
max_doc_size: 63 vs max_bson_size: 16777216
>>>>operation
<pymongo.message._GetMore object at 0x7f39071d6cc8>
max_doc_size: 63 vs max_bson_size: 16777216
>>>>operation
<pymongo.message._GetMore object at 0x7f39071d6cc8>
max_doc_size: 63 vs max_bson_size: 16777216
>>>>operation
<pymongo.message._GetMore object at 0x7f39071d6cc8>
max_doc_size: 63 vs max_bson_size: 16777216
...
hao-yu added a commit to hao-yu/pulp-2to3-migration that referenced this issue Jul 19, 2022
hao-yu added a commit to hao-yu/pulp-2to3-migration that referenced this issue Jul 26, 2022
hao-yu added a commit to hao-yu/pulp-2to3-migration that referenced this issue Jul 26, 2022
When pre-migrating errata, make query in batch to
prevent the BSON too large error.

closes pulp#572
ggainey pushed a commit that referenced this issue Aug 4, 2022
When pre-migrating errata, make query in batch to
prevent the BSON too large error.

closes #572
ggainey pushed a commit to ggainey/pulp-2to3-migration that referenced this issue Aug 5, 2022
When pre-migrating errata, make query in batch to
prevent the BSON too large error.

closes pulp#572
ggainey pushed a commit to ggainey/pulp-2to3-migration that referenced this issue Aug 5, 2022
When pre-migrating errata, make query in batch to
prevent the BSON too large error.

closes pulp#572

(cherry picked from commit 8ef32d4)
ipanova pushed a commit that referenced this issue Aug 9, 2022
When pre-migrating errata, make query in batch to
prevent the BSON too large error.

closes #572

(cherry picked from commit 8ef32d4)
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant