Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BC break: Detected unsupported BSON type 0x06 (undefined) #663

Closed
amcsi opened this issue Oct 23, 2017 · 13 comments
Closed

BC break: Detected unsupported BSON type 0x06 (undefined) #663

amcsi opened this issue Oct 23, 2017 · 13 comments

Comments

@amcsi
Copy link

amcsi commented Oct 23, 2017

Description

Seems like if any value in a document is undefined, an exception is thrown since version 1.3+

After upgrading to mongo-php-driver 1.3+, we started getting the error message in the title. I believe this is a big BC break that wasn't clearly warned about in the changelog.
It is common to mistakenly get undefined into a value in MongoDB by accident through a badly written JS migration script, and it's not like MongoDB can be told to never store undefined.
Probably caused by:
#545
PHPC-302
https://jira.mongodb.org/secure/ReleaseNote.jspa?projectId=12484&version=17597

Version

1.3.0

The version we used before was 1.2.9 which didn't have this issue.

Environment

CentOS

Expected and Actual Behavior

Expected: undefined works as normal, being treated as NULL in PHP

Actual: exception being thrown: "Detected unsupported BSON type 0x06"

@amcsi amcsi changed the title Detected unsupported BSON type 0x06 BC break: Detected unsupported BSON type 0x06 (undefined) Oct 23, 2017
@derickr
Copy link
Contributor

derickr commented Oct 23, 2017

Where did you upgrade from? Please let us know both which extension (mongo vs mongodb), and version number.

@amcsi
Copy link
Author

amcsi commented Oct 23, 2017

@derickr the change was mongodb driver from 1.2.9 -> 1.3.0

@jmikola
Copy link
Member

jmikola commented Oct 23, 2017

Expected: undefined works as normal, being treated as NULL in PHP

Converting BSON's undefined type to PHP's null type is not strictly correct, as they are technically different types (BSON defines a separate null type) and silent conversion could result in a BSON document being inadvertently modified as it passes through PHP. That could pose a problem for another application/driver that may rely differentiating null from undefined.

The fact that the driver previously ignored these meant that an undefined value simply disappeared once it reached PHP, which could be as problematic as converting undefined to null. In both cases, the original type information was lost.

I believe this is a big BC break that wasn't clearly warned about in the changelog.

Re-reading the 1.3.0 release notes, we certainly could have done a better job highlighting this change in paragraph form, rather than leaving it within the long list of resolved JIRA tickets. I'll do my best to avoid making that mistake in the future.


@derickr: I think we may need to reconsider adding classes for these unsupported BSON types, which would allow the PHP driver to round-trip them reliably. Thoughts?

@derickr
Copy link
Contributor

derickr commented Oct 24, 2017

I don't think we should support deprecated data types, and that our current behaviour is correct as to prevent data corruption.

@amcsi
Copy link
Author

amcsi commented Oct 24, 2017

@derickr the current behaviour doesn't prevent data corruption. We got bad data due to JS migrations being run in the DB.

For example:

document.existing_property = document.non_existent_property;
// Saving an undefined value (by accident)
db.someCollection.save(document);

@jmikola
Copy link
Member

jmikola commented Oct 24, 2017

I believe the "data corruption" @derickr was referring to was due to the behavior of the 1.2.x driver, which silently ignored unsupported BSON types (i.e. undefined, DBPointer, and symbol) when converting to PHP and could result in data loss (of the fields with deprecated types) should the PHP value be written back to the database.

To that end, throwing an exception does prevent data corruption, as it prevents the user from converting BSON to PHP in the first place.

If we were to introduce classes to represent these BSON types, it would have to be in 1.4.0. Given that 1.3.0 was just released, we don't yet have a timeline for 1.4.0. Timelines aside, I still have reservations about introducing new classes that would allow users to create documents with these deprecated types. All of the "do not use this" warnings in the world can still fall on deaf ears (case in point: MongoTimestamp in the legacy driver). Additionally, if other drivers also do not support reading unsupported BSON types, the PHP driver then starts contributing to this problem.

We got bad data due to JS migrations being run in the DB.

As you said, the undefined values were introduced "by accident". I think the best course of action is attempting to rectify the bad data migration. The $type query operator can be used to search for documents where fields have the undefined type (e.g. { "foo": { "$type": "undefined" }}), after which you can use $unset to remove the deprecated types. After this is resolved, you should be able to upgrade to 1.3.x without issue.

I'm not sure if those undefined types were introduced by the shell or Node.js driver, but I would also encourage you to open a ticket in the SERVER (component: "shell") or NODE projects, respectively, to request that the behavior be fixed. IMO, neither the shell nor any driver should allow users to create deprecated BSON types (and certainly not inadvertently).


As an aside, I also noted that the driver documentation only discusses the exception for unsupported types on MongoDB\BSON\toPHP(). Since most BSON conversion actually happens during cursor iteration, I'll aim to get this documented on MongoDB\Driver\Cursor as well. I've opened PHPC-1024 to track that.

@amcsi
Copy link
Author

amcsi commented Oct 24, 2017

@jmikola

The $type query operator can be used to search for documents where fields have the undefined type (e.g. { "foo": { "$type": "undefined" }}), after which you can use $unset to remove the deprecated types.

That is assuming that we know for certain that it's 1 concrete property in 1 conrete collection, whereas in reality there are random undefined values in all collections deep within objects of arrays of objects, which so far wasn't causing problems with doctrine-mongodm-odm, because those values would continue to be just treated as NULL.

Just saying, so that you would know an example of a real world scenario of the usage of this driver.

@amcsi
Copy link
Author

amcsi commented Oct 24, 2017

Unfortunately despite undefined being deprecated, MongoDB allows for them to be set. Nothing's really done to prevent the user from being able to insert such values.
We're using MongoDB 3.4 by the way.

@jmikola
Copy link
Member

jmikola commented Oct 25, 2017

I'll revisit discussion with Derick later this week. Perhaps we can settle on a way to allow deserialization of these types in existing BSON while still making it terribly inconvenient for users to insert such values (e.g. BSON classes with private constructors). If we do create a task for that, I'll cross-reference the JIRA ticket here. That said, 1.4.0 will still be the earliest release in which we'll be able to address that.

@jacqueswaller
Copy link

We're also getting bitten by this. On a find query, the exception is only thrown if the field with "undefined" value is in your projection. However for several other queries, like "distinct", the exception is thrown regardless of which field is "undefined".

If you want to throw exceptions when I try to access "undefined", there's debatable value in that. But making a document completely unusable because a field I didn't care about is "undefined", that's not so helpful.

On that note, if the "distinct" query is processing the entire document field by field then there may be room for optimization since it should only care about the field I specified in the command.

I would also mention that it's difficult to search for documents with "undefined" values since the documents are schemaless which strictly speaking means we don't know what to search for. The only real option for flushing them out is to "findAll", in _id sorted order, with no projection on every collection of every database which is time consuming to say the least.

So ultimately it's a "fix as we find" type of thing, which would drastically easier if the exception also printed the _id of the offending document. The fieldname gives us something but if it's not an indexed field then it could take days to find the offending document depending on the collection.

@jmikola
Copy link
Member

jmikola commented Oct 26, 2017

Points noted. @amcsi, @jacqueswaller: Thank you both for the valuable feedback.

PHPC-1026 will revert the exception behavior and instead log a message, which most users will ignore. We can push this out in a 1.3.2 release within the week.

I've created PHPC-1027 to introduce classes for these deprecated types in 1.4.0 as I still don't think having the driver should ignore them when converting BSON to PHP and classes would allow for lossless round-tripping.

@jmikola
Copy link
Member

jmikola commented Oct 30, 2017

FYI: the exception behavior has been reverted and we've released 1.3.2. Please let me know if that resolves things for you guys (at least until we implement support for round-tripping these types in 1.4.0).

@jmikola
Copy link
Member

jmikola commented Jan 23, 2018

Closing this, as we haven't heard back since my last comment about PHPC-1026 being addressed in the 1.3.2 release.

Since then, we've also implemented PHPC-1027 and released 1.4.0beta1 and 1.4.0RC1. Quoting the 1.4.0beta1 release notes:

This release adds new classes for deprecated BSON types (DBPointer, Symbol, and Undefined), which will allow the driver to round-trip documents containing such types. Previously, the driver would throw an exception when it encountered a deprecated type.

These classes are also now documented on http://php.net/manual/en/book.bson.php.

@jmikola jmikola closed this as completed Jan 23, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants