-
Notifications
You must be signed in to change notification settings - Fork 1.3k
CSHARP-5345: Unable to deserialize C# DateOnly objects #1534
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| public class SerializerHelper | ||
| { | ||
| // private fields | ||
| private readonly long _allMemberFlags; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was assigned but not used
| case BsonType.Document: | ||
| value = default; | ||
| _helper.DeserializeMembers(context, (_, flag) => | ||
| if (_documentFormat is DateOnlyDocumentFormat.Classic) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to be more forgiving and deserialize whatever we find instead of requiring that the stored documents match the configured format?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was also thinking about that and I'm not super sure of this either. My thought on this was:
- I suppose developers would prefer to have their
DateOnlyin one format in the database - If they were serializing
DateOnlybefore, then they probably are in theDateOnlyDocumentFormat .HumanReadableformat, while now they're serialized by default in theDateOnlyDocumentFormat.Classicformat. - I think it would be better to raise an error as soon as possible, to make developers aware of these inconsistencies, before they realize it further down the road
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Developers should ensure that their documents are all serialized in the same way in the database.
However, we almost always are willing to deserialize whatever we find. I don't see why deserializing whatever document format we find is any different than deserializing Double vs DateTime vs Document etc...
Maybe we need other people to chime in also.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I corrected this part. The correction is kinda convoluted, but I'm not sure if we can make it easier on the eyes
| new SerializerHelper.Member("DateTime", Flags.DateTime), | ||
| new SerializerHelper.Member("Ticks", Flags.Ticks) | ||
| ); | ||
| if (_documentFormat is DateOnlyDocumentFormat.Classic) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It really throws me off to see is instead of ==
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I started doing that mostly because I think it is convenient to use this form when one must check if the enum is one of multiple values:
v is Enum.A or Enum.B vs v == Enum.A || v == Enum.B
And from there I started using also when there is only value, for consistency, even though there is no particular advantage here.
Of course I can put == if we prefer.
| return this; | ||
| } | ||
|
|
||
| return new DateOnlySerializer(representation, documentFormat); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably should include _documentFormat in line 291 below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely
| case BsonType.Document: | ||
| value = default; | ||
| _helper.DeserializeMembers(context, (_, flag) => | ||
| if (_documentFormat is DateOnlyDocumentFormat.Classic) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Developers should ensure that their documents are all serialized in the same way in the database.
However, we almost always are willing to deserialize whatever we find. I don't see why deserializing whatever document format we find is any different than deserializing Double vs DateTime vs Document etc...
Maybe we need other people to chime in also.
| /// <returns>A reconfigured serializer.</returns> | ||
| protected override IBsonSerializer Apply(IBsonSerializer serializer) | ||
| { | ||
| if (serializer is DateOnlySerializer dateOnlySerializer) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if it's a list of DateOnly?
We probably should come up with a generic method we can reuse to apply attributes to nested types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, this has come up also in the conventions recently
| /// <summary> | ||
| /// The document will contain "Year", "Month" and "Day" (all BsonType.Int32). | ||
| /// </summary> | ||
| HumanReadable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not convinced these names are the best.
What do you think of DateTimeTicks and YearMonthDay?
Those are more self describing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it makes sense. A small thing I don't like is that DateTimeTicks at first glance looks like 3 elements, instead of two, but it's a small thing 😁
| public const long Year = 4; | ||
| public const long Month = 8; | ||
| public const long Day = 16; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need for two sets of flags. We can do this:
private static class Flags
{
public const long DateTime = 1;
public const long Ticks = 2;
public const long Year = 4;
public const long Month = 8;
public const long Day = 16;
public const long DateTimeTicks = DateTime | Ticks;
public const long YearMonthDay = Year | Month | Day;
}
See below for how the last two are used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes sense. Before I put it separately mostly for readability, but together with the rest this makes a lot of sense
|
|
||
| value = classicFormatFound ? VerifyAndMakeDateOnly(new DateTime(tickValue, DateTimeKind.Utc)) | ||
| : new DateOnly(year, month, day); | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This whole case BsonType.Document can be improved.
I suggest the following refactoring for this whole method:
public override DateOnly Deserialize(BsonDeserializationContext context, BsonDeserializationArgs args)
{
var bsonReader = context.Reader;
var bsonType = bsonReader.GetCurrentBsonType();
switch (bsonType)
{
case BsonType.DateTime:
return VerifyAndMakeDateOnly(BsonUtils.ToDateTimeFromMillisecondsSinceEpoch(bsonReader.ReadDateTime()));
case BsonType.Document:
var ticks = 0L;
var year = 0;
var month = 0;
var day = 0;
var foundMemberFlags = _helper.DeserializeMembers(context, (_, flag) =>
{
switch (flag)
{
case Flags.DateTime: bsonReader.SkipValue(); break; // ignore value (use Ticks instead)
case Flags.Ticks: ticks = Int64Serializer.Instance.Deserialize(context); break;
case Flags.Year: year = Int32Serializer.Instance.Deserialize(context); break;
case Flags.Month: month = Int32Serializer.Instance.Deserialize(context); break;
case Flags.Day: day = Int32Serializer.Instance.Deserialize(context); break;
}
});
if (foundMemberFlags == Flags.DateTimeTicks)
{
return VerifyAndMakeDateOnly(new DateTime(ticks, DateTimeKind.Utc));
}
else if (foundMemberFlags == Flags.YearMonthDay)
{
return new DateOnly(year, month, day);
}
else
{
throw new FormatException("Invalid document format.");
}
case BsonType.Decimal128:
return VerifyAndMakeDateOnly(new DateTime(_converter.ToInt64(bsonReader.ReadDecimal128()), DateTimeKind.Utc));
case BsonType.Double:
return VerifyAndMakeDateOnly(new DateTime(_converter.ToInt64(bsonReader.ReadDouble()), DateTimeKind.Utc));
case BsonType.Int32:
return VerifyAndMakeDateOnly(new DateTime(bsonReader.ReadInt32(), DateTimeKind.Utc));
case BsonType.Int64:
return VerifyAndMakeDateOnly(new DateTime(bsonReader.ReadInt64(), DateTimeKind.Utc));
case BsonType.String:
return DateOnly.ParseExact(bsonReader.ReadString(), "yyyy-MM-dd");
default:
throw CreateCannotDeserializeFromBsonTypeException(bsonType);
}
DateOnly VerifyAndMakeDateOnly(DateTime dt)
{
if (dt.TimeOfDay != TimeSpan.Zero)
{
throw new FormatException("Deserialized value has a non-zero time component.");
}
return DateOnly.FromDateTime(dt);
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't forget to modify Equals below to also compare the _documentFormam
_documentFormat.Equals(other._documentFormat)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure! This looks very nice
| ? hasDiscriminatorConvention.DiscriminatorConvention | ||
| : BsonSerializer.LookupDiscriminatorConvention(serializer.ValueType); | ||
|
|
||
| /// <summary> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rstam do you think this would be an appropriate place for this method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is too visible here. This extension method is visible whenever a serializer is used, but this method is actually used in very few places.
I would suggest making it another static method in SerializationHelper instead of an extension method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. Unfortunately SerializationHelper is in the Driver project, so we can't access it from here.
I tried to give a look around at the Bson package but could not find anything that feel super suitable. But I can see we have BsonUtils and HexUtils classes, maybe we can make another one and call it SerializationUtils?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about a new internal class called SerializerConfigurator in the MongoDB.Bson.Serialization folder?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggested new method:
namespace MongoDB.Bson.Serialization
{
internal static class SerializerConfigurator
{
internal static IBsonSerializer ReconfigureSerializer<TSerializer>(IBsonSerializer serializer, Func<TSerializer, IBsonSerializer> reconfigure)
{
switch (serializer)
{
case IChildSerializerConfigurable childSerializerConfigurable:
var childSerializer = childSerializerConfigurable.ChildSerializer;
var reconfiguredChildSerializer = ReconfigureSerializer(childSerializer, reconfigure);
return reconfiguredChildSerializer != null? childSerializerConfigurable.WithChildSerializer(reconfiguredChildSerializer) : null;
case TSerializer typedSerializer:
return reconfigure(typedSerializer);
default:
return null;
}
}
}
}
Note the name of the method and the removal of the unused shouldReconfigure parameter.
While you may have a future use in mind for shouldReconfigure as a general rule don't add anything until you actually need it. You might change your mind about it between now and then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense! I've added this and corrected the call.
| ? hasDiscriminatorConvention.DiscriminatorConvention | ||
| : BsonSerializer.LookupDiscriminatorConvention(serializer.ValueType); | ||
|
|
||
| /// <summary> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is too visible here. This extension method is visible whenever a serializer is used, but this method is actually used in very few places.
I would suggest making it another static method in SerializationHelper instead of an extension method.
| /// <returns> | ||
| /// The reconfigured serializer, or <c>null</c> if no leaf serializer could be reconfigured. | ||
| /// </returns> | ||
| internal static IBsonSerializer GetReconfigured<T>(this IBsonSerializer serializer, Func<T, IBsonSerializer> reconfigure, Func<IBsonSerializer, bool> shouldReconfigure = null ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would name the method ReconfigureSerializer.
I would rename the <T> type parameter to TSerialzier. We only use T when it could be absolutely any type whatsoever, and sometimes not even then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand the purpose of the shouldReconfigure parameter.
I mean I understand the code here... just not WHY shouldReconfigure exists. It's never used.
If you've only added it for POSSIBLE future use, then let's not. Let's wait until it's actually needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding both points, I was actually having in mind the PR for configuring the EnumRepresentationConvention to work with collections of enums that we parked until the new major version, in particular this method.
For that case, because EnumSerializer is a generic type we can't simply use serializer is TSerializer and we need to do something else. Using the GetReconfigured method in that case would look something like this:
var reconfigured = serializer.GetReconfigured<IRepresentationConfigurable>(
configurable => configurable.WithRepresentation(_representation),
bsonSerializer => serializer.ValueType.IsEnum);This is one case where the shouldReconfigure method is used, because we cannot simply check the serializer type as we do it in this PR.
Also, this shows why I didn't use TSerializer in the first place, because I thought it could cause some confusion given that it is not necessarily a serializer. Maybe it would be worth to call it TSerializer in any case to make it clearer?
I've also changed a little bit the xml docs because they were deceiving.
| var dayFound = false; | ||
|
|
||
| var tickValue = 0L; | ||
| var ticks = 0L; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also get rid of the value variable and replace assignments to value with return.
Like in line 148 below.
Refer to the suggested refactoring in this comment if necessary:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense! My bad, I didn't notice the returns in your suggestion before 🤦
| obj is DateOnlySerializer other && | ||
| _representation.Equals(other._representation); | ||
| _representation.Equals(other._representation) && | ||
| _documentFormat.Equals(other._documentFormat); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably need to refactor Equals test to test this change.
|
Please rebase on main and force push to fix the failing auth tests in the patch build. |
| ? hasDiscriminatorConvention.DiscriminatorConvention | ||
| : BsonSerializer.LookupDiscriminatorConvention(serializer.ValueType); | ||
|
|
||
| /// <summary> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about a new internal class called SerializerConfigurator in the MongoDB.Bson.Serialization folder?
| /// <returns>A reconfigured serializer.</returns> | ||
| protected override IBsonSerializer Apply(IBsonSerializer serializer) | ||
| { | ||
| var reconfiguredSerializer = serializer.GetReconfigured<DateOnlySerializer>(s => s.WithRepresentation(_representation, _documentFormat)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After moving and renaming the method I would change this line to:
var reconfiguredSerializer = SerializerConfigurator.ReconfigureSerializer(serializer, (DateOnlySerializer s) => s.WithRepresentation(_representation, _documentFormat));
| ? hasDiscriminatorConvention.DiscriminatorConvention | ||
| : BsonSerializer.LookupDiscriminatorConvention(serializer.ValueType); | ||
|
|
||
| /// <summary> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggested new method:
namespace MongoDB.Bson.Serialization
{
internal static class SerializerConfigurator
{
internal static IBsonSerializer ReconfigureSerializer<TSerializer>(IBsonSerializer serializer, Func<TSerializer, IBsonSerializer> reconfigure)
{
switch (serializer)
{
case IChildSerializerConfigurable childSerializerConfigurable:
var childSerializer = childSerializerConfigurable.ChildSerializer;
var reconfiguredChildSerializer = ReconfigureSerializer(childSerializer, reconfigure);
return reconfiguredChildSerializer != null? childSerializerConfigurable.WithChildSerializer(reconfiguredChildSerializer) : null;
case TSerializer typedSerializer:
return reconfigure(typedSerializer);
default:
return null;
}
}
}
}
Note the name of the method and the removal of the unused shouldReconfigure parameter.
While you may have a future use in mind for shouldReconfigure as a general rule don't add anything until you actually need it. You might change your mind about it between now and then.
| { | ||
| internal static class SerializerConfigurator | ||
| { | ||
| /// <summary> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
internal things don't need doc comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They don't need it, but I think it's not bad to have it. Maybe it helps at understanding what this method does without trying to read too much the implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure you can leave these comments, but we're not going to start writing doc comments for internal classes in general. In general code should be written with proper naming that makes code self-describing.
Occasionally some small comments here and there may be helpful, but full-blown doc comments are not necessary.
| /// <returns>A reconfigured serializer.</returns> | ||
| protected override IBsonSerializer Apply(IBsonSerializer serializer) | ||
| { | ||
| var reconfiguredSerializer = SerializerConfigurator.ReconfigureSerializer<DateOnlySerializer>(serializer, s => s.WithRepresentation(_representation, _documentFormat)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I kind of liked my suggestion in an earlier review to put the DateOnlySerializer on the lambda parameter and let the compiler infer the generic method type from that:
var reconfiguredSerializer = SerializerConfigurator.ReconfigureSerializer(serializer, (DateOnlySerializer s) => s.WithRepresentation(_representation, _documentFormat));
Don't know if you just didn't notice that or didn't like it.
If you didn't like it we can go with what you have now.
rstam
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
| { | ||
| internal static class SerializerConfigurator | ||
| { | ||
| /// <summary> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure you can leave these comments, but we're not going to start writing doc comments for internal classes in general. In general code should be written with proper naming that makes code self-describing.
Occasionally some small comments here and there may be helpful, but full-blown doc comments are not necessary.
|
Hey guys. (lower case is because I have My question is: is it the correct final way to preserve the old behavior or this PR with |
|
@sguryev Yes, what you posted is a good "temporary" fix for the problem. When the content of this PR will be included with the next version you can either:
|



JIRA ticket