Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ignore members with default values #678

Open
ygoe opened this issue Nov 26, 2019 · 29 comments
Open

Ignore members with default values #678

ygoe opened this issue Nov 26, 2019 · 29 comments

Comments

@ygoe
Copy link

ygoe commented Nov 26, 2019

Is your feature request related to a problem? Please describe.

My application needs to send structured data over the network regularly. Not all members of that class are always required. For example, data that will never change only needs to be sent once to save bandwidth. Other data may only need to be sent when it has changed. The remote side will track old and new data and add/update it accordingly.

I cannot use MessagePack serialization like this because it will always include all members of the class. Even if I set the members to binary-minimal values (which is already complicated for DateTime) their names will still be included. Using numeric keys is not possible for versioning reasons.

Describe the solution you'd like

Optionally, when MessagePack sees a member that has its default value (default in C#, null for reference types, some form of 0 otherwise), it should simply leave out that member from serialization. The serialized data then looks as if that member wasn't there. On deserialization, it will be left out anyway and remain its default value already.

The DefaultValueAttribute should also be respected to determine the member's default value.

This option could be triggered at a member level with a new IgnoreMemberIfDefaultValue attribute, and a similar solution on the class level for all members not already ignored otherwise.

Describe alternatives you've considered

The IgnoreMember attribute is closest, but can only ignore members every time, not depending on their value.

My current workaround code is this:

/// <summary>
/// Implements a MessagePack formatter that leaves out object properties that have their default
/// value, as well as properties marked as ignored.
/// </summary>
/// <typeparam name="T">The type to format.</typeparam>
public class NoDefaultsFormatter<T> : IMessagePackFormatter<T>
{
	/// <summary>
	/// Serializes the object.
	/// </summary>
	public int Serialize(ref byte[] bytes, int offset, T value, IFormatterResolver formatterResolver)
	{
		var dict = new Dictionary<string, object>();
		foreach (var property in typeof(T).GetProperties())
		{
			if (property.GetCustomAttributes(typeof(IgnoreMemberAttribute), false).Any())
				continue;
			object propValue = property.GetValue(value);
			if (object.Equals(propValue, DeepConvert.GetDefaultValue(property.PropertyType)))
				continue;
			string camelCaseName = char.ToLowerInvariant(property.Name[0]) + property.Name.Substring(1);
			dict[camelCaseName] = propValue;
		}
		return formatterResolver.GetFormatterWithVerify<Dictionary<string, object>>().Serialize(ref bytes, offset, dict, formatterResolver);
	}

	/// <summary>
	/// Deserializes the object.
	/// Not implemented because this operation is performed by DeepConvert.
	/// </summary>
	public T Deserialize(byte[] bytes, int offset, IFormatterResolver formatterResolver, out int readSize)
	{
		throw new NotImplementedException();
	}
}

Note that this isn't optimised in anyway. Using reflection like this is the slowest possible implementation. This implementation targets MessagePack v1.7.

@AArnott
Copy link
Collaborator

AArnott commented Jan 13, 2020

I would propose we solve this using a similar pattern to JSON.NET's ShouldSerialize methods. Then it can be based on default values or anything else the code wants.

@ygoe
Copy link
Author

ygoe commented Jan 19, 2020

@AArnott That would still leave the reflection stuff to the user. It would be a lot simpler if the library already supported this option. It could use the same optimised reflection as for other parts.

@github-actions
Copy link

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days.

@AArnott
Copy link
Collaborator

AArnott commented Apr 19, 2020

Let's look at this both from the perspective of skipping serialization of values (when using property names as keys) and from skipping deserialization of null values (i.e. don't call the property setter).

@vpodlnk
Copy link

vpodlnk commented Jun 21, 2020

Hi! I'm wondering if there are any alternatives to @ygoe workaround?

@AArnott
Copy link
Collaborator

AArnott commented Jun 21, 2020

The workaround fundamentally requires a custom formatter as @ygoe describes, although there are various ways of implementing it.

@vpodlnk
Copy link

vpodlnk commented Jun 22, 2020

Understood. I have the last one)
Do you consider to add some kind of attribute to handle this case like [IgnoreWhenDefault]?

@AArnott
Copy link
Collaborator

AArnott commented Jun 22, 2020

I don't think so. Most attributes including DataMemberAttribute which we already support have a property on them with this semantic, such as DataMemberAttribute.EmitDefaultValue. The other attribute we support natively is KeyAttribute, which we define and can add a similar property to.

@vpodlnk
Copy link

vpodlnk commented Jun 23, 2020

Got it, thanks for quick answers and support, @AArnott !

@github-actions
Copy link

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days.

@ygoe
Copy link
Author

ygoe commented Sep 22, 2020

@github-actions I'm still here, if that's what you wanted to know.

pCYSl5EDgo added a commit to pCYSl5EDgo/MessagePack-CSharp that referenced this issue Nov 20, 2020
Add bool IgnoreSerializationWhenNull property to KeyAttribute.
DataMemberAttribute.EmitDefaultValue is also used.
@github-actions
Copy link

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days.

@Phoenix359
Copy link

I am also very interested in this feature (add EmitDefaultValue to KeyAttribute) and would like to go even 1 step further.
In the case of strings and collections i would also like it to NOT emit the property if their length/count properties are 0.
I expect this would reduce binary size and decode time (maybe encode time as well) and we would only transmit actual data over the wire and let the deserialization side decide the default state of "empty" properties of a model.

@Phoenix359
Copy link

To clarify, i would like the interface to be something like:

[MessagePack.MessagePackObject()]
public class SomeClass
{
    ....

    [MessagePack.Key(Name = "aName", EmitDefaultValue: false, EmitEmptyValue: false)]
    public string someStringProperty{ get; set; }
    
    [MessagePack.Key(Name = "anotherName", EmitDefaultValue: false, EmitEmptyValue: false)]
    public List<string> someListProperty{ get; set; }

    ....
}

AND if at all possible

[MessagePack.MessagePackObject(keyAsPropertyName: true, EmitDefaultValues: false, EmitEmptyValues: false)]
public class SomeClass
{
    ....
}

@github-actions
Copy link

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days.

@ejball
Copy link

ejball commented Dec 13, 2021

This would be great, even if we just start with ignoring null when writing, akin to JsonIgnoreCondition.WhenWritingNull, which can be set on JsonIgnoreAttribute or on JsonSerializerOptions. (Despite the documentation, it ignores both null reference types and null Nullable<T> value types.)

I looked into writing a PR, but my IL skills are minimal, and I can see how this isn't trivial, since the serializer can't write the number of properties until it knows how many aren't null.

@AArnott
Copy link
Collaborator

AArnott commented Dec 24, 2021

since the serializer can't write the number of properties until it knows how many aren't null.

Ooh, that's a very good point. And we probably wouldn't want to pull on a property more than once for fear of getting a different result of incurring a perf cost, so the formatter would probably need to store every value in a local variable first, then check that for null and then (conditionally) serialize the value from the local variable.

@ygoe
Copy link
Author

ygoe commented Dec 24, 2021

Sounds plausible, yet I can very much accept that performance drawback for the small objects I have. See, my workaround already does that, it collects everything in a dictionary and then writes that out. Well, maybe it's simpler to optimise my "workaround" with type caching and declare that the "solution". But type caching isn't enough, you'd also need to generate the IL to access the property values. MessagePack probably does that internally. Maybe a compiled expression comes close, that would need some benchmarking. So, since the MessagePack library already has the efficient implementation, it could still be faster with value collecting or double-getting of properties compared to my workaround. I'd suggest to allow double-getting as an option at least. It will work perfectly and very efficiently for simple objects that won't return dynamic values.

@AArnott
Copy link
Collaborator

AArnott commented Dec 25, 2021

The IL will have to be substantially changed either way. I honestly don't think the avoidance of double-getting will be a significant complicating factor.

@alex-vorobyev
Copy link

So, is it planned to be able to ignore nulls when serializing to a dictionary (using property names)?
Similar to how it was done in your Utf8Json serializer (StandardResolver.ExcludeNull).

@Difficulty-in-naming
Copy link

Difficulty-in-naming commented Jun 28, 2022

I think we can implement a temporary solution via MPC generated code for those who need this functionality

@alex-vorobyev
Copy link

I think we can implement a temporary solution via MPC generated code for those who need this functionality

Is there an example of such a solution? Thanks.

@ReferenceType
Copy link

Any update on this? Changing my serializer to MessagePack from Protobuf-net quadruples my bandwith because of this issue.

@ugurpelister
Copy link

Unfortunately this issue has been open since 2019 and there are no signs that it will be implemented soon 😔

@PatchouliTC
Copy link

PatchouliTC commented Apr 27, 2023

@ygoe I achieved a close result in a different way - compile-time checking of the relevant library code via Fody and weaving in the relevant attributes

you can check this PolymorphicMessagePack

I fork by Phylum123-PolymorphicMessagePack and completed it

in that repo,you can see how to use fody to analysis your code when complie, and add attributes like [Key(1)] /check conflict key id /auto mark and use with PolymorphicMessagePack to get another union type support

For auto ignore default value prop,I have prepare method in my repo,search

bool HasDefaultValueSet = PropHasDefaultValue(typeNode.node, property);

that will help you know is a property set default value then follow other code,you can descide to add [IgnoreMember] attribute to it or do other things

I also add some test for test&example in that repo,you can use ILSpy to see what will happen

@ts-indikaf
Copy link

@AArnott I'm dealing with large data sets and using MessagePack serializer for better performance. However, I found it's spitting fields with null values into JSON, which is making the payload larger than expected. Just wondering if this feature in the pipeline of development?

@AArnott
Copy link
Collaborator

AArnott commented Sep 14, 2023

I'm doing very little with MessagePack lately. So while this is a popular requested feature and would make sense to deliver, as I personally don't need it, and am quite busy with other OSS projects in my nights+weekends time, it hasn't bubbled up for me.

Sponsorship may be an effective way to lead a contributor of this repo to prioritize the feature. As a free and open source library, all or the majority of its development is done without any monetary benefit to its authors, and all of it is made available free of charge. But sponsorship is very much appreciated and, with agreement from a particular contributor and/or the owning team, can be a great way to support the project and help get your favorite features or bugs addressed in an upcoming version.
If sponsorship to drive completion of a feature or bug fix is an interesting option to you, please let us know. You can reach out to me privately on keybase.io or gitter.im as well.

@ygoe
Copy link
Author

ygoe commented Sep 19, 2023

@PatchouliTC I've been notified for your mention edit. I've already seen your comment after reading the recent updates here. But I'm sorry, from my own experience creating Fody modules, I have to consider entire Fody as technical debt and avoid it for any new development. Maybe C# source code generators are a better approach for this.

@PatchouliTC
Copy link

@PatchouliTC I've been notified for your mention edit. I've already seen your comment after reading the recent updates here. But I'm sorry, from my own experience creating Fody modules, I have to consider entire Fody as technical debt and avoid it for any new development. Maybe C# source code generators are a better approach for this.

However, among the currently available tools, it is possible to analyze and modify the IL content of the code that has been written or Fody;

For C# source generator, which is designed to automatically generate new blocks of code rather than modify the content of the code already written [more like the previous T4 template], you cannot dynamically add attributes through this tool, at most it generates partial classes/new properties/methods, etc.

Another option is to implement a Roslyn analyzer [such as mseeagepackanalyzer], which allows parsing at code writing time to point out possible errors and recommend fixes, so you can define [properties that have default values and classes that have messagepackobject but do not mark ignoreNumber] acts as a warning or error, and then provides a quick way to mark it[add ignoreNumber], but this also needs to be similar to letting developers manually mark ignored objects

So if messagepack lib not support this function,may be use Fody to auto weave [ignoreNumber] is the most quick and easy way

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests