Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[azure-spring-data-cosmos] Turn java.math.Big{Decimal,Integer} into simple types #40239

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

Xiphoseer
Copy link

Description

This PR adds java.math.BigDecimal and java.math.BigInteger to the set of CosmosSimpleTypes, thus extending #31417 to those types. A "simple type" in the context of spring-data-commons is one that will "not [be] recursively analyzed" (c.f. AbstractMappingContext#setSimpleTypeHolder).

Using one of these types in a POJO in Java 17 currently results in a InaccessibleObjectException as described in #36831. While the workaround mentioned there to use --add-opens works, I believe that is the wrong approach. Adding that flag will prevent spring-data-commons from causing the error by using reflection to find properties, but that doesn't mean that Jackson / the ObjectMapper will treat it as a nested POJO.

Quite the opposite actually: Jackson has a special DecimalNode for BigDecimal and several configuration options on how it is serialized to and serialized from JSON. Thus, it's very much treated as an atomic type by Jackson, and by extension this library (azure-spring-data-cosmos) and should be a "simple type" in the context of the MappingContext. My understanding is that being a simple type does not prevent converters being registered for it, so this is orthogonal to #38691 and any kind of default conversion to JSON primitives for the type.

As for the question of whether this is a bug in spring-data-commons: I do believe that Spring Data Commons should default to a behavior that does not try to use reflection on java.* classes that do not have public fields. On the other hand, it is an abstraction layer over multiple different implementations and only those can reasonably opt-in to a type being both simple and supported based on the underlying technology. Not all Spring Data Projects use a JSON / NoSQL datamodel.

The default of the SimpleTypeHandler API is a set, which elements can't be excluded from without re-doing the whole set. So my conclusion would be that it's possible to commit to BigDecimal and BigInteger being simple for azure-spring-data-cosmos today, while it's not nearly as clear for upstream Spring Data Commons. Removing them from CosmosSimpleTypes after they've been added upstream is also not a breaking change.

The / one relevant bug on spring-data-commons is spring-projects/spring-data-commons#2844, but with neither accepted, it's been causing issues for us in the services we work with. In our case, the source data model we want to import into CosmosDB is based on an XML Schema using xs:decimal, so the natural java type to use as per JAXB is java.math.BigDecimal.

All SDK Contribution checklist:

  • The pull request does not introduce [breaking changes]
  • CHANGELOG is updated for new features, bug fixes or other significant changes.
  • I have read the contribution guidelines.

General Guidelines and Best Practices

  • Title of the pull request is clear and informative.
  • There are a small number of commits, each of which have an informative message. This means that previously merged commits do not appear in the history of the PR. For more information on cleaning up the commits in your PR, see this page.

Testing Guidelines

  • Pull request includes test coverage for the included changes.

@github-actions github-actions bot added azure-spring All azure-spring related issues Community Contribution Community members are working on the issue customer-reported Issues that are reported by GitHub users external to the Azure organization. labels May 17, 2024
Copy link

Thank you for your contribution @Xiphoseer! We will review the pull request and get back to you soon.

@trande4884
Copy link
Member

trande4884 commented May 21, 2024

@saragluna can you please review? Any issues that would arise from adding these types?

@trande4884
Copy link
Member

@Xiphoseer can you also add some tests that actually write/read to a database to ensure we are getting out what we put in? Thanks!

@Xiphoseer
Copy link
Author

@microsoft-github-policy-service agree company="PRODYNA"

Copy link
Member

@saragluna saragluna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change looks good to me, but we should run the IT to verify it will not bring any breaking change

@Xiphoseer
Copy link
Author

@Xiphoseer can you also add some tests that actually write/read to a database to ensure we are getting out what we put in? Thanks!

I tried this and would appreciate some input on it. First, as far as CONTRIBUTING goes, I tried to run IT tests against the emulator but hit rate limiting even when explicitly starting it with the flag to disable that. Secondly, the Azure Portal wizard for "Azure Cosmos DB" now has a bunch more options to pick from (Core vs MongoDB, Serverless, Rate Limiting) which can default to values incompatible with the IT suite, 1000 RU limit is one example. It would be helpful to update CONTRIBUTING for that.

I guess the tricky part is to get this right without introducing breaking changes. My current understanding is that with or without the SimpleType changes, Jackson will special case BigDecimal and store it as a number, but I'd like to double-check that again. This means that https://learn.microsoft.com/en-us/azure/cosmos-db/nosql/modeling-data#numbers-in-json applies which talks about the fact that number will be "stored as JSON". In practice that seems to mean that JSON numbers are stored as IEEE 754 binary64 in the backend, not only some clients, so there is some loss of precision involved.

In particular, BigDecimal keeps around trailing zeroes after the decimal point (e.g. 10.50) via its scale but the roundtrip through cosmos turns that into the mathematically equivalent 10.5, which doesn't compare as equals in the context of BigDecimal.

The solution to that would be to serialize the numbers as string, which is what spring-data-mongodb does via a converter, can also be achived locally using @JsonFormat(shape = Shape.STRING) or could be done in the object mapper via feature SERIALIZE_NUMBERS_AS_STRING (probably way too broad) or an AnnotationIntrospector (bit of a misnomer as per Jackson docs) for all BigDecimal types.

What I'm unsure about is whether that last point is

  • orthogonal to this PR because it's not using field reflection either way
  • possible to configure in the object mapper without breaking any users on java pre-16/17 or having added --add-opens as per the linked issues.

@saragluna
Copy link
Member

/azp run java - cosmos - tests

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@saragluna
Copy link
Member

@Xiphoseer can you also add some tests that actually write/read to a database to ensure we are getting out what we put in? Thanks!

I tried this and would appreciate some input on it. First, as far as CONTRIBUTING goes, I tried to run IT tests against the emulator but hit rate limiting even when explicitly starting it with the flag to disable that. Secondly, the Azure Portal wizard for "Azure Cosmos DB" now has a bunch more options to pick from (Core vs MongoDB, Serverless, Rate Limiting) which can default to values incompatible with the IT suite, 1000 RU limit is one example. It would be helpful to update CONTRIBUTING for that.

I guess the tricky part is to get this right without introducing breaking changes. My current understanding is that with or without the SimpleType changes, Jackson will special case BigDecimal and store it as a number, but I'd like to double-check that again. This means that https://learn.microsoft.com/en-us/azure/cosmos-db/nosql/modeling-data#numbers-in-json applies which talks about the fact that number will be "stored as JSON". In practice that seems to mean that JSON numbers are stored as IEEE 754 binary64 in the backend, not only some clients, so there is some loss of precision involved.

In particular, BigDecimal keeps around trailing zeroes after the decimal point (e.g. 10.50) via its scale but the roundtrip through cosmos turns that into the mathematically equivalent 10.5, which doesn't compare as equals in the context of BigDecimal.

The solution to that would be to serialize the numbers as string, which is what spring-data-mongodb does via a converter, can also be achived locally using @JsonFormat(shape = Shape.STRING) or could be done in the object mapper via feature SERIALIZE_NUMBERS_AS_STRING (probably way too broad) or an AnnotationIntrospector (bit of a misnomer as per Jackson docs) for all BigDecimal types.

What I'm unsure about is whether that last point is

  • orthogonal to this PR because it's not using field reflection either way
  • possible to configure in the object mapper without breaking any users on java pre-16/17 or having added --add-opens as per the linked issues.

I've triggered the IT pipeline. This is also my concern, since if a user already store their BigDecimal in the Cosmos DB using another version client, can the number be read using this version.

@Xiphoseer
Copy link
Author

My current understanding is that with or without the SimpleType changes, Jackson will special case BigDecimal and store it as a number, but I'd like to double-check that again.

Ran my test case in MappingCosmosConverterUnitTest without the simple type changes, which, as expected, required adding --add-opens java.base/java.math=spring.core to <javaModulesSurefireArgLine> in pom.xml and it still passes.

I have to say I'm a bit surprised by the roundtrip to string and back in

final String valueAsString = objectMapper.writeValueAsString(sourceEntity);
cosmosObjectNode = (ObjectNode) objectMapper.readTree(valueAsString);

because that's what actually looses the trailing zeroes before even reaching the backend.

It seems to have been introduced by 61b37c9, which mentions no reason to choose that method over ObjectMapper#valueToTree, but even that one does a full round-trip to JSON primitives and does not have the DecimalNode precision logic enabled.

@Xiphoseer
Copy link
Author

Xiphoseer commented May 22, 2024

I've triggered the IT pipeline.

Thanks!

This is also my concern, since if a user already store their BigDecimal in the Cosmos DB using another version client, can the number be read using this version.

After the testing mentioned above, I'm fairly confident that the SimpleType changes will indeed not affect the actual serialization / deserialization.

Edit: quite the opposite actually. The existing implementation already looses scale/precison without user intervention and this PR neither solves nor worsens that.

@kushagraThapar
Copy link
Member

Thanks @saragluna for the PR review, appreciate your help!
@trande4884 can you please run the spring integration tests on this pipeline, thanks!

@trande4884
Copy link
Member

/azp run java - spring - tests

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@trande4884
Copy link
Member

@Xiphoseer since you are having issues with IT tests if you give me access I can push some tests to this PR

@Xiphoseer
Copy link
Author

@Xiphoseer since you are having issues with IT tests if you give me access I can push some tests to this PR

The "allow edits by maintainers" is on for this PR, but presumably that's not sufficient for some reason and you need write access to the fork?

@trande4884
Copy link
Member

@Xiphoseer I just pushed the integration tests, let me re-run the CI's

@trande4884
Copy link
Member

/azp run java - spring - tests

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@trande4884
Copy link
Member

/azp run java - cosmos - tests

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
azure-spring All azure-spring related issues Community Contribution Community members are working on the issue customer-reported Issues that are reported by GitHub users external to the Azure organization.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants