Add support to ignore field separator #968

fscarponi · 2024-05-15T16:13:08Z

At the moment, there is no way to ignore field separator. This pr is adding a new put method to do so.

This behavior is primary to the PackageSearch team, allowing custom documents to be directly associated with the key.

…iated with the key Updated the NitriteDocument class by extracting the field and value validation into a separate method for reusability. Added a new put method in Document interface and implemented it in NitriteDocument which includes an option to ignore the field separator, allowing values to be directly associated with the key.

anidotnet · 2024-05-15T16:17:50Z

nitrite/src/main/java/org/dizitart/no2/collection/NitriteDocument.java

+    }
+
+    @Override
+    public Document put(String key, Object value, boolean ignoreFieldSeparator) {


How are you planning to get the value if you are putting values ignoring fieldSeparator? Why can't you use a different field separator than the default one?

Package Search is indexing a lot of data from repositories, including Maven Central. Nitrite is used as local caches for our plugin. As such, the strings in the data might contains any character, hence triggering the deepPut unwillingly.

Changing the shape of the document will mess up the deserialization using kotlinx.serialization. As such we need a way to serialize without changing the shape of the document.

On a side note, I would guess that this fieldSeparator "optimization" has been designed for a better way to store data on disk. If so, I would instead recommend to have NitriteDatabase to handle the transformation from a "non-nested document" to a "nested" one right before it gets stored and vice versa when reading from storage. This will ensure the field separator mechanism to be transparent from Document initialization and will allow the NitriteConfig.fieldSeparator to not be static (beinig static looks to me kinda of a hack to let any document access it regardless of the configuration of the DB it will stored in).

I would also stress this fieldSeparator transformation in the docs since it might change the shape of the queries one might do.

Alternatively, you might either:

allow to disable such mechanism altogether just like for Repositories initialization logic not compatible with Kotlinx.serialization #966

create a builder for Document that accepts as parameter the fieldSeparator that by default . and can be nullable to allow no field separator.

I am failing to understand the issue you are facing here. If for your data, deepPut is getting invoked unintentionally, that means, you have "." character in your field, which should not be the case as naming a field is under your complete control only. If the value contains any arbitrary string, it will not trigger deepPut. In other words, if your data can be converted to valid json, you should not be facing any issue in nitrite document also.

To better understand your issue, could you please provide me some example data, your kotlin entity class and the document it is generating. May be I'll be able to point out a better solution for you.

ThefieldSeparator is not designed for better storage. It is there to query nested documents in a collection, similar to mongo. Using it to put or get nested document is a convenience, nothing more.

The reason for NitriteConfig.fieldSeparator being a static field is to make it a global configuration application wide. So that all collections, indexes have uniform convention to access data from a document. It minimizes the configuration burden and unintentional bugs in the code, if user choose to customize the fieldSeparator.

Have a look at this test, the commit that introduces it is here. As I mentioned before, it makes sense that at a storage level the document is normalized using the deep put. I see the point. But it is incorrect that the act itself of adding something inside the document modifies it.

Manipulating the document should be a DBMS task, done right before storing or after retrieving it from disk.

Hi @lamba92, I was looking into the test cases of yours and got the problem you are facing. You can take a look (if you have not already) at EntityConverter here.

May be, you don't need the kotlinx.serialization here at all. EntityConverter would be much simpler and efficient solution for you. It would be faster as there is no reflection involved.

Here is the code to use it, based on your test criteria.

// test data class data class DeepPutTestData( val name: String?, val userData: Map<String, String>? ) { companion object { fun create() = DeepPutTestData( name = "aString", userData = mapOf("the.key.to.split" to "aValue") ) } } // converter class class DeepPutTestConverter : EntityConverter<DeepPutTestData> { override fun getEntityType(): Class<DeepPutTestData> { return DeepPutTestData::class.java } override fun toDocument(entity: DeepPutTestData, nitriteMapper: NitriteMapper): Document { return Document.createDocument("name", entity.name) .put("userData", entity.userData) } @Suppress("UNCHECKED_CAST") override fun fromDocument(document: Document, nitriteMapper: NitriteMapper): DeepPutTestData { val deepPutTest = DeepPutTestData( name = document.get("name") as String?, userData = document.get("userData") as Map<String, String>? ) return deepPutTest } } // test method @Test fun testDeepPut() { val db = nitrite { registerEntityConverter(DeepPutTestConverter()) // register your converter here } val repository = db.getRepository<DeepPutTestData>() val deepPutTestData = DeepPutTestData.create() repository.insert(deepPutTestData) val found = repository.find().firstOrNull() assertEquals(deepPutTestData, found) }

The above code will not trigger deepPut() in the document inside your repository.

@lamba92, Just following up. Did the EntityConverter solution solve your problem?

anidotnet · 2024-05-16T04:05:21Z

@fscarponi could you please clarify my concern above before I work on this PR?

anidotnet · 2024-05-30T16:58:58Z

Closing this PR for lack of activity. Feel free to reopen for further changes.

fscarponi changed the title ~~Add support to ignoreFieldSeparator~~ Add support to ignore field separator May 15, 2024

anidotnet reviewed May 15, 2024

View reviewed changes

anidotnet mentioned this pull request May 16, 2024

Revamp encoding-decoding with KotlinXSerializationMapper #969

Closed

anidotnet closed this May 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support to ignore field separator #968

Add support to ignore field separator #968

fscarponi commented May 15, 2024

anidotnet May 15, 2024

lamba92 May 16, 2024 •

edited

anidotnet May 16, 2024

lamba92 May 17, 2024 •

edited

anidotnet May 17, 2024

anidotnet May 23, 2024 •

edited

anidotnet commented May 16, 2024

anidotnet commented May 30, 2024

Add support to ignore field separator #968

Add support to ignore field separator #968

Conversation

fscarponi commented May 15, 2024

anidotnet May 15, 2024

Choose a reason for hiding this comment

lamba92 May 16, 2024 • edited

Choose a reason for hiding this comment

anidotnet May 16, 2024

Choose a reason for hiding this comment

lamba92 May 17, 2024 • edited

Choose a reason for hiding this comment

anidotnet May 17, 2024

Choose a reason for hiding this comment

anidotnet May 23, 2024 • edited

Choose a reason for hiding this comment

anidotnet commented May 16, 2024

anidotnet commented May 30, 2024

lamba92 May 16, 2024 •

edited

lamba92 May 17, 2024 •

edited

anidotnet May 23, 2024 •

edited