Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support to ignore field separator #968

Conversation

fscarponi
Copy link

At the moment, there is no way to ignore field separator. This pr is adding a new put method to do so.

This behavior is primary to the PackageSearch team, allowing custom documents to be directly associated with the key.

…iated with the key

Updated the NitriteDocument class by extracting the field and value validation into a separate method for reusability. Added a new put method in Document interface and implemented it in NitriteDocument which includes an option to ignore the field separator, allowing values to be directly associated with the key.
@fscarponi fscarponi changed the title Add support to ignoreFieldSeparator Add support to ignore field separator May 15, 2024
}

@Override
public Document put(String key, Object value, boolean ignoreFieldSeparator) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How are you planning to get the value if you are putting values ignoring fieldSeparator? Why can't you use a different field separator than the default one?

Copy link

@lamba92 lamba92 May 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Package Search is indexing a lot of data from repositories, including Maven Central. Nitrite is used as local caches for our plugin. As such, the strings in the data might contains any character, hence triggering the deepPut unwillingly.

Changing the shape of the document will mess up the deserialization using kotlinx.serialization. As such we need a way to serialize without changing the shape of the document.

On a side note, I would guess that this fieldSeparator "optimization" has been designed for a better way to store data on disk. If so, I would instead recommend to have NitriteDatabase to handle the transformation from a "non-nested document" to a "nested" one right before it gets stored and vice versa when reading from storage. This will ensure the field separator mechanism to be transparent from Document initialization and will allow the NitriteConfig.fieldSeparator to not be static (beinig static looks to me kinda of a hack to let any document access it regardless of the configuration of the DB it will stored in).

I would also stress this fieldSeparator transformation in the docs since it might change the shape of the queries one might do.

Alternatively, you might either:

  1. allow to disable such mechanism altogether just like for Repositories initialization logic not compatible with Kotlinx.serialization #966
  2. create a builder for Document that accepts as parameter the fieldSeparator that by default . and can be nullable to allow no field separator.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am failing to understand the issue you are facing here. If for your data, deepPut is getting invoked unintentionally, that means, you have "." character in your field, which should not be the case as naming a field is under your complete control only. If the value contains any arbitrary string, it will not trigger deepPut. In other words, if your data can be converted to valid json, you should not be facing any issue in nitrite document also.

To better understand your issue, could you please provide me some example data, your kotlin entity class and the document it is generating. May be I'll be able to point out a better solution for you.

ThefieldSeparator is not designed for better storage. It is there to query nested documents in a collection, similar to mongo. Using it to put or get nested document is a convenience, nothing more.

The reason for NitriteConfig.fieldSeparator being a static field is to make it a global configuration application wide. So that all collections, indexes have uniform convention to access data from a document. It minimizes the configuration burden and unintentional bugs in the code, if user choose to customize the fieldSeparator.

Copy link

@lamba92 lamba92 May 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have a look at this test, the commit that introduces it is here. As I mentioned before, it makes sense that at a storage level the document is normalized using the deep put. I see the point. But it is incorrect that the act itself of adding something inside the document modifies it.

Manipulating the document should be a DBMS task, done right before storing or after retrieving it from disk.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @lamba92, I was looking into the test cases of yours and got the problem you are facing. You can take a look (if you have not already) at EntityConverter here.

May be, you don't need the kotlinx.serialization here at all. EntityConverter would be much simpler and efficient solution for you. It would be faster as there is no reflection involved.

Here is the code to use it, based on your test criteria.

// test data class
data class DeepPutTestData(
    val name: String?,
    val userData: Map<String, String>?
) {
    companion object {
        fun create() = DeepPutTestData(
            name = "aString",
            userData = mapOf("the.key.to.split" to "aValue")
        )
    }
}

// converter class
class DeepPutTestConverter : EntityConverter<DeepPutTestData> {
    override fun getEntityType(): Class<DeepPutTestData> {
        return DeepPutTestData::class.java
    }

    override fun toDocument(entity: DeepPutTestData, nitriteMapper: NitriteMapper): Document {
        return Document.createDocument("name", entity.name)
            .put("userData", entity.userData)
    }

    @Suppress("UNCHECKED_CAST")
    override fun fromDocument(document: Document, nitriteMapper: NitriteMapper): DeepPutTestData {
        val deepPutTest = DeepPutTestData(
            name = document.get("name") as String?,
            userData = document.get("userData") as Map<String, String>?
        )
        return deepPutTest
    }
}

// test method
@Test
fun testDeepPut() {
    val db = nitrite {
        registerEntityConverter(DeepPutTestConverter()) // register your converter here
    }
    val repository = db.getRepository<DeepPutTestData>()
    val deepPutTestData = DeepPutTestData.create()
    repository.insert(deepPutTestData)
    val found = repository.find().firstOrNull()
    assertEquals(deepPutTestData, found)
}

The above code will not trigger deepPut() in the document inside your repository.

Copy link
Contributor

@anidotnet anidotnet May 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lamba92, Just following up. Did the EntityConverter solution solve your problem?

@anidotnet
Copy link
Contributor

@fscarponi could you please clarify my concern above before I work on this PR?

@anidotnet
Copy link
Contributor

Closing this PR for lack of activity. Feel free to reopen for further changes.

@anidotnet anidotnet closed this May 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants