Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate the serialization from Klaxon to kotlinx-serialization library #312

Open
devcrocod opened this issue Mar 20, 2023 · 6 comments
Open
Labels
research This requires a deeper dive to gather a better understanding
Milestone

Comments

@devcrocod
Copy link
Contributor

This will help integrate the dataframe better into other libraries where serialization is required, such as ktor, ggdsl, and others

@devcrocod devcrocod added the enhancement New feature or request label Mar 20, 2023
@Jolanrensen
Copy link
Collaborator

Jolanrensen commented Mar 20, 2023

Didn't @belovrv have concerns about that?

@devcrocod
Copy link
Contributor Author

@belovrv had doubts about the performance of kotlinx.serialization. We discussed this issue and came to the conclusion that there are no obvious problems. However, since this task is quite time-consuming, even though part of the serialization/deserialization logic will be preserved from Klaxon, we decided to postpone its implementation

@zaleslaw
Copy link
Collaborator

zaleslaw commented Apr 4, 2023

I'd like the serialization and io tasks and especially both libraries, klaxon and kotlinx.serialization, will assign it to me

@zaleslaw zaleslaw self-assigned this Apr 4, 2023
@Jolanrensen Jolanrensen added this to the 0.11.0 milestone Apr 6, 2023
@zaleslaw zaleslaw changed the title change the serialization from Klaxon to kotlinx-serialization Migrate the serialization from Klaxon to kotlinx-serialization library Apr 17, 2023
@zaleslaw zaleslaw added performance Something related to how fast the library can handle data and removed enhancement New feature or request labels Apr 25, 2023
@zaleslaw zaleslaw modified the milestones: 0.11.0, Backlog Apr 25, 2023
@zaleslaw zaleslaw added research This requires a deeper dive to gather a better understanding and removed performance Something related to how fast the library can handle data labels Apr 25, 2023
@devcrocod
Copy link
Contributor Author

devcrocod commented Feb 21, 2024

A bit more detail about serialization support.

Benefits:

  1. Performance. Based on my limited research, simple conversion to strings is performance-wise equivalent to klaxon, as it performs identical logic. Marking classes as serializable and writing custom serializers, the performance of kotlinx-serialization is expected to be higher than klaxon. However, the actual implementation of the serializers will also affect this. In general, the performance will be slightly better or the same as klaxon. Thus, I wouldn't consider this as the main reason for switching to kotlinx-serialization or for sticking with klaxon.
  2. Code Refactoring. Support for kotlinx-serialization will help us improve the code responsible for json parsing. When I examined the code, I found some code quality issues. Serializers for dataframe will help alleviate this area, which will increase the codebase but will improve code readability and quality.
  3. Type Inference. This will affect type parsing, as some of the load can be shifted to the kotlinx-serialization plugin. This, in turn, will help us partially eliminate the use of platform-dependent reflection. Also, kotlinx-serialization has better support for JsonElement. For example, it can identify NaN unlike klaxon.
  4. Kotlin Ecosystem. kotlinx-serialization is an official library and is part of the Kotlin ecosystem, which will make it easier for us to work within it in the future. For example, with Ktor. There are also guarantees that it will be supported further.
  5. Other Formats. While official support is limited, interestingly, there is support for Protobuf.
  6. Multiplatform Support. kotlinx-serialization is a multiplatform library.
  7. Flexibility and reliability. Writing custom serializers will allow better control over serialization and deserialization. For example, this can help avoid issues with serialization when a Map might be inside a column.

I envision the migration from klaxon to kotlinx-serialization as follows:

  1. To start, we can simply get rid of klaxon by switching JsonElement from klaxon to JsonElement from kotlinx-serialization. Since the logic is identical, this won't be too difficult. It will only be necessary to consider that kotlinx-serialization has JsonPrimitive, unlike klaxon. And jsonObjectBuilder looks a bit different.
  2. This is a more complex part that can be broken down into several steps:
  • The JSON object and transformation related to the schema — https://kotlin.github.io/dataframe/read.html#specify-key-value-paths
    It will be necessary to analyze how to improve this part or keep the simple parsing of the json tree.
  • Separating serialization and deserialization. Break down for specific objects, such as DataFrame, DataColumn.
  • Identifying necessary objects that are used when working with json. Since working with open interfaces is challenging.
  • Writing custom serializers for classes

@zaleslaw
Copy link
Collaborator

@devcrocod could we say that this is our blocker to be multiplatform?

@Jolanrensen
Copy link
Collaborator

@zaleslaw Just one of many: #24 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
research This requires a deeper dive to gather a better understanding
Projects
None yet
Development

No branches or pull requests

3 participants