Compare DokkaSourceSetImpl by DokkaSourceSetID only #3568

whyoleg · 2024-04-15T16:39:17Z

probably fixes #3246

Note: PR is not ready to be merged, documentation is not added.

There are several questions which should be resolved first:

After implementation of pre-generation check I found out, that those checks are not run by default in TESTS - in this PR, I've enabled them to run by default (one test started to fail: JavadocDocumentableJVMSourceSetFilterTest because of js sourceSet present, which is not supported by javadoc plugin) - does anyone have anything against enabling them, as I do expect them to run?
It's not possible on current moment to create a pre-generation check for sourceSet IDs uniqueness during multi-module generation, as we don't have information about sourceSets there, only the result of Dokka execution - Do you think it's fine? I'm a little worried about this.
when comparing execution time of Dokka running with and without this commit - I haven't seen a big difference (for serialisation and coroutines) - better to recheck
We do also implement DokkaSourceSet in CLI and in tests separately - I don't think that we need to change something there, as now they are compared by identity and everything works find - WDYT?
Just out of curiosity I've replaced comparing by sourceSet IDs with comparing by identity - this caused failure of several tests (screenshot attached) - in those tests we do compare or DokkaConfiguration or DModule in assertEquals. So if we will change equals of DokkaSourceSet to use ID only, such tests will just silently stop working - we could fix those, but what about future tests and overall complexity of understanding this behaviour? WDYT?

* Add checker that all sourceSet ids are unique during generation * `verificationStage` should always run in tests * `verificationStage` should run as a first step as in `SingleModuleGeneration::generate` in tests

IgnatBeresnev · 2024-04-16T10:04:01Z

After implementation of pre-generation check I found out, that those checks are not run by default

Note to other reviewers: this is about unit tests only. I got scared for a second when reading this sentence 😅 I think it's correct to enable it, yeah

It's not possible on current moment to create a pre-generation check for sourceSet IDs uniqueness during multi-module generation, as we don't have information about sourceSets there, only the result of Dokka execution - Do you think it's fine? I'm a little worried about this.

Why are you worried about this, what problems could it lead to?

To me that's expected: source sets are only visible when you open a specific module, and each module is processed separately. The plugins are also written as if for a single module, so hashCode / equals should as expected

when comparing execution time of Dokka running with and without this commit - I haven't seen a big difference (for serialisation and coroutines) - better to recheck

We don't use Documentable / DokkaSourceSet in maps/sets/etc much, so it's also expected that we wouldn't see a noticeable difference. Google's plugin does use it heavily from what I understand, and the folks have done some profiling and identified DokkaSourceSet#equals as one of the possible performance improvements (the other one being the inclusion of sources, such as descriptors/psi, in hashCode/equals of most documentables)

We do also implement DokkaSourceSet in CLI and in tests separately - I don't think that we need to change something there, as now they are compared by identity and everything works find - WDYT?

If the spec (i.e documentation) states that the implementations should compare it by DokkaSourceSetID, I think the behavior should be aligned. I doubt that comparing by identity here would lead to bugs, but I'd change it just for consistency and to cause less questions later on

Just out of curiosity I've replaced comparing by sourceSet IDs with comparing by identity - this caused failure of several tests (screenshot attached) - in those tests we do compare or DokkaConfiguration or DModule in assertEquals. So if we will change equals of DokkaSourceSet to use ID only, such tests will just silently stop working - we could fix those, but what about future tests and overall complexity of understanding this behaviour? WDYT?

Good thought to check it!

I wouldn't say they will "stop working" - they will be working correctly from the specification's (documentation's) point of view, but I see your concern - they might deserialize some fields of a source set incorrectly and it will pass the equals check

I would actually expect specialized serialization/deserialization tests to be checking every field one by one, not just compare the two topmost objects by equals, because it is fragile otherwise. These tests basically already rely on the implementation and it being a data class that compares everything by default, which is not guaranteed at all (equals doesn't have to and most of the time doesn't compare everything - for instance, collections/arrays are frequent to be omitted), and I think the tests should be fixed regardless of this change (now or later)

As to how to fix the tests - that's a good question. Do you have any thoughts here?

I only have two ideas atm:

add more asserts that check individual fields one by one - more code to write, but better for the tests in case they fail
take inspiration from Objects.deepEquals and add a fun deepEquals(other: DokkaSourceSet) to either DokkaSourceSetImpl or even DokkaSourceSet - this would compare literally everything, same as before, and we can adapt the tests to use it instead of the regular equals; the documentation for it would state that it should compare all underlying fields, but it will be relatively slow

whyoleg · 2024-04-16T12:08:12Z

Note to other reviewers: this is about unit tests only.

Oops, my bad, fixed the PR description.

Why are you worried about this, what problems could it lead to?

I believe there should be no problems here for now. May be I'm a bit too worried about such change, and so not feeling safe without all possible checks.
But mostly it's for reference to the original issue #3246 where it mentions (don't forget about AllModulesPagePlugin) :)

We don't use Documentable / DokkaSourceSet in maps/sets/etc much

I would disagree, as we do have SourceSetDependent (which is a map with DokkaSourceSet as a key) used 45 times just in org/jetbrains/dokka/model/Documentable.kt (4/5 shared properties per documentable entity). And we use pattern documentable.property[sourceSet] quite frequently in our code. So I would expect some difference, as there is no need to perform a lot of computations there. May be it's reproduced on bigger projects or slower hardware.
I'm fine with letting Google test the performance difference afterwards on their setup.

(the other one being the inclusion of sources, such as descriptors/psi, in hashCode/equals of most documentables)

Just in case, for reference, the issue is here: #2620

I doubt that comparing by identity here would lead to bugs, but I'd change it just for consistency and to cause less questions later on

yeah, exactly my thoughts, just wanted to recheck

As to how to fix the tests - that's a good question. Do you have any thoughts here?

As DokkaSourceSet is just an interface and I can't think of any case when it could be needed to compare it's content except in our tests, I would say that we could just introduce assertSourceSetEquals (and may be the same for other entities needed) in core-test-api/core textFixtures and use it in places where those tests failed.
I will add documentation for DokkaSourceSet, that it's hashCode/equals relies on ID only and this should be enough for understanding, that equals should not be used for structural comparing.

I will have last provocative question - what about overall making DokkaSourceSet equals/hashCode use just identity? AFAIU all over the code we don't really modify it anyway? As this is the configuration part, not the documentable part (which properties are frequently changed)?

Big thanks for such detailed review! I will be able to finish the PR based on this discussion.

…on` for tests To be able to use `dokka-test-api` in Gradle plugin module, it's needed for Gradle project name to be the same as the published artifact name, because of composite builds

whyoleg · 2024-04-17T08:39:56Z

PR is ready for review:

I've decided to put asserts in dokka-test-api module
to be able to use dokka-test-api from Gradle plugin, I was needed to rename Gradle project name (doesn't affect published artefact) in the same way as we have done for dokka-core

IgnatBeresnev · 2024-04-17T10:58:26Z

Nice idea about adding DokkaConfiguration asserts to dokka-test-api!

what about overall making DokkaSourceSet equals/hashCode use just identity? AFAIU all over the code we don't really modify it anyway? As this is the configuration part, not the documentable part (which properties are frequently changed)?

If we were designing it from scratch, I don't think I would mind, because it's indeed only created during the configuration, so it shouldn't change, and I can't think of a reason for why someone might create their own implementation during the other steps (documentable, content, renderer, etc)

Right now though, given that we already have DokkaSourceSetID which is being used to identify DokkaSourceSets, I think it would add to confusion if equals started using identity (or anything else, really). I guess DokkaSourceSetID could be renamed to something like DokkaSourceSetScope to avoid that confusion

I've also just now remembered that we have dependentSourceSets: Set<DokkaSourceSetID>

dokka/dokka-subprojects/core/src/main/kotlin/org/jetbrains/dokka/configuration.kt

Line 178 in d4d457d

public val dependentSourceSets: Set<DokkaSourceSetID>

and DokkaSourceSetID is used to check the dependencies between source sets, pretty much doing DokkaSourceSet.equals(other)

dokka/dokka-subprojects/plugin-base/src/main/kotlin/org/jetbrains/dokka/base/renderers/html/HtmlRenderer.kt

Lines 43 to 44 in d4d457d

    
           .map { it.sourceSetID } 
        
           .filter { it in sourceSet.dependentSourceSets }

I think this part could also be refactored to use identity, but that comes with a whole new can of worms. But without this refactoring, I think it would be confusing and possibly error prone to use both DokkaSourceSetID and identity for comparing source sets. I think it should be one or the other

vmishenev

I would like to know if this change breaks, e.g. the Google plugin.

it's also expected that we wouldn't see a noticeable difference.

Me too. And I am not even sure about the Google plugin. In my opinion, the description of the issue is too promising)
I would use a flame graph to compare the performance.
But, anyway, it is nice to fix the comparison.

whyoleg · 2024-04-19T08:44:36Z

I would like to know if this change breaks, e.g. the Google plugin.

dokka-devsite-plugin builds successfully (with the patch from https://github.com/Kotlin/dokka/compare/sample-analysis-manual), all tests are green. No significant tests execution time difference.

Make SourceSet equals/hashCode use only it's ID

0ffc693

* Add checker that all sourceSet ids are unique during generation * `verificationStage` should always run in tests * `verificationStage` should run as a first step as in `SingleModuleGeneration::generate` in tests

whyoleg requested review from vmishenev and IgnatBeresnev April 15, 2024 16:39

whyoleg self-assigned this Apr 15, 2024

whyoleg added 3 commits April 17, 2024 10:29

Add structural assertions over DokkaSourceSet and `DokkaConfigurati…

03f3a5e

…on` for tests To be able to use `dokka-test-api` in Gradle plugin module, it's needed for Gradle project name to be the same as the published artifact name, because of composite builds

Implement equals/hashCode for other inheritors

452f444

Add docs

77c2a6b

whyoleg force-pushed the sourceset-hashcode-equals branch from 149bdee to 77c2a6b Compare April 17, 2024 08:35

whyoleg marked this pull request as ready for review April 17, 2024 08:37

IgnatBeresnev approved these changes Apr 17, 2024

View reviewed changes

vmishenev approved these changes Apr 17, 2024

View reviewed changes

whyoleg merged commit f524273 into master Apr 19, 2024
12 checks passed

whyoleg deleted the sourceset-hashcode-equals branch April 19, 2024 09:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Compare DokkaSourceSetImpl by DokkaSourceSetID only #3568

Compare DokkaSourceSetImpl by DokkaSourceSetID only #3568

whyoleg commented Apr 15, 2024 •

edited

Loading

IgnatBeresnev commented Apr 16, 2024 •

edited

Loading

whyoleg commented Apr 16, 2024

whyoleg commented Apr 17, 2024

IgnatBeresnev commented Apr 17, 2024 •

edited

Loading

vmishenev left a comment

whyoleg commented Apr 19, 2024

Compare DokkaSourceSetImpl by DokkaSourceSetID only #3568

Compare DokkaSourceSetImpl by DokkaSourceSetID only #3568

Conversation

whyoleg commented Apr 15, 2024 • edited Loading

IgnatBeresnev commented Apr 16, 2024 • edited Loading

whyoleg commented Apr 16, 2024

whyoleg commented Apr 17, 2024

IgnatBeresnev commented Apr 17, 2024 • edited Loading

vmishenev left a comment

Choose a reason for hiding this comment

whyoleg commented Apr 19, 2024

whyoleg commented Apr 15, 2024 •

edited

Loading

IgnatBeresnev commented Apr 16, 2024 •

edited

Loading

IgnatBeresnev commented Apr 17, 2024 •

edited

Loading