Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update CONTRIBUTING.md with the new build guide #2600

Merged
merged 11 commits into from Jan 7, 2022

Conversation

lukaszkalnik
Copy link
Contributor

The project structure has changed and the old build guide didn't work anymore so I updated the CONTRIBUTING.md guide.
I also provided a link there for the Gradle subproject task names syntax, as it may be not obvious to all.

Copy link
Member

@nomisRev nomisRev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @lukaszkalnik 🙏

cc\ @JavierSegoviaCordoba and @raulraja are updating the Gradle config.
I think this update is correct, and will stay correct moving forward?

@lukaszkalnik
Copy link
Contributor Author

I added one more commit with information how to actually find out the names of the subprojects. I also improved the layout a little bit.
BTW this is all based on my current struggles as a beginner contributor, so maybe it might come in handy for others as well ;)

@JavierSegoviaCordoba
Copy link
Member

Looks correct, except OP is missing information will come like spotless and suggest running spotlessApply before pushing.

I suggest keep this PR open a bit until we are sure if the OP needs to add only spotless info or something so.

Thank you 😀

@lukaszkalnik
Copy link
Contributor Author

@JavierSegoviaCordoba my PR is only about building the project locally, not pushing...

@JavierSegoviaCordoba
Copy link
Member

@LukasKalnik if the project is not correctly formatted before pushing, PR will fail and the contributor will have to run spotlessApply and push again with the code correctly formatted.

@lukaszkalnik
Copy link
Contributor Author

Ok, thank you for the explanation. Then please let me know if I should add anything more to this PR.

@nomisRev
Copy link
Member

@lukaszkalnik @JavierSegoviaCordoba perhaps we can add a new step for that? (I think we should really think about automating that so that any contributor is not bothered with it 😛)

Something like the following I think includes all information to correctly open a PR.

### Pushing your contribution to Arrow

The easiest way to contribute to Arrow is to create a branch from a fork, and then create a PR from Github from your PR.
Arrow is a large project that uses several tools to verify its quality, and we don't break downstream projects that rely on Arrow across versions. For this reason, we use [KtFmt](https://github.com/facebookincubator/ktfmt) and [Binary Compatibility Validator](https://github.com/Kotlin/binary-compatibility-validator) and they need to run before you commit and push your code to Github.

git clone git@github.com:username/arrow.git
git checkout -b "my-branch"
...
./gradlew spotlessApply # Format code
./gradlew apiDump # Generate api files
git add .
git commit -m "My message"

If you've included those changes for Binary Compatibility and formatted the code correctly it's time to open your PR and get your contribution into Arrow. Thanks ahead of time for your effort and contributions 🙏 

@lukaszkalnik
Copy link
Contributor Author

lukaszkalnik commented Dec 20, 2021

@nomisRev I see there is a whole section further below:
How to propose an improvement -> How to create a pull request
Probably it would make sense to add your changes there.

I suppose that the build commands in the How to generate and validate the documentation are outdated as well.

@lukaszkalnik
Copy link
Contributor Author

When trying to fix the "How to generate and validate documentation" section I've run into following issues:

./gradlew dokka

Task 'dokka' is ambiguous in root project 'arrow'. Candidates are: 'dokkaGfm', 'dokkaGfmCollector', 'dokkaGfmMultiModule', 'dokkaGfmMultimodule', 'dokkaGfmPartial', 'dokkaHtml', 'dokkaHtmlCollector', 'dokkaHtmlMultiModule', 'dokkaHtmlMultimodule', 'dokkaHtmlPartial', 'dokkaJavadoc', 'dokkaJavadocCollector', 'dokkaJavadocPartial', 'dokkaJekyll', 'dokkaJekyllCollector', 'dokkaJekyllMultiModule', 'dokkaJekyllMultimodule', 'dokkaJekyllPartial'.

Also there is no :arrow-site:runAnk task like the documentation suggests.

@lukaszkalnik
Copy link
Contributor Author

I removed the section about running Ank manually for now (as it is outcommented in pull_request.yml as well).

@nomisRev
Copy link
Member

We're in the process of updating our release process, and also how we build documentation since so much has changed from the legacy Dokka to the latest Dokka. Similarly the release process has compleltly changed from JVM only, to MPP.

Sorry for these inconveniences, and thanks for helping to make these docs more accurately.

So to reply to your questions:

Probably it would make sense to add your changes there.

Yes, you're 100% correct.

I suppose that the build commands in the How to generate and validate the documentation are outdated as well.

Yes, Ank has been replaced by the 1st party tool KotlinX Knit and it works as a kind of pre-processor to Dokka, but that is handled inside Gradle.

Publishing, and generating docs is broken atm though. So not sure how that process will still change.
It's failing with "pre-validation step in Dokka on the latest 1.6.0 version".

@lukaszkalnik
Copy link
Contributor Author

Thanks for your support. I just feel a little underqualified but with your support I will try to make the requested changes.

@lukaszkalnik
Copy link
Contributor Author

...
./gradlew spotlessApply # Format code
./gradlew apiDump # Generate api files
git add .
git commit -m "My message"

Shouldn't apiDump be only executed when adding new public APIs? binary-compatibility-validator's docs say:

The plugin provides two tasks:
apiDump — builds the project and dumps its public API in project api subfolder. API is dumped in a human-readable format. If API dump already exists, it will be overwritten.
apiCheck — builds the project and checks that project's public API is the same as golden value in project api subfolder. This task is automatically inserted into check pipeline, so both build and check tasks will start checking public API upon their execution.

So it looks like there are two cases:

  1. Committer modifies an existing feature: apiCheck task will be run as part of build to ensure he didn't break the API binary compatibility. So apiDump should not be run as it would overwrite the golden value.
  2. Committer adds a new API: then he has to run apiDump as there are no entries for his new API yet so they have to be generated for future checks.

Do I understand this correctly?

@lukaszkalnik
Copy link
Contributor Author

I also see that Spotless is not yet included in the project (hence there is no spotlessApply task). So probably it would be a good idea to wait with including it into the documentation, otherwise it might be confusing.

@lukaszkalnik
Copy link
Contributor Author

lukaszkalnik commented Dec 20, 2021

On the other hand, as the build/release process is apparently still being changed, maybe it would make sense to just merge this PR (if you think it's correct) and add all other changes in a subsequent PR when you've finalized the migration of the build process?

@nomisRev
Copy link
Member

nomisRev commented Dec 20, 2021

Hey @lukaszkalnik,

No need to feel underqualified 😄 It's a big project, and the contribution guide is out of date due to the latest changes, so this contribution is great! The Arrow team is always happy to help anyone willing to contribution 👍

Your understanding of binary-compatibility-validator is correct, and you only need to call apiDump if you add a new API but it will result in a no-op if you haven't changed the API so calling it redundantly can do no harm. Although specifying it in the guide would be very helpful!

Spotless works similarly, spotlessApply should be called locally and spotlessCheck verifies that the code is correctly formatted.

Small note:
I've been wanting to automate the task of apiCheck on CI, and I created a ticket for it. The goal is to make the binary API reviewable, by checking in these files we can keep track of the binary in git. #2521
If this is done, then you never have to locally deal with binary-compatibility-validator.

@lukaszkalnik
Copy link
Contributor Author

No need to feel underqualified 😄 It's a big project, and the contribution guide is out of date due to the latest changes, so this contribution is great! The Arrow team is always happy to help anyone willing to contribution 👍

Thanks! I'm happy to contribute, as this is a really interesting project.

Your understanding of binary-compatibility-validator is correct, and you only need to call apiDump if you add a new API but it will result in a no-op if you haven't changed the API so calling it redundantly can do no harm. Although specifying it in the guide would be very helpful!

I was thinking about the following scenario:

  1. The contributor changes some existing API in a binary-incompatible way
  2. The contributor executes apiDump, overwriting the golden standard files
  3. How does apiCheck now detect that there is a breaking binary compatibility change? Does it compare to the "old" golden standard files on main?

@lukaszkalnik
Copy link
Contributor Author

Ok, the workflow guide confirms my suspicion:

Regular workflow

  • When doing code changes that do not imply any changes in public API, no additional actions should be performed. check task on your CI will validate everything.
  • When doing code changes that imply changes in public API, whether it is a new API or adjustments in existing one, check task will start to fail. apiDump should be executed manually, the resulting diff in .api file should be verified: only signatures you expected to change should be changed.
  • Commit the resulting .api diff along with code changes.

So in the scenario when the contributor changed the API in a binary incompatible way, he has to run apiDump and verify the changed signatures manually. As soon as the result of apiDump will be committed to CI, apiCheck will stop detecting the broken binary compatibility.

So looks like when changing the API the changes to binary compatibility can only be detected by manual review, either by the contributor himself or by the reviewers of the PR. Maybe it would be a good idea to mention this as well?

@lukaszkalnik
Copy link
Contributor Author

@nomisRev Ok, I see you wrote exactly that in the issue you linked:

The goal of the Binary Compat Validator is not perse to break CI on breaking changes but to make changes in the binary reviewable.

@nomisRev
Copy link
Member

Yes, that is entirely correct. The usage of the tool seems a bit unconventional, but I think it was designed in this way because sometimes across major versions you also need the ability to break the binary sometimes to remove old deprecated code.

The burden should not be so much on the user though, since if apiDump did not work correctly then apiCheck will fail.
So this enforces that the api changes will always be visible in the PR for both the contributor, and reviewers.

If you want to verify locally that apiDump worked correctly you can run following commands.

./gradlew apiDump
./gradlew check

@JavierSegoviaCordoba if we make check depend on apiCheck then the users build should also break on ./gradlew check, right? Does that mean it's also included in build? I'm thinking perhaps we can update the check task to include everything that is necessary to reflect CI on PR, so we can simply the process for the users. Then a contributor can just run check to verify your PR will pass Github Actions.

@lukaszkalnik
Copy link
Contributor Author

Thanks for the clarification. I learn a lot from these explanations.

What about the spotlessApply task not being there currently? Is there a PR open which will add it soon? It would be confusing for the contributors if we include the spotlessApply task in the contribution guide but the task is not there yet.

@JavierSegoviaCordoba
Copy link
Member

JavierSegoviaCordoba commented Dec 20, 2021

@nomisRev apiCheck should be depending on check already. spotlessCheck too, and knit check task I think too.

./gradlew check should be the correct way to ensure everything is working, personally, I should use ./gradlew build which runs ./gradlew check plus more tasks (and this one is which is used in the CICD.

Not sure if a pre-commit should run apiDump/spotlessApply or apiCheck/spotlessCheck. Maybe a mix, a breaking change should be notified when it is being committed so I would run apiCheck, but autoformatting is "irrelevant" in term of, who cares if it fails, just fix it silently. So I would do a pre-commit that runs apiCheck and spotlessApply as a minimum.

Another option is a pre-commit running spotlessApply and later check

This pre-commit can be installed automatically when the project syncs too, so the user hasn't to do it manually.

@nomisRev
Copy link
Member

so the user hasn't to do it manually.

@JavierSegoviaCordoba totally in favor of this, but I tried Gradle githooks for spotlessApply and I hated it.
Multiple times I closed my projected thinking I commited, but the committ was aborted due to Grade running first.
So IMO it's way too slow, and apiCheck is even slower.

But both can be silently fixed on CI, without it affecting the purpose of the usage of Spotelss or Binary Validator.
I am always in favor of automating on CI if it doesn't affect the goals, and decreases burden on the developers.

Ideally, a good editorconfig or KtFmt idea plugin always keep the code correctly formatted.

@nomisRev
Copy link
Member

@lukaszkalnik Api files have been automated by @serras 🥳

@lukaszkalnik
Copy link
Contributor Author

lukaszkalnik commented Dec 25, 2021

@nomisRev can you elaborate what exactly has been automated?

Current state that I see on my other branch is:

  1. I changed some API (intentionally or not).
  2. I run build
  3. build fails because apiCheck fails. It tells me about the difference in the .api files.
  4. Now I have to either revert my API changes (if they were accidental) or, if they were intentional, manually run apiDump to make apiCheck pass.

Has this process now changed?

@lukaszkalnik
Copy link
Contributor Author

lukaszkalnik commented Dec 26, 2021

I suspect you mean the change to pull_request.yml in e2eb572 which runs apiDump instead of apiCheck on a pull request.
As I understand it, this still doesn't change what happens locally - when a build is run locally, an apiCheck runs (not apiDump), so local build will fail if there are API changes.

@lukaszkalnik
Copy link
Contributor Author

lukaszkalnik commented Dec 29, 2021

I just added some Knit annotated code snippets to my other PR and I see that the build is now failing.
Is there a reason why the knit task to (re-)generate the examples is not automatically run during build?

@lukaszkalnik
Copy link
Contributor Author

lukaszkalnik commented Jan 4, 2022

@nomisRev @JavierSegoviaCordoba do you think these changes are helpful and correct so that we can merge this PR?
If there are any changes to the build configuration later we can always create another PR with the new updates to the contributing guide.

Copy link
Member

@nomisRev nomisRev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this update is very valuable @lukaszkalnik! Thanks for updating it.
Everything looks correct to me and can be merged.


For code formatting we use [Spotless](https://github.com/diffplug/spotless/tree/main/plugin-gradle) with [KtFmt](https://github.com/facebookincubator/ktfmt) and for API binary compatibility we use [Binary Compatibility Validator](https://github.com/Kotlin/binary-compatibility-validator). They need to run before you commit and push your code to Github.

If you've included those changes for binary compatibility and formatted the code correctly it's time to open your PR and get your contribution into Arrow. Thanks ahead of time for your effort and contributions 🙏
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hanks ahead of time for your effort and contributions 🙏
❤️

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part was actually originally your suggestion ;) I think it's very good too!

@nomisRev nomisRev merged commit 3bec032 into arrow-kt:main Jan 7, 2022
@lukaszkalnik lukaszkalnik deleted the update-contributing-guide branch January 7, 2022 16:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants