Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rough draft of contributor help script #4293

Closed
wants to merge 5 commits into from

Conversation

sarayourfriend
Copy link
Contributor

@sarayourfriend sarayourfriend commented May 9, 2024

Fixes

Fixes #4137 by @AetherUnbound

Description

Adds a new script get-started.sh for contributors to run. The script checks for dependencies on their system and gives suggestions for how to install them.

I've gone with a basic approach here, and we can expand or change references however we see fit. One thing I've noticed is that the installation suggestions naturally overlap with the documentation we have in the getting started page. Is there a reasonable way to merge these? Should we go "all in" on the script? Or should the script refer to the documentation pages instead of putting the suggestions in-line? The latter is probably the clearest and easiest way to remove the duplication. What do y'all think?

Testing Instructions

I've added a temporary Dockerfile and just recipe to make testing the different scenarios as easy as possible.

Run just test-get-started <target> with each of the targets defined in Dockerfile.get-started-test:

  • no-deps
  • with-python
  • with-just
  • with-docker
  • with-docker-compose
  • with-node
  • with-pnpm-direct
  • with-everything

I'll remove the Dockerfile and the just recipe before merging.

Output when there's nothing missing
Welcome to...                                                                                                                                      

   ____
  / __ \
 | |  | |_ __   ___ _ ____   _____ _ __ ___  ___
 | |  | | '_ \ / _ \ '_ \ \ / / _ \ '__/ __|/ _ \
 | |__| | |_) |  __/ | | \ V /  __/ |  \__ \  __/
  \____/| .__/ \___|_| |_|\_/ \___|_|  |___/\___|
        | |
        |_|

This script will check your local development environment for the tools required to work on Openverse.

If anything is missing, it will let you know, and provide a suggestion for where to find it.


Enabling corepack in the repository for pnpm!

Congrats! Your system appears to be all set up for Openverse development!

Try running 'just install' followed by 'just up' and then 'just init'.

Further setup instructions can be found in the quick start guide.
The guide also includes instructions for setting up individual parts of the Openverse stack.

https://docs.openverse.org/general/quickstart.html
Output when everything is missing
Welcome to...

   ____
  / __ \
 | |  | |_ __   ___ _ ____   _____ _ __ ___  ___
 | |  | | '_ \ / _ \ '_ \ \ / / _ \ '__/ __|/ _ \
 | |__| | |_) |  __/ | | \ V /  __/ |  \__ \  __/
  \____/| .__/ \___|_| |_|\_/ \___|_|  |___/\___|
        | |
        |_|

This script will check your local development environment for the tools required to work on Openverse.

If anything is missing, it will let you know, and provide a suggestion for where to find it.



====== Python language ======

Python 3.11 or later could not be found on your system.
Please update or install Python according to the instructions from the Python Foundation:

https://docs.python.org/3/using/unix.html#getting-and-installing-the-latest-version-of-python


====== 'just' command runner ======
The 'just' command runner could not be found on your system.
Try installing 'just' using your OS's package manager: https://github.com/casey/just?tab=readme-ov-file#packages

For debian or debian-derived systems (like Ubuntu) that do not have makedeb configured,
'just' also provides pre-built binaries. However, you'll need to manually keep them updated:

https://github.com/casey/just?tab=readme-ov-file#pre-built-binaries

Alternatively, you may prefer using the 'just-install' NPM package, endorsed by the 'just' project:

https://github.com/brombal/just-install#readme


====== Docker container runtime ======

Docker is missing from your system. Install it and Docker compose using Docker's instructions.

Docker engine: https://docs.docker.com/engine/install/
Docker compose: https://docs.docker.com/compose/install/

Podman is not currently supported for Openverse development.


====== pnpm Node.js package manager ======

pnpm is missing from your system, and corepack was unavailable to automatically install it using standard Node.js tooling.

For ease of use, Corepack is highly recommended and is the most flexible approach to Node.js package manager installation.

Refer to Corepack's documentation for installation instructions: https://github.com/nodejs/corepack?tab=readme-ov-file#how-to-install

Alternatively, to install pnpm directly, refer to pnpm's installation instructions: https://pnpm.io/installation#on-posix-systems


I detected the following missing dependencies:
- pnpm package manager
- Docker container runtime
- Just command runner
- Python 3.11 or greater

Checklist

  • My pull request has a descriptive title (not a vague title like Update index.md).
  • My pull request targets the default branch of the repository (main) or a parent feature branch.
  • My commit messages follow best practices.
  • My code follows the established code style of the repository.
  • [N/A] I added or updated tests for the changes I made (if applicable).
  • I added or updated documentation (if applicable).
  • I tried running the project locally and verified that there are no visible errors.

Developer Certificate of Origin

Developer Certificate of Origin
Developer Certificate of Origin
Version 1.1

Copyright (C) 2004, 2006 The Linux Foundation and its contributors.
1 Letterman Drive
Suite D4700
San Francisco, CA, 94129

Everyone is permitted to copy and distribute verbatim copies of this
license document, but changing it is not allowed.


Developer's Certificate of Origin 1.1

By making a contribution to this project, I certify that:

(a) The contribution was created in whole or in part by me and I
    have the right to submit it under the open source license
    indicated in the file; or

(b) The contribution is based upon previous work that, to the best
    of my knowledge, is covered under an appropriate open source
    license and I have the right under that license to submit that
    work with modifications, whether created in whole or in part
    by me, under the same open source license (unless I am
    permitted to submit under a different license), as indicated
    in the file; or

(c) The contribution was provided directly to me by some other
    person who certified (a), (b) or (c) and I have not modified
    it.

(d) I understand and agree that this project and the contribution
    are public and that a record of the contribution (including all
    personal information I submit with it, including my sign-off) is
    maintained indefinitely and may be redistributed consistent with
    this project or the open source license(s) involved.

@github-actions github-actions bot added 🏷 status: label work required Needs proper labelling before it can be worked on 🚦 status: awaiting triage Has not been triaged & therefore, not ready for work labels May 9, 2024
@openverse-bot openverse-bot added 🟩 priority: low Low priority and doesn't need to be rushed 🌟 goal: addition Addition of new feature 🤖 aspect: dx Concerns developers' experience with the codebase labels May 9, 2024
@obulat obulat added 🧱 stack: documentation Related to Sphinx documentation and removed 🏷 status: label work required Needs proper labelling before it can be worked on 🚦 status: awaiting triage Has not been triaged & therefore, not ready for work labels May 9, 2024
@sarayourfriend sarayourfriend marked this pull request as ready for review May 15, 2024 00:55
@sarayourfriend sarayourfriend requested review from a team as code owners May 15, 2024 00:55
Copy link

Full-stack documentation: https://docs.openverse.org/_preview/4293

Please note that GitHub pages takes a little time to deploy newly pushed code, if the links above don't work or you see old versions, wait 5 minutes and try again.

You can check the GitHub pages deployment action list to see the current status of the deployments.

Changed files 🔄:

@dhruvkb
Copy link
Member

dhruvkb commented May 15, 2024

Should we go "all in" on the script? Or should the script refer to the documentation pages instead of putting the suggestions in-line?

I'd love to review this PR in depth soon, but I do want to quickly chime in to voice my support for keeping the docs in the docs, and for the script to print links to the docs site. For these reasons:

  • It'll introduce contributors to the docs site and encourage them to read and search through it for common issues. It's also good to cultivate an expectation that the docs site should have answers for commonly asked questions.
  • It's easier to add rich content like formatting, images, step-by-step guides. on the docs site compared to plain text content in a script. Additionally a web page is much more capable and interactive with hyperlinks and search.

@sarayourfriend
Copy link
Contributor Author

I agree with you, Dhruv.

Do you think we should have the instructions say something like:

  1. Install git if you don't have it
  2. Clone the repository
  3. Run the get-started script (or whatever we end up calling it) and it will tell you what dependencies you're missing (you can read about those below)

That would change the docs from "install these dependencies" to "run this script and it will tell you what to do next". It feels like there's a bit of a structural change warranted by the introduction of this script, mainly because I'm not entirely confident when it should be brought up in the documentation.

@dhruvkb
Copy link
Member

dhruvkb commented May 15, 2024

We can suggest the script at the very top of the general setup page, in a "tip"-style admonition. This lets novice contributors, who will primarily be interested in a script like this to find it immediately instead of having to read through any docs first.

I'm not a fan of the curl | bash pattern but it let's us skip the Git clone step as well.

curl https://raw.githubusercontent.com/WordPress/openverse/main/get_started.sh | bash

Then the entire rest of the page can be left as-is as reading material for slightly advanced contributors who would prefer to set things up manually with more control over their systems. Any missing deps can be listed by the script as links to the very specific sections in the file for those deps. It would love it if we could go one step further and just install the missing deps for the users too (but that can make the script extremely complicated and, since we'll need sudo, potentially dangerous too).


Oh and I think the script name get_started.sh is quite good and on point, but with _ instead of the -, similar to the load_sample_data.sh script.


Additionally I feel like the script should ask the contributor what part of the stack they would like to work on because some setup can be avoided, and some deps may differ, if they decide to only work on a smaller part of the stack like the docs or the frontend (which do not need Docker) or the API or ingestion server which do not need PDM to be present locally.


Also about linting, I think we can keep the entire thing as optional and let the CI check it for them. Linting makes a lot of dependencies like pnpm and Docker become non-optional which may be a lot to ask of a new contributor.

@sarayourfriend
Copy link
Contributor Author

sarayourfriend commented May 15, 2024

Sounds good on changing the script to link to the docs, it simplifies things quite a bit!

I'm not a fan of the curl | bash pattern but it let's us skip the Git clone step as well.

I wondered about this approach as well, but also am not a fan of it 🤔

Git is typically easy to install anywhere we support development 🤔. It's available by default on many Linux distros and macOS (as far as I understand it).

It would love it if we could go one step further and just install the missing deps for the users too (but that can make the script extremely complicated and, since we'll need sudo, potentially dangerous too).

FWIW, just to clarify, I'm firmly against this for a lot of reasons, even if we could work around the need for sudo. I don't think it actually creates less friction for contributors and the potential for wreaking havoc in someone's environment is too significant.

If anything, we could simplify things by ditching just, but that would be a huge change in our workflow. It's for some reason a recurring issue that people have trouble installing it on Ubuntu/Debian generally.

Installing things ourselves presents a huge list of problems, which I wrote about in the issue. I don't think it's something we should get in the mindset of trying to manage in the project unless we managed to move the entire development environment into Docker and made Docker the only dependency. That would be awesome, but would require making sure docker socket pass through was intuitive and easy for everyone to get working regardless of context.

Doing that would avoid all of this, Docker is trivially easy to install these days pretty much everywhere.

Additionally I feel like the script should ask the contributor what part of the stack they would like to work on because some setup can be avoided, and some deps may differ, if they decide to only work on a smaller part of the stack like the docs or the frontend (which do not need Docker) or the API or ingestion server which do not need PDM to be present locally.

That would be great, but maybe a fast follow? I agree it's a good feature, but more significant, and there are already some big questions that need answering in this basic version.

What do you think about this compared to making a development environment in a Docker image and Docker being the only requirement? We'd still need to write some kind of initialisation script to configure git inside the container, I think? Or could require git and docker, and everything else happens inside an openverse-dev-env Docker container?

I played around a bit with nix because it's another option for creating a zero-effort development environment... but only if nix is already available, and that's a question in itself 😅

Also about linting, I think we can keep the entire thing as optional and let the CI check it for them

The issue here is that then contributors are not running linters, and we repeatedly have to ask them to do so in PRs, or run it ourselves. This is a massive increase in the time and friction it takes for us to review PRs. I've talked about this in the discussion on this PR, #3889, but I think running the linters is a reasonable baseline expectation that you must be able to do to contribute. Same with writing unit tests when a change would require them.

We need to have some baseline expectation of being able to run development tools and follow the quick start guide. There are essentially no contributions that are less complex than that, and while it's nice to be able to accept easy one-off contributions, I question the value of them in and of themselves if we have to spend more time requesting changes or running linters on a PR ourselves than it would have taken to just implement the issue. As discussed in that PR, I don't think Openverse has the context, time, or other resources to be a significant aspect of someone's learning the basic skills of software development in Python or JavaScript. It would be great if we did, but we're very small team working on an ambitious project, and already are bogged down for time and energy to tackle our most urgent and pressing needs.

@dhruvkb
Copy link
Member

dhruvkb commented May 15, 2024

I would be all in on the devcontainers workflow where 100% of the development happens in a container with everything pre-provisioned but that has not worked out so reliably. I remember this issue with executables github.com/docker/for-mac/issues/5029.

For now, I think a simple script that only checks for all our dependencies (PDM, pnpm, Docker... everything) and reports a list of missing ones with docs links will be good enough. If it can be run with curl | bash that's good because it's one less step, but if not, that's good too because any contributor will undoubtedly need to clone the repo anyways.

What I mean to say this is a good start and I'd be totally open to merging this PR as-is albeit after a proper review with further enhancements as needed based on contributor feedback.

@zackkrida
Copy link
Member

zackkrida commented May 15, 2024

I think the approach @dhruvkb just outlined (which is basically what's here so far) is ideal.

I'll plug my related issue here, which should make actually cloning the repo much more accessible if it remains a requirement for running this script: #4329

@sarayourfriend
Copy link
Contributor Author

I'm going to close this for now, because I think #4343 is a better approach that fixes all the problems we wanted to solve with this script, without introducing a bunch of complex caveats about how to install certain dependencies on systems where it's a pain to do so for one reason or another.

If we can just entirely obviate the need to think about any of this for a normal contributor, and I believe we can by using just git, bash, and docker, then I'd prefer that than trying to sort out the complexity of instructing contributors to use this script (or not) and the documentation to support it.

#4343 would also be something that all of us could use all the time when working on Openverse, and so we'd be confident that it would continue to work and remain true to the needs of the project. A get-started script like the one in this PR would essentially never be used by us directly, and probably go out of sync with the project's needs.

@sarayourfriend sarayourfriend deleted the add/openverse-setup-script branch May 16, 2024 07:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🤖 aspect: dx Concerns developers' experience with the codebase 🌟 goal: addition Addition of new feature 🟩 priority: low Low priority and doesn't need to be rushed 🧱 stack: documentation Related to Sphinx documentation
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

Create a "dev dependencies check" script for identifying what a contributor may need
5 participants