Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC Split the repo based on the projects in the repo #744

Closed
adrinjalali opened this issue Mar 7, 2022 · 14 comments
Closed

RFC Split the repo based on the projects in the repo #744

adrinjalali opened this issue Mar 7, 2022 · 14 comments

Comments

@adrinjalali
Copy link
Contributor

Right now the huggingface_repo includes three projects;

  • huggingface_hub: which is the core library used by integration libraries and api-inference-community
  • api-inference-community which handes integration with some of the third party libraries
  • huggingface_widgets which is located under the js folder and hosts the code for the inference widgets

The three projects require different dev environments, have different development styles, and follow different release cycles.

This request for comment is for us to see if it would make sense to separate the repo into three different repositories.

cc @julien-c @LysandreJik @osanseviero @Narsil

@julien-c
Copy link
Member

julien-c commented Mar 7, 2022

no strong opinion on my side, for context we pushed to experiment with a single repo some time ago to see if it would gain "critical mass" on GitHub, but it's maybe not as important anymore

So I'm open to re-splitting.

@osanseviero
Copy link
Member

After some discussions last week I think it makes sense to split. I think having clearer release cycles for api-inference-community would be beneficial, and not having a mixed bag of everything in a single repo.

@adrinjalali, js has many things: inference snippets, widgets, snippets of code for each library, the definition of the tasks tree, etc. There are two other directories: tasks and docs, which I would move with js. Soon huggingface_hub will have new proper docs, which would be in this repo. docs would be just documentation about product usage. This has the advantage that moon-landing will just directly depend on that repo instead of huggingface_hub and huggingface_widgets.

@adrinjalali
Copy link
Contributor Author

adrinjalali commented Mar 7, 2022

Sounds good to me @osanseviero . I think on top of the API docs we need some user guides as well, which would go in the docs/ folder, but I agree that the product specific docs can (or should) move and be where js goes.

In terms of how to execute this, I suggest:

  • creating two repositories, one for api-inference-community and one for js (not sure what we're gonna call this one)
  • setup their corresponding CI and GH actions
  • mirror the existing content of this repo there
  • move issues related to those projects to those repos
  • stop accepting PRs which change these two projects in this repo and direct people to those other repos
  • move existing PRs to those repos, or merge some of the existing ones here, and keep those two repos in sync with this one while this is happening
  • make moon-landing use those repos
  • delete those projects from this repo

WDYT?

@Narsil
Copy link
Contributor

Narsil commented Mar 7, 2022

I am fine either way (single or split) !

We need to not forget moving all github actions related and adapt them in consequence too.

@osanseviero
Copy link
Member

We also need to change moon-landing to access the new repo instead of this one.

@pngwn
Copy link
Member

pngwn commented Mar 10, 2022

I personally have no opinion on this suggestion, as I do not routinely work with these projects but thought I would share my experience.

In my experience, lots of individual repos scales poorly as an org grows. Having fewer, bigger repos, increases the likelihood that the repo will be well maintained (as it is less liklely to get abandoned without intention, and generally has more eyes on it), offers an easier path for code sharing (local files/ local symlinks are far easier to maintain and utilise than cross-repository ones), makes enforcing code and quality standards easier and, in particular, makes sharing CI easier.

I don't think there are any inherent problems with multi-language mono-repos, whether custom tooling is used to support that or not.

I also think there are better mechanisms for importing dependencies than relying on the entire repo. Package repositories, both public and private, exist to address that problem.

@stevhliu
Copy link
Member

+1 for the split! ❤️

Not directly related to splitting the repo, but I think this could be a good opportunity to sort of refresh the Hub product docs. Instead of a list of questions (i.e, How can I do this? Can I do that?), I think it would be nice to give it a little more narrative/structure and focus more on user goals instead of more granular/narrow questions.

@LysandreJik
Copy link
Member

As everyone involved in the thread seems to react positively to the split, we will now go ahead and split the repository in three parts with @adrinjalali.

We're currently trying the following:

  • Keep huggingface_hub as the source of truth with all commits and files. At the end of the process, when we have verified everything works correctly as expected, we will remove the files of the two other repositories in a single commit.
  • Create a new repository huggingface/api-inference-community.
    • That repository is set at the first commit of the huggingface_hub repository (1884eff).
    • We then cherry-pick all commits that touched to the api-inference-community folder, and apply those commits only.
    • We remove the files that were included in commits that are not related to the api-inference-community.
    • We move the contents of the folder to the root of the repository.
  • We do the same with the new repository huggingface/<name_to_be_found_for_the_js_folder>:
  • We complete the checklist shared by Adrin here: RFC Split the repo based on the projects in the repo #744 (comment)

@adrinjalali
Copy link
Contributor Author

@Narsil could you please have a look at https://github.com/huggingface/api-inference-community and check if things are okay? I have a fork of the old version under my account in case we end up needing it.

@adrinjalali
Copy link
Contributor Author

I think I made sure the .github and api-inference-community are in sync here and in https://github.com/huggingface/api-inference-community

One thing I think is missing is the USER_API_TOKEN secret, which I'm not sure how to generate for the new repo.

@adrinjalali
Copy link
Contributor Author

Will now go ahead with moving api-inference-community related issues to the corresponding repo.

@LysandreJik
Copy link
Member

The js, tasks and docs folders, alongside all commits that touched to these files, have been moved to the public repository https://github.com/huggingface/hub-docs.

Similarly, there is a secret missing for the js tests which is the NETLIFY_AUTH_TOKEN.

@julien-c
Copy link
Member

@mishig25 who's "owning" the widgets or @beurkinger might be able to add a new netlify token (ping me if you can't find one)

@adrinjalali
Copy link
Contributor Author

All action items on this issue seem to be completed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants