Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Bower mass-import #95

Closed
f-f opened this issue Nov 11, 2020 · 11 comments
Closed

Implement Bower mass-import #95

f-f opened this issue Nov 11, 2020 · 11 comments
Assignees
Milestone

Comments

@f-f
Copy link
Member

f-f commented Nov 11, 2020

We already have some code that can get all the Bower package manifests and generate our manifests, so in order to be able to do the mass-import of Bower packages to the Registry we need to implement some code that:

  • downloads the tarball for a commit hash from a GitHub repo
  • sticks the generated manifest inside
  • repackages the whole thing
  • computes the sha256 of the new tarball

Note that we are planning to have CI checks for new packages, and would be good to run them for imported packages too:

Plus we should convers the semver ranges in the dependencies to simpler ones when generating the manifest, see #43

@f-f f-f added good first issue Good for newcomers help wanted Extra attention is needed CI labels Nov 11, 2020
@JordanMartinez
Copy link
Contributor

Based on the docs, we'll need to download an archive via https://api.github.com/repos/USER/REPO/tarball/REF. For example, https://api.github.com/repos/purescript/purescript-prelude/tarball/v4.1.0

We can add the manifest inside the file via tar --append v4.1.0.json --file=downloaded-archive.tar and compute the sha256 sum via sha256sum result.tar.

Not sure whether this is an issue, but the downloaded tar will store all of its content in top-level folder:

$ tar -tf prelude.tar 
purescript-purescript-prelude-7a691ce/.eslintrc.json
purescript-purescript-prelude-7a691ce/.gitignore
purescript-purescript-prelude-7a691ce/.travis.yml
purescript-purescript-prelude-7a691ce/LICENSE
purescript-purescript-prelude-7a691ce/README.md
purescript-purescript-prelude-7a691ce/bower.json
purescript-purescript-prelude-7a691ce/package.json
purescript-purescript-prelude-7a691ce/src/Control/Applicative.purs
purescript-purescript-prelude-7a691ce/src/Control/Apply.js
...

Correct me if I'm wrong, but I'm assuming we would want to remove the toplevel purescript-purescript-prelude-7a691ce directory, correct?

purs.json -- package manifest file
.eslintrc.json
.gitignore
.travis.yml
LICENSE
README.md
bower.json
package.json
src/Control/Applicative.purs
...

@f-f
Copy link
Member Author

f-f commented Nov 16, 2020

@JordanMartinez since we'll ultimately be in control of the whole tar/untar business (spec and implementations), we can go for anything as long as we detail it properly in the spec - and I think there's value in keeping consistent with the format that gihub is using, as I feel it's slightly more convenient implementation-wise: we'll support also package destinations not on GitHub (this is not the case for Bower import though) and we'll have to clone repositories in that case. Just packaging the whole repo folder might be less messy than making folders, listing files to tar, etc

@JordanMartinez
Copy link
Contributor

@f-f While it might make the implementation easier, won't it make it harder for tools to extract that information later if the toplevel directory is always different?

@f-f
Copy link
Member Author

f-f commented Nov 16, 2020

Could be? The amount of code that deals with that problem is exactly one line in Spago, here
In any case, I don't have a strong opinion on this and I'm fine with either way as long as we spec it 🙂

@f-f
Copy link
Member Author

f-f commented Dec 2, 2020

According to this comment it looks like extracting tarballs that don't have a toplevel folder is a messy business with Nix. I don't know anything about this, but maybe @thomashoneyman can confirm?

@thomashoneyman
Copy link
Member

Sorry, I don't know about this either. I don't have an opinion on the format here, though I in general prefer less processing (use whatever GitHub gives ya) as there are fewer places to check if things go wrong.

@f-f
Copy link
Member Author

f-f commented Dec 20, 2020

I was just thinking that instead of special casing this as a "mass import" we could instead add a specific Operation named AdditionFromBower, which would be exactly the Addition, except also importing/converting the package from the Bower coordinates. In this way we could go ahead with implementing the pipeline in #96, and just add a special case in there for Bower packages. At that point we could import the packages from Bower slowly instead of having to run a batch job - this would allow for better supervision on the import code and hopefully faster iteration, and would allow us to keep the Registry in sync with the new package versions that get published on Bower.

@f-f
Copy link
Member Author

f-f commented Dec 25, 2020

I just realized that when we convert Bower manifests at the moment, the dependencies still look like this:
https://github.com/purescript/registry/blob/78ea3c90de75722df1d0ffc095ab288e5c0cbc38/examples/aff/v5.1.2.json#L7-L14

I believe we should remove the purescript- prefix, since these are supposed to refer to other registry packages, which will have their prefix removed according to #3

@f-f
Copy link
Member Author

f-f commented Jan 24, 2021

I'm having a shot at implementing this whole pipeline right now, and I now have a more concrete answer on the whole "should we keep the subdirectory in the tar archive", and my take is that "yes, we should". As it's pointed out multiple times in this SO question, not having a top level directory inside the tar archive (i.e. just packing all the files in the base level) makes it really annoying to deal with when unpacking.
So I think we should have a toplevel directory inside the tar, and this is going to be good for clients, as long as we have a standard naming scheme (which will be something like $packageName-$version)

@JordanMartinez
Copy link
Contributor

Ok, sounds good.

@f-f f-f self-assigned this Apr 11, 2021
@f-f f-f added this to the v0.1 milestone Apr 26, 2021
@f-f
Copy link
Member Author

f-f commented Sep 8, 2021

This is now implemented! The first piece was #140, and as of #210 we should be most of the way there. We opened new issues to track smaller parts of this (such as #212 and #214), so I'll close this 😊

@f-f f-f closed this as completed Sep 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants