Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

the legacy buildpack can still be used in "binder" or ".binder" folders #932

Open
bitnik opened this issue Jul 23, 2020 · 3 comments
Open

Comments

@bitnik
Copy link
Contributor

bitnik commented Jul 23, 2020

In #829 the legacy buildpack (LegacyBinderDockerBuildPack) is removed, but actually it can still be used in binder config folder, because detect method only checks the Dockerfile at the root:

https://github.com/jupyter/repo2docker/blob/8bbced7ded5a21b581f1f3846ffc9f87944ba799/repo2docker/buildpacks/legacy/__init__.py#L19-L23

was this on purpose? Or should we create a PR and change it to check Dockerfile in binder_dir?

Btw I know this is not a realistic usecase, but if a repo contains

  • ./Dockerfile: which uses the legacy image
  • ./binder/requirements.txt

this will also throws error saying that legacy buildpack is removed, but actually it should pick python buildpack.

@minrk
Copy link
Member

minrk commented Jul 23, 2020

I think it should not check for Dockerfile in binder_dir, since that never worked for the legacy buildpack, but it should probably add:

if self.binder_dir:
    return False

before checking the top-level Dockerfile contents to make sure legacy is never used if binder_dir is present.

@betatim
Copy link
Member

betatim commented Jul 30, 2020

Should we add this special casing? Everywhere else in repo2docker we have the "rule" that "if there is a Dockerfile it is used, nothing else counts". Here we'd start inverting that "Dockerfile wins, unless it is a legacy one, and your have config files in a binder sub-directory, but not if the config isn't in a sub-directory." (or something like that) The point being it is more complex than the current rule.

I'm wondering if there is a reason for repo owners wanting to keep the legacy Dockerfile instead of deleting it when they don't want to use it any more. If there is no use-case for keeping it I'd vote for a solution that encourages people to remove the legacy Dockerfile as a solution to problems/weird behaviour.

@minrk
Copy link
Member

minrk commented Aug 10, 2020

tl;dr: I think the logic we have now is right, and we shouldn't make any changes.

Everywhere else in repo2docker we have the "rule" that "if there is a Dockerfile it is used, nothing else counts"

Not quite, because binder_dir takes top priority and determines how we check "if there is a Dockerfile". If a binder directory exists, we consider no files outside this directory, even Dockerfiles.

The logic (excluding Legacy) is more like:

  1. resolve where binder files should be found (.binder, binder, top-level)
  2. only within this directory, is there:
    1. Dockerfile -> use it, ignore everything else
    2. other buildpacks, etc.

which results in the effective priority:

  1. binder/Dockerfile
  2. binder/requirements.txt
  3. /Dockerfile

i.e. the presence of binder/requirements.txt means top-level Dockerfile is and should be ignored. However, we currently special-case top-level Dockerfile matching the legacy Dockerfile pattern as the actual first priority, taking precedence over the presence of the binder directory.

Should we add this special casing?

I think special-casing legacy makes sense, since it is, somewhat by definition, a special case. The goal of the Legacy buildpack is to keep some fraction of old pre-repo2docker Binder repos working, and not use any of the newer repo2docker logic in that case.

Stepping back, I think the logic for picking legacy should be:

Is this repo built for the old Binder, before repo2docker was a thing?

So the question is really, in the presence of a top-level Dockerfile that looks like it was built for the old Binder, does the presence of a .binder or binder directory indicate that the repo has been updated for repo2docker instead of the old Binder? It's ambiguous because we have an old Dockerfile but also a newer binder directory with other files.

Case for using Legacy in this situation:

  • it's technically possible to have created a repo with a dir called binder with no special meaning for the old binder,
    so I could make a contrived case that would be broken by adding this and would need updating.

Case for not using Legacy in this situation:

  • an old repo had the Dockerfile for Binder v1, but was updated to work with repo2docker and this should be ignored until the legacy Dockerfile is removed

Now that I think of it, our current logic makes more sense than the change I proposed for two reasons:

  1. only the repos with more recent activity need more changes to work as desired
  2. because of the current logic, those repos don't work now, so probably don't exist, or at least aren't in use

To make this change, we would be enabling a new partial transition from Binder v1 to repo2docker without dropping support for the old way, which hasn't been operation for a couple of years now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants