Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add more flexibility into devfile project source - checkout branch, commit ID etc #12775

Closed
mshaposhnik opened this issue Feb 27, 2019 · 26 comments
Assignees
Labels
kind/task Internal things, technical debt, and to-do tasks to be performed. severity/P1 Has a major impact to usage or development of the system.

Comments

@mshaposhnik
Copy link
Contributor

mshaposhnik commented Feb 27, 2019

Description

Factories have a variety of options to contol projects sources more precisely:
branch, startPoint, commitId, fetch, branchMerge, keepDir

In this issue, we want to add support for branch and commitId into the devfile as new fields:

projects:
  - name: petclinic
    source:
      type: git
      location: 'git@github.com:spring-projects/spring-petclinic.git'
      branch: my-branch

or

projects:
  - name: petclinic
    source:
      type: git
      location: 'git@github.com:spring-projects/spring-petclinic.git'
      commitId: 42dfsa234

respectively.

@mshaposhnik mshaposhnik added kind/task Internal things, technical debt, and to-do tasks to be performed. team/platform labels Feb 27, 2019
@sunix
Copy link
Contributor

sunix commented Feb 27, 2019

I would make 'branch' available first/top priority. Not sure in which usecases the other parameters would be useful.

@sunix
Copy link
Contributor

sunix commented Feb 27, 2019

to make it simpler, i would choose the second option (not using another attribute field)

@skabashnyuk skabashnyuk added the severity/P1 Has a major impact to usage or development of the system. label Mar 6, 2019
@skabashnyuk
Copy link
Contributor

@metlos
Copy link
Contributor

metlos commented Mar 18, 2019

I've updated the description with the agreed upon scope.

@sunix
Copy link
Contributor

sunix commented Mar 18, 2019

Thanks @metlos,
Git clone with commitId is not implemented in Theia ATM.
Any idea which git command it should perform ? not sure neither about the use cases.

@metlos
Copy link
Contributor

metlos commented Mar 18, 2019

Git clone with commitId is not implemented in Theia ATM.
Any idea which git command it should perform ? not sure neither about the use cases.

The use case I guess is that it is a general purpose tool to start with the repo at a specific state for e.g. debugging a bugfix.

In Che 6, the commitId is just the commit to checkout after the clone. E.g.:

git clone <REPO_URL>
git checkout <COMMIT_ID>

@metlos
Copy link
Contributor

metlos commented Mar 18, 2019

IMHO, branch is much more useful, of course, and commitId is just a nice-to-have.

@metlos
Copy link
Contributor

metlos commented Mar 18, 2019

@mshaposhnik , @skabashnyuk any comments about the above?

@skabashnyuk
Copy link
Contributor

skabashnyuk commented Mar 19, 2019

I was looking at https://git-scm.com/docs/git-checkout

git checkout [-q] [-f] [-m] [<branch>]
git checkout [-q] [-f] [-m] --detach [<branch>]
git checkout [-q] [-f] [-m] [--detach] <commit>
git checkout [-q] [-f] [-m] [[-b|-B|--orphan] <new_branch>] [<start_point>]
git checkout [-f|--ours|--theirs|-m|--conflict=<style>] [<tree-ish>] [--] <paths>…​
git checkout [<tree-ish>] [--] <pathspec>…​
git checkout (-p|--patch) [<tree-ish>] [--] [<paths>…​]

And have one idea in mind. What if instead of branch and commitId we use checkout field for example.

projects:
  - name: petclinic
    source:
      type: git
      location: 'git@github.com:spring-projects/spring-petclinic.git'
      checkout: existedbranch
projects:
  - name: petclinic
    source:
      type: git
      location: 'git@github.com:spring-projects/spring-petclinic.git'
      checkout: -b newbranch
projects:
  - name: petclinic
    source:
      type: git
      location: 'git@github.com:spring-projects/spring-petclinic.git'
      checkout: 234a34
projects:
  - name: petclinic
    source:
      type: git
      location: 'git@github.com:spring-projects/spring-petclinic.git'
      checkout: -b mybranch --track origin/master

@sunix @mshaposhnik @metlos @l0rd @vinokurig @sleshchenko wdyt?

@l0rd
Copy link
Contributor

l0rd commented Mar 19, 2019

@skabashnyuk I prefer branch

@sunix
Copy link
Contributor

sunix commented Mar 19, 2019

@skabashnyuk because it covers 99% of our known usecase, i would use only branch for cloning to a specific branch,
commit-id could be use to perform a git reset --hard <commit-id> (after the initial git clone)
WDYT ?

@metlos
Copy link
Contributor

metlos commented Mar 19, 2019

I agree with @sunix that branch is the important thing here.

On the other hand I am not sure if we're not trying to conflate two different things - namely the cloning of sources and setting the clone to a specific state.

WRT to cloning I'd maybe prefer some of the possibilities of the actual git clone like --depth or combination of --single-branch --no-tags

projects:
- name: petclinic
  source:
    type: git
    location: ...
    clone:
      branch: v2
      depth: 1
      full: false
    checkout: 556df33

but in either case I'd postpone such things, along with @skabashnyuk's checkout idea, that I find quite nice, to some later time. branch is IMHO gonna be the most commonly used thing (by far).

@skabashnyuk
Copy link
Contributor

Ok. Let's define branch behavior.
If a branch exists - checkout, if not - create? Correct?

@metlos
Copy link
Contributor

metlos commented Mar 20, 2019

I'm not sure about create if branch doesn't exist.

Maybe we could instead just checkout the default branch and inform the user about the fact? That seems a little bit less "magic" to me and gives the user more control over the situation.

@metlos
Copy link
Contributor

metlos commented Mar 20, 2019

And what about the naming and format?

projects:
- name: petclinic
  source:
    type: git
    location: ...
    clone:
      branch: v2

or

projects:
- name: petclinic
  source:
    type: git
    location: ...
    branch: v2

or even

projects:
- name: petclinic
  source:
    type: git
    location: ...
    checkout: v2

I'm personally leaning more towards the third actually. branch is logically a checkout concept, even though one can supply it at clone time. It would also lend itself nicely to support commitId in the future without actually any change to the devfile format (we'd just need to change the implementation from supplying --branch to git clone to actually do 2 git commands - a clone and then a checkout).

For now if user would supply a commitId to the checkout parameter we'd just say that that branch doesn't exist and checkout the default branch instead.

@metlos metlos self-assigned this Mar 20, 2019
@sunix
Copy link
Contributor

sunix commented Mar 21, 2019

I would go to the simplest and closest to what we have option 2:

- name: che-theia
    source:
      type: git
      location: 'https://github.com/eclipse/che-theia.git'
      branch: che-1234-wip

@sunix
Copy link
Contributor

sunix commented Mar 21, 2019

KO for checkout because here we are trying to give some information, not talking about the action. I would not replace location with clone for instance

@sunix
Copy link
Contributor

sunix commented Mar 21, 2019

commit-id should be the result of a git reset --hard <commit-id>. Otherwise we would fall down to strange state and behaviour.

@metlos
Copy link
Contributor

metlos commented Mar 22, 2019

KO for checkout because here we are trying to give some information, not talking about the action. I would not replace location with clone for instance

IMHO, "checkout", unlike "check out", is a noun (e.g. "please go to the checkout", or "the checkout papers are at the reception"), so yes, we're giving information. It is on the same level of confusion as "branch". When you say "git branch" you're talking about an action, whereas in our case you're giving information. Welcome to the wonderful world of ambiguities of non-declined languages ;)

The reason why I am trying to avoid branch is because it doesn't lend itself nicely to later improvements. While checkout is a familiar word that people are used to when working with git and are used to give it any "treeish", i.e. branch, tag or commit id, branch can only mean one thing.

So if we in the future want to support specifying a tag or commit id, we would have to go ahead and actually change all of Theia impl, devfile schema and server side impl to add the new attribute.

And I do think that for example checking out tags or commits is of great value - e.g. take a look at how the code looked like at the release time (usually a tag) or flows like "give me an IDE for the repo in the state I'm currently looking at on github".

commit-id should not be the result of a git reset --hard . Otherwise we would fall down to strange state and behaviour.

not sure what you're referring to here.

@metlos
Copy link
Contributor

metlos commented Mar 22, 2019

That said, I'm willing to look for a different name than checkout.

These came to my mind:

  • revision
  • commit

@l0rd
Copy link
Contributor

l0rd commented Mar 22, 2019

I don't understand why you want to use checkout. The whole source section of a devfile is in some way a git pull section (we could call it source-pull) so if you want to adhere to git the attributes of the devfile source section should match the options of git pull (i.e. repository + refspec + options):

NAME
       git-pull - Fetch from and integrate with another repository or a local branch

SYNOPSIS
       git pull [options] [<repository> [<refspec>...]]

@metlos
Copy link
Contributor

metlos commented Mar 22, 2019

refspec then?

@sunix
Copy link
Contributor

sunix commented Apr 1, 2019

refspec is very confusing ... at the moment i would have to read the documentation to understand what it is doing ... which is bad
i would keep

  • branch for the branch to checkout (clone or not). From theia git clone --branch <branch> <repourl>
  • commit-id would perform a git reset --hard <commit-id>

@metlos
Copy link
Contributor

metlos commented Apr 1, 2019

A short recap of a conversation we had:

  1. our main usecase is to support "sharing a PR branch between developers"
  2. another plausible usecase is "devfile for prod environment" which could be used as the base for developing production fixes for example.

branch is ideal for the usecase 1 but not for second usecase, because branches can point to different commits at different points in time.

refspec is a "hard to parse" name but can support both usecases with a single concept. It is also a bit of a misnomer, because refspec is used as a fetch specification, https://git-scm.com/book/en/v2/Git-Internals-The-Refspec.

The implementation is going to implement this using git clone and git checkout (which actually is what git clone --branch is doing anyway, with the limitation that it cannot take commit ids). The first to clone the repo, the second to check out the concrete revision of the repository and to automatically setup tracking of the upstream branch (if we're checking out a branch at all).

So to resolve this for good, I propose the following alternative ways forward (vote for one using an emoji on this comment):

  1. support branch + commit-id as suggested above Add more flexibility into devfile project source - checkout branch, commit ID etc #12775 (comment) (vote with 👀 )
  2. keep refspec (vote with 😕 )
  3. use startPoint to point to branch/tag/commit (vote with 🚀 ) - we're not using git terminology for location either, so maybe it'd be better to stay neutral here, too. At the same time, start_point is used in git docs for checkout to mean the same thing.
  4. use revision to point to branch/tag/commit (vote with 😆 ) - the reasons are the same as above

@sunix
Copy link
Contributor

sunix commented Apr 1, 2019

branch is ideal for the usecase 1 but not for second usecase, because branches can point to different commits at different points in time.

I have edited the first option because wit branch and commit-id we could cover everything
or maybe implement branch and refspec:

  • branch for the branch to checkout (clone or not). From theia git clone --branch <branch> <repourl>. It is also the branch where changes should go or PR should be based
  • refspec would perform a git reset --hard <refspec>

@metlos
Copy link
Contributor

metlos commented Apr 8, 2019

I'm closing this because the feature is ready in devfile as designed. There is a pending PR in che-theia to add the full support for this in the IDE: eclipse-che/che-theia#144.

Currently che-theia only supports the branch attribute and ignores all others.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/task Internal things, technical debt, and to-do tasks to be performed. severity/P1 Has a major impact to usage or development of the system.
Projects
None yet
Development

No branches or pull requests

5 participants