Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

jenkins often fails just due to cloning #155

Closed
pseudotensor opened this issue Sep 11, 2017 · 8 comments
Closed

jenkins often fails just due to cloning #155

pseudotensor opened this issue Sep 11, 2017 · 8 comments
Assignees

Comments

@pseudotensor
Copy link
Collaborator

pseudotensor commented Sep 11, 2017

Solution could be to keep s3 private versions of each repo and sync from them instead of git cloning and keeping the s3 up to date with a jenkins job. Point is that we very infrequently change the commit we point to, and silly to grab from github every time. The other solution is to use my shallow clone stuff so that there is much less to download from github.

Commit message: "minor doc2"

git config core.sparsecheckout # timeout=10
git checkout -f 7508244
git remote # timeout=10
git submodule init # timeout=10
git submodule sync # timeout=10
git config --get remote.origin.url # timeout=10
git submodule init # timeout=10
git config -f .gitmodules --get-regexp ^submodule.(.*).url # timeout=10
git config --get submodule.cub.url # timeout=10
git config -f .gitmodules --get submodule.cub.path # timeout=10
git submodule update --init --recursive cub
git config --get submodule.xgboost.url # timeout=10
git config -f .gitmodules --get submodule.xgboost.path # timeout=10
git submodule update --init --recursive xgboost
git config --get submodule.py3nvml.url # timeout=10
git config -f .gitmodules --get submodule.py3nvml.path # timeout=10
git submodule update --init --recursive py3nvml
git config --get submodule.scikit-learn.url # timeout=10
git config -f .gitmodules --get submodule.scikit-learn.path # timeout=10
git submodule update --init --recursive scikit-learn
ERROR: Timeout after 10 minutes
Command "git submodule update --init --recursive scikit-learn" returned status code 143:
stdout:
stderr: Cloning into 'scikit-learn'...

@mmalohlava
Copy link
Member

Sounds as good solution for me - if i am understand well, you propose to do a shallow clone with submodules, and then look into "cache" (whatever it is) to find for example xgboost-.zip instead of git checkout submodule

Furthermore, we should modify Jenkins file to do (CC:@anmol):

  • do explicit checkout of scm at beginning of job run
  • retry on checkout failure + reasonable timeout
  • zip workspace folder and pass it among the stages (see XGBoost Jenkinsfile)

@mdymczyk
Copy link
Contributor

Yes it is very annoying - @abal5 is it because of some jenkins setup or just the network/node is being slow?

@pseudotensor
Copy link
Collaborator Author

It's because scikit-learn repo is very large (like 100MB) and github throttles us. Happens with cub sometimes too. A shallow clone will help, and works as long as we use our own repo that we control (instead of losing track of remote repo and shallow will break as head moves to new commits).

@pseudotensor
Copy link
Collaborator Author

If I knew where the jenkins stuff was done (where all the clone commands are) I could fix things myself. I don't see things in the docker or jenkins files in the repo. Are they stored somewhere else?

@mdymczyk
Copy link
Contributor

mdymczyk commented Sep 12, 2017

@pseudotensor @abal5 will know for sure but I think we're using the git checkout plugin for it, it's in the Jenkins file:

checkout([
                        $class                           : 'GitSCM',
                        branches                         : scm.branches,
                        doGenerateSubmoduleConfigurations: false,
                        extensions                       : scm.extensions + [[$class: 'SubmoduleOption', disableSubmodules: false, recursiveSubmodules: true, reference: '', trackingSubmodules: false]],
                        submoduleCfg                     : [],
                        userRemoteConfigs                : scm.userRemoteConfigs])

I'm not very knowledgeable about it, though. Probably need to dabble with the extensions option.

@pseudotensor
Copy link
Collaborator Author

But where is the actual repo shown, etc.? I can try setting recursiveSubmodules: false and using the scripts/gitshallow_submodules.sh , but I'll ask @abal5 before breaking things and wasting time.

@mmalohlava
Copy link
Member

@pseudotensor repo is defined in jenkins job itself.

btw: what about disabling checkout from jenkins totally and running the @pseudotensor script?

@pseudotensor
Copy link
Collaborator Author

Resolved by #167

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants