Skip to content
Abstraction over fetching data from repository management services
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.



Project Crawler

Small project to iterate over and fetch files from different repository management tools like Github, Gitlab, BitBucket


Fetch the repositories for an org and download a file from it

ProjectCrawler crawler = new ProjectCrawler(OptionsBuilder.builder()
  // basing the root URL we can resolve the type of repo (e.g.
  // username to access the API
  // password to access the API
  // token to access the API
  // repository type (GITHUB, BITBUCKET, GITLAB, OTHER)
// get the repos from the org
List<Repository> repositories = crawler.repositories(org);
repositories.each { Repository repo ->
  // fetch a file from the repository
  String file = crawler.fileContent(org,, repo.requestedBranch, "path/to/file.txt")

For BitBucket:

Adding your own implementation

If you’re using some other tool than Github, Gitlab or Bitbucket you can write an implementation that integrates with that tool.

We’re using the standard, Java ServiceLoader mechanism to load any extensions and the interface to implement is RepositoryManagementBuilder.

To do that just create a file called META-INF/io.cloudpipelines.projectcrawler.RepositoryManagementBuilder that is accessible on classpath. The file should contain a line with fully qualified name of your implementation class (e.g. com.example.TestRepositoryManagementBuilder)

You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.