Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
Browse files

Unify all gems into one, generalize resources.

Separate scraper into two distinct phases: retrieval and scraping.
Retrieval fetches the files to local disk while scraping analyzes
the repository content. It's now possible to have scrapers that
scrape different kind of resources (cookbooks and workflows so far).
  • Loading branch information...
commit 3a12a6ba571215176109113ab5ee587c9e9c3e49 1 parent c17a406
@raphael raphael authored
Showing with 2,676 additions and 4,973 deletions.
  1. +2 −3 Gemfile
  2. +30 −25 Gemfile.lock
  3. +1 −1  README.rdoc
  4. +7 −10 Rakefile
  5. +32 −18 right_scraper_base/lib/right_scraper_base.rb → lib/right_scraper.rb
  6. +6 −6 {right_scraper_base/lib/right_scraper_base → lib/right_scraper}/builders/base.rb
  7. +13 −12 {right_scraper_base/lib/right_scraper_base → lib/right_scraper}/builders/filesystem.rb
  8. +6 −6 {right_scraper_base/lib/right_scraper_base → lib/right_scraper}/builders/union.rb
  9. +2 −2 {right_scraper_base/lib/right_scraper_base → lib/right_scraper}/logger.rb
  10. 0  {right_scraper_base/lib/right_scraper_base → lib/right_scraper}/loggers/noisy.rb
  11. 0  {right_scraper_git/lib/right_scraper_git → lib/right_scraper}/processes/ssh.rb
  12. +299 −0 lib/right_scraper/repositories/base.rb
  13. +19 −12 {right_scraper_base/lib/right_scraper_base → lib/right_scraper}/repositories/download.rb
  14. +27 −16 {right_scraper_git/lib/right_scraper_git → lib/right_scraper}/repositories/git.rb
  15. +4 −4 {right_scraper_base/lib/right_scraper_base → lib/right_scraper}/repositories/mock.rb
  16. +29 −18 {right_scraper_svn/lib/right_scraper_svn → lib/right_scraper}/repositories/svn.rb
  17. +70 −0 lib/right_scraper/resources/base.rb
  18. +10 −2 right_scraper_s3/lib/right_scraper_s3.rb → lib/right_scraper/resources/cookbook.rb
  19. +27 −13 ...r_libcurl/lib/right_scraper_libcurl/repositories/download.rb → lib/right_scraper/resources/workflow.rb
  20. +114 −0 lib/right_scraper/retrievers/base.rb
  21. +10 −14 {right_scraper_base/lib/right_scraper_base/scrapers → lib/right_scraper/retrievers}/checkout.rb
  22. +13 −11 ...se/lib/right_scraper_base/scrapers/command_line_download.rb → lib/right_scraper/retrievers/download.rb
  23. +11 −12 {right_scraper_git/lib/right_scraper_git/scrapers → lib/right_scraper/retrievers}/git.rb
  24. +10 −8 {right_scraper_svn/lib/right_scraper_svn/scrapers → lib/right_scraper/retrievers}/svn.rb
  25. +15 −17 {right_scraper_base/lib/right_scraper_base → lib/right_scraper}/scanners/base.rb
  26. +7 −7 ...per_base/lib/right_scraper_base/scanners/manifest.rb → lib/right_scraper/scanners/cookbook_manifest.rb
  27. +3 −4 ...per_base/lib/right_scraper_base/scanners/metadata.rb → lib/right_scraper/scanners/cookbook_metadata.rb
  28. +4 −5 right_scraper_s3/lib/right_scraper_s3/scanners/s3.rb → lib/right_scraper/scanners/cookbook_s3_upload.rb
  29. +11 −11 {right_scraper_base/lib/right_scraper_base → lib/right_scraper}/scanners/union.rb
  30. +86 −0 lib/right_scraper/scanners/workflow_manifest.rb
  31. +70 −0 lib/right_scraper/scanners/workflow_metadata.rb
  32. +85 −0 lib/right_scraper/scanners/workflow_s3_upload.rb
  33. +45 −70 {right_scraper_base/lib/right_scraper_base → lib/right_scraper}/scraper.rb
  34. +35 −8 right_scraper_git/spec/spec_helper.rb → lib/right_scraper/scraper_logger.rb
  35. +262 −0 lib/right_scraper/scrapers/base.rb
  36. +73 −0 lib/right_scraper/scrapers/cookbook.rb
  37. +88 −0 lib/right_scraper/scrapers/workflow.rb
  38. +4 −4 {right_scraper_svn/lib/right_scraper_svn → lib/right_scraper}/svn_client.rb
  39. +1 −1  {right_scraper_base/lib/right_scraper_base → lib/right_scraper}/version.rb
  40. +10 −9 right_scraper_all/right_scraper_all.gemspec → right_scraper.gemspec
  41. +13 −0 right_scraper.rconf
  42. +0 −20 right_scraper/LICENSE
  43. +0 −62 right_scraper/README.rdoc
  44. +0 −80 right_scraper/Rakefile
  45. +0 −28 right_scraper/lib/right_scraper.rb
  46. +0 −62 right_scraper/right_scraper.gemspec
  47. +0 −67 right_scraper/spec/scraper_spec.rb
  48. +0 −28 right_scraper/spec/spec_helper.rb
  49. +0 −1  right_scraper_all/.gitignore
  50. +0 −20 right_scraper_all/LICENSE
  51. +0 −62 right_scraper_all/README.rdoc
  52. +0 −66 right_scraper_all/Rakefile
  53. +0 −20 right_scraper_base/LICENSE
  54. +0 −102 right_scraper_base/README.rdoc
  55. +0 −80 right_scraper_base/Rakefile
  56. +0 −67 right_scraper_base/lib/right_scraper_base/builders/archive.rb
  57. +0 −209 right_scraper_base/lib/right_scraper_base/cookbook.rb
  58. +0 −286 right_scraper_base/lib/right_scraper_base/repository.rb
  59. +0 −162 right_scraper_base/lib/right_scraper_base/scrapers/base.rb
  60. +0 −284 right_scraper_base/lib/right_scraper_base/scrapers/filesystem.rb
  61. +0 −62 right_scraper_base/right_scraper_base.gemspec
  62. +0 −45 right_scraper_base/spec/full_scraper_helper.rb
  63. +0 −20 right_scraper_git/LICENSE
  64. +0 −72 right_scraper_git/README.rdoc
  65. +0 −80 right_scraper_git/Rakefile
  66. +0 −28 right_scraper_git/lib/right_scraper_git.rb
  67. +0 −61 right_scraper_git/right_scraper_git.gemspec
  68. +0 −466 right_scraper_git/spec/git_scraper_spec.rb
  69. +0 −20 right_scraper_libcurl/LICENSE
  70. +0 −72 right_scraper_libcurl/README.rdoc
  71. +0 −80 right_scraper_libcurl/Rakefile
  72. +0 −25 right_scraper_libcurl/lib/right_scraper_libcurl.rb
  73. +0 −132 right_scraper_libcurl/lib/right_scraper_libcurl/scrapers/download.rb
  74. +0 −62 right_scraper_libcurl/right_scraper_libcurl.gemspec
  75. +0 −119 right_scraper_libcurl/spec/libcurl_download_scraper_spec.rb
  76. +0 −75 right_scraper_libcurl/spec/libcurl_download_scraper_spec_helper.rb
  77. +0 −57 right_scraper_libcurl/spec/repository_spec.rb
  78. +0 −101 right_scraper_libcurl/spec/scraper_spec.rb
  79. +0 −29 right_scraper_libcurl/spec/spec_helper.rb
  80. +0 −20 right_scraper_s3/LICENSE
  81. +0 −74 right_scraper_s3/README.rdoc
  82. +0 −80 right_scraper_s3/Rakefile
  83. +0 −60 right_scraper_s3/right_scraper_s3.gemspec
  84. +0 −39 right_scraper_s3/spec/spec_helper.rb
  85. +0 −20 right_scraper_svn/LICENSE
  86. +0 −86 right_scraper_svn/README.rdoc
  87. +0 −80 right_scraper_svn/Rakefile
  88. +0 −26 right_scraper_svn/lib/right_scraper_svn.rb
  89. +0 −61 right_scraper_svn/right_scraper_svn.gemspec
  90. +0 −178 right_scraper_svn/spec/cookbook_spec.rb
  91. +0 −29 right_scraper_svn/spec/spec_helper.rb
  92. 0  {right_scraper_git → }/scripts/stub_ssh_askpass
  93. +13 −13 {right_scraper_base → }/spec/builder_spec.rb
  94. +13 −29 {right_scraper_base/spec/cookbooks → spec}/cookbook_helper.rb
  95. +13 −14 right_scraper_base/spec/manifest_spec.rb → spec/cookbook_manifest_spec.rb
  96. +14 −16 right_scraper_s3/spec/s3_upload_spec.rb → spec/cookbook_s3_upload_spec.rb
  97. +21 −20 ...er_base/spec/download/command_line_download_scraper_spec.rb → spec/download/download_retriever_spec.rb
  98. +14 −18 ...ownload/command_line_download_scraper_spec_helper.rb → spec/download/download_retriever_spec_helper.rb
  99. +14 −49 right_scraper_base/spec/cookbooks/command_line_download_spec.rb → spec/download/download_spec.rb
  100. +31 −25 {right_scraper_base → }/spec/download/multi_dir_spec.rb
  101. +4 −4 {right_scraper_base → }/spec/download/multi_dir_spec_helper.rb
  102. +51 −83 right_scraper_git/spec/git_spec.rb → spec/git/cookbook_spec.rb
  103. 0  {right_scraper_git/spec → spec/git}/demokey
  104. 0  {right_scraper_git/spec → spec/git}/demokey.pub
  105. 0  {right_scraper_git/spec → spec/git}/password_key
  106. 0  {right_scraper_git/spec → spec/git}/password_key.pub
  107. +6 −6 {right_scraper_git/spec → spec/git}/repository_spec.rb
  108. +505 −0 spec/git/retriever_spec.rb
  109. +19 −10 right_scraper_git/spec/git_scraper_spec_helper.rb → spec/git/retriever_spec_helper.rb
  110. +36 −33 {right_scraper_git/spec → spec/git}/scraper_spec.rb
  111. +3 −4 {right_scraper_git/spec → spec/git}/ssh_spec.rb
  112. +4 −4 {right_scraper_git/spec → spec/git}/url_spec.rb
  113. +2 −3 {right_scraper_base → }/spec/logger_spec.rb
  114. +7 −8 {right_scraper_base → }/spec/repository_spec.rb
  115. +2 −2 right_scraper_base/spec/scraper_spec_helper_base.rb → spec/retriever_spec_helper.rb
  116. +12 −9 {right_scraper_base → }/spec/scanner_spec.rb
  117. +39 −20 {right_scraper_base → }/spec/scraper_helper.rb
  118. +43 −74 {right_scraper_base → }/spec/scraper_spec.rb
  119. +28 −6 {right_scraper_base → }/spec/spec_helper.rb
  120. +97 −0 spec/svn/cookbook_spec.rb
  121. +11 −11 {right_scraper_svn/spec → spec/svn}/multi_svn_spec.rb
  122. +4 −4 {right_scraper_svn/spec → spec/svn}/multi_svn_spec_helper.rb
  123. +3 −3 {right_scraper_svn/spec → spec/svn}/repository_spec.rb
  124. +64 −64 right_scraper_svn/spec/svn_scraper_spec.rb → spec/svn/retriever_spec.rb
  125. +30 −29 {right_scraper_svn/spec → spec/svn}/scraper_spec.rb
  126. +7 −7 right_scraper_svn/spec/svn_scraper_spec_helper.rb → spec/svn/svn_retriever_spec_helper.rb
  127. +4 −4 {right_scraper_svn/spec → spec/svn}/url_spec.rb
  128. +8 −7 {right_scraper_base → }/spec/url_spec.rb
View
5 Gemfile
@@ -2,7 +2,7 @@ source "http://rubygems.org"
gem "json", "~> 1.4.5"
gem "blackwinter-git", "~> 1.2.7"
-gem "libarchive", "~> 0.1.1"
+gem "libarchive", "~> 0.1.2"
gem "curb", "~> 0.7.7.1"
gem "right_aws", "~> 2.0"
gem "process_watcher", "~> 0.3"
@@ -11,6 +11,5 @@ group :development do
gem "rspec", "~> 2.3"
gem "rake"
gem "flexmock"
- gem "rtags"
- gem "ruby-debug"
+ gem "ruby-debug19"
end
View
55 Gemfile.lock
@@ -1,33 +1,39 @@
GEM
remote: http://rubygems.org/
specs:
+ archive-tar-minitar (0.5.2)
blackwinter-git (1.2.7)
- columnize (0.3.2)
+ columnize (0.3.4)
curb (0.7.7.1)
diff-lcs (1.1.2)
- flexmock (0.8.8)
+ flexmock (0.9.0)
json (1.4.6)
- libarchive (0.1.1)
- linecache (0.43)
- process_watcher (0.3)
- rake (0.8.7)
- right_aws (2.0.0)
- right_http_connection (>= 1.2.1)
- right_http_connection (1.2.4)
- rspec (2.3.0)
- rspec-core (~> 2.3.0)
- rspec-expectations (~> 2.3.0)
- rspec-mocks (~> 2.3.0)
- rspec-core (2.3.1)
- rspec-expectations (2.3.0)
+ libarchive (0.1.2)
+ linecache19 (0.5.12)
+ ruby_core_source (>= 0.1.4)
+ process_watcher (0.4)
+ rake (0.9.2)
+ right_aws (2.1.0)
+ right_http_connection (>= 1.2.5)
+ right_http_connection (1.3.0)
+ rspec (2.6.0)
+ rspec-core (~> 2.6.0)
+ rspec-expectations (~> 2.6.0)
+ rspec-mocks (~> 2.6.0)
+ rspec-core (2.6.4)
+ rspec-expectations (2.6.0)
diff-lcs (~> 1.1.2)
- rspec-mocks (2.3.0)
- rtags (0.97)
- ruby-debug (0.10.4)
- columnize (>= 0.1)
- ruby-debug-base (~> 0.10.4.0)
- ruby-debug-base (0.10.4)
- linecache (>= 0.3)
+ rspec-mocks (2.6.0)
+ ruby-debug-base19 (0.11.25)
+ columnize (>= 0.3.1)
+ linecache19 (>= 0.5.11)
+ ruby_core_source (>= 0.1.4)
+ ruby-debug19 (0.11.6)
+ columnize (>= 0.3.1)
+ linecache19 (>= 0.5.11)
+ ruby-debug-base19 (>= 0.11.19)
+ ruby_core_source (0.1.5)
+ archive-tar-minitar (>= 0.5.2)
PLATFORMS
ruby
@@ -37,10 +43,9 @@ DEPENDENCIES
curb (~> 0.7.7.1)
flexmock
json (~> 1.4.5)
- libarchive (~> 0.1.1)
+ libarchive (~> 0.1.2)
process_watcher (~> 0.3)
rake
right_aws (~> 2.0)
rspec (~> 2.3)
- rtags
- ruby-debug
+ ruby-debug19
View
2  README.rdoc
@@ -17,7 +17,7 @@ cost of requiring some systems administration work external to Ruby.
require 'rubygems'
require 'right_scraper'
- scraper = RightScale::Scraper.new('/tmp')
+ scraper = RightScale::Scraper.new(:basedir => '/tmp', :kind => :cookbook)
scraper.scrape(:type => :git, :url => 'git://github.com/rightscale/right_scraper.git')
== INSTALLATION
View
17 Rakefile
@@ -1,5 +1,5 @@
#-- -*-ruby-*-
-# Copyright: Copyright (c) 2010 RightScale, Inc.
+# Copyright: Copyright (c) 2010-2011 RightScale, Inc.
#
# Permission is hereby granted, free of charge, to any person obtaining
# a copy of this software and associated documentation files (the
@@ -27,8 +27,8 @@ require 'bundler/setup'
require 'fileutils'
require 'rake'
require 'rspec/core/rake_task'
-require 'rake/rdoctask'
-require 'rake/gempackagetask'
+require 'rdoc/task'
+require 'rubygems/package_task'
require 'rake/clean'
task :default => 'spec'
@@ -44,7 +44,6 @@ task :gem => 'pkg' do
end
CLEAN.include('pkg')
-CLEAN.include('right_scraper_all/lib')
# == Unit Tests == #
@@ -54,7 +53,8 @@ task :specs => :spec
desc 'Run unit tests'
RSpec::Core::RakeTask.new do |t|
- t.pattern = '*/spec/**/*_spec.rb'
+ t.pattern = 'spec/**/*_spec.rb'
+ t.rspec_opts = ["--color", "--format", "nested"]
end
namespace :spec do
@@ -75,15 +75,12 @@ end
# == Documentation == #
desc "Generate API documentation to doc/rdocs/index.html"
-Rake::RDocTask.new do |rd|
+RDoc::Task.new do |rd|
rd.rdoc_dir = 'doc/rdocs'
rd.main = 'README.rdoc'
- rd.rdoc_files.include 'README.rdoc', '*/README.rdoc', "*/lib/**/*.rb"
+ rd.rdoc_files.include 'README.rdoc', 'lib/**/*.rb'
- rd.options << '--inline-source'
- rd.options << '--line-numbers'
rd.options << '--all'
- rd.options << '--fileboxes'
rd.options << '--diagram'
end
View
50 right_scraper_base/lib/right_scraper_base.rb → lib/right_scraper.rb
@@ -1,5 +1,5 @@
#--
-# Copyright: Copyright (c) 2010 RightScale, Inc.
+# Copyright: Copyright (c) 2010-2011 RightScale, Inc.
#
# Permission is hereby granted, free of charge, to any person obtaining
# a copy of this software and associated documentation files (the
@@ -23,20 +23,34 @@
# Explicitly list required files to make IDEs happy
require 'fileutils'
-require File.expand_path(File.join(File.dirname(__FILE__), 'right_scraper_base', 'builders', 'base'))
-require File.expand_path(File.join(File.dirname(__FILE__), 'right_scraper_base', 'builders', 'archive'))
-require File.expand_path(File.join(File.dirname(__FILE__), 'right_scraper_base', 'builders', 'filesystem'))
-require File.expand_path(File.join(File.dirname(__FILE__), 'right_scraper_base', 'builders', 'union'))
-require File.expand_path(File.join(File.dirname(__FILE__), 'right_scraper_base', 'cookbook'))
-require File.expand_path(File.join(File.dirname(__FILE__), 'right_scraper_base', 'logger'))
-require File.expand_path(File.join(File.dirname(__FILE__), 'right_scraper_base', 'repository'))
-require File.expand_path(File.join(File.dirname(__FILE__), 'right_scraper_base', 'repositories', 'download'))
-require File.expand_path(File.join(File.dirname(__FILE__), 'right_scraper_base', 'scanners', 'base'))
-require File.expand_path(File.join(File.dirname(__FILE__), 'right_scraper_base', 'scanners', 'manifest'))
-require File.expand_path(File.join(File.dirname(__FILE__), 'right_scraper_base', 'scanners', 'metadata'))
-require File.expand_path(File.join(File.dirname(__FILE__), 'right_scraper_base', 'scanners', 'union'))
-require File.expand_path(File.join(File.dirname(__FILE__), 'right_scraper_base', 'scraper'))
-require File.expand_path(File.join(File.dirname(__FILE__), 'right_scraper_base', 'scrapers', 'base'))
-require File.expand_path(File.join(File.dirname(__FILE__), 'right_scraper_base', 'scrapers', 'checkout'))
-require File.expand_path(File.join(File.dirname(__FILE__), 'right_scraper_base', 'scrapers', 'filesystem'))
-require File.expand_path(File.join(File.dirname(__FILE__), 'right_scraper_base', 'scrapers', 'command_line_download'))
+require File.expand_path(File.join(File.dirname(__FILE__), 'right_scraper', 'builders', 'base'))
+require File.expand_path(File.join(File.dirname(__FILE__), 'right_scraper', 'builders', 'filesystem'))
+require File.expand_path(File.join(File.dirname(__FILE__), 'right_scraper', 'builders', 'union'))
+require File.expand_path(File.join(File.dirname(__FILE__), 'right_scraper', 'logger'))
+require File.expand_path(File.join(File.dirname(__FILE__), 'right_scraper', 'processes', 'ssh'))
+require File.expand_path(File.join(File.dirname(__FILE__), 'right_scraper', 'repositories', 'base'))
+require File.expand_path(File.join(File.dirname(__FILE__), 'right_scraper', 'repositories', 'download'))
+require File.expand_path(File.join(File.dirname(__FILE__), 'right_scraper', 'repositories', 'git'))
+require File.expand_path(File.join(File.dirname(__FILE__), 'right_scraper', 'repositories', 'svn'))
+require File.expand_path(File.join(File.dirname(__FILE__), 'right_scraper', 'resources', 'base'))
+require File.expand_path(File.join(File.dirname(__FILE__), 'right_scraper', 'resources', 'cookbook'))
+require File.expand_path(File.join(File.dirname(__FILE__), 'right_scraper', 'resources', 'workflow'))
+require File.expand_path(File.join(File.dirname(__FILE__), 'right_scraper', 'retrievers', 'base'))
+require File.expand_path(File.join(File.dirname(__FILE__), 'right_scraper', 'retrievers', 'checkout'))
+require File.expand_path(File.join(File.dirname(__FILE__), 'right_scraper', 'retrievers', 'download'))
+require File.expand_path(File.join(File.dirname(__FILE__), 'right_scraper', 'retrievers', 'git'))
+require File.expand_path(File.join(File.dirname(__FILE__), 'right_scraper', 'retrievers', 'svn'))
+require File.expand_path(File.join(File.dirname(__FILE__), 'right_scraper', 'scanners', 'base'))
+require File.expand_path(File.join(File.dirname(__FILE__), 'right_scraper', 'scanners', 'cookbook_manifest'))
+require File.expand_path(File.join(File.dirname(__FILE__), 'right_scraper', 'scanners', 'cookbook_metadata'))
+require File.expand_path(File.join(File.dirname(__FILE__), 'right_scraper', 'scanners', 'cookbook_s3_upload'))
+require File.expand_path(File.join(File.dirname(__FILE__), 'right_scraper', 'scanners', 'union'))
+require File.expand_path(File.join(File.dirname(__FILE__), 'right_scraper', 'scanners', 'workflow_manifest'))
+require File.expand_path(File.join(File.dirname(__FILE__), 'right_scraper', 'scanners', 'workflow_metadata'))
+require File.expand_path(File.join(File.dirname(__FILE__), 'right_scraper', 'scraper'))
+require File.expand_path(File.join(File.dirname(__FILE__), 'right_scraper', 'scraper_logger'))
+require File.expand_path(File.join(File.dirname(__FILE__), 'right_scraper', 'scrapers', 'base'))
+require File.expand_path(File.join(File.dirname(__FILE__), 'right_scraper', 'scrapers', 'cookbook'))
+require File.expand_path(File.join(File.dirname(__FILE__), 'right_scraper', 'scrapers', 'workflow'))
+require File.expand_path(File.join(File.dirname(__FILE__), 'right_scraper', 'svn_client'))
+require File.expand_path(File.join(File.dirname(__FILE__), 'right_scraper', 'version'))
View
12 ...aper_base/lib/right_scraper_base/builders/base.rb → lib/right_scraper/builders/base.rb
@@ -1,5 +1,5 @@
#--
-# Copyright: Copyright (c) 2010 RightScale, Inc.
+# Copyright: Copyright (c) 2010-2011 RightScale, Inc.
#
# Permission is hereby granted, free of charge, to any person obtaining
# a copy of this software and associated documentation files (the
@@ -31,7 +31,7 @@ module Builders
#
# The lifecycle for a builder is as follows:
# - builder = Builder.new (once)
- # - builder.go(dir, cookbook) (many times)
+ # - builder.go(dir, resource) (many times)
# - builder.finish (once)
class Builder
# Create a new Builder. Recognizes options as given. Some
@@ -47,12 +47,12 @@ def initialize(options={})
@logger = options.fetch(:logger, Logger.new)
end
- # Run builder for this cookbook.
+ # Run builder for this resource.
#
# === Parameters
- # dir(String):: directory cookbook exists at
- # cookbook(RightScraper::Cookbook):: cookbook instance being built
- def go(dir, cookbook)
+ # dir(String):: directory resource exists at
+ # resource(Object):: resource instance being built
+ def go(dir, resource)
end
# Notification that all scans for this repository have
View
25 ...ase/lib/right_scraper_base/builders/filesystem.rb → lib/right_scraper/builders/filesystem.rb
@@ -1,5 +1,5 @@
#--
-# Copyright: Copyright (c) 2010 RightScale, Inc.
+# Copyright: Copyright (c) 2010-2011 RightScale, Inc.
#
# Permission is hereby granted, free of charge, to any person obtaining
# a copy of this software and associated documentation files (the
@@ -27,20 +27,21 @@ module RightScraper
module Builders
# Build metadata by scanning the filesystem.
class Filesystem < Builder
+
# Create a new filesystem scanner. In addition to the options
- # recognized by Builder, this class recognizes <tt>:scraper</tt> and
+ # recognized by Builder, this class recognizes <tt>:retriever</tt> and
# <tt>:scanner</tt>.
#
# === Options
- # <tt>:scraper</tt>:: Required. FilesystemBasedScraper currently being used
# <tt>:scanner</tt>:: Required. Scanner currently being used
+ # <tt>:ignorable_paths</tt>:: Ignore directories whose name belong to this list
#
# === Parameters
# options(Hash):: scraper options
def initialize(options={})
super
- @scraper = options.fetch(:scraper)
@scanner = options.fetch(:scanner)
+ @ignorable_paths = options[:ignorable_paths]
end
# Tell the scanner we're done.
@@ -49,16 +50,16 @@ def finish
@scanner.finish
end
- # Run builder for this cookbook.
+ # Run builder for this resource.
#
# === Parameters
- # dir(String):: directory cookbook exists at
- # cookbook(RightScraper::Cookbook):: cookbook instance being built
- def go(dir, cookbook)
+ # dir(String):: directory resource exists at
+ # resource(Object):: resource instance being built
+ def go(dir, resource)
@logger.operation(:scanning_filesystem, "rooted at #{dir}") do
- @scanner.begin(cookbook)
+ @scanner.begin(resource)
maybe_scan(Dir.new(dir), nil)
- @scanner.end(cookbook)
+ @scanner.end(resource)
end
end
@@ -72,11 +73,11 @@ def maybe_scan(directory, position)
#
# === Parameters
# directory(Dir):: directory to scan
- # position(String):: relative pathname for _directory_ from root of cookbook
+ # position(String):: relative pathname for _directory_ from root of resource
def scan(directory, position)
directory.each do |entry|
next if entry == '.' || entry == '..'
- next if @scraper.ignorable?(entry)
+ next if @ignorable_paths && @ignorable_paths.include?(entry)
fullpath = File.join(directory.path, entry)
relative_position = position ? File.join(position, entry) : entry
View
12 ...per_base/lib/right_scraper_base/builders/union.rb → lib/right_scraper/builders/union.rb
@@ -1,5 +1,5 @@
#--
-# Copyright: Copyright (c) 2010 RightScale, Inc.
+# Copyright: Copyright (c) 2010-2011 RightScale, Inc.
#
# Permission is hereby granted, free of charge, to any person obtaining
# a copy of this software and associated documentation files (the
@@ -38,13 +38,13 @@ def initialize(classes, options={})
@subbuilders = classes.map {|klass| klass.new(options)}
end
- # Run each builder for this cookbook.
+ # Run each builder for this resource.
#
# === Parameters
- # dir(String):: directory cookbook exists at
- # cookbook(RightScraper::Cookbook):: cookbook instance being built
- def go(dir, cookbook)
- @subbuilders.each {|builder| builder.go(dir, cookbook)}
+ # dir(String):: directory resource exists at
+ # resource(RightScraper::Resources::Base):: resource instance being built
+ def go(dir, resource)
+ @subbuilders.each {|builder| builder.go(dir, resource)}
end
# Notify subbuilders that all scans for this repository have
View
4 right_scraper_base/lib/right_scraper_base/logger.rb → lib/right_scraper/logger.rb
@@ -1,5 +1,5 @@
#--
-# Copyright: Copyright (c) 2010 RightScale, Inc.
+# Copyright: Copyright (c) 2010-2011 RightScale, Inc.
#
# Permission is hereby granted, free of charge, to any person obtaining
# a copy of this software and associated documentation files (the
@@ -38,7 +38,7 @@ def initialize(*args)
@exceptional = false
end
- # (RightScraper::Repository) Repository currently being examined.
+ # (RightScraper::Repositories::Base) Repository currently being examined.
attr_writer :repository
# Begin an operation that merits logging. Will call #note_error
View
0  ...aper_base/lib/right_scraper_base/loggers/noisy.rb → lib/right_scraper/loggers/noisy.rb
File renamed without changes
View
0  ...craper_git/lib/right_scraper_git/processes/ssh.rb → lib/right_scraper/processes/ssh.rb
File renamed without changes
View
299 lib/right_scraper/repositories/base.rb
@@ -0,0 +1,299 @@
+#--
+# Copyright: Copyright (c) 2010-2011 RightScale, Inc.
+#
+# Permission is hereby granted, free of charge, to any person obtaining
+# a copy of this software and associated documentation files (the
+# 'Software'), to deal in the Software without restriction, including
+# without limitation the rights to use, copy, modify, merge, publish,
+# distribute, sublicense, and/or sell copies of the Software, and to
+# permit persons to whom the Software is furnished to do so, subject to
+# the following conditions:
+#
+# The above copyright notice and this permission notice shall be
+# included in all copies or substantial portions of the Software.
+#
+# THE SOFTWARE IS PROVIDED 'AS IS', WITHOUT WARRANTY OF ANY KIND,
+# EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+# MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
+# IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
+# CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
+# TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
+# SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
+#++
+require 'uri'
+require 'digest/sha1'
+require 'set'
+require 'socket'
+
+module RightScraper
+
+ module Repositories
+
+ # Description of remote repository that needs to be scraped.
+ #
+ # Repository definitions inherit from this base class. A repository must
+ # register its #repo_type in @@types so that they can be used with
+ # Repositories::Base::from_hash, as follows:
+ # class ARepository < Base
+ # ...
+ #
+ # # Add this repository to the list of available types.
+ # @@types[:arepository] = ARepository
+ # end
+ #
+ # Subclasses should override #repo_type, #retriever and #to_url; when
+ # sensible, #revision should also be overridden. The most important
+ # methods are #to_url, which will return a +URI+ that completely
+ # characterizes the repository, and #retriever which returns the
+ # appropriate RightScraper::Retrievers::Base to scan that repository.
+ class Base
+
+ # Initialize repository from given hash
+ # Hash keys should correspond to attributes of this class
+ #
+ # === Parameters
+ # opts(Hash):: Hash to be converted into a RightScraper::Repositories::Base instance
+ #
+ # === Return
+ # repo(RightScraper::Repositories::Base):: Resulting repository instance
+ def self.from_hash(opts)
+ repo_class = @@types[opts[:repo_type]]
+ raise "Can't understand how to make #{opts[:repo_type]} repos" if repo_class.nil?
+ repo = repo_class.new
+ unless ENV['DEVELOPMENT']
+ validate_uri opts[:url]
+ end
+ opts.each do |k, v|
+ next if k == :repo_type
+ if [:first_credential, :second_credential].include?(k) && is_useful?(v)
+ v = useful_part(v)
+ end
+ repo.__send__("#{k.to_s}=".to_sym, v)
+ end
+ repo
+ end
+
+ # (String) Human readable repository name used for progress reports
+ attr_accessor :display_name
+
+ # (Array of String) Subdirectories in the repository to search for resources
+ attr_accessor :resources_path
+
+ # (String) URL to repository (e.g 'git://github.com/rightscale/right_scraper.git')
+ attr_accessor :url
+
+ # (String) Type of the repository. Currently one of 'git', 'svn'
+ # or 'download', implemented by the appropriate subclass. Needs
+ # to be overridden by subclasses.
+ def repo_type
+ raise NotImplementedError
+ end
+
+ # (RightScraper::Retrievers::Base class) Appropriate class for retrieving this sort of
+ # repository. Needs to be overridden appropriately by subclasses.
+ #
+ # === Options
+ # <tt>:max_bytes</tt>:: Maximum number of bytes to read
+ # <tt>:max_seconds</tt>:: Maximum number of seconds to spend reading
+ # <tt>:basedir</tt>:: Destination directory, use temp dir if not specified
+ # <tt>:logger</tt>:: Logger to use
+ #
+ # === Returns
+ # retriever(Retrievers::Base):: Corresponding retriever instance
+ def retriever(options)
+ raise NotImplementedError
+ end
+
+ # Return the revision this repository is currently looking at.
+ #
+ # === Returns
+ # String:: opaque revision type
+ def revision
+ nil
+ end
+
+ # Return a unique identifier for this repository ignoring the tags
+ # to check out.
+ #
+ # === Returns
+ # String:: opaque unique ID for this repository
+ def repository_hash
+ digest("#{PROTOCOL_VERSION}\000#{repo_type}\000#{url}")
+ end
+
+ # Return a unique identifier for this revision in this repository.
+ #
+ # === Returns
+ # String:: opaque unique ID for this revision in this repository
+ def checkout_hash
+ repository_hash
+ end
+
+ # Unique representation for this repo, should resolve to the same string
+ # for repos that should be cloned in same directory
+ #
+ # === Returns
+ # res(String):: Unique representation for this repo
+ def to_s
+ res = "#{repo_type} #{url}"
+ end
+
+ # Convert this repository to a URL in the style of resource URLs.
+ #
+ # === Returns
+ # URI:: URL representing this repository
+ def to_url
+ URI.parse(url)
+ end
+
+ # Return true if this repository and +other+ represent the same
+ # repository including the same checkout tag.
+ #
+ # === Parameters
+ # other(Repositories::Base):: repository to compare with
+ #
+ # === Returns
+ # Boolean:: true iff this repository and +other+ are the same
+ def ==(other)
+ if other.is_a?(RightScraper::Repositories::Base)
+ checkout_hash == other.checkout_hash
+ else
+ false
+ end
+ end
+
+ # Return true if this repository and +other+ represent the same
+ # repository, excluding the checkout tag.
+ #
+ # === Parameters
+ # other(Repositories::Base):: repository to compare with
+ #
+ # === Returns
+ # Boolean:: true iff this repository and +other+ are the same
+ def equal_repo?(other)
+ if other.is_a?(RightScraper::Repositories::Base)
+ repository_hash == other.repository_hash
+ else
+ false
+ end
+ end
+
+ # (Hash) Lookup table from textual description of repository type
+ # ('git', 'svn' or 'download' currently) to the class that
+ # represents that repository.
+ @@types = {} unless class_variable_defined?(:@@types)
+
+ # (Set) list of acceptable URI schemes. Initially just http, https and ftp.
+ @@okay_schemes = Set.new(["http", "https", "ftp"])
+
+ protected
+
+ # Return true iff this credential is useful. Currently "useful"
+ # means "nonempty and not all spaces".
+ def self.is_useful?(credential)
+ credential && !credential.strip.empty?
+ end
+
+ # Return the useful portion of this credential. Currently strips
+ # out any spaces.
+ def self.useful_part(credential)
+ credential.strip
+ end
+
+ # Compute a unique identifier for the given string. Currently uses SHA1.
+ #
+ # === Parameters
+ # string(String):: string to compute unique identifier for
+ #
+ # === Returns
+ # String:: unique identifier
+ def digest(string)
+ Digest::SHA1.hexdigest(string)
+ end
+
+ # Regexp matching everything not allowed in a URI and also ':',
+ # '@' and '/', to be used for encoding usernames and passwords.
+ USERPW = Regexp.new("[^#{URI::PATTERN::UNRESERVED}#{URI::PATTERN::RESERVED}]|[:@/]", false, 'N').freeze
+
+ # Return a URI with the given username and password set.
+ #
+ # === Parameters
+ # uri(URI or String):: URI to add user identification to
+ #
+ # === Returns
+ # URI:: URI with username and password identification added
+ def add_users_to(uri, username=nil, password=nil)
+ begin
+ uri = URI.parse(uri) if uri.instance_of?(String)
+ if username
+ userinfo = URI.escape(username, USERPW)
+ userinfo += ":" + URI.escape(password, USERPW) unless password.nil?
+ uri.userinfo = userinfo
+ end
+ uri
+ rescue URI::InvalidURIError
+ if uri =~ PATTERN::GIT_URI
+ user, host, path = $1, $2, $3
+ userinfo = URI.escape(user, USERPW)
+ userinfo += ":" + URI.escape(username, USERPW) unless username.nil?
+ path = "/" + path unless path.start_with?('/')
+ URI::Generic::build({:scheme => "ssh",
+ :userinfo => userinfo,
+ :host => host,
+ :path => path
+ })
+ else
+ raise
+ end
+ end
+ end
+
+ module PATTERN
+ include URI::REGEXP::PATTERN
+ GIT_URI = Regexp.new("^((?:[#{UNRESERVED}]|#{ESCAPED})*)@(#{HOST}):(#{ABS_PATH}|#{REL_PATH})$")
+ end
+
+ SSH_PORT = 22
+
+ def self.validate_uri(uri)
+ begin
+ uri = URI.parse(uri) if uri.instance_of?(String)
+ raise "Invalid URI #{uri}: don't know how to interpret scheme #{uri.scheme}" unless @@okay_schemes.include?(uri.scheme)
+ check_host(uri, uri.host, uri.port)
+ rescue URI::InvalidURIError
+ # could be a Git type URI.
+ if uri =~ PATTERN::GIT_URI
+ check_host(uri, $2, SSH_PORT)
+ else
+ raise
+ end
+ end
+ end
+
+ def self.check_host(uri, host, port)
+ begin
+ possibles = Socket.getaddrinfo(host, port, Socket::AF_INET, Socket::SOCK_STREAM, Socket::IPPROTO_TCP)
+ raise "Invalid URI #{uri}: no hosts for #{host}:#{port}" if possibles.nil? || possibles.empty?
+ possibles.each do |possible|
+ family, port, hostname, address, protocol_family, socket_type, protocol = possible
+
+ # Our EC2 gateway is not permitted.
+ raise "Invalid URI #{uri}" if address == "169.254.169.254"
+ # Loopbacks are not permitted.
+ raise "Invalid URI #{uri}" if address =~ /^127\.[0-9]+\.[0-9]+\.[0-9]+$/
+
+ # Private networks are not permitted
+ raise "Invalid URI #{uri}" if address =~ /^10\.[0-9]+\.[0-9]+\.[0-9]+$/
+ raise "Invalid URI #{uri}" if address =~ /^172\.(1[6-9]|[23][0-9])\.[0-9]+\.[0-9]+$/
+ raise "Invalid URI #{uri}" if address =~ /^192\.168\.[0-9]+\.[0-9]+$/
+ end
+ true
+ rescue SocketError
+ # means the host doesn't exist
+ raise "Invalid URI #{uri}: no hosts for #{host}:#{port}"
+ end
+ end
+
+ end
+ end
+end
View
31 ...e/lib/right_scraper_base/repositories/download.rb → lib/right_scraper/repositories/download.rb
@@ -1,5 +1,5 @@
#--
-# Copyright: Copyright (c) 2010 RightScale, Inc.
+# Copyright: Copyright (c) 2010-2011 RightScale, Inc.
#
# Permission is hereby granted, free of charge, to any person obtaining
# a copy of this software and associated documentation files (the
@@ -20,14 +20,13 @@
# TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
# SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
#++
-require File.expand_path(File.join(File.dirname(__FILE__), '..', 'repository'))
module RightScraper
module Repositories
- # A "cookbook repository" that is just an archive file hanging off a
+ # A repository that is just an archive file hanging off a
# web server somewhere. This version uses a command line curl to
# download the archive, and command line tar to extract it.
- class Download < Repository
+ class Download < Base
# (String) Type of the repository, here 'download'.
def repo_type
:download
@@ -52,7 +51,7 @@ def to_s
res = "download #{url}"
end
- # Convert this repository to a URL in the style of Cookbook URLs.
+ # Convert this repository to a URL in the style of resource URLs.
#
# === Returns
# URI:: URL representing this repository
@@ -65,19 +64,27 @@ def to_url
# === Returns
# String:: opaque unique ID for this revision in this repository
def checkout_hash
- digest("#{RS_PROTOCOL_VERSION}\000#{repo_type}\000#{url}\000#{tag}")
+ digest("#{PROTOCOL_VERSION}\000#{repo_type}\000#{url}\000#{tag}")
end
- # (ScraperBase class) Appropriate class for scraping this sort of
- # repository.
- def scraper
- RightScraper::Scrapers::CommandLineDownload
+ # Instantiate retriever for this kind of repository
+ #
+ # === Options
+ # <tt>:max_bytes</tt>:: Maximum number of bytes to read
+ # <tt>:max_seconds</tt>:: Maximum number of seconds to spend reading
+ # <tt>:basedir</tt>:: Destination directory, use temp dir if not specified
+ # <tt>:logger</tt>:: Logger to use
+ #
+ # === Return
+ # retriever(Retrivers::Download):: Retriever for this repository
+ def retriever(options)
+ RightScraper::Retrievers::Download.new(self, options)
end
+
# Add this repository to the list of available types.
@@types[:download] = RightScraper::Repositories::Download
- # Add a way to get this class, specifically.
- @@types[:download_command_line] = RightScraper::Repositories::Download
+
end
end
end
View
43 ...per_git/lib/right_scraper_git/repositories/git.rb → lib/right_scraper/repositories/git.rb
@@ -1,5 +1,5 @@
#--
-# Copyright: Copyright (c) 2010 RightScale, Inc.
+# Copyright: Copyright (c) 2010-2011 RightScale, Inc.
#
# Permission is hereby granted, free of charge, to any person obtaining
# a copy of this software and associated documentation files (the
@@ -23,8 +23,18 @@
module RightScraper
module Repositories
- # A cookbook repository stored in a Git repository.
- class Git < Repository
+ # A Git repository.
+ class Git < Base
+
+ # (String) Optional, tag or branch of repository that should be downloaded
+ attr_accessor :tag
+ alias_method :revision, :tag
+
+ # (String) Optional, git private SSH key content
+ attr_accessor :first_credential
+ alias_method :ssh_key, :first_credential
+
+ # Initialize repository
def initialize(*args)
super
@tag = "master" if @tag.nil?
@@ -35,22 +45,15 @@ def repo_type
:git
end
- # (String) Optional, tag or branch of repository that should be downloaded
- attr_accessor :tag
- alias_method :revision, :tag
-
- # (String) Optional, git private SSH key content
- attr_accessor :first_credential
-
# Return a unique identifier for this revision in this repository.
#
# === Returns
# String:: opaque unique ID for this revision in this repository
def checkout_hash
- digest("#{RS_PROTOCOL_VERSION}\000#{repo_type}\000#{url}\000#{tag}")
+ digest("#{PROTOCOL_VERSION}\000#{repo_type}\000#{url}\000#{tag}")
end
- # Convert this repository to a URL in the style of Cookbook URLs.
+ # Convert this repository to a URL in the style of resource URLs.
#
# === Returns
# URI:: URL representing this repository
@@ -63,10 +66,18 @@ def to_url
uri
end
- # (ScraperBase class) Appropriate class for scraping this sort of
- # repository.
- def scraper
- RightScraper::Scrapers::Git
+ # Instantiate retriever for this kind of repository
+ #
+ # === Options
+ # <tt>:max_bytes</tt>:: Maximum number of bytes to read
+ # <tt>:max_seconds</tt>:: Maximum number of seconds to spend reading
+ # <tt>:basedir</tt>:: Destination directory, use temp dir if not specified
+ # <tt>:logger</tt>:: Logger to use
+ #
+ # === Return
+ # retriever(Retrivers::Git):: Retriever for this repository
+ def retriever(options)
+ RightScraper::Retrievers::Git.new(self, options)
end
# Add this repository to the list of available types.
View
8 ..._base/lib/right_scraper_base/repositories/mock.rb → lib/right_scraper/repositories/mock.rb
@@ -1,5 +1,5 @@
#--
-# Copyright: Copyright (c) 2010 RightScale, Inc.
+# Copyright: Copyright (c) 2010-2011 RightScale, Inc.
#
# Permission is hereby granted, free of charge, to any person obtaining
# a copy of this software and associated documentation files (the
@@ -20,13 +20,13 @@
# TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
# SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
#++
-require File.expand_path(File.join(File.dirname(__FILE__), '..', 'repository'))
+require File.expand_path(File.join(File.dirname(__FILE__), 'base'))
module RightScraper
module Repositories
# A "repository" that is just there for testing. This class is not
# loaded by default.
- class Mock < Repository
+ class Mock < Base
# Create a new mock repository.
def initialize
@repo_type = :mock
@@ -52,7 +52,7 @@ def to_s
res = "mock #{url}:#{tag}"
end
- # (ScraperBase class) Appropriate class for scraping this sort of
+ # (Base class) Appropriate class for scraping this sort of
# repository.
def scraper
@@scraper || raise("Scraper for mocks isn't defined yet")
View
47 ...per_svn/lib/right_scraper_svn/repositories/svn.rb → lib/right_scraper/repositories/svn.rb
@@ -1,5 +1,5 @@
#--
-# Copyright: Copyright (c) 2010 RightScale, Inc.
+# Copyright: Copyright (c) 2010-2011 RightScale, Inc.
#
# Permission is hereby granted, free of charge, to any person obtaining
# a copy of this software and associated documentation files (the
@@ -24,7 +24,20 @@
module RightScraper
module Repositories
# A repository that is stored in a Subversion server.
- class Svn < Repository
+ class Svn < Base
+
+ # (String) Optional, tag or branch of repository that should be downloaded
+ attr_accessor :tag
+ alias_method :revision, :tag
+
+ # (String) Optional, SVN username
+ attr_accessor :first_credential
+ alias_method :username, :first_credential
+
+ # (String) Optional, SVN password
+ attr_accessor :second_credential
+ alias_method :password, :second_credential
+
# Create a new SvnRepository. If the tag is not specified,
# defaults to HEAD.
def initialize(*args)
@@ -37,25 +50,15 @@ def repo_type
:svn
end
- # (String) Optional, tag or branch of repository that should be downloaded
- attr_accessor :tag
- alias_method :revision, :tag
-
- # (String) Optional, SVN username
- attr_accessor :first_credential
-
- # (String) Optional, SVN password
- attr_accessor :second_credential
-
# Return a unique identifier for this revision in this repository.
#
# === Returns
# String:: opaque unique ID for this revision in this repository
def checkout_hash
- digest("#{RS_PROTOCOL_VERSION}\000#{repo_type}\000#{url}\000#{tag}")
+ digest("#{PROTOCOL_VERSION}\000#{repo_type}\000#{url}\000#{tag}")
end
- # Convert this repository to a URL in the style of Cookbook URLs.
+ # Convert this repository to a URL in the style of resource URLs.
#
# === Returns
# URI:: URL representing this repository
@@ -68,10 +71,18 @@ def to_url
uri
end
- # (ScraperBase class) Appropriate class for scraping this sort of
- # repository.
- def scraper
- RightScraper::Scrapers::Svn
+ # Instantiate retriever for this kind of repository
+ #
+ # === Options
+ # <tt>:max_bytes</tt>:: Maximum number of bytes to read
+ # <tt>:max_seconds</tt>:: Maximum number of seconds to spend reading
+ # <tt>:basedir</tt>:: Destination directory, use temp dir if not specified
+ # <tt>:logger</tt>:: Logger to use
+ #
+ # === Return
+ # retriever(Retrivers::Svn):: Retriever for this repository
+ def retriever(options)
+ RightScraper::Retrievers::Svn.new(self, options)
end
# Add this repository to the list of available types.
View
70 lib/right_scraper/resources/base.rb
@@ -0,0 +1,70 @@
+#--
+# Copyright: Copyright (c) 2010-2011 RightScale, Inc.
+#
+# Permission is hereby granted, free of charge, to any person obtaining
+# a copy of this software and associated documentation files (the
+# 'Software'), to deal in the Software without restriction, including
+# without limitation the rights to use, copy, modify, merge, publish,
+# distribute, sublicense, and/or sell copies of the Software, and to
+# permit persons to whom the Software is furnished to do so, subject to
+# the following conditions:
+#
+# The above copyright notice and this permission notice shall be
+# included in all copies or substantial portions of the Software.
+#
+# THE SOFTWARE IS PROVIDED 'AS IS', WITHOUT WARRANTY OF ANY KIND,
+# EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+# MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
+# IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
+# CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
+# TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
+# SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
+#++
+require 'digest/sha1'
+require 'uri'
+
+module RightScraper
+
+ module Resources
+
+ # Localized representation of a resource. Contains the resource
+ # contents, and the metadata as a hash. A resource at its core is any
+ # abstraction that is statically represented by a set of files and
+ # directories and metadata.
+ #
+ # The JSON metadata for the resource is in #metadata, and the
+ # manifest is in #manifest.
+ class Base
+
+ # (Repositories::Base) Repository the resource was fetched from.
+ attr_reader :repository
+
+ # (Hash) Metadata from the resource.
+ attr_accessor :metadata
+
+ # (Hash) Manifest for resource. A hash of path => SHA-1 digests.
+ attr_accessor :manifest
+
+ # (String) Position in the repository.
+ attr_accessor :pos
+
+ # Create a new resource from the given parameters.
+ #
+ # === Parameters
+ # repo(Repositories::Base):: Repository containing this resource
+ def initialize(repo, pos)
+ @repository = repo
+ @pos = pos
+ end
+
+ # Resource hash
+ #
+ # === Return
+ # hash(String):: Hexadecimal value that uniquely identifies this resource
+ def resource_hash
+ Digest::SHA1.hexdigest("#{PROTOCOL_VERSION}\000#{@repository.checkout_hash}\000#{@pos}")
+ end
+
+ end
+ end
+end
View
12 right_scraper_s3/lib/right_scraper_s3.rb → lib/right_scraper/resources/cookbook.rb
@@ -1,5 +1,5 @@
#--
-# Copyright: Copyright (c) 2010 RightScale, Inc.
+# Copyright: Copyright (c) 2010-2011 RightScale, Inc.
#
# Permission is hereby granted, free of charge, to any person obtaining
# a copy of this software and associated documentation files (the
@@ -21,4 +21,12 @@
# SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
#++
-require File.expand_path(File.join(File.dirname(__FILE__), 'right_scraper_s3', 'scanners', 's3'))
+module RightScraper
+
+ module Resources
+
+ class Cookbook < Base
+ end
+
+ end
+end
View
40 ...ib/right_scraper_libcurl/repositories/download.rb → lib/right_scraper/resources/workflow.rb
@@ -1,5 +1,5 @@
#--
-# Copyright: Copyright (c) 2010 RightScale, Inc.
+# Copyright: Copyright (c) 2010-2011 RightScale, Inc.
#
# Permission is hereby granted, free of charge, to any person obtaining
# a copy of this software and associated documentation files (the
@@ -22,20 +22,34 @@
#++
module RightScraper
- module Repositories
- # A "cookbook repository" that is just an archive file hanging off a
- # web server somewhere.
- class LibCurlDownload < Download
- # (ScraperBase class) Appropriate class for scraping this sort of
- # repository.
- def scraper
- RightScraper::Scrapers::LibCurlDownload
+
+ module Resources
+
+ class Workflow < Base
+
+ METADATA_EXT = '.meta'
+ DEFINITION_EXT = '.def'
+
+ # Relative path to definition file
+ # @pos must be set before this can be called
+ #
+ # === Return
+ # path(String):: Path to definition file
+ def definition_path
+ path = @pos
+ end
+
+ # Relative path to metadata file
+ # @pos must be set before this can be called
+ #
+ # === Return
+ # path(String):: Path to metadata file
+ def metadata_path
+ path = @pos.chomp(File.extname(@pos)) + METADATA_EXT if @pos
end
- # Add this repository to the list of available types.
- # @@types[:download] = RightScraper::Repositories::LibCurlDownload
- # Add a way to get this class, specifically.
- @@types[:download_libcurl] = RightScraper::Repositories::LibCurlDownload
end
+
end
end
+
View
114 lib/right_scraper/retrievers/base.rb
@@ -0,0 +1,114 @@
+#--
+# Copyright: Copyright (c) 2010-2011 RightScale, Inc.
+#
+# Permission is hereby granted, free of charge, to any person obtaining
+# a copy of this software and associated documentation files (the
+# 'Software'), to deal in the Software without restriction, including
+# without limitation the rights to use, copy, modify, merge, publish,
+# distribute, sublicense, and/or sell copies of the Software, and to
+# permit persons to whom the Software is furnished to do so, subject to
+# the following conditions:
+#
+# The above copyright notice and this permission notice shall be
+# included in all copies or substantial portions of the Software.
+#
+# THE SOFTWARE IS PROVIDED 'AS IS', WITHOUT WARRANTY OF ANY KIND,
+# EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+# MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
+# IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
+# CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
+# TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
+# SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
+#++
+
+module RightScraper
+ module Retrievers
+ # Base class for all retrievers.
+ #
+ # Retrievers fetch remote repositories into a given path
+ # They will attempt to fetch incrementally when possible (e.g. leveraging
+ # the underlying source control management system incremental capabilities)
+ class Base
+
+ # Integer:: optional maximum size permitted for repositories
+ attr_accessor :max_bytes
+
+ # Integer:: optional maximum number of seconds for any single
+ # retrieve operation.
+ attr_accessor :max_seconds
+
+ # RightScraper::Repositories::Base:: repository currently being retrieved
+ attr_reader :repository
+
+ # String:: Path to directory where files are retrieved
+ attr_reader :repo_dir
+
+ # Create a new retriever for the given repository. This class
+ # recognizes several options, and subclasses may recognize
+ # additional options. Options may never be required.
+ #
+ # === Options
+ # <tt>:basedir</tt>:: Required, base directory where all files should be retrieved
+ # <tt>:max_bytes</tt>:: Maximum number of bytes to read
+ # <tt>:max_seconds</tt>:: Maximum number of seconds to spend reading
+ # <tt>:logger</tt>:: Logger to use
+ #
+ # === Parameters
+ # repository(RightScraper::Repositories::Base):: repository to scrape
+ # options(Hash):: retriever options
+ #
+ # === Raise
+ # 'Missing base directory':: if :basedir option is missing
+ def initialize(repository, options={})
+ raise 'Missing base directory' unless options[:basedir]
+ @repository = repository
+ @max_bytes = options[:max_bytes] || nil
+ @max_seconds = options[:max_seconds] || nil
+ @basedir = options[:basedir]
+ @repo_dir = RightScraper::Retrievers::Base.repo_dir(@basedir, repository)
+ @logger = options[:logger] || RightScraper::Logger.new
+ @logger.repository = repository
+ @logger.operation(:initialize, "setting up in #{@repo_dir}") do
+ FileUtils.mkdir_p(@repo_dir)
+ end
+ end
+
+ # Paths to ignore when traversing the filesystem. Mostly used for
+ # things like Git and Subversion version control directories.
+ #
+ # === Return
+ # list(Array):: list of filenames to ignore.
+ def ignorable_paths
+ []
+ end
+
+ # Retrieve repository, overridden in heirs
+ def retrieve
+ raise NotImplementedError
+ end
+
+ # Path to directory where given repo should be or was downloaded
+ #
+ # === Parameters
+ # root_dir(String):: Path to directory containing all scraped repositories
+ # repo(Hash|RightScraper::Repositories::Base):: Remote repository corresponding to local directory
+ #
+ # === Return
+ # String:: Path to local directory that corresponds to given repository
+ def self.repo_dir(root_dir, repo)
+ repo = RightScraper::Repositories::Base.from_hash(repo) if repo.is_a?(Hash)
+ dir_name = repo.repository_hash
+ dir_path = File.join(root_dir, dir_name)
+ "#{dir_path}/repo"
+ end
+
+ protected
+
+ # (Hash) Lookup table from textual description of scraper type
+ # ('cookbook' or 'workflow' currently) to the class that
+ # represents that scraper.
+ @@types = {} unless class_variable_defined?(:@@types)
+
+ end
+ end
+end
View
24 ..._base/lib/right_scraper_base/scrapers/checkout.rb → lib/right_scraper/retrievers/checkout.rb
@@ -1,5 +1,5 @@
#--
-# Copyright: Copyright (c) 2010 RightScale, Inc.
+# Copyright: Copyright (c) 2010-2011 RightScale, Inc.
#
# Permission is hereby granted, free of charge, to any person obtaining
# a copy of this software and associated documentation files (the
@@ -21,19 +21,19 @@
# SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
#++
-require File.expand_path(File.join(File.dirname(__FILE__), 'filesystem'))
-
module RightScraper
- module Scrapers
- # Base class for FS based scrapers that want to do version control
+ module Retrievers
+
+ # Base class for retrievers that want to do version control
# operations (CVS, SVN, etc.). Subclasses can get away with
# implementing only #do_checkout but to support incremental
# operation need to implement #exists? and #do_update, in addition
- # to FilesystemBasedScraper#ignorable_paths.
- class CheckoutBasedScraper < FilesystemBasedScraper
+ # to Retrievers::Base#ignorable_paths.
+ class CheckoutBasedRetriever < Base
+
# Check out repository into the directory. Occurs between
# variable initialization and beginning scraping.
- def setup_dir
+ def retrieve
if exists?
begin
@logger.operation(:updating) do
@@ -41,7 +41,7 @@ def setup_dir
end
rescue
@logger.note_error($!, :updating, "switching to using checkout")
- FileUtils.remove_entry_secure basedir
+ FileUtils.remove_entry_secure @repo_dir
@logger.operation(:checkout) do
do_checkout
end
@@ -71,13 +71,9 @@ def do_update
# Perform a de novo full checkout of the repository. Subclasses
# must override this to do anything useful.
def do_checkout
- FileUtils.mkdir_p(basedir)
+ FileUtils.mkdir_p(@repo_dir)
end
- # Path to check repository to. Should be within @basedir.
- def basedir
- File.join(@basedir, @repository.repository_hash)
- end
end
end
end
View
24 ...ht_scraper_base/scrapers/command_line_download.rb → lib/right_scraper/retrievers/download.rb
@@ -1,5 +1,5 @@
#--
-# Copyright: Copyright (c) 2010 RightScale, Inc.
+# Copyright: Copyright (c) 2010-2011 RightScale, Inc.
#
# Permission is hereby granted, free of charge, to any person obtaining
# a copy of this software and associated documentation files (the
@@ -20,28 +20,30 @@
# TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
# SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
#++
-require File.expand_path(File.join(File.dirname(__FILE__), 'filesystem'))
require 'process_watcher'
require 'tempfile'
require 'digest/sha1'
module RightScraper
- module Scrapers
- # A scraper for cookbooks stored in archives on a web server
+ module Retrievers
+ # A retriever for resources stored in archives on a web server
# somewhere. Uses command line curl and command line tar.
- class CommandLineDownload < FilesystemBasedScraper
+ class Download < Base
+
+ # Directory used to download tarballs
def workdir
File.join(@basedir, @repository.repository_hash)
end
- def basedir
+ # Path to directory where files are retrieved
+ def repo_dir
File.join(workdir, "archive")
end
- # Return next cookbook from the stream, or nil if none.
- def setup_dir
+ # Download tarball and unpack it
+ def retrieve
FileUtils.remove_entry_secure workdir if File.exists?(workdir)
- FileUtils.mkdir_p basedir
+ FileUtils.mkdir_p repo_dir
file = File.join(workdir, "package")
@logger.operation(:downloading) do
@@ -69,8 +71,8 @@ def setup_dir
else
extraction = "xf"
end
- Dir.chdir(basedir) do
- ProcessWatcher.watch("tar", [extraction, file], basedir,
+ Dir.chdir(repo_dir) do
+ ProcessWatcher.watch("tar", [extraction, file], repo_dir,
@max_bytes || -1, @max_seconds || -1) do |phase, command, exception|
@logger.note_phase(phase, :running_command, command, exception)
end
View
23 ...scraper_git/lib/right_scraper_git/scrapers/git.rb → lib/right_scraper/retrievers/git.rb
@@ -1,5 +1,5 @@
#--
-# Copyright: Copyright (c) 2010 RightScale, Inc.
+# Copyright: Copyright (c) 2010-2011 RightScale, Inc.
#
# Permission is hereby granted, free of charge, to any person obtaining
# a copy of this software and associated documentation files (the
@@ -20,17 +20,16 @@
# TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
# SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
#++
-require File.expand_path(File.join(File.dirname(__FILE__), '..', 'processes', 'ssh'))
require 'git'
module RightScraper
- module Scrapers
- # Scraper for cookbooks stored in a git repository.
- class Git < CheckoutBasedScraper
- # In addition to normal scraper initialization, if the
+ module Retrievers
+ # Retriever for resources stored in a git repository.
+ class Git < CheckoutBasedRetriever
+ # In addition to normal retriever initialization, if the
# underlying repository has a credential we need to initialize a
# fresh SSHAgent and add the credential to it.
- def setup_dir
+ def retrieve
RightScraper::Processes::SSHAgent.with do |agent|
agent.add_key(@repository.first_credential) unless
@repository.first_credential.nil?
@@ -45,7 +44,7 @@ def setup_dir
# Boolean:: true if the checkout already exists (and thus
# incremental updating can occur).
def exists?
- File.exists?(File.join(basedir, '.git'))
+ File.exists?(File.join(@repo_dir, '.git'))
end
def do_fetch(git)
@@ -65,7 +64,7 @@ def do_fetch(git)
# Note that if #tag is a SHA revision or a tag that exists in the
# current repository, no fetching is done.
def do_update
- git = ::Git.open(basedir)
+ git = ::Git.open(@repo_dir)
do_fetch(git)
git.reset_hard
do_checkout_revision(git)
@@ -78,13 +77,13 @@ def do_update_tag(git)
end
# Clone the remote repository. The operations are as follows:
- # * clone repository to #basedir
+ # * clone repository to @repo_dir
# * checkout #tag
# * update @repository#tag
def do_checkout
super
- git = @logger.operation(:cloning, "to #{basedir}") do
- ::Git.clone(@repository.url, basedir)
+ git = @logger.operation(:cloning, "to #{@repo_dir}") do
+ ::Git.clone(@repository.url, @repo_dir)
end
do_fetch(git)
do_checkout_revision(git)
View
18 ...scraper_svn/lib/right_scraper_svn/scrapers/svn.rb → lib/right_scraper/retrievers/svn.rb
@@ -1,5 +1,5 @@
#--
-# Copyright: Copyright (c) 2010 RightScale, Inc.
+# Copyright: Copyright (c) 2010-2011 RightScale, Inc.
#
# Permission is hereby granted, free of charge, to any person obtaining
# a copy of this software and associated documentation files (the
@@ -20,13 +20,15 @@
# TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
# SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
#++
-require File.expand_path(File.join(File.dirname(__FILE__), '..', 'svn_client'))
+require File.join(File.dirname(__FILE__), '..', 'svn_client')
+
require 'process_watcher'
module RightScraper
- module Scrapers
- # Scraper for cookbooks stored in a Subversion repository.
- class Svn < CheckoutBasedScraper
+ module Retrievers
+ # Retriever for svn repositories
+ class Svn < CheckoutBasedRetriever
+
include RightScraper::SvnClient
# Return true if a checkout exists. Currently tests for .svn in
@@ -36,7 +38,7 @@ class Svn < CheckoutBasedScraper
# Boolean:: true if the checkout already exists (and thus
# incremental updating can occur).
def exists?
- File.exists?(File.join(basedir, '.svn'))
+ File.exists?(File.join(@repo_dir, '.svn'))
end
# Incrementally update the checkout. The operations are as follows:
@@ -67,11 +69,11 @@ def do_update_tag
end
# Check out the remote repository. The operations are as follows:
- # * checkout repository at #tag to #basedir
+ # * checkout repository at #tag to @repo_dir
def do_checkout
super
@logger.operation(:checkout_revision) do
- run_svn_no_chdir("checkout", @repository.url, basedir, get_tag_argument)
+ run_svn_no_chdir("checkout", @repository.url, @repo_dir, get_tag_argument)
end
do_update_tag
end
View
32 ...aper_base/lib/right_scraper_base/scanners/base.rb → lib/right_scraper/scanners/base.rb
@@ -1,5 +1,5 @@
#--
-# Copyright: Copyright (c) 2010 RightScale, Inc.
+# Copyright: Copyright (c) 2010-2011 RightScale, Inc.
#
# Permission is hereby granted, free of charge, to any person obtaining
# a copy of this software and associated documentation files (the
@@ -21,8 +21,6 @@
# SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
#++
-require File.expand_path(File.join(File.dirname(__FILE__), '..', 'logger'))
-
module RightScraper
module Scanners
# Base class for scanning filesystems. Subclasses should override
@@ -31,16 +29,16 @@ module Scanners
#
# Overriding #new is useful for getting
# additional arguments. Overriding #begin allows you to do
- # processing before the scan of a given cookbook begins;
+ # processing before the scan of a given resource begins;
# overriding #end allows you to do processing after it completes.
#
# Most processing will occur in #notice, which notifies you that a
# file has been detected, and in #notice_dir. In #notice you are
# handed the relative position of the file from the start of the
- # cookbook; so if you were scanning <tt>/a/cookbook</tt> and
+ # resource; so if you were scanning <tt>/a/resource</tt> and
# noticed a file <tt>b/c</tt>, #notice would be called with
# <tt>"b/c"</tt>, even though the full pathname is
- # <tt>/a/cookbook/b/c</tt>. If you decide you need the actual
+ # <tt>/a/resource/b/c</tt>. If you decide you need the actual
# data, #notice takes a block which will return that data to you
# if you +yield+.
#
@@ -48,9 +46,9 @@ module Scanners
# directory. The return value determines whether you find the
# directory worth recursing into, or not--as an example, when
# looking for the <tt>metadata.json</tt> file it is never
- # necessary to descend past the topmost directory of the cookbook,
+ # necessary to descend past the topmost directory of the resource,
# but the same is not true when building a manifest.
- class Scanner
+ class Base
# Create a new Scanner. Recognizes options as given. Some
# options may be required, others optional. This class recognizes
# only _:logger_.
@@ -61,7 +59,7 @@ class Scanner
# === Parameters ===
# options(Hash):: scanner options
def initialize(options={})
- @logger = options.fetch(:logger, Logger.new)
+ @logger = options.fetch(:logger, RightScraper::Logger.new)
end
# Notification that all scans for this repository have
@@ -69,19 +67,19 @@ def initialize(options={})
def finish
end
- # Begin a scan for the given cookbook.
+ # Begin a scan for the given resource.
#
# === Parameters ===
- # cookbook(RightScraper::Cookbook):: cookbook to scan
- def begin(cookbook)
+ # resource(RightScraper::Resource::Base):: resource to scan
+ def begin(resource)
end
- # Finish a scan for the given cookbook.
+ # Finish a scan for the given resource.
#
# === Parameters ===
- # cookbook(RightScraper::Cookbook):: cookbook that just finished
+ # resource(RightScraper::Resource::Base):: resource that just finished
# scanning
- def end(cookbook)
+ def end(resource)
end
# Notice a file during scanning.
@@ -92,7 +90,7 @@ def end(cookbook)
#
# === Parameters ===
# relative_position(String):: relative pathname for _pathname_
- # from root of cookbook
+ # from root of resource
def notice(relative_position)
end
@@ -101,7 +99,7 @@ def notice(relative_position)
#
# === Parameters ===
# relative_position(String):: relative pathname for the directory
- # from root of cookbook
+ # from root of resource
#
# === Returns ===
# Boolean:: should the scanning recurse into the directory
View
14 ..._base/lib/right_scraper_base/scanners/manifest.rb → lib/right_scraper/scanners/cookbook_manifest.rb
@@ -1,5 +1,5 @@
#--
-# Copyright: Copyright (c) 2010 RightScale, Inc.
+# Copyright: Copyright (c) 2010-2011 RightScale, Inc.
#
# Permission is hereby granted, free of charge, to any person obtaining
# a copy of this software and associated documentation files (the
@@ -27,19 +27,19 @@
module RightScraper
module Scanners
# Build manifests from a filesystem.
- class Manifest < Scanner
+ class CookbookManifest < Base
# Create a new manifest scanner. Does not accept any new arguments.
def initialize(*args)
super
@manifest = {}
end
- # Complete a scan for the given cookbook.
+ # Complete a scan for the given resource.
#
# === Parameters ===
- # cookbook(RightScraper::Cookbook):: cookbook to scan
- def end(cookbook)
- cookbook.manifest = @manifest
+ # resource(RightScraper::Resources::Base):: resource to scan
+ def end(resource)
+ resource.manifest = @manifest
@manifest = {}
end
@@ -50,7 +50,7 @@ def end(cookbook)
# not always be necessary to read the data.
#
# === Parameters ===
- # relative_position(String):: relative pathname for file from root of cookbook
+ # relative_position(String):: relative pathname for file from root of resource
def notice(relative_position)
@manifest[relative_position] = Digest::SHA1.hexdigest(yield)
end
View
7 ..._base/lib/right_scraper_base/scanners/metadata.rb → lib/right_scraper/scanners/cookbook_metadata.rb
@@ -1,5 +1,5 @@
#--
-# Copyright: Copyright (c) 2010 RightScale, Inc.
+# Copyright: Copyright (c) 2010-2011 RightScale, Inc.
#
# Permission is hereby granted, free of charge, to any person obtaining
# a copy of this software and associated documentation files (the
@@ -21,17 +21,16 @@
# SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
#++
-require File.expand_path(File.join(File.dirname(__FILE__), 'base'))
require 'json'
module RightScraper
module Scanners
# Load cookbook metadata from a filesystem.
- class Metadata < Scanner
+ class CookbookMetadata < Base
# Begin a scan for the given cookbook.
#
# === Parameters
- # cookbook(RightScraper::Cookbook):: cookbook to scan
+ # cookbook(RightScraper::Resources::Cookbook):: cookbook to scan
def begin(cookbook)
@cookbook = cookbook
end
View
9 right_scraper_s3/lib/right_scraper_s3/scanners/s3.rb → lib/right_scraper/scanners/cookbook_s3_upload.rb
@@ -1,5 +1,5 @@
#--
-# Copyright: Copyright (c) 2010 RightScale, Inc.
+# Copyright: Copyright (c) 2010-2011 RightScale, Inc.
#
# Permission is hereby granted, free of charge, to any person obtaining
# a copy of this software and associated documentation files (the
@@ -26,7 +26,7 @@
module RightScraper
module Scanners
# Upload scanned files to an S3 bucket.
- class S3Upload < Scanner
+ class CookbookS3Upload < Base
# Create a new S3Upload. In addition to the options recognized
# by Scanner, this class recognizes <tt>:s3_key</tt>,
# <tt>:s3_secret</tt>, and <tt>:s3_bucket</tt> and requires all
@@ -56,11 +56,10 @@ def initialize(options={})
# === Parameters
# cookbook(RightScraper::Cookbook):: cookbook to scan
def end(cookbook)
- @bucket.put(File.join('Cooks', cookbook.cookbook_hash),
+ @bucket.put(File.join('Cooks', cookbook.resource_hash),
{
- :url => cookbook.to_url,
:metadata => cookbook.metadata,
- :manifest => cookbook.manifest,
+ :manifest => cookbook.manifest
}.to_json)
end
View
22 ...per_base/lib/right_scraper_base/scanners/union.rb → lib/right_scraper/scanners/union.rb
@@ -1,5 +1,5 @@
#--
-# Copyright: Copyright (c) 2010 RightScale, Inc.
+# Copyright: Copyright (c) 2010-2011 RightScale, Inc.
#
# Permission is hereby granted, free of charge, to any person obtaining
# a copy of this software and associated documentation files (the
@@ -40,20 +40,20 @@ def finish
@subscanners.each {|scanner| scanner.finish}
end
- # Begin a scan for the given cookbook.
+ # Begin a scan for the given resource.
#
# === Parameters
- # cookbook(RightScraper::Cookbook):: cookbook to scan
- def begin(cookbook)
- @subscanners.each {|scanner| scanner.begin(cookbook)}
+ # resource(RightScraper::Resource::Base):: resource to scan
+ def begin(resource)
+ @subscanners.each {|scanner| scanner.begin(resource)}
end
- # Finish a scan for the given cookbook.
+ # Finish a scan for the given resource.
#
# === Parameters
- # cookbook(RightScraper::Cookbook):: cookbook that just finished scanning
- def end(cookbook)
- @subscanners.each {|scanner| scanner.end(cookbook)}
+ # resource(RightScraper::Resource::Base):: resource that just finished scanning
+ def end(resource)
+ @subscanners.each {|scanner| scanner.end(resource)}
end
# Notice a file during scanning.
@@ -63,7 +63,7 @@ def end(cookbook)
# not always be necessary to read the data.
#
# === Parameters
- # relative_position(String):: relative pathname for the file from the root of cookbook
+ # relative_position(String):: relative pathname for the file from the root of resource
def notice(relative_position)
data = nil
@subscanners.each {|scanner| scanner.notice(relative_position) {
@@ -77,7 +77,7 @@ def notice(relative_position)
# subscanners report that they should recurse into the directory.
#
# === Parameters
- # relative_position(String):: relative pathname for directory from root of cookbook
+ # relative_position(String):: relative pathname for directory from root of resource
#
# === Returns
# Boolean:: should the scanning recurse into the directory
View
86 lib/right_scraper/scanners/workflow_manifest.rb
@@ -0,0 +1,86 @@
+#--
+# Copyright: Copyright (c) 2010-2011 RightScale, Inc.
+#
+# Permission is hereby granted, free of charge, to any person obtaining
+# a copy of this software and associated documentation files (the
+# 'Software'), to deal in the Software without restriction, including
+# without limitation the rights to use, copy, modify, merge, publish,
+# distribute, sublicense, and/or sell copies of the Software, and to
+# permit persons to whom the Software is furnished to do so, subject to
+# the following conditions:
+#
+# The above copyright notice and this permission notice shall be
+# included in all copies or substantial portions of the Software.
+#
+# THE SOFTWARE IS PROVIDED 'AS IS', WITHOUT WARRANTY OF ANY KIND,
+# EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+# MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
+# IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
+# CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
+# TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
+# SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
+#++
+
+require File.expand_path(File.join(File.dirname(__FILE__), 'base'))
+require 'digest/sha1'
+
+module RightScraper
+ module Scanners
+ # Build manifests from a filesystem.
+ class WorkflowManifest < Base
+ # Create a new manifest scanner. Does not accept any new arguments.
+ def initialize(*args)
+ super
+ @manifest = {}
+ end
+
+ # Retrieve relative workflow files positions
+ #
+ # === Parameters
+ # workflow(Resources::Workflow):: Workflow whose manifest is being built
+ def begin(workflow)
+ @workflow = workflow
+ @metadata_filename = File.basename(@workflow.metadata_path)
+ @definition_filename = File.basename(@workflow.definition_path)
+ end
+
+ # Complete a scan for the given resource.
+ #
+ # === Parameters ===
+ # resource(RightScraper::Resources::Base):: resource to scan
+ def end(resource)
+ resource.manifest = @manifest
+ @manifest = {}
+ end
+
+ # Notice a file during scanning.
+ #
+ # === Block ===
+ # Return the data for this file. We use a block because it may
+ # not always be necessary to read the data.
+ #
+ # === Parameters ===
+ # relative_position(String):: relative pathname for file from root of resource
+ def notice(relative_position)
+ if [ @metadata_filename, @definition_filename ].include?(relative_position)
+ @manifest[relative_position] = Digest::SHA1.hexdigest(yield)
+ end
+ end
+
+ # Notice a directory during scanning. Since the workflow definition and
+ # metadata live in the root directory we don't need to recurse,
+ # but we do need to go into the first directory (identified by
+ # +relative_position+ being +nil+).
+ #
+ # === Parameters
+ # relative_position(String):: relative pathname for the directory from root of workflow
+ #
+ # === Returns
+ # Boolean:: should the scanning recurse into the directory
+ def notice_dir(relative_position)
+ relative_position == nil
+ end
+
+ end
+ end
+end
View
70 lib/right_scraper/scanners/workflow_metadata.rb
@@ -0,0 +1,70 @@
+#--
+# Copyright: Copyright (c) 2010-2011 RightScale, Inc.
+#
+# Permission is hereby granted, free of charge, to any person obtaining
+# a copy of this software and associated documentation files (the
+# 'Software'), to deal in the Software without restriction, including
+# without limitation the rights to use, copy, modify, merge, publish,
+# distribute, sublicense, and/or sell copies of the Software, and to
+# permit persons to whom the Software is furnished to do so, subject to
+# the following conditions:
+#
+# The above copyright notice and this permission notice shall be
+# included in all copies or substantial portions of the Software.
+#
+# THE SOFTWARE IS PROVIDED 'AS IS', WITHOUT WARRANTY OF ANY KIND,
+# EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+# MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
+# IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
+# CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
+# TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
+# SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
+#++
+
+require 'json'
+
+module RightScraper
+ module Scanners
+ # Load workflow metadata from a filesystem.
+ class WorkflowMetadata < Base
+ # Begin a scan for the given workflow.
+ #
+ # === Parameters
+ # workflow(RightScraper::Resources::Workflow):: workflow to scan
+ def begin(workflow)
+ @workflow = workflow
+ @metadata_filename = File.basename(workflow.metadata_path)
+ end
+
+ # Notice a file during scanning.
+ #
+ # === Block
+ # Return the data for this file. We use a block because it may
+ # not always be necessary to read the data.
+ #
+ # === Parameters
+ # relative_position(String):: relative pathname for the file from root of workflow
+ def notice(relative_position)
+ if relative_position == @metadata_filename
+ @logger.operation(:metadata_parsing) do
+ @workflow.metadata = JSON.parse(yield)
+ end
+ end
+ end
+
+ # Notice a directory during scanning. Since the workflow definition and
+ # metadata live in the root directory we don't need to recurse,
+ # but we do need to go into the first directory (identified by
+ # +relative_position+ being +nil+).
+ #
+ # === Parameters
+ # relative_position(String):: relative pathname for the directory from root of workflow
+ #
+ # === Returns
+ # Boolean:: should the scanning recurse into the directory
+ def notice_dir(relative_position)
+ relative_position == nil
+ end
+ end
+ end
+end
View
85 lib/right_scraper/scanners/workflow_s3_upload.rb
@@ -0,0 +1,85 @@
+#--
+# Copyright: Copyright (c) 2010-2011 RightScale, Inc.
+#
+# Permission is hereby granted, free of charge, to any person obtaining
+# a copy of this software and associated documentation files (the
+# 'Software'), to deal in the Software without restriction, including
+# without limitation the rights to use, copy, modify, merge, publish,
+# distribute, sublicense, and/or sell copies of the Software, and to
+# permit persons to whom the Software is furnished to do so, subject to
+# the following conditions:
+#
+# The above copyright notice and this permission notice shall be
+# included in all copies or substantial portions of the Software.
+#
+# THE SOFTWARE IS PROVIDED 'AS IS', WITHOUT WARRANTY OF ANY KIND,
+# EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+# MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
+# IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
+# CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
+# TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
+# SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
+#++
+require 'right_aws'
+require 'json'
+
+module RightScraper
+ module Scanners
+ # Upload workflow definition and metadata to an S3 bucket.
+ class WorkflowS3Upload < Base
+ # Create a new S3Upload. In addition to the options recognized
+ # by Scanner, this class recognizes <tt>:s3_key</tt>,
+ # <tt>:s3_secret</tt>, and <tt>:s3_bucket</tt> and requires all
+ # of those.
+ #
+ # === Options