Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changes search index format to fasten pod search --full command. #4249

Merged
merged 12 commits into from Oct 28, 2015

Conversation

manuyavuz
Copy link
Member

Detailed discussion can be found here: CocoaPods/cocoapods-search#8

  • Previous implementation was having the drawback of traversing and performing gsub on all index strings for each pod specification.
  • New implementation stores words as keys and list of pods containing corresponding word inside their specification as the values for corresponding hash keys. Therefore, while searching for a query, we only need to check if query is matched with any key in index hash, if so, we will add corresponding list of spec names to a Set object. Resulted Set object gives us the search result. Using this policy, sources manager can perform a faster search operation.

@floere
Copy link
Member

floere commented Oct 3, 2015

This is great! 🚀

Were you able to run the specs? There are two failures, reproduced here:

Pod::Informative: [!] Unable to find a pod with name, author, summary, or descriptionmatching `Chunky`
    /Users/travis/build/CocoaPods/CocoaPods/lib/cocoapods/sources_manager.rb:130:in `search_by_name': In general - can perform a full text search of the sets
    spec/unit/sources_manager_spec.rb:75:in `block (3 levels) in <module:Pod>'
    /Users/travis/build/CocoaPods/CocoaPods/spec/spec_helper/pre_flight.rb:33:in `call'
    /Users/travis/build/CocoaPods/CocoaPods/spec/spec_helper/pre_flight.rb:33:in `block in <class:Context>'
    spec/unit/sources_manager_spec.rb:73:in `block (2 levels) in <module:Pod>'
    spec/unit/sources_manager_spec.rb:37:in `block in <module:Pod>'
    spec/unit/sources_manager_spec.rb:30:in `<module:Pod>'
    spec/unit/sources_manager_spec.rb:29:in `<top (required)>'
Pod::Informative: [!] Unable to find a pod with name, author, summary, or descriptionmatching `Ch[aeiou]nky`
    /Users/travis/build/CocoaPods/CocoaPods/lib/cocoapods/sources_manager.rb:130:in `search_by_name': In general - can perform a full text regexp search of the sets
    spec/unit/sources_manager_spec.rb:82:in `block (3 levels) in <module:Pod>'
    /Users/travis/build/CocoaPods/CocoaPods/spec/spec_helper/pre_flight.rb:33:in `call'
    /Users/travis/build/CocoaPods/CocoaPods/spec/spec_helper/pre_flight.rb:33:in `block in <class:Context>'
    spec/unit/sources_manager_spec.rb:80:in `block (2 levels) in <module:Pod>'
    spec/unit/sources_manager_spec.rb:37:in `block in <module:Pod>'
    spec/unit/sources_manager_spec.rb:30:in `<module:Pod>'
    spec/unit/sources_manager_spec.rb:29:in `<top (required)>'

@manuyavuz
Copy link
Member Author

Yeah, I observed them also, but couldn't have time to fix it. Hoping to fix these tomorrow.

@floere
Copy link
Member

floere commented Oct 3, 2015

That's great! 🚀

@manuyavuz
Copy link
Member Author

Hey @floere,

I checked again and discovered that failures were caused by Core repo here not containing my changes in CocoaPods/Core#265.

How should I manage this? This PR is tightly coupled to changes in Core repo, so I guess it will not pass unless the PR in Core repo is merged.

Need suggestions.

@floere
Copy link
Member

floere commented Oct 4, 2015

Are the Core changes standalone? (They work independently?)

On Sun, Oct 4, 2015 at 3:09 PM, Muhammed Yavuz Nuzumlali
notifications@github.com wrote:

Hey @floere,
I checked again and discovered that failures were caused by Core repo here not containing my changes in CocoaPods/Core#265.
How should I manage this? This PR is tightly coupled to changes in Core repo, so I guess it will not pass unless the PR in Core repo is merged.

Need suggestions.

Reply to this email directly or view it on GitHub:
#4249 (comment)

@manuyavuz
Copy link
Member Author

Core include changes on aggregate.rb and can run standalone. However, sources_manager.rb here is dependent on the changes in aggregate code. Therefore, CocoaPods/Core#265 can pass individually, but this won't pass.

@segiddins
Copy link
Member

You can point the gem file here to your branch until the Core PR is merged

@manuyavuz
Copy link
Member Author

@segiddins I tried your suggestion but probably missing something.

Added the following line to Gemfile of CocoaPods project:

  gem 'cocoapods-core', :git => 'https://github.com/manuyavuz/Core.git', :branch => 'improvement/searchPerformance'

When I run rake spec I get the following error:

[!] There was an error parsing `Gemfile`: You cannot specify the same gem twice coming from different sources.
You specified that cocoapods-core (>= 0) should come from https://github.com/CocoaPods/Core.git (at master) and https://github.com/manuyavuz/Core.git (at improvement/searchPerformance)
. Bundler cannot continue.

 #  from /Users/manuyavuz/Development/OpenSource/Cocoapods/Rainforest/CocoaPods/Gemfile:29
 #  -------------------------------------------
 #    gem 'cocoapods-dependencies'
 >    gem 'cocoapods-core', :git => 'https://github.com/manuyavuz/Core.git', :branch => 'improvement/searchPerformance'
 #    gem 'bacon'
 #  -------------------------------------------

Is this default Core dependency come from gem 'cocoapods-dependencies'? If so, how can I solve this? Setting local dependency won't work because travis do not run using local dependencies I guess?

@manuyavuz
Copy link
Member Author

Ah, got it. Removed cp_gem line for Core dependency.

@manuyavuz manuyavuz force-pushed the improvement/searchPerformance branch from 702c1ce to ebf5d8c Compare October 5, 2015 09:25
@manuyavuz
Copy link
Member Author

@floere, previous errors have gone now. However, there are two fails for functional specs. I was not observing them in the previous commit.

I think they may be related to changes included from that time until now because I rebased the branch with master before sending the last commit.

Also the failure seems not related to any source file that I've changed.

@manuyavuz
Copy link
Member Author

Hi,

I've completed update index policy. All tests pass but got a strange error during running Inch.

--------------------------------------------------------------------------------
Running Inch
--------------------------------------------------------------------------------
Parse docs …
.............................................................................................
Evaluating …
................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
rake aborted!
ArgumentError: wrong number of arguments (2 for 1)
/Users/travis/build/CocoaPods/CocoaPods/vendor/bundle/ruby/2.0.0/gems/colored-1.2/lib/colored.rb:83:in `color'
/Users/travis/build/CocoaPods/CocoaPods/vendor/bundle/ruby/2.0.0/gems/term-ansicolor-1.3.0/lib/term/ansicolor.rb:226:in `on_color'
/Users/travis/build/CocoaPods/CocoaPods/vendor/bundle/ruby/2.0.0/gems/inch-0.5.10/lib/inch/utils/ui.rb:65:in `__header'
/Users/travis/build/CocoaPods/CocoaPods/vendor/bundle/ruby/2.0.0/gems/inch-0.5.10/lib/inch/utils/ui.rb:47:in `header'
/Users/travis/build/CocoaPods/CocoaPods/vendor/bundle/ruby/2.0.0/gems/inch-0.5.10/lib/inch/cli/command/output/list.rb:27:in `block in display_list'
/Users/travis/build/CocoaPods/CocoaPods/vendor/bundle/ruby/2.0.0/gems/inch-0.5.10/lib/inch/cli/command/output/list.rb:22:in `each'
/Users/travis/build/CocoaPods/CocoaPods/vendor/bundle/ruby/2.0.0/gems/inch-0.5.10/lib/inch/cli/command/output/list.rb:22:in `display_list'
/Users/travis/build/CocoaPods/CocoaPods/vendor/bundle/ruby/2.0.0/gems/inch-0.5.10/lib/inch/cli/command/output/list.rb:16:in `initialize'
/Users/travis/build/CocoaPods/CocoaPods/Rakefile:344:in `new'
/Users/travis/build/CocoaPods/CocoaPods/Rakefile:344:in `block (2 levels) in <top (required)>'
/Users/travis/build/CocoaPods/CocoaPods/Rakefile:146:in `block (2 levels) in <top (required)>'
Tasks: TOP => inch:spec
(See full trace by running task with --trace)

What could be the reason?

@segiddins
Copy link
Member

That happens when inch fails, so something isn't properly documented.

@manuyavuz
Copy link
Member Author

@floere with the last commit all tests pass. You may check out them now.

@segiddins
Copy link
Member

W00t! Excellent work, @manuyavuz , really look forward to reviewing this! If you want, feel free to give me your email so we can get you into our slack org?

@manuyavuz
Copy link
Member Author

Thanks @segiddins! I would be very happy to join the slack team because I'm hoping to contribute more. You can use the email in my github profile.

Waiting to hear for more!

sources.each do |source|
UI.section "Updating spec repo `#{source.name}`" do
Dir.chdir(source.repo) do
begin
prev_commit_hash = (git! %w(rev-parse HEAD)).strip
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we move this updating logic into Source ? i.e. so it has an update method that returns the changes spec paths?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean moving also the following, right?

output = git! %w(pull --ff-only)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@segiddins should we also move UI code there into source?

I don't think it's suitable because we would need config object inside source to handle UI things.

I'm thinking of making a method returning a tuple variable including git pull output and changed paths. It would be sth like this:

output, changed_spec_paths = source.update

Does this fit?

@segiddins
Copy link
Member

Yup

@manuyavuz manuyavuz force-pushed the improvement/searchPerformance branch 3 times, most recently from 9bd70c7 to 868d911 Compare October 17, 2015 15:53
def save_search_index(index)
require 'json'
@updated_search_index = index
File.open(search_index_path, 'w') do |file|
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

search_index_path should be a Pathname

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this way works too. Current implementation also performs open in the same way as the following:

      File.open(search_index_path, 'w') do |file|
        file.write(search_index.to_yaml)
      end

What kind of problem may we face?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem, we just try to stay away from using strings as paths and use pathnames instead

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it indeed gives a pathname. It's a method of sources_manager as the following (line 202):

      # @return [Pathname] The path where the search index should be stored.
      #
      def search_index_path
        Config.instance.search_index_file
      end

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in that case, lets just use search_index_path.open

@manuyavuz
Copy link
Member Author

I've completed implementation and fixed multi word search issue. Solving that required a change in cocoapods-search plugin also (CocoaPods/cocoapods-search#10)

I think this is ready for final review too.

@manuyavuz
Copy link
Member Author

@segiddins Wouldn't original update method (therefore the pod repo update command) be blocked if I wait for that pid to finish?

@segiddins
Copy link
Member

Good point. I'd just expect that the background method get's called then, manually call the synchronous method, and then verify

@manuyavuz
Copy link
Member Author

I actually did as you suggest in unit tests of SourcesManager class. However, not sure if this is an appropriate thing to do in a functional spec. Failed test is the following:

    it 'updates a repository' do
      upstream = SpecHelper.temporary_directory + 'upstream'
      FileUtils.cp_r(test_repo_path, upstream)
      Dir.chdir(test_repo_path) do
        `git remote add origin #{upstream}`
        `git remote -v`
        `git fetch -q`
        `git branch --set-upstream-to=origin/master master`
      end
      lambda { command('repo', 'update').run }.should.not.raise
    end

Is it alright to change it like the following?:

    it 'updates a repository' do
      upstream = SpecHelper.temporary_directory + 'upstream'
      FileUtils.cp_r(test_repo_path, upstream)
      Dir.chdir(test_repo_path) do
        `git remote add origin #{upstream}`
        `git remote -v`
        `git fetch -q`
        `git branch --set-upstream-to=origin/master master`
      end
      # Add expect for bg method
      SourcesManager.expects(:update_search_index_if_needed_in_background).with({}).returns(nil)
      lambda { command('repo', 'update').run }.should.not.raise
    end

If this is OK, I'll make a last rebase and push final version.

@segiddins
Copy link
Member

IMHO that's fine

- Previous implementation was having the drawback of traversing and performing `gsub` on all index strings for each pod specification.
- New implementation stores words as keys and list of pods containing corresponding word inside their specification as the values for corresponding hash keys. Therefore, while searching for a query, we only need to check if query is matched with any key in index hash, if so, we will add corresponding list of spec names to a Set object. Resulted Set object gives us the search result. Using this policy, sources manager can perform a faster search operation.
    - Override update_git_repo method of Source class in SourcesManager, and uses Executable instead of backticks. This provides streaming command output to STDOUT.
@manuyavuz manuyavuz force-pushed the improvement/searchPerformance branch from fa15c40 to 7cfdff8 Compare October 27, 2015 16:07
@manuyavuz manuyavuz force-pushed the improvement/searchPerformance branch 2 times, most recently from 5a5f96f to 956ca76 Compare October 27, 2015 17:24
@manuyavuz manuyavuz force-pushed the improvement/searchPerformance branch from 956ca76 to a7f504f Compare October 27, 2015 17:38
@manuyavuz
Copy link
Member Author

Offenses made me sick 😤

I think this is ready to 🚀 also. You can now make a last glance.

SourcesManager.update(test_repo_path.basename.to_s, true)
UI.output.should.match /is up to date/
end

it 'uses the only fast forward git option' do
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test can come back

@segiddins
Copy link
Member

😄 super well done!

@floere
Copy link
Member

floere commented Oct 27, 2015

👍

@manuyavuz
Copy link
Member Author

It appeared again 😉

@segiddins
Copy link
Member

🚀 🚢
Great work @manuyavuz!

segiddins added a commit that referenced this pull request Oct 28, 2015
Changes search index format to fasten `pod search --full` command.
@segiddins segiddins merged commit f757808 into CocoaPods:master Oct 28, 2015
@manuyavuz
Copy link
Member Author

Great!

@manuyavuz manuyavuz deleted the improvement/searchPerformance branch October 28, 2015 19:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants